OpenXLA Project

`-chlo-legalize-to-stablehlo`

將 CHLO 作業流程合法化為 StableHLO 和 Shape 作業

`-shape-legalize-to-stablehlo`

將形狀相關作業合法化為 StableHLO。

實驗性傳遞，可將形狀相關作業合法化為 StableHLO 作業。

透過選用程序將形狀和資料運算結合，StableHLO 生態系統就能運用使用 StableHLO 作業的編譯管道，模擬動態性。

`-stablehlo-canonicalize-dynamism`

將動態 StableHLO 作業正規化為靜態作業。

如果這些作業的所有動態元素實際上都是常數，則會將 DynamicReshapeOp 等動態 StableHLO 作業，替換為對應的靜態作業，例如 DynamicReshapeOp 至 ReshapeOp 或 DynamicBroadcastInDim 至 BroadcastInDim。

  %c = stablehlo.constant dense<16> : tensor<1xi32>
  %0 = stablehlo.dynamic_broadcast_in_dim %cst, %c, dims = [] : (tensor<f32>, tensor<1xi32>) -> tensor<16xf32>

  ==>

  %0 = stablehlo.broadcast_in_dim %cst, dims = [] : (tensor<f32>) -> tensor<16xf32>

`-stablehlo-check-shape-assertions`

_Check stablehlo.custom_call @shapeassertion ops.

驗證 shape_assertion 自訂呼叫。

形狀斷言會驗證 StableHLO 中動態維度的限制。舉例來說，如果架構需要強制執行 DimA < 2 的限制，可以發出下列 IR：

%dimA = <get_dimension_size or input arg> : tensor<i32>
%c2 = stablehlo.constant dense<2> : tensor<i32>
%is_lt = stablehlo.compare LT %dimA, %c2 : tensor<i1>
stablehlo.custom_call @shape_assertion(%is_lt) { error_message = "DimA must be less than 2" }

通過驗證後，如果形狀正確，系統就會移除 stablehlo.custom_call。

選項

-enable-shape-assertions : Whether shape assertions may generate errors.

`-stablehlo-compatibility-expander`

StableHLO 作業的相容性擴充器。

最新版本會更新 StableHLO 作業，或導入新的作業。這項選擇加入的傳遞作業會將較新的 StableHLO 作業分解為舊版支援的同等作業，擴大與舊版 StableHLO 的回溯相容性。

為什麼這是選擇加入的憑證？

有時，StableHLO 作業強化功能會用於大幅簡化 OpenXLA 生態系統中特定常見模式的處理方式。這包括 TanOp (具有高架構和編譯器支援)，以及可使用切片表示的收集/分散批次處理維度，但這會大幅增加分片難度。這類新功能不提供自動降級服務，因為降級可能會捨棄後續最佳化作業使用的重要資訊。這個階段可用於根據目標版本擴充這些作業，以盡量提高相容性，但可能會導致編譯最佳化程度較低。

func.func @tan_op_non_complex(%arg0: tensor<4xf64>) -> tensor<4xf64> {
  %1 = stablehlo.tan %arg0 : tensor<4xf64>
  func.return %1 : tensor<4xf64>
}

==>

func.func @tan_op_non_complex(%arg0: tensor<4xf64>) -> tensor<4xf64> {
  %0 = stablehlo.sine %arg0 : tensor<4xf64>
  %1 = stablehlo.cosine %arg0 : tensor<4xf64>
  %2 = stablehlo.divide %0, %1 : tensor<4xf64>
  return %2 : tensor<4xf64>
}

選項

-target : The target version. Must be a version of the form #.#.#.

`-stablehlo-complex-math-expander`

StableHLO 複數數學運算擴充器。

StableHLO 複數數學運算會使用 StableHLO 實數數學運算進行分解。

這項聲明是根據以下假設：沒有任何硬體原生支援複數或複數數學運算。也就是說，編譯器可能實作的複雜數學運算備用機制是多餘的。啟用這項傳遞作業後，所有 StableHLO 複雜的數學運算都會擴展。

func.func @sqrt_op_complex(%arg0: tensor<4xcomplex<f64>>) -> tensor<4xcomplex<f64>> {
  %1 = stablehlo.sqrt %arg0 : tensor<4xcomplex<f64>>
  func.return %1 : tensor<4xcomplex<f64>>
}

==>

func.func @sqrt_op_complex(%arg0: tensor<4xcomplex<f64>>) -> tensor<4xcomplex<f64>> {
  TBD
  return %2 : tensor<4xcomplex<f64>>
}

`-stablehlo-convert-to-signless`

傳遞，將 IR 轉換為無符號整數。

`-stablehlo-legalize-composite-to-call`

以呼叫複合運算的分解作業取代複合運算。

以分解呼叫取代複合運算，例如：

stablehlo.composite "my_namespace.my_op" %arg0, %arg1 {
  decomposition = @bar,
  version = 1,
  composite_attributes = {
    "my_attribute": "my_value"
  }
}

會變成：

func.call @bar(%arg0, %arg1)

您可以使用「except」標記，將部分複合項目排除在這項轉換作業之外，例如：

stablehlo-opt --stablehlo-legalize-composite-to-call=except='foo.baz,foo.qux'

選項

-except : Names of composites that should not be replaced with calls.

`-stablehlo-legalize-deprecated-ops`

將已淘汰的運算合法化為支援良好的運算。

StableHLO v1.0 Opset Deprecations RFC (#2283) 建議移除多個多餘的作業。這個階段會將這些作業合法化為長期支援的對應項目，藉此評估各種編譯管道中移除這些作業的影響。

選項

-fail-on-unused : Fail on (mostly) unused ops that are deprecated without any fallback.

`-stablehlo-legalize-qdq-to-quantized-op`

將「融合」(取消量化、浮點運算和量化) 模式納入 StableHLO 量化運算

將 Fuse (取消量化、浮點運算和量化) 模式併入 StableHLO 量化作業注意：此傳遞不會刪除任何現有的作業。舉例來說，下列程式

func.func @add(%arg0: tensor<16x16x!quant.uniform<ui8:f32, 34.0:16>>) -> tensor<16x16x!quant.uniform<ui8:f32, 34.0:16>> {
  %0 = stablehlo.uniform_dequantize %arg0 : (tensor<16x16x!quant.uniform<ui8:f32, 34.0:16>>) -> tensor<16x16xf32>
  %1 = stablehlo.abs %0 : tensor<16x16xf32>
  %2 = stablehlo.uniform_quantize %1 : (tensor<16x16xf32>) -> tensor<16x16x!quant.uniform<ui8:f32, 34.0:16>>
  func.return %2 : tensor<16x16x!quant.uniform<ui8:f32, 34.0:16>>
}

會變成：

func.func @add(%arg0: tensor<16x16x!quant.uniform<u8:f32, 3.400000e+01:16>>) -> tensor<16x16x!quant.uniform<u8:f32, 3.400000e+01:16>> {
  %0 = stablehlo.uniform_dequantize %arg0 : (tensor<16x16x!quant.uniform<u8:f32, 3.400000e+01:16>>) -> tensor<16x16xf32>
  %1 = stablehlo.abs %0 : tensor<16x16xf32>
  %2 = stablehlo.abs %arg0 : tensor<16x16x!quant.uniform<u8:f32, 3.400000e+01:16>>
  %3 = stablehlo.uniform_quantize %1 : (tensor<16x16xf32>) -> tensor<16x16x!quant.uniform<u8:f32, 3.400000e+01:16>>
  return %2 : tensor<16x16x!quant.uniform<u8:f32, 3.400000e+01:16>>
}

`-stablehlo-legalize-quant-to-math`

將 StableHLO 量化作業轉換為 StableHLO 原始數學作業。

使用 UniformQuantized 型別將 StableHLO 程式轉換為語意上等效的整數數學運算。

func.func @add(%arg0: tensor<!quant.uniform<i8:f32,1.0:0>>, %arg1: tensor<!quant.uniform<i8:f32,2.0:1>>) ->  tensor<!quant.uniform<i8:f32,3.0:2>> {
  %0 = "stablehlo.add"(%arg0, %arg1) : (tensor<!quant.uniform<i8:f32,1.0:0>>, tensor<!quant.uniform<i8:f32,2.0:1>>) -> tensor<!quant.uniform<i8:f32,3.0:2>>
  func.return %0 : tensor<!quant.uniform<i8:f32,3.0:2>>
}

會變成：

func.func @add(%arg0: tensor<i8>, %arg1: tensor<i8>) -> tensor<i8> {
  %0 = stablehlo.convert %arg0 : (tensor<i8>) -> tensor<f32>
  %cst = stablehlo.constant dense<0.333333343> : tensor<f32>
  %1 = chlo.broadcast_multiply %0, %cst : (tensor<f32>, tensor<f32>) -> tensor<f32>
  %cst_0 = stablehlo.constant dense<2.000000e+00> : tensor<f32>
  %2 = chlo.broadcast_add %1, %cst_0 : (tensor<f32>, tensor<f32>) -> tensor<f32>
  %3 = stablehlo.round_nearest_even %2 : tensor<f32>
  %4 = stablehlo.convert %3 : (tensor<f32>) -> tensor<i32>
  %5 = stablehlo.convert %arg1 : (tensor<i8>) -> tensor<f32>
  %cst_1 = stablehlo.constant dense<0.666666686> : tensor<f32>
  %6 = chlo.broadcast_multiply %5, %cst_1 : (tensor<f32>, tensor<f32>) -> tensor<f32>
  %cst_2 = stablehlo.constant dense<1.33333337> : tensor<f32>
  %7 = chlo.broadcast_add %6, %cst_2 : (tensor<f32>, tensor<f32>) -> tensor<f32>
  %8 = stablehlo.round_nearest_even %7 : tensor<f32>
  %9 = stablehlo.convert %8 : (tensor<f32>) -> tensor<i32>
  %c = stablehlo.constant dense<2> : tensor<i32>
  %10 = chlo.broadcast_add %4, %9 : (tensor<i32>, tensor<i32>) -> tensor<i32>
  %11 = chlo.broadcast_subtract %10, %c : (tensor<i32>, tensor<i32>) -> tensor<i32>
  %c_3 = stablehlo.constant dense<-128> : tensor<i32>
  %c_4 = stablehlo.constant dense<127> : tensor<i32>
  %12 = stablehlo.clamp %c_3, %11, %c_4 : tensor<i32>
  %13 = stablehlo.convert %12 : (tensor<i32>) -> tensor<i8>
  return %13 : tensor<i8>
}

`-stablehlo-legalize-quantized-op-to-qdq`

將量化 StableHLO 作業分解為 (取消量化、浮點運算和量化) 模式。

使用統一的量化/去量化作業，分解 StableHLO 量化程式。舉例來說，下列程式

func.func @add(%arg0: tensor<!quant.uniform<i8:f32,1.0:0>>, %arg1: tensor<!quant.uniform<i8:f32,2.0:1>>) ->  tensor<!quant.uniform<i8:f32,3.0:2>> {
  %0 = "stablehlo.add"(%arg0, %arg1) : (tensor<!quant.uniform<i8:f32,1.0:0>>, tensor<!quant.uniform<i8:f32,2.0:1>>) -> tensor<!quant.uniform<i8:f32,3.0:2>>
  func.return %0 : tensor<!quant.uniform<i8:f32,3.0:2>>
}

會變成：

func.func @add(%arg0: tensor<!quant.uniform<i8:f32, 1.000000e+00>>, %arg1: tensor<!quant.uniform<i8:f32, 2.000000e+00:1>>) -> tensor<!quant.uniform<i8:f32, 3.000000e+00:2>> {
  %0 = stablehlo.uniform_dequantize %arg0 : (tensor<!quant.uniform<i8:f32, 1.000000e+00>>) -> tensor<f32>
  %1 = stablehlo.uniform_dequantize %arg1 : (tensor<!quant.uniform<i8:f32, 2.000000e+00:1>>) -> tensor<f32>
  %2 = stablehlo.add %0, %1 : tensor<f32>
  %3 = stablehlo.uniform_quantize %2 : (tensor<f32>) -> tensor<!quant.uniform<i8:f32, 3.000000e+00:2>>
  return %3 : tensor<!quant.uniform<i8:f32, 3.000000e+00:2>>
}

`-stablehlo-legalize-to-vhlo`

將 StableHLO 合法化為 VHLO。

將 StableHLO 合法化為 VHLO 中最新版本的作業。然後，這些作業可以使用 VhloToVersionPass 降級至舊版 VHLO，以確保向前相容性。

stablehlo.exponential %[[ARG0]] <{result_accuracy = DEFAULT}> : tensor<f32>
# ====>
"vhlo.exponential_v2"(%[[ARG0]]) <{result_accuracy = #vhlo.DEFAULT_v1}> : !vhlo.tensor_v1<!vhlo.f32_v1>

如要瞭解 VHLO 如何用於保留向前和向後相容性，請參閱 vhlo.md > The VHLO dialect。

選項

-allow-other-dialects : Allow serialization to use other (potentially unstable) dialects, inserts unrealized casts between dialects.

`-stablehlo-refine-arguments`

修正主要函式的引數形狀。

使用輸入型別簽章修改主要函式的引數。將引數包裝在 custom_call @stablehlo.shape_refinement_operand_wrapper 中，確保 IR 在執行形狀細化前有效。

func.func public @main(%arg0: tensor<?xf32>) -> tensor<?xf32> {
  ...
}

==>

func.func public @main(%arg0: tensor<16xf32>) -> tensor<?xf32> {
  %c = stablehlo.constant dense<16> : tensor<1xi64>
  %0 = stablehlo.custom_call @stablehlo.shape_refinement_operand_wrapper(%arg0, %c) {...}
    : (tensor<16xf32>, tensor<1xi64>) -> tensor<?xf32>
  ...
}

refinedTypesOption 可用於指定精細類型清單。這可以在 MLIR 中以 --types='tensor<...>,tensor<...>' 指定，也可以傳遞至傳遞建立方法。精細化型別清單必須指定要精細化的 main 方法中每個引數的型別。

選項

-types : The new types to be used for the main function's arguments, specified as an MLIR TypeRange 'tensor<1x2xf32>, ...'

`-stablehlo-refine-shapes`

在整個 StableHLO 程式中調整形狀。

逐步說明 StableHLO 程式，在作業中調整形狀。

這個通道的主要用途是將動態形狀的程式專門化為靜態形狀。如果動態形狀的 StableHLO 程式具有正確結構，將引數型別從動態形狀更新為靜態形狀，並執行此傳遞，即可在程式中傳播靜態形狀。

這個傳遞會移除 custom_call @shape_refinement_operand_wrapper，方法是直接以運算元取代結果的使用情形，並在整個程式中傳播靜態形狀。

  %c = stablehlo.constant dense<16> : tensor<1xi64>
  %0 = stablehlo.custom_call @stablehlo.shape_refinement_operand_wrapper(%arg0, %c) {...}
      : (tensor<16xf32>, tensor<1xi64>) -> tensor<?xf32>
  %1 = stablehlo.add %0, %0 : tensor<?xf32>

  ==>

  %1 = stablehlo.add %arg0, %arg0 : tensor<16xf32>

適用於形狀精修的模組必須具備下列屬性：

所有動態形狀都只取決於輸入形狀 (輸入陣列內容沒有形狀依附元件)。我們將僅以遞移方式依附於輸入形狀 (例如 stablehlo.get_dimension_size 提供) 或全域常數 (例如符號整數的已解析值，即張量 : A = 5) 的作業稱為 dimension 作業。所有維度值都可以透過程序間常數摺疊解析為常數。
中繼函式可能會在引數清單開頭採用多個權杖引數 (類型為 !stablehlo.token)，後面接著一些全域常數引數，這些引數是常數整數純量，例如符號整數的已解析值 (即張量：A = 5)。
部分中介函式可能會傳回全域常數的計算結果，也就是 symint 值的 floordiv。這些函式在精簡後只會傳回常數值。這些函式會內嵌。
對單一函式的所有呼叫都會解析為相同的引數形狀，且不會進行任何遞迴 / 共同遞迴函式呼叫。

`-stablehlo-wrap-in-composite`

將非複合式 StableHLO 作業包裝在複合式作業中。

將 StableHLO 作業包裝在 stablehlo.composite 作業中。

舉例來說，請考量簡單的 StableHLO 程式：

func.func @main(%arg0 : tensor<2xf32>, %arg1 : tensor<2xf32>) -> tensor<2xf32> {
  %0 = stablehlo.add %arg0, %arg1 : tensor<2xf32>
  return %0 : tensor<2xf32>
}

套用這個傳遞項目來包裝 stablehlo.add 作業，會產生下列程式：

func.func @main(%arg0: tensor<2xf32>, %arg1: tensor<2xf32>) -> tensor<2xf32> {
  %0 = stablehlo.composite "stablehlo.add" %arg0, %arg1 {decomposition = @stablehlo.add.impl} : (tensor<2xf32>, tensor<2xf32>) -> tensor<2xf32>
  return %0 : tensor<2xf32>
}
func.func private @stablehlo.add.impl(%arg0: tensor<2xf32>, %arg1: tensor<2xf32>) -> tensor<2xf32> {
  %0 = stablehlo.add %arg0, %arg1 : tensor<2xf32>
  return %0 : tensor<2xf32>
}

注意：

產生的 stablehlo.composite 作業的 name 屬性一律會與包裝的原始作業名稱相同 (例如，如果您包裝 stablehlo.add 作業，複合作業也會命名為 "stablehlo.add")。
封裝原始作業的私有函式 (由 stablehlo.composite 作業的 decomposition 屬性參照) 會使用 <op_name>.impl[.N] 模式命名，其中 <op_name> 是原始作業的名稱，N 則是產生的專屬整數 ID，可避免模組內發生命名衝突。

這項憑證有兩種不同的使用方式：

模式 1：使用指令列

這個模式適用於偵錯或測試，因為它對產生的 stablehlo.composite 作業屬性提供最少的控制權。這個函式會包裝使用 op-names 選項 (以半形逗號分隔的作業名稱清單) 指定的所有作業例項。新建立的 stablehlo.composite 作業屬性會與原始作業屬性相同。

使用範例：

stablehlo-opt input.mlir --stablehlo-wrap-in-composite=op-names='stablehlo.add,stablehlo.mul' -o output.mlir

模式 2：以程式輔助方式包裝整個模組，並自訂屬性處理方式

這個模式會將程式輔助包裝擴展至整個模組，可精細控管要包裝的作業及其屬性。方法是使用 createStablehloWrapInCompositePass API，並以 CompositeAttributeProviderMap 做為引數。

CompositeAttributeProviderMap 是一張對應表，可指定要包裝哪些作業，以及如何處理這些作業的屬性。其語意如下：

鍵 (mlir::TypeID)：MLIR 作業。TypeID如果作業的 TypeID 與對應項目中的鍵相符，就會成為包裝候選項目。
值 (Lambda 函式)：類型為 std::function<std::optional<NamedAttrList>(Operation*)> 的 Lambda 函式。這項函式會套用至每個候選作業。
- 輸入：mlir::Operation*，這是與 TypeID 鍵對應的作業型別執行個體。
- 傳回值：std::optional<NamedAttrList>。
  - 如果 lambda 傳回 NamedAttrList (包裝在 std::optional 中)，作業會包裝在 stablehlo::composite 作業中，且傳回的屬性會用於設定複合項目的屬性。
  - 如果 lambda 傳回 std::nullopt，則作業不會包裝。這樣就能根據自訂條件選擇性包裝。

範例 (C++)：


stablehlo::CompositeAttributeProviderMap compositeAttributeProviderMap;

compositeAttributeProviderMap[mlir::TypeID::get<mlir::stablehlo::AddOp>()] =
  [](mlir::Operation* op) -> std::optional<mlir::NamedAttrList> {
  // Custom logic to determine if and how to wrap the operation.
  // Example: Only wrap if it's on a specific type.
  if (mlir::isa<mlir::Float32Type>(op->getOperand(0).getType())) {
    return mlir::NamedAttrList(op->getAttrs());
  }
  return std::nullopt; // Do not wrap.
};

pm.addPass(createStablehloWrapInCompositePass(compositeAttributeProviderMap, compositeVersion));
if (mlir::failed(pm.run(module))) {
  return;
}

選項

-op-names : The names of the ops to wrap.
-version  : The version number of the composite op.

`-vhlo-legalize-to-stablehlo`

將 VHLO 合法化為 StableHLO。

`-vhlo-to-version`

在 VHLO 版本之間轉換，確保相容性。

在 VHLO 版本之間轉換，以升級及降級 IR，確保向前和向後相容性。

"vhlo.exponential_v2"(%[[ARG0]]) <{result_accuracy = DEFAULT}>
# ==( -target=1.0.0 )==>
"vhlo.exponential_v1"(%[[ARG0]])
# ==( -target=1.9.0 )==>
"vhlo.exponential_v2"(%[[ARG0]]) <{result_accuracy = DEFAULT}>

如要瞭解 VHLO 如何用於保留向前和向後相容性，請參閱 vhlo.md > The VHLO dialect。

選項

-target : The target version. Must be a version of the form #.#.# .