-stablehlo-aggressive-folder

Faz o dobramento de operações StableHLO

Opções

-assume-no-undeclared-side-effects : Allow dead code to be eliminated in some situations (e.g. dead while loops) under the assumption that ops are pure unless declared with explicit MLIR `MemoryEffects`. Notably, this means `func.call` ops will be assumed pure.
-fold-op-element-limit             : Folding an op into a constant can sometimes come at the cost of memory overhead. (This occurs if the op's inputs are reused, meaning that they can't be deleted after the op is folded to a constant, or when folding operations like `concat` whose outputs take up more memory than their inputs.) In such cases, this config option sets an upper limit on how many elements an op's result may have before the op is no longer folded. Splat folds are exempt from this limit.
-optimize-float                    : Allow float optimizations that, though mathematically equivalent, may result in slightly different quantization of floating-point values (e.g. `log(sqrt(x))` -> `0.5 * log(x)`). Float optimizations that can't affect numerical results are always enabled.

-stablehlo-aggressive-simplification

Canonicaliza operações do StableHLO

Realiza simplificações de gráficos, incluindo:

- add(cst, X) -> add(X, cst)
- add(X, 0) -> X
- and(cst, X) -> and(X, cst)
- and(X, 0) -> 0
- and(X, 1) -> X
- broadcast_in_dim(broadcast_in_dim(X, [dimsA...]), [dimsB...]) -> broadcast_in_dim(X, merge(dimsA, dimsB))
- broadcast_in_dim(X, [dims...]) -> transpose(X, [dims...]) [if same numel & rank]
- broadcast_in_dim(X, [iota...]) -> X
- broadcast_in_dim(X, [sorted...]) -> reshape(X, [sorted...]) [if same numel]
- compare(cst, X, comparator) -> compare(X, cst, inv(comparator))
- compare(X, X, [EQ,GE,LE]) -> true
- compare(X, X, [NE,GT,LT]) -> false
- complex(real(X), imag(X))) -> X
- concatenate(concatenate(X, Y), Z) -> concatenate(X, Y, Z)
- concatenate(X) -> X
- concatenate(X, Y, []) -> concatenate(X, Y)
- convert(X, [X.type]) -> X
- dynamic_broadcast_in_dim(dynamic_broadcast_in_dim(X, _, [dimsA...]), shape, [dimsB...]) -> dynamic_broadcast_in_dim(X, shape, merge(dimsA, dimsB))
- dynamic_broadcast_in_dim(dynamic_reshape(X, shape), shape) -> dynamic_reshape(X, shape)
- dynamic_broadcast_in_dim(X, _, _, [all_nonexpanding...]) -> convert(X)
- dynamic_broadcast_in_dim(X, shape_of(X)) -> X
- dynamic_gather(x, constant(slice_sizes)) -> gather(x, slice_sizes)
- dynamic_iota(shape, dim) ->
- dynamic_pad(X, low, high, interior) -> pad(X, low, high, interior)
- dynamic_reshape(dynamic_reshape(X, _), shape)) -> dynamic_reshape(X, shape)
- dynamic_reshape(op(dynamic_reshape(X, shape)), shape)
- dynamic_slice(X, begin, slice_sizes) -> slice(X, begin, slice_sizes)
- dynamic_update_slice(X, update, start_indices : zero)) -> update
- dynamic_update_slice(X, update : zero_extent)) -> X
- gather(X, cst_start_indices) -> slice(X, slice_start, slice_end)
- get_dimension_size(X, i) -> X.shape[i]
- get_tuple_element(tuple(X_0, X_1, ...), i) -> X_i
- imag(complex(R,I)) -> I
- iota(dim) : multi_rank
- iota(dim) : type -> constant(0) : type [if type[dim] == 1]
- max(cst, X) -> max(X, cst)
- minimum(cst, X) -> minimum(X, cst)
- multiply(cst, X) -> multiply(X, cst)
- multiply(X, 0i) -> 0i
- multiply(X, 1i) -> X
- op(X : zero_extent_tensor) -> constant([])
- or(cst, X) -> or(X, cst)
- or(X, 0) -> X
- or(X, 1) -> 1
- pad(empty_tensor, _) -> broadcast_in_dim(empty_tensor, _)
- real(complex(R,I)) -> X
- real_dynamic_slice(X, start, limit, strides)
- real_dynamic_slice(X, start, limit, strides)
- reduce[A](_, _, fn:return A) -> A...
- reduce(empty_0, empty_1, ...) -> [broadcast_in_dim(empty_i)...]
- reduce(in_1, in_2, _, _) -> reduce(in_1, _, _) [if unused(in_2)]
- reduce(X..., dims=[], add) -> X...
- reshape(reshape(X, _), [shape]) -> reshape(X, [shape])
- reshape(X, [X.shape]) -> X
- select(broadcast(not(p)), t, f) => select(broadcast(p), f, t)
- select(not(p), t, f) => select(p, f, t)
- shape_of(dynamic_reshape(X, shape)) -> shape
- slice(concat(X,Y,Z,...),...) -> concat(slice(X),slice(Y),slice(Z))
- slice(X, [A:A], [B:B], ...) -> X
- sort(X) -> sort(X, dim = N) [when dim can be inferred]
- sort(X,Y) -> sort(X) [if Y unused and unused in comparator]
- subtract(X, 0) -> X
- subtract(X, X) -> 0
- transpose(X, [iota...]) -> X
- transpose(X, [no_mem_layout_change...]) -> reshape(X)
- tuple(get_tuple_element(X, 0), get_tuple_element(X, 1), ...) -> X
- while -> while (loop invariants as implicit captures)
- xor(cst, X) -> xor(X, cst)
- (+more)

Essa lista é extraída de comentários de código e não é totalmente exaustiva, mas tem alta cobertura da aprovação de hoje.

Opções

-fold-op-element-limit : Folding an op into a constant can sometimes come at the cost of memory overhead. (This occurs if the op's inputs are reused, meaning that they can't be deleted after the op is folded to a constant, or when folding operations like `concat` whose outputs take up more memory than their inputs.) In such cases, this config option sets an upper limit on how many elements an op's result may have before the op is no longer folded. Splat folds are exempt from this limit.

-stablehlo-target-independent-optimization

Executa canonicalizadores, pastas e outras otimizações independentes da meta.

Usa padrões de StablehloAggressiveSimplificationPass e StablehloAggressiveFolderPass juntos, permitindo que a canonização e o dobramento sejam realizados no mesmo conjunto de padrões, o que geralmente leva a melhores resultados.

Os usuários devem preferir essa transmissão a chamar as outras diretamente.

Opções

-assume-no-undeclared-side-effects : Allow dead code to be eliminated in some situations (e.g. dead while loops) under the assumption that ops are pure unless declared with explicit MLIR `MemoryEffects`. Notably, this means `func.call` ops will be assumed pure.
-fold-op-element-limit             : Folding an op into a constant can sometimes come at the cost of memory overhead. (This occurs if the op's inputs are reused, meaning that they can't be deleted after the op is folded to a constant, or when folding operations like `concat` whose outputs take up more memory than their inputs.) In such cases, this config option sets an upper limit on how many elements an op's result may have before the op is no longer folded. Splat folds are exempt from this limit.
-optimize-float                    : Allow float optimizations that, though mathematically equivalent, may result in slightly different quantization of floating-point values (e.g. `log(sqrt(x))` -> `0.5 * log(x)`). Float optimizations that can't affect numerical results are always enabled.