-stablehlo-aggressive-folder
Folds StableHLO অপারেশন
অপশন
-assume-no-undeclared-side-effects : Allow dead code to be eliminated in some situations (e.g. dead while loops) under the assumption that ops are pure unless declared with explicit MLIR `MemoryEffects`. Notably, this means `func.call` ops will be assumed pure.
-fold-op-element-limit : Folding an op into a constant can sometimes come at the cost of memory overhead. (This occurs if the op's inputs are reused, meaning that they can't be deleted after the op is folded to a constant, or when folding operations like `iota` whose outputs take up more memory than their inputs.) In such cases, this config option sets an upper limit on how many elements an op's result may have before the op is no longer folded.
-optimize-float : Allow float optimizations that, though mathematically equivalent, may result in slightly different quantization of floating-point values (e.g. `log(sqrt(x))` -> `0.5 * log(x)`). Float optimizations that can't affect numerical results are always enabled.
-stablehlo-aggressive-simplification
StableHLO অপারেশনকে ক্যানোনিকালাইজ করে
গ্রাফ সরলীকরণ সম্পাদন করে, সহ:
- add(cst, X) -> add(X, cst)
- add(X, 0) -> X
- and(cst, X) -> and(X, cst)
- and(X, 0) -> 0
- and(X, 1) -> X
- broadcast_in_dim(broadcast_in_dim(X, [dimsA...]), [dimsB...]) -> broadcast_in_dim(X, merge(dimsA, dimsB))
- broadcast_in_dim(X, [dims...]) -> transpose(X, [dims...]) [if same numel & rank]
- broadcast_in_dim(X, [iota...]) -> X
- broadcast_in_dim(X, [sorted...]) -> reshape(X, [sorted...]) [if same numel]
- compare(cst, X, comparator) -> compare(X, cst, inv(comparator))
- compare(X, X, [EQ,GE,LE]) -> true
- compare(X, X, [NE,GT,LT]) -> false
- complex(real(X), imag(X))) -> X
- concatenate(concatenate(X, Y), Z) -> concatenate(X, Y, Z)
- concatenate(X) -> X
- concatenate(X, Y, []) -> concatenate(X, Y)
- convert(X, [X.type]) -> X
- dynamic_broadcast_in_dim(dynamic_broadcast_in_dim(X, _, [dimsA...]), shape, [dimsB...]) -> dynamic_broadcast_in_dim(X, shape, merge(dimsA, dimsB))
- dynamic_broadcast_in_dim(dynamic_reshape(X, shape), shape) -> dynamic_reshape(X, shape)
- dynamic_broadcast_in_dim(X, _, _, [all_nonexpanding...]) -> convert(X)
- dynamic_broadcast_in_dim(X, shape_of(X)) -> X
- dynamic_gather(x, constant(slice_sizes)) -> gather(x, slice_sizes)
- dynamic_iota(shape, dim) ->
- dynamic_pad(X, low, high, interior) -> pad(X, low, high, interior)
- dynamic_reshape(dynamic_reshape(X, _), shape)) -> dynamic_reshape(X, shape)
- dynamic_reshape(op(dynamic_reshape(X, shape)), shape)
- dynamic_slice(X, begin, slice_sizes) -> slice(X, begin, slice_sizes)
- dynamic_update_slice(X, update, start_indices : zero)) -> update
- dynamic_update_slice(X, update : zero_extent)) -> X
- gather(X, cst_start_indices) -> slice(X, slice_start, slice_end)
- get_dimension_size(X, i) -> X.shape[i]
- get_tuple_element(tuple(X_0, X_1, ...), i) -> X_i
- imag(complex(R,I)) -> I
- iota(dim) : multi_rank
- iota(dim) : type -> constant(0) : type [if type[dim] == 1]
- max(cst, X) -> max(X, cst)
- minimum(cst, X) -> minimum(X, cst)
- multiply(cst, X) -> multiply(X, cst)
- multiply(X, 0i) -> 0i
- multiply(X, 1i) -> X
- op(X : zero_extent_tensor) -> constant([])
- or(cst, X) -> or(X, cst)
- or(X, 0) -> X
- or(X, 1) -> 1
- pad(empty_tensor, _) -> broadcast_in_dim(empty_tensor, _)
- real(complex(R,I)) -> X
- real_dynamic_slice(X, start, limit, strides)
- real_dynamic_slice(X, start, limit, strides)
- reduce[A](_, _, fn:return A) -> A...
- reduce(empty_0, empty_1, ...) -> [broadcast_in_dim(empty_i)...]
- reduce(in_1, in_2, _, _) -> reduce(in_1, _, _) [if unused(in_2)]
- reduce(X..., dims=[], add) -> X...
- reshape(reshape(X, _), [shape]) -> reshape(X, [shape])
- reshape(X, [X.shape]) -> X
- select(broadcast(not(p)), t, f) => select(broadcast(p), f, t)
- select(not(p), t, f) => select(p, f, t)
- shape_of(dynamic_reshape(X, shape)) -> shape
- slice(concat(X,Y,Z,...),...) -> concat(slice(X),slice(Y),slice(Z))
- slice(X, [A:A], [B:B], ...) -> X
- sort(X) -> sort(X, dim = N) [when dim can be inferred]
- sort(X,Y) -> sort(X) [if Y unused and unused in comparator]
- subtract(X, 0) -> X
- subtract(X, X) -> 0
- transpose(X, [iota...]) -> X
- transpose(X, [no_mem_layout_change...]) -> reshape(X)
- tuple(get_tuple_element(X, 0), get_tuple_element(X, 1), ...) -> X
- while -> while (loop invariants as implicit captures)
- xor(cst, X) -> xor(X, cst)
- (+more)
এই তালিকাটি কোড মন্তব্য থেকে টেনে নেওয়া হয়েছে তাই সম্পূর্ণরূপে সম্পূর্ণ নয়, কিন্তু আজ পাসের উচ্চ কভারেজ রয়েছে।
অপশন
-fold-op-element-limit : Folding an op into a constant can sometimes come at the cost of memory overhead. (This occurs if the op's inputs are reused, meaning that they can't be deleted after the op is folded to a constant, or when folding operations like `iota` whose outputs take up more memory than their inputs.) In such cases, this config option sets an upper limit on how many elements an op's result may have before the op is no longer folded.
-stablehlo-target-independent-optimization
ক্যানোনিকালাইজার, ফোল্ডার এবং অন্যান্য লক্ষ্য-স্বাধীন অপ্টিমাইজেশান চালায়।
StablehloAggressiveSimplificationPass এবং StablehloAggressiveFolderPass থেকে প্যাটার্নগুলি একসাথে ব্যবহার করে, একই প্যাটার্ন সেটে ক্যানোনিকালাইজেশন এবং ভাঁজ করার অনুমতি দেয়, প্রায়শই ভাল ফলাফলের দিকে পরিচালিত করে।
ব্যবহারকারীদের অন্যদের সরাসরি কল করার চেয়ে এই পাসটি পছন্দ করা উচিত।
অপশন
-assume-no-undeclared-side-effects : Allow dead code to be eliminated in some situations (e.g. dead while loops) under the assumption that ops are pure unless declared with explicit MLIR `MemoryEffects`. Notably, this means `func.call` ops will be assumed pure.
-fold-op-element-limit : Folding an op into a constant can sometimes come at the cost of memory overhead. (This occurs if the op's inputs are reused, meaning that they can't be deleted after the op is folded to a constant, or when folding operations like `iota` whose outputs take up more memory than their inputs.) In such cases, this config option sets an upper limit on how many elements an op's result may have before the op is no longer folded.
-optimize-float : Allow float optimizations that, though mathematically equivalent, may result in slightly different quantization of floating-point values (e.g. `log(sqrt(x))` -> `0.5 * log(x)`). Float optimizations that can't affect numerical results are always enabled.