Skip to content

[Unhoisted] [Binary] [Should fallthrough] load & store order affects the constant propagation in binaryOp #7455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xuruiyang2002 opened this issue Apr 5, 2025 · 3 comments · Fixed by #7480

Comments

@xuruiyang2002
Copy link
Contributor

xuruiyang2002 commented Apr 5, 2025

Given the following code:

(module
  (type $0 (func))
  (type $1 (func (param i32 i32) (result i32)))
  (type $2 (func (result i32)))
  (type $3 (func (param i32) (result i32)))
  (import "External" "external_function" (func $external_function (type $0)))
  (func $_start (type $3) (param $0 i32) (result i32)
    (local $1 i32) (local $2 i32)
    i32.const 0
    i32.load
    local.set $2
    local.get $1
    i32.const 0
    call $foo
    local.set $1
    local.get $1
    local.get $2
    i32.gt_u
    if (result i32)  ;; label = @1
      call $external_function
      i32.const 0
    else
      i32.const 1
    end)
  (func $foo (type $1) (param $0 i32) (param $1 i32) (result i32)
    i32.const 0
    i32.const 0
    i32.store
    i32.const 0)
  (memory $0 258 258)
  (export "_start" (func $_start)))

The wasm-opt (16dbac1) can eliminate the dead true branch by -all -O2 but cannot by -all -O3.

Analysis

O2 produces the following if statement before inlining-optimizing:

(i32.gt_u
(call $foo) ==> return const 0, and it has side effect
(local.get $0) ;; ==> which is (i32.load $0 (i32.const 0) )
)

After inline, the condition is deduced to be zero, thus the true branch is deleted (call $external_function is in the true branch). (Of course it preserves the side-effect statements in the branch)

O3 produces the following if statement before inlining-optimizing:

(i32.lt_u
(i32.load $0 (i32.const 0) )
(call $foo)
)

After inline, the condition is transformed to

(i32.lt_u
 (i32.load $0 (i32.const 0) )
 (block (result i32)
  (i32.store $0
   (i32.const 0)
   (i32.const 0)
  )
  (i32.const 0)
 )
)

As you can see, the condition is almost the same as O2 produced before inlining-optimizing. However, wasm-opt failed to deduced the condition to be zero after inline -- It should be.

The only difference is the O2 uses gt_u and the O3 uses lt_u. Maybe the reorder of the comparison operands affects the optimization? So weird. (I am not sure about that).

Overall, I think there is a missed optimization in the updateAfterInlining -- wasm-opt may not handle constant propagation properly after inlining. (Similar to the #7454 )

@kripken
Copy link
Member

kripken commented Apr 7, 2025

Simplified:

(module
 (import "External" "external_function" (func $external_function))
 (memory $0 258 258)
 (export "_start" (func $_start))
 (func $_start (param $0 i32)
  (if
   (i32.lt_u
    (i32.load
     (i32.const 0)
    )
    (block (result i32)
     (i32.store
      (i32.const 0)
      (i32.const 0)
     )
     (i32.const 0)
    )
   )
   (then
    (call $external_function)
   )
  )
 )
)

This code is normally optimized by merge-blocks hoisting the store out of the if, after which we see the constant zero after it. However, the load and the store cannot be reordered, so we fail to do so, leaving the store where it is.

OptimizeInstructions could handle this even with the store in place, but we would need to use Properties::getFallthrough to see that 0 is the fallthrough value, and if we optimize, to also use getDroppedChildrenAndAppend to keep the store around.

@kripken kripken changed the title O2 removes dead code after inlining while O3 fails despite similar input structures O2 removes dead code after inlining while O3 fails despite similar input structures [OptimizeInstructions should use fallthrough] Apr 7, 2025
@xuruiyang2002
Copy link
Contributor Author

xuruiyang2002 commented Apr 9, 2025

Got it, that is the limitation of the merge-block. The solution you provided appears to implement the hoisting functionality (original should be carry by merge-block) in the OptimizeInstructions. Is this what you mean? If so, what concerns me is that to what extent can we fix this in generally (or just add a check rule)? Because I see there is a comment TODO left previously:

// Optimize given that the expression is flowing into a boolean context
Expression* optimizeBoolean(Expression* boolean) {
// TODO use a general getFallthroughs

@kripken
Copy link
Member

kripken commented Apr 9, 2025

Yes, I think that might be the relevant TODO. We need to look at the fallthrough, and then keep dropped children as needed.

xuruiyang2002 added a commit to xuruiyang2002/binaryen that referenced this issue Apr 10, 2025
@xuruiyang2002 xuruiyang2002 changed the title O2 removes dead code after inlining while O3 fails despite similar input structures [OptimizeInstructions should use fallthrough] [Unhoisted] [Binary] [Should fallthrough] load & store order affects the constant propagation in binaryOp Apr 14, 2025
xuruiyang2002 added a commit to xuruiyang2002/binaryen that referenced this issue Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants