-
Notifications
You must be signed in to change notification settings - Fork 694
Structured code for the stack machine #753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@JSStats I think that with I agree that extra sequence of call give_me_a_b_c_d [ a b c d ]
block [ a b c d ]
take 3 [ _ b c d a ]
take 3 [ _ _ c d a b ]
i32.const 42 [ _ _ c d a b 42 ]
i32.add [ _ _ c d a b+42 ]
take 3 [ _ _ _ d a b+42 c ]
take 3 [ _ _ _ _ a b+42 c d ]
end [ a b` c d ]
=>
const (i32 $a, i32 $b, i32 $c, i32 $d) = give_me_a_b_c_d();
{ // loop
$b = $b + 42;
} The actual order of stack elements not reflected in the structural code and potentially ambiguous. Unless structural code is SSA, then we might need an equivalent of Φ function. |
@JSStats call give_me_a_b_c_d [ a b c d ]
block [ a b c d ]
pick 1 [ a b c d a ]
swap [ a b c a d ]
i32.sub [ a b c a-d ]
end [ a b c d` ]
call give_me_a_b_c_d [ a b c d ]
block [ a b c d ]
take 3 [ _ b c d a ]
take 3 [ _ _ c d a b ]
take 3 [ _ _ _ d a b c ]
pick 2 [ _ _ _ d a b c a ]
take 4 [ _ _ _ _ a b c a d ]
i32.sub [ _ _ _ _ a b c a-d ]
end [ a b c d` ]
=>
const (i32 $a, i32 $b, i32 $c, i32 $d) = give_me_a_b_c_d();
{
$b = $a - $d;
} |
@drom Not sure
|
Some examples using
|
@JSStats
I introduced block to illustrate the case when it is needed (like loop) where stack balance need to be preserved. |
@drom The A key requirement is that excess values left on the stack in a block are implicitly discarded, and this is a requirement that the current wasm design has been moving away from and doing so is clearly not a tenable plan. For a function SSA style loop, here is a suggestion. The Using this same approach for the function entry, the function arguments would be within the functions implicit block, and this might be implemented by block stack adjustments, so they could be dropped etc. A tail recursive call would expect the argument values to be on the stack and would discard all other block values. That would seem to leave no need for the local variables in wasm. Perhaps they could be retained to help some producers. |
@JSStats do you envision structural code text to be SSA like? |
@drom I think a functional SSA text format would be a good start. Might want to add some lexical scoping of definitions to help communicate the live range of definitions to the reader too. If the wasm encoding does not support an expression based language well then this is where I think the text format will end up, that this is what developer would want to debug. |
@JSStats any idea about syntax for the function declaration with multi-returns? |
@drom There is already provision for declaring multiple return values in the binary encoding, and the s-exp format, so it's just a matter of some bike-shedding to chose a text format and there are other issues discussing the structure of the text format. |
@JSStats IMHO on the topic: The answer will mostly depend on the following aspects:
|
One more successful example of structured code translation: ;; Iterative factorial named
(func $fac-iter (param $n i64) (result i64)
(local $i i64)
(local $res i64)
(set_local $i (get_local $n))
(set_local $res (i64.const 1))
(loop $done $loop
(if
(i64.eq (get_local $i) (i64.const 0))
(br $done)
(block
(set_local $res (i64.mul (get_local $i) (get_local $res)))
(set_local $i (i64.sub (get_local $i) (i64.const 1)))
)
)
(br $loop)
)
(get_local $res)
) the same code in structuted language. function fac-iter ($n:i64) : (i64) {
i64 $i = $n;
i64 $res = 1;
loop {
if ($i == 0) {
break;
} else {
$res = $i * $res;
$i = $i - 1;
}
}
$res;
} Direct translation: function fac-iter // i64:$n
take 0 // _ i64:$i <-- dumb but consistent with the source
i64.const 1 // _ i64:$i i64:$res
begin // i64:$i i64:$res
pick 1 // i64:$i i64:$res i64:$i
i64.const 0 // i64:$i i64:$res i64:$i i64
i64.eq // i64:$i i64:$res i64
if // i64:$i i64:$res
break
else // i64:$i i64:$res
pick 1 // i64:$i i64:$res i64:$i
take 1 // i64:$i _ i64:$i i64:$res
i64.mul // i64:$i _ i64:$res
take 2 // _ _ i64:$res i64:$i
i64.const 1 // _ _ i64:$res i64:$i i64
i64.sub // _ _ i64:$res i64:$i
take 1 // _ _ _ i64:$i i64:$res <-- automatic cleanup
endif // i64:$i i64:$res
again
take 1 // _ i64:$res i64:$i
drop // _ i64:$res
} // i64:$res Direct translation with function fac-iter // i64:$n
take 0 // _ i64:$i
i64.const 1 // _ i64:$i i64:$res
begin // i64:$i i64:$res
pick 1 // i64:$i i64:$res i64:$i
i64.const 0 // i64:$i i64:$res i64:$i i64
i64.eq // i64:$i i64:$res i64
if // i64:$i i64:$res
break
else // i64:$i i64:$res
pick 1 // i64:$i i64:$res i64:$i
swap // i64:$i i64:$i i64:$res
i64.mul // i64:$i i64:$res
swap // i64:$res i64:$i
i64.const 1 // i64:$res i64:$i i64
i64.sub // i64:$res i64:$i
endif // i64:$i i64:$res
again
swap // i64:$res i64:$i
drop // i64:$res
} // i64:$res |
@drom Cool. Here's another suggestion for
|
@JSStats I don't understand how come Also why you have moved Why |
@drom It is just a suggestion, the If the branch at the end of the |
I've written up a bit about the stack machine here: https://docs.google.com/document/d/1CieRxPy3Fp62LQdtWfhymikb_veZI7S9MnuCZw7biII/edit?usp=sharing |
Document options and issues addressing the use case of a structured language uniquely representing code in constrained stack machine. The wasm has been pivoting from an AST to a stack machine, but the goal of having a structure familiar text format has not been conceded. This is an attempt to reconcile the two, if that is possible or to understand any show stoppers.
Here is one proposal for a single pass stack machine to text decoder:
pick
operator is always an edge of an expression, and any node referenced by apick
operator is emitted as a lexical constant. If the node referenced is part of the pending expression then it creates a barrier at the referenced node and this node and those before are emitted assigning their results to lexical constants.drop
operators are not assigned a constant name. If there is only one value and it is immediate discarded then this is presented as a statement in the text format.drop
operators not immediately following the producing operator are presented as an explicitdrop()
operation is in the text format. There remains the challenge of text format code the explicitly consumed a value immediately after it is generated on the stack, conflicting with the above rule, and this could either be a syntax error (perhaps quietly canonicalized to be valid), or could generate an extra redundantpick
operation to separate them.Some examples:
Note that this design leaves non-void values on the stack at the end of the block that would need to be implicitly discarded. This might be good for coding efficiency, but if optimizing for simple baseline compilers then a
pick_and_void
operator could be used on the last use of a stack element along a control flow path, and this would leave all stack elements void at the end of the block except for the block result values and this could be validated to enforce this pattern.Discussion and other possible strategies:
pick
operator which is a special case of the general re-ordering and duplication above. Validation to canonical patterns would keep the encoding choices out of the structure format. For exampledup
might need to be used rather thanpick 0
. These uses can be expressed in a familiar expression:values
text expression. For example:pick
operator addresses this.pick
operation in the stack machine code.pick
operation, even if it had other uses. The text decoder will have to keep track of barriers to forming expressions to split these into either being folded into an expression or a lexical constant reference.The text was updated successfully, but these errors were encountered: