-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Auto-vectorization via masked.load
blocks constprop
#134513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What's the original IR? Ideally this should get constant folded before it gets vectorized. |
Here's the full
|
What happens here is that the load of the loop counter gets load-only promoted by LICM, but at that point we just have the load in the preheader, but not the forwarded value from the initialization to 0. So LoopUnrollFull does not know that this is actually a loop with 6 iterations. We've recently gained load-only promotion support in SROA, but we currently only use it for readonly calls. I believe we could also use it to SROA the case where we have unknown-offset loads, as long as all the stores are known-offset. That would allow the optimization to occur earlier, including direct forwarding of the initialization value. |
It looks like that would basically work, but there's an issue with llvm-project/llvm/lib/Transforms/Scalar/SROA.cpp Lines 5513 to 5514 in a6853cd
|
If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location. As such, we can handle loads with unknown offset via the "escaped read-only" code path. This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective. Fixes llvm#134513.
…ds (#135609) If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location. As such, we can handle loads with unknown offset via the "escaped read-only" code path. This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective. Fixes llvm/llvm-project#134513.
…5609) If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location. As such, we can handle loads with unknown offset via the "escaped read-only" code path. This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective. Fixes llvm#134513.
I was writing some code in Rust and ended up with the following IR, where even though everything's a constant -- it should just be
ret i64 165
-- the masked loads from autovectorization on-Ctarget-cpu=x86-64-v3
kept that from happening:It looks like trunk can't optimize that to a constant either: https://llvm.godbolt.org/z/z6MKz6cz1
(Trunk at least doesn't need the store-load of the vector constant, but it still doesn't const-prop the stores and the
masked.load
.)The text was updated successfully, but these errors were encountered: