You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add transposed support to the rowwise-fp8 sparse CUTLASS kernel. The above kernel assumes that the weight is 2:4 sparse. Since 2:4 sparsity is only supported for the first operand, I'm using the fact $xW^T = (Wx^T)^T$ to be able to use the kernel for activation sparsity, but this means that the output of the kernel is in col-major format instead of row-major.
This is a tracker issue for all the different ways we can accelerate training / inference with activation sparsity in TorchAO.
Inference
Training
The text was updated successfully, but these errors were encountered: