-
Notifications
You must be signed in to change notification settings - Fork 11.6k
sycl : Fixes broken build and test-backend-ops #10257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
06cb3c6
to
f6ea8b7
Compare
could you cherry-pick the |
Added the changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oneMKL changes look good to me.
these commits negatively affect intel gpus. Is this expected ? For example :
After:
Reverting over master and performance returns |
I'm currently investigating another performance regression in the SYCL backend, though this seems a separate issue. Reverting the change would break the build for non-intel backends, but we also want to avoid this performance loss. Will look into it. Edit: Sorry for the inconvenience |
The issues is caused by the mul_mats marked as unsupported. I've found out a new issue for a specific test case:
The following is a temporary patch that fixes the regression and makes test-backend-ops not crash. I'm still investigating the test-case from above. diff --git a/ggml/src/ggml-sycl/ggml-sycl.cpp b/ggml/src/ggml-sycl/ggml-sycl.cpp
index 2dba15d2..72a94a50 100644
--- a/ggml/src/ggml-sycl/ggml-sycl.cpp
+++ b/ggml/src/ggml-sycl/ggml-sycl.cpp
@@ -4350,9 +4350,10 @@ static bool ggml_backend_sycl_device_supports_op(ggml_backend_dev_t dev, const g
if (op->op == GGML_OP_MUL_MAT) {
a = op->src[0];
b = op->src[1];
- if (ggml_is_permuted(a) || ggml_is_permuted(b)) {
+ if (ggml_is_permuted(a)) {
// TODO: fix like https://github.com/ggerganov/llama.cpp/pull/10021
- return false;
+ if (a->nb[0] <= a->nb[1] && a->nb[3] <= a->nb[2]) return false; // 0,1,3,2 Unsupported
+ if (b->type != GGML_TYPE_F32) return false;
}
} else {
a = op->src[2]; |
* Fixes broken build for the SYCL CUDA backend caused by non-explicit gemm call in outprod (merged in with RWKV6 in Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration ggml-org#10133) * Marks permuted MUL_MAT as unsupported to be able to run test-backend-ops * Fixes asserts in norm to fix debug builds.
@Alcpz How do you think? Thank you! |
Yes, let's revert until we get a proper fix for the test-backend-ops. Edit: I can't automatically do it. I will submit a new PR reverting just the problematic changes. CI is gonna fail for SYCL though, so we may need to have a convesation there. |
Tests confirmed passing in Nvidia A100 and Intel Data Center GPU Max 1100