Skip to content

ggml : add SSE 4.2 and x64 base variant for CPUs without AVX #12871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 21, 2025

Conversation

slaren
Copy link
Member

@slaren slaren commented Apr 10, 2025

Fixes #12866

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 10, 2025
@slaren slaren changed the title ggml : add SSE 4.2 variant for CPUs without AVX ggml : add SSE 4.2 and x64 base variant for CPUs without AVX Apr 10, 2025
@corsairius
Copy link

How long does it usually take for a bug fix to become certified? Is it necessary to pass all of the above tests to get it into the review queue?

Thanks!

@slaren slaren requested a review from ggerganov April 21, 2025 16:09
@slaren slaren merged commit 1d735c0 into master Apr 21, 2025
50 of 51 checks passed
@slaren slaren deleted the sl/no-avx-variant branch April 21, 2025 16:13
@acbits
Copy link

acbits commented Apr 21, 2025

Just noticed this fix.

I am curious why the user can't fix this issue by setting CFLAGS to -msse4.2 -mno-avx?
Is there any reason why we add more code to cmake as it is a maintenance burden?

P. S. A even better option is to set -march=<cpu arch> and -mtune=<cpu> as it enables the exact set of flags required for that architecture.

colout pushed a commit to colout/llama.cpp that referenced this pull request Apr 21, 2025
…g#12871)

* ggml : add SSE 4.2 variant for CPUs without AVX

* ggml : add x64 base ABI variant
@slaren
Copy link
Member Author

slaren commented Apr 21, 2025

@acbits see #10606 and #10626 for more details.

pockers21 pushed a commit to pockers21/llama.cpp that referenced this pull request Apr 28, 2025
…g#12871)

* ggml : add SSE 4.2 variant for CPUs without AVX

* ggml : add x64 base ABI variant
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: llama fails to run on older x86 hardware.
4 participants