You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
podman run --rm --pull=newer --device nvidia.com/gpu=all --volume /home/kraxel/.local/share/ramalama:/models ghcr.io/ggml-org/llama.cpp:full-cuda --run -m /models/models/ollama/tinyllama:latest -p hello
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes
load_backend: loaded CUDA backend from /app/libggml-cuda.so
[ ... ]
load_tensors: loading model tensors, this can take a while... (mmap = true)
Finds the GPU, loads the model, then just stops.
The linux kernel logs a segfault:
[ 2408.935610] llama-cli[3154]: segfault at 78 ip 00007f92766fe4d4 sp 00007fff1bbe0f78 error 4 in libggml-base.so[284d4,7f92766e7000+63000] likely on CPU 3 (core 3, socket 0)
[ 2408.935673] Code: 84 00 00 00 00 00 f3 0f 1e fa 66 0f ef c0 48 c7 46 20 00 00 00 00 0f 11 06 0f 11 46 10 ff 67 20 66 0f 1f 44 00 00 f3 0f 1e fa <48> 8b 47 78 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa ff 67 28 66 0f 1f
The CPU is older and has no AVX vector instructions. Apparently llama uses AVX without checking beforehand the CPU actually supports these instructions.
Ran into this with ramalama first, see containers/ramalama#1145, where I see the same behavior but a slightly different kernel error message:
Name and Version
using latest docker image
build: 5097 (fe5b78c) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
Problem description & steps to reproduce
Finds the GPU, loads the model, then just stops.
The linux kernel logs a segfault:
The CPU is older and has no AVX vector instructions. Apparently llama uses AVX without checking beforehand the CPU actually supports these instructions.
Ran into this with ramalama first, see containers/ramalama#1145, where I see the same behavior but a slightly different kernel error message:
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: