-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Problems when i try to use this inside the default python 3.10 docker container #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Do you mind posting the Dockerfile? I haven't tried to use this inside a container but I'l investigate. |
Alright i cant send you the Dockerfile, but i created a toy-example with your own server. Dockerfile:
I then started the container via this docker-compose file:
When running this the 'GLIBCXX_3.4.29' error is gone, but the model loads for a very long time and the container gets OOM killed sometimes. I'm running on 64GB so that's very strange. The other thing i noticed is that the container fails to bind the address but that's just docker IPv4 shenanigans. I will try to use llama-cpp-python[server] as a dependency in my other project and see if it will get rid of the 'GLIBCXX_3.4.29' error. Oh and just for completness i was using this model. |
So the GLIBC error is fixed now? I would also make sure the OOM issue isn't some Docker default limit. |
Thats the strange thing the dockerfile listed above works without any problems. But if im trying to run my Dockerfile im getting the "GLIBC" error. This is my Dockerfile:
and its using the following requirements file:
My guess is one of the dependencies in the requirements does some weird stuff, but i actually have no clue where to start looking. |
I run within a Ubuntu container which works. |
Took the opportunity to shrink my own dockerfile:
works without a hitch with requirements.txt
|
I also played around a bit more and i couldnt get the container working, and i dont know why. All the other containers i build with llama-cpp-python work without any problems. I will probably host a seperate llama-cpp-python container and then use it via its REST api. @abetlen Could we maybe get an official prebuild docker container for the REST server? |
Yeah if someone wants to open a PR I can test / review it when I have some time. |
I could try later. But i guess i noticed another problem i had, when building my containers via github-actions and then trying to run them locally the containers exit with exitcode 132 hinting at an unsupported CPU instruction-set. If build and run locally they work without any problems. My guess the is the llama.cpp dependencies are resolved in the build step of the container with the cpu feature flags of the github-actions-runner. Can i define specific feature flags used for the llama.cpp compilation at setup time or should i look into first building llama.cpp manually and then setting it via the environment? |
@LLukas22 that's definitely it, should check what llama.cpp does for this with their docker containers. |
@abetlen Yeah its pretty weird, i played around a bit but can't get it to work even if i use QEMU to force a linux/amd64 platform while building the image on an github-actions-runner. And i'm actually really confused what instruction set is missing as i'm using relatively new processors (AMD Zen 2 & 3) so avx and avx-2 are definitely there. But i'm also no docker pro, so maybe i'm just missing something. |
@LLukas22 is QEMU sufficient to emulate processor specific instruction sets? I'd imagine it's because of a mismatch between processor architectures in the github runner and your local machine. |
@abetlen Theoretically QEMU should be able to emulate any cpu features, but i dont know how it is implemented in docker build. |
@LLukas22 https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/docker.yml describes how they build the llama.cpp docker image, not sure if you've checked that out |
@abetlen Yup i had a look and tried their image, it runs on my Intel machine but fails with exit code 132 on both my AMD based systems. I will create a new issue in the llama.cpp repository. |
These old build issues where Can it support avx cpu's older than 10 years old EDIT: Is there a chance that the Docker image is configured for one type of h/w (e.g. Intel), and then fails when moved to another (e.g. AMD)? |
@gjmulder Well the images are build on an github-actions-runner which probably uses a virtualized Intel CPU. I then downloaded them an tried to run them on my AMD systems, which leads to the errors. The thing that confuses me is that all systems i used support all CPU features listed in the cmake file namely avx, avx2, fma and f16c. I don't quite get why an image build on one of these systems shouldn't work when moved to another system supporting the exact same instruction sets. |
I'm guessing here, as I haven't had time to repro, but will do so in the next few days: Maybe do a |
@gjmulder Alright i ran it locally and on the github-actions-runner. Here are the results: Actions Runner:
Local:
Maybe the problem is caused by the |
@LLukas22 this might be relevant: ggml-org/llama.cpp#809 |
@gjmulder Yes this is probably related. Do you know if it is possible to somehow change what featuresets are used in the compilation of llama.cpp via environment variables? Then i could just set them in the dockerfile and use the normal setup from this repo. If not i probably have to build it manually and then copy around the binary. |
@LLukas22 I seem to remember that the This might help. Maybe you can override the EDIT: GPT-4 to the rescue?
This line will keep the -march=native option but will disable SSE3, AVX, and AVX2 while enabling SSE and SSE2. Again, note that forcibly enabling SIMD instruction sets that your CPU doesn't support may cause your program to crash or produce incorrect results. Make sure you understand the implications of enabling or disabling these macros before making any changes. |
@gjmulder Hm this could actually work im gonna try it later and i will just try to disable |
@LLukas22 You could just query |
@gjmulder Alrigth i had another look, setting the |
@abetlen Alright i now have an action that builds me a The my dockerfile now looks like this and is available in this fork:
This is the binary i used: llama.zip I'm at my wits ends here, i'm probably just gonna build my containers on the machines i will run them on. |
Here to help you debug @LLukas22 😄 EDIT: Full clone and build of your repo:
|
@gjmulder Alright thats kind of my bad, the There is no need to build or recursively checkout the repo. To reproduce the issue i'm experiencing just clone the fork and run a docker build and run on the dockerfile. The build should work without any errors and when trying to run the container the The buildprocess of the |
When i try to install and use this package via a requirements file in the default 3.10 python container i get the following error when i try to import the module:
Failed to load shared library '/usr/local/lib/python3.10/site-packages/llama_cpp/libllama.so': /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version 'GLIBCXX_3.4.29' not found (required by /usr/local/lib/python3.10/site-packages/llama_cpp/libllama.so)
Am i doing something wrong? Or am i just missing some dependencies?
The text was updated successfully, but these errors were encountered: