Breaking Changes #9276
Replies: 4 comments 16 replies
-
I'm open to suggestions how to improve the notification for breaking changes. We generally try to update the readme when there are updates to the C-style API. Maybe we can start updating it for server-related changes too.
I don't think these options have any advantage over watching the repo. Please, correct me if I'm wrong.
For semantic versioning to be effective, we would need to start writing detailed release notes. We can try to do that, though I think that the existing commit messages already provide 99% of the information. How would have semantic versioning helped in this case?
cc @ngxson for further clarifications of why this change is needed. AFAIU, without it, there is no way to check efficiently the server health because the requests would possibly timeout during the processing of bigger batches. |
Beta Was this translation helpful? Give feedback.
-
My genuine question is - does llama.cpp aim to be used in prod? I mean - is that a project's goal at all, or is llama.cpp supposed to be a tool for running open source LLMs locally? Because without any guarantees of backwards compatibility it would be really hard to build anything on top of llama.cpp cc @ggerganov |
Beta Was this translation helpful? Give feedback.
-
Just to state the obvious, using the |
Beta Was this translation helpful? Give feedback.
-
Just want to add to the discussion: A while ago @Vaibhavs10 and I had a small discussion (non-formal) on the subject of having a "stable channel" of llama.cpp server image. The idea is to have some commits on master branch to be tested (semi-)automatically, then being tagged as "stable". This could be done periodically, maybe once a week or so. We don't have a clear plan yet. But on the way doing so, having a communication channel for breaking changes could be a good idea to add. |
Beta Was this translation helpful? Give feedback.
-
This change broke both my project (https://github.com/distantmagic/paddler) and the infrastructure (forced me to update both the prod environment and other related projects by surprise):
#9056
I did not notice that in time because I've been using an older build of llama.cpp at the moment. That also forced me to rebuild the llama.cpp instances I had deployed in prod.
Do you plan to introduce a communication channel that notifies about breaking changes in llama.cpp? It would be important to have something like this for the sake of stability and reliability in prod environments. It would be nice to have some way to avoid such unexpected changes in the future. It could be anything—a Discord server, a mailing list, anything (also, it would be best to notify before they happen).
Also, since llama.cpp uses rolling releases instead of semantic versioning it is impossible to target specific versions of llama.cpp with 3rd party projects - all that is left is "duck checking" - I have to check if the currently installed llama.cpp version supports the new endpoint, or the older one, which is also not ideal.
Also, in the case of Paddler (or any tool that monitors llama.cpp), it forces requests to two endpoints (
/health
and/slots
instead of one), which is more taxing on the infra, and this wasn't discussed anywhere beforehand.I love the project and do my best to use it in production and rely on it, but I was kind of taken by surprise. I think backward compatibility needs some consideration. This is a must if llama.cpp can be trusted to be used in production or to be the foundational building block for other projects.
cc @ggerganov
Beta Was this translation helpful? Give feedback.
All reactions