[Feature] Allow async model loading and cancellation #699

AsakusaRinne · 2024-04-26T00:49:46Z

In production environment, especially desktop apps, it's common to have a button (or any other way) to allow users to abort the model loading. Fortunately, llama.cpp has already added support for it ggml-org/llama.cpp#4462. I think we should introduce this feature in LLamaSharp.

Similarly, async model loading is also important for applications based on LLamaSharp, which avoids blocking the main thread for a long time when loading a large model. I've found a similar work of it in the node.js binding of llama.cpp withcatai/node-llama-cpp#178. We could also implement it by polling the progress callback.

AsakusaRinne added the feature request label Apr 26, 2024

AsakusaRinne self-assigned this Apr 26, 2024

martindevans mentioned this issue Apr 27, 2024

Interruptible Async Model Loading With Progress Monitoring #702

Merged

martindevans closed this as completed in #702 Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Allow async model loading and cancellation #699

[Feature] Allow async model loading and cancellation #699

AsakusaRinne commented Apr 26, 2024 •

edited

Loading

[Feature] Allow async model loading and cancellation #699

[Feature] Allow async model loading and cancellation #699

Comments

AsakusaRinne commented Apr 26, 2024 • edited Loading

AsakusaRinne commented Apr 26, 2024 •

edited

Loading