feat: async operations #178

giladgd · 2024-03-09T20:33:33Z

Description of change

feat: async model loading
feat: async context creation
feat: export TemplateChatWrapperOptions
feat: detect cmake binary issues and suggest fixes on detection
feat: automatically try to resolve Failed to detect a default CUDA architecture CUDA compilation error
fix: adapt to breaking llama.cpp changes to make embedding work again
fix: adapt to breaking llama.cpp changes to support mamba models
fix: show console log prefix on postinstall
fix: call logger with last llama.cpp logs before exit
fix: rename .buildMetadata.json to not start with a dot, to make using this library together with bundlers easier
fix: DisposedError was thrown when calling .dispose()

How to use node-llama-cpp after this change

Regular context

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf"),
    onLoadProgress(loadProgress: number) {
        console.log(`Load progress: ${loadProgress * 100}%`);
    }
});
const context = await model.createContext({
    contextSize: Math.min(4096, model.trainContextSize)
});
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});


const q1 = "Hi there, how are you?";
console.log("User: " + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);


const q2 = "Summerize what you said";
console.log("User: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

Embedding

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaEmbeddingContext} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "functionary-small-v2.2.q4_0.gguf")
});
const embeddingContext = await model.createEmbeddingContext({
    contextSize: Math.min(4096, model.trainContextSize)
});

const text = "Hello world";
const embedding = await embeddingContext.getEmbeddingFor(text);

console.log(text, embedding.vector);

Pull-Request Checklist

Code is up-to-date with the master branch
npm run format to apply eslint formatting
npm run test passes with this change
This pull request links relevant issues as Fixes #0000
There are new or updated unit tests validating the change
Documentation has been updated to reflect this change
The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

ido-pluto

LGTM

…test `llama.cpp`

…ing it with bundlers easier

…ommands

…rchitecture` CUDA compilation error

github-actions · 2024-03-16T22:46:35Z

🎉 This PR is included in version 3.0.0-beta.14 🎉

The release is available on:

Your semantic-release bot 📦🚀

github-actions · 2024-09-24T18:12:42Z

🎉 This PR is included in version 3.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

giladgd added 9 commits March 9, 2024 22:12

feat: load a model asynchronously

0230ed3

fix: adapt to breaking llama.cpp changes to make embedding work again

299cb9e

feat: log package install instructions for apk

2bfe21a

fix: show console log prefix on postinstall

26b1339

fix: registry memory usage on the correct js env

edffb14

feat: async context creation

0ea4a84

test: async context creation

6f6c93c

feat: export TemplateChatWrapperOptions

dad4a64

feat: detect cmake binary issues and suggest fixes on detection

565da6a

giladgd requested a review from ido-pluto March 9, 2024 20:33

giladgd self-assigned this Mar 9, 2024

giladgd added 3 commits March 9, 2024 23:29

style: lint

868eb02

docs: update type docs

2835eac

fix: adapt to breaking llama.cpp changes to support mamba models

123c9a0

giladgd added this to the v3.0.0 milestone Mar 9, 2024

ido-pluto approved these changes Mar 10, 2024

View reviewed changes

giladgd added 14 commits March 10, 2024 23:20

test: fix tests

b792d46

fix: compile Metal shaders to make the binaries work on macOS with la…

e9d090a

…test `llama.cpp`

fix: call logger with last llama.cpp logs before exit

fff9fd0

fix: rename .buildMetadata.json to not start with a dot, to make us…

2734809

…ing it with bundlers easier

fix: missing module

00a6e85

fix: bug

5c9a19d

feat: update cmake-js and ipull

c78a3b7

fix: detect windir

6889d00

fix: bug

5582b3e

fix: DisposedError was thrown when calling .dispose()

f93b53c

fix: wait for logs to finish printing before the next action on CLI c…

361ac59

…ommands

fix: remove Metal shader compilation workaround as it's no longer needed

021972f

fix: adapt to llama.cpp changes

bda0d5c

feat: async binding backend loading and unloading

5a3d5d1

giladgd added 21 commits March 16, 2024 19:24

fix: setupAndTestOnPaperspace.sh

b8f7cc8

fix: setupAndTestOnPaperspace.sh

f6b2725

fix: setupAndTestOnPaperspace.sh

9d58528

fix: setupAndTestOnPaperspace.sh

787089f

fix: setupAndTestOnPaperspace.sh

309cb5c

fix: setupAndTestOnPaperspace.sh

a55548a

fix: setupAndTestOnPaperspace.sh

6e84d2c

fix: setupAndTestOnPaperspace.sh

19fcda7

feat: automatically try to resolve `Failed to detect a default CUDA a…

f7ab49e

…rchitecture` CUDA compilation error

fix: setupAndTestOnPaperspace.sh

cd9135a

fix: setupAndTestOnPaperspace.sh

f58ee14

fix: SpawnError

70c3e8d

fix: compileLlamaCpp bug

928550c

fix: CUDA detection bug

0f700ff

fix: automatic compilation workarounds deadlock

a5ea28d

fix: setupAndTestOnPaperspace.sh

79ef65d

fix: setupAndTestOnPaperspace.sh

453827c

feat: remove ts-node in favor of vite-node

90a28cb

fix: improve CUDA workarounds

d4dde4c

fix: setupAndTestOnPaperspace.sh

45cc411

feat: add test:vitest script to package.json

5474d5a

giladgd merged commit 315a3eb into beta Mar 16, 2024
12 checks passed

giladgd deleted the gilad/asyncOperations branch March 16, 2024 22:14

github-actions bot added the released on @beta label Mar 16, 2024

giladgd mentioned this pull request Mar 16, 2024

feat: version 3.0 #105

Merged

17 tasks

AsakusaRinne mentioned this pull request Apr 26, 2024

[Feature] Allow async model loading and cancellation SciSharp/LLamaSharp#699

Closed

giladgd mentioned this pull request Jul 28, 2024

feat: Llama 3.1 support, Phi-3 support #273

Merged

7 tasks

github-actions bot added the released label Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: async operations #178

feat: async operations #178

giladgd commented Mar 9, 2024 •

edited

Loading

ido-pluto left a comment

github-actions bot commented Mar 16, 2024

github-actions bot commented Sep 24, 2024 •

edited by giladgd

Loading

feat: async operations #178

feat: async operations #178

Conversation

giladgd commented Mar 9, 2024 • edited Loading

Description of change

How to use node-llama-cpp after this change

Regular context

Embedding

Pull-Request Checklist

ido-pluto left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 16, 2024

github-actions bot commented Sep 24, 2024 • edited by giladgd Loading

giladgd commented Mar 9, 2024 •

edited

Loading

github-actions bot commented Sep 24, 2024 •

edited by giladgd

Loading