You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Yes] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
When running converted ggml model, the eps used in RMSNorm is consistent with original model definition.
Current Behavior
The norm_eps used in RMSNorm is hardcoded to 1e-6, in all backends: X86, CUDA, Metal.
Related commit: Change RMSNorm eps to 1e-6 #173 (22213a1)
Environment and Context
Recently I want to evaluate LLaMA-1 and LLaMA-2 models on MMLU (Measuring Massive Multitask Language Understanding, https://github.com/hendrycks/test) test set, and I chose llama.cpp as the inference engine.
The performance of LLaMA-1 models are nearly the same as the paper reported, but for LLaMA-2 7B and 13B models, they just got the LLaMA-1 7B level scores.
Then I check the model definitions of LLaMA-2 7B and 13B and found the “rms_norm_eps” in config.json is 1e-5 instead of 1e-6.
After recompiling the source code with the change of eps=1-5, the test results of LLaMA-2 models are finally looking good.
Related issue:
GGML model showing noticeable quality issues when compared to HF model #2354
Affected discussions:
LLaMA-2 Perplexities #2352
Presentation on llama.cpp on 25.07.2023 at karlsruhe.ai #2281
The text was updated successfully, but these errors were encountered:
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
When running converted ggml model, the eps used in RMSNorm is consistent with original model definition.
Current Behavior
The norm_eps used in RMSNorm is hardcoded to 1e-6, in all backends: X86, CUDA, Metal.
Related commit: Change RMSNorm eps to 1e-6 #173 (22213a1)
Environment and Context
Recently I want to evaluate LLaMA-1 and LLaMA-2 models on MMLU (Measuring Massive Multitask Language Understanding, https://github.com/hendrycks/test) test set, and I chose llama.cpp as the inference engine.
The performance of LLaMA-1 models are nearly the same as the paper reported, but for LLaMA-2 7B and 13B models, they just got the LLaMA-1 7B level scores.
Then I check the model definitions of LLaMA-2 7B and 13B and found the “rms_norm_eps” in config.json is 1e-5 instead of 1e-6.
After recompiling the source code with the change of eps=1-5, the test results of LLaMA-2 models are finally looking good.
Related issue:
GGML model showing noticeable quality issues when compared to HF model #2354
Affected discussions:
LLaMA-2 Perplexities #2352
Presentation on llama.cpp on 25.07.2023 at karlsruhe.ai #2281
The text was updated successfully, but these errors were encountered: