[Bug report] Performance deterioration of LLaMA-2 model due to hardcoded rms_norm_eps #2373

xx205 · 2023-07-24T13:58:11Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[Yes] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[Yes] I carefully followed the README.md.
[Yes] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[Yes] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

When running converted ggml model, the eps used in RMSNorm is consistent with original model definition.

Current Behavior

The norm_eps used in RMSNorm is hardcoded to 1e-6, in all backends: X86, CUDA, Metal.
Related commit: Change RMSNorm eps to 1e-6 #173 (22213a1)

Environment and Context

Recently I want to evaluate LLaMA-1 and LLaMA-2 models on MMLU (Measuring Massive Multitask Language Understanding, https://github.com/hendrycks/test) test set, and I chose llama.cpp as the inference engine.
The performance of LLaMA-1 models are nearly the same as the paper reported, but for LLaMA-2 7B and 13B models, they just got the LLaMA-1 7B level scores.
Then I check the model definitions of LLaMA-2 7B and 13B and found the “rms_norm_eps” in config.json is 1e-5 instead of 1e-6.
After recompiling the source code with the change of eps=1-5, the test results of LLaMA-2 models are finally looking good.

Related issue:
GGML model showing noticeable quality issues when compared to HF model #2354

Affected discussions:
LLaMA-2 Perplexities #2352
Presentation on llama.cpp on 25.07.2023 at karlsruhe.ai #2281

The text was updated successfully, but these errors were encountered:

slaren · 2023-07-24T14:10:05Z

Thanks for the report, I am working on a fix for this.

klosax · 2023-07-24T14:51:25Z

Perplexity change on LLaMA2-7b by changing epsilon to 1e-5:
6.006 --> 5.918

slaren mentioned this issue Jul 24, 2023

GGUF file format specification ggml-org/ggml#302

Merged

slaren mentioned this issue Jul 24, 2023

make rms_norm_eps a parameter #2374

Merged

slaren closed this as completed in #2374 Jul 24, 2023

lmg-anon mentioned this issue Jul 24, 2023

GGML model showing noticeable quality issues when compared to HF model #2354

Closed

ikawrakow mentioned this issue Jul 25, 2023

Default for RMS epsilon #2384

Merged

julien-piet mentioned this issue Dec 4, 2023

Details of quality rating wagner-group/MarkMyWords#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug report] Performance deterioration of LLaMA-2 model due to hardcoded rms_norm_eps #2373

[Bug report] Performance deterioration of LLaMA-2 model due to hardcoded rms_norm_eps #2373

xx205 commented Jul 24, 2023

slaren commented Jul 24, 2023

klosax commented Jul 24, 2023

[Bug report] Performance deterioration of LLaMA-2 model due to hardcoded rms_norm_eps #2373

[Bug report] Performance deterioration of LLaMA-2 model due to hardcoded rms_norm_eps #2373

Comments

xx205 commented Jul 24, 2023

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

slaren commented Jul 24, 2023

klosax commented Jul 24, 2023