Server: fallback to chatml, add AlphaMonarch chat template #5628

ngxson · 2024-02-21T10:14:13Z

Fix server crash if the template is not supported

If the template is not supported, we will now see an error in the log. The server will still work, but fallback to chatml template.

{"timestamp":1708509262,"level":"INFO","function":"main","line":2723,"message":"model loaded"}
{"timestamp":1708509262,"level":"ERROR","function":"validate_model_chat_template","line":408,"message":"The chat template comes with this model is not yet supported, falling back to chatml. This may cause the model to output suboptimal responses"}
{"timestamp":1708509262,"level":"VERBOSE","function":"start_loop","line":293,"message":"have new task"}
{"timestamp":1708509262,"level":"VERBOSE","function":"start_loop","line":308,"message":"callback_all_task_finished"}
all slots are idle and system prompt is empty, clear the KV cache
{"timestamp":1708509262,"level":"VERBOSE","function":"start_loop","line":329,"message":"wait for new task"}
{"timestamp":1708509286,"level":"VERBOSE","function":"format_chat","line":208,"message":"formatted_chat","text":"<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nhi, how are you<|im_end|>\n<|im_start|>assistant\n"}

Add new template

New template added: https://huggingface.co/mlabonne/AlphaMonarch-7B/blob/main/tokenizer_config.json#L36

The new template is also updated in the research: #5527

Example for the formatted prompt:

{"timestamp":1708510304,"level":"VERBOSE","function":"format_chat","line":208,"message":"formatted_chat","text":"system\nYou are a helpful assistant.</s>\n<s>user\nhi, how are you</s>\n<s>assistant\n"}

ngxson · 2024-02-21T10:15:10Z

@infozzdatalabs Can you please check this out if possible? Thank you!

infozzdatalabs · 2024-02-21T10:20:06Z

I think it's great that it warns in case it doesn't match with any available one, until now it was trial and error without knowing it at all. It would be amazing if I could directly use a template that is not hardcoded, taking it from the model's built-in, but this solves many problems already, thanks a lot!

ggerganov

Thank you for looking into this and fixing!

But there 2 things still missing IMO:

For each new template we must add a test in tests/test-chat-template.cpp
We should not validate the model's template if a custom template is provided

infozzdatalabs · 2024-02-21T11:12:09Z

Thanks to both of you @ngxson @ggerganov for fixing this so fast.
I have not found any information about this, if i want to use a custom template in docker with the --chat-template flag, would i have to provide the path to the jinja file or put it directly as a string? i have tried to put it in string and it did not start the server and threw the wrong argument exception.

Edit: I searched inside #5593 files and found that is a string.

ngxson · 2024-02-21T16:46:17Z

@ggerganov Bien vu! I've fixed that on the 2 commits above.

For the test, before, I only check if output contains a pre-defined string (I was quite lazy to deal with \n). But now it checks if the output is exactly equal to what we expect.

@infozzdatalabs Yeah you need to input a string as argument. I suspect that it doesn't work in your case because the template contains special characters that are reserved by shell language (brackets {} for example). I'll document it better in a future PR.

ggerganov

Awesome 🦙

…#5628) * server: fallback to chatml * add new chat template * server: add AlphaMonarch to test chat template * server: only check model template if there is no custom tmpl * remove TODO

ngxson added 2 commits February 21, 2024 11:11

server: fallback to chatml

9576df9

add new chat template

b599545

ngxson requested a review from ggerganov February 21, 2024 10:14

ggerganov reviewed Feb 21, 2024

View reviewed changes

ngxson added 2 commits February 21, 2024 17:35

server: add AlphaMonarch to test chat template

10d8673

server: only check model template if there is no custom tmpl

2ab9cb9

ngxson requested a review from ggerganov February 21, 2024 19:50

remove TODO

f6b2e1d

ggerganov approved these changes Feb 22, 2024

View reviewed changes

ggerganov merged commit a46f507 into ggml-org:master Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server: fallback to chatml, add AlphaMonarch chat template #5628

Server: fallback to chatml, add AlphaMonarch chat template #5628

ngxson commented Feb 21, 2024

ngxson commented Feb 21, 2024

infozzdatalabs commented Feb 21, 2024 •

edited

Loading

ggerganov left a comment

infozzdatalabs commented Feb 21, 2024 •

edited

Loading

ngxson commented Feb 21, 2024 •

edited

Loading

ggerganov left a comment

Server: fallback to chatml, add AlphaMonarch chat template #5628

Server: fallback to chatml, add AlphaMonarch chat template #5628

Conversation

ngxson commented Feb 21, 2024

Fix server crash if the template is not supported

Add new template

ngxson commented Feb 21, 2024

infozzdatalabs commented Feb 21, 2024 • edited Loading

ggerganov left a comment

Choose a reason for hiding this comment

infozzdatalabs commented Feb 21, 2024 • edited Loading

ngxson commented Feb 21, 2024 • edited Loading

ggerganov left a comment

Choose a reason for hiding this comment

infozzdatalabs commented Feb 21, 2024 •

edited

Loading

infozzdatalabs commented Feb 21, 2024 •

edited

Loading

ngxson commented Feb 21, 2024 •

edited

Loading