You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: llama.h
+6-4
Original file line number
Diff line number
Diff line change
@@ -704,18 +704,20 @@ extern "C" {
704
704
char * buf,
705
705
int32_t length);
706
706
707
-
/// Apply chat template and maybe tokenize it. Inspired by hf apply_chat_template() on python.
707
+
/// Apply chat template. Inspired by hf apply_chat_template() on python.
708
708
/// Both "model" and "custom_template" are optional, but at least one is required. "custom_template" has higher precedence than "model"
709
709
/// NOTE: This function only support some known jinja templates. It is not a jinja parser.
710
-
/// @param custom_template A Jinja template to use for this conversion. If this is nullptr, the model’s default chat template will be used instead.
711
-
/// @param msg Pointer to a list of multiple llama_chat_message
710
+
/// @param custom_template A Jinja template to use for this chat. If this is nullptr, the model’s default chat template will be used instead.
711
+
/// @param chat Pointer to a list of multiple llama_chat_message
712
+
/// @param n_msg Number of llama_chat_message in this chat
712
713
/// @param add_ass Whether to end the prompt with the token(s) that indicate the start of an assistant message.
713
714
/// @param buf A buffer to hold the output formatted prompt. The recommended alloc size is 2 * (total number of characters of all messages)
715
+
/// @param length The size of the allocated buffer
714
716
/// @return The total number of bytes of the formatted prompt. If is it larger than the size of buffer, you may need to re-alloc it and then re-apply the template.
0 commit comments