Skip to content

Commit 2070378

Browse files
ngxsonarthw
authored andcommitted
server : (refactoring) do not rely on JSON internally (ggml-org#10643)
* server : (refactoring) reduce usage of json internally * move all response types to struct * wip [no ci] * many fixes * add virtual function * fix index * minor style fix * add std::move * refactor handle_completions_generic * add virtual functions * remove server.hpp * clarify server_sent_event RFC specs * apply review comments * fix model_alias and completion_probabilities * small clean up * remove virtual for to_json_oai_compat() * naming oai_compat --> oaicompat * fix unwanted recursive call * update docs
1 parent c28a202 commit 2070378

File tree

8 files changed

+983
-695
lines changed

8 files changed

+983
-695
lines changed

common/common.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -215,7 +215,7 @@ struct common_params {
215215
struct common_params_speculative speculative;
216216

217217
std::string model = ""; // model path // NOLINT
218-
std::string model_alias = "unknown"; // model alias // NOLINT
218+
std::string model_alias = ""; // model alias // NOLINT
219219
std::string model_url = ""; // model url to download // NOLINT
220220
std::string hf_token = ""; // HF token // NOLINT
221221
std::string hf_repo = ""; // HF repo // NOLINT

examples/server/README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -473,9 +473,11 @@ Notice that each `probs` is an array of length `n_probs`.
473473
- `generation_settings`: The provided options above excluding `prompt` but including `n_ctx`, `model`. These options may differ from the original ones in some way (e.g. bad values filtered out, strings converted to tokens, etc.).
474474
- `model`: The path to the model loaded with `-m`
475475
- `prompt`: The provided `prompt`
476-
- `stopped_eos`: Indicating whether the completion has stopped because it encountered the EOS token
477-
- `stopped_limit`: Indicating whether the completion stopped because `n_predict` tokens were generated before stop words or EOS was encountered
478-
- `stopped_word`: Indicating whether the completion stopped due to encountering a stopping word from `stop` JSON array provided
476+
- `stop_type`: Indicating whether the completion has stopped. Possible values are:
477+
- `none`: Generating (not stopped)
478+
- `eos`: Stopped because it encountered the EOS token
479+
- `limit`: Stopped because `n_predict` tokens were generated before stop words or EOS was encountered
480+
- `word`: Stopped due to encountering a stopping word from `stop` JSON array provided
479481
- `stopping_word`: The stopping word encountered which stopped the generation (or "" if not stopped due to a stopping word)
480482
- `timings`: Hash of timing information about the completion such as the number of tokens `predicted_per_second`
481483
- `tokens_cached`: Number of tokens from the prompt which could be re-used from previous completion (`n_past`)

0 commit comments

Comments
 (0)