Skip to content

Commit 3ad986c

Browse files
reidliu41reidliu41
and
reidliu41
authored
[doc] update wrong model id (#17287)
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>
1 parent 344e193 commit 3ad986c

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

docs/source/features/quantization/gptqmodel.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ model.save(quant_path)
5858
To run an GPTQModel quantized model with vLLM, you can use [DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2](https://huggingface.co/ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2) with the following command:
5959

6060
```console
61-
python examples/offline_inference/llm_engine_example.py --model DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2
61+
python examples/offline_inference/llm_engine_example.py --model ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2
6262
```
6363

6464
## Using GPTQModel with vLLM's Python API
@@ -80,15 +80,17 @@ prompts = [
8080
sampling_params = SamplingParams(temperature=0.6, top_p=0.9)
8181

8282
# Create an LLM.
83-
llm = LLM(model="DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2")
83+
llm = LLM(model="ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2")
8484

8585
# Generate texts from the prompts. The output is a list of RequestOutput objects
8686
# that contain the prompt, generated text, and other information.
8787
outputs = llm.generate(prompts, sampling_params)
8888

8989
# Print the outputs.
90+
print("-"*50)
9091
for output in outputs:
9192
prompt = output.prompt
9293
generated_text = output.outputs[0].text
93-
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
94+
print(f"Prompt: {prompt!r}\nGenerated text: {generated_text!r}")
95+
print("-"*50)
9496
```

0 commit comments

Comments
 (0)