[Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark for V1 EAGLE #17812

ekagra-ranjan · 2025-05-07T17:24:14Z

We have been doing perf bench on MTBench so that e2e speedup and AL are comparable with other setups and academic papers. Thanks to @luyuzhe111 and others for the discussion and helping with measuring the gaps!

llama 3 8b

During model wt loading

[V1][Spec Decode] Eagle Model loading #16035 (comment)
- SGL correction: [V1][Spec Decode] Eagle Model loading #16035 (comment)
- SGL setup: https://docs.google.com/document/d/18ETJLsnxR88Qq3VDk5Mq-Hb7vuE9o3VNZ-hhz-OqAXk/edit?usp=sharing
[V1][Spec Decode] Eagle Model loading #16035 (comment)

During KV Cache slot

[V1][Spec Decode] KV cache slots for eagle heads #16370 (comment)

llama 3.1 8b

[V1][Spec Decode] KV cache slots for eagle heads #16370 (comment)
EAGLE - 1/3
- offline serving: [V1][Spec Decode] EAGLE-3 Support #16937 (comment)
- online serving: [Benchmark] Add single turn MTBench to Serving Bench #17202 (comment)

torch compile & CUDA graph:

wwl2755 · 2025-05-07T22:24:35Z

Great job and thanks on collecting these!

Do you have a unified table/doc that keeps all the up-to-date benchmarking (and also the gaps with ideal condition), which I believe would be much helpful?

wwl2755 · 2025-05-07T22:29:25Z

And also, making the metrics-related PRs (#16367 and #17010) finialized and merged would be great.

ekagra-ranjan · 2025-05-09T15:36:34Z

Do you have a unified table/doc that keeps all the up-to-date benchmarking (and also the gaps with ideal condition), which I believe would be much helpful?

I didnt get the time to do it. I believe the most recent comment would be the one to refer.

ekagra-ranjan added the performance Performance-related issues label May 7, 2025

ekagra-ranjan changed the title ~~[V1][Benchmark][Spec Decode][EAGLE] Tracking benchmark done for V1 EAGLE~~ [Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark done for V1 EAGLE May 7, 2025

ekagra-ranjan mentioned this issue May 7, 2025

[SpecDecode] Support EAGLE in V1 #15901

Open

10 tasks

ekagra-ranjan changed the title ~~[Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark done for V1 EAGLE~~ [Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark for V1 EAGLE May 7, 2025

ekagra-ranjan mentioned this issue May 16, 2025

[Model] vLLM v1 supports Medusa #17956

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark for V1 EAGLE #17812

[Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark for V1 EAGLE #17812

ekagra-ranjan commented May 7, 2025 •

edited

Loading

wwl2755 commented May 7, 2025 •

edited

Loading

wwl2755 commented May 7, 2025

ekagra-ranjan commented May 9, 2025

[Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark for V1 EAGLE #17812

[Benchmark][V1][Spec Decode][EAGLE] Tracking benchmark for V1 EAGLE #17812

Comments

ekagra-ranjan commented May 7, 2025 • edited Loading

llama 3 8b

llama 3.1 8b

wwl2755 commented May 7, 2025 • edited Loading

wwl2755 commented May 7, 2025

ekagra-ranjan commented May 9, 2025

ekagra-ranjan commented May 7, 2025 •

edited

Loading

wwl2755 commented May 7, 2025 •

edited

Loading