ggml-ci: add run.sh #2877

redraskal · 2025-03-14T01:04:25Z

I created a new CI script (ci/run.sh) modeled after the one in llama.cpp.

Builds debug/release builds and runs ctest (currently no tests are found since they're commented out)
Runs benchmark with output like scripts/bench-all.sh
Can specify models to use with GGML_TEST_MODELS env variable (GGML_TEST_MODELS="tiny,base,...") as a comma-separated list, otherwise all models are used
gg_run can take additional arguments to reduce redundant logic (debug/release ctest logic is mostly the same, for example)

Added GG_BUILD_LOW_PERF env var to limit models to "tiny", "base", and "small" for faster CI on low-perf systems (maybe needs adjustment).

What else should be added? Running quantize or verifying binding generation?

The script downloads required models if they don't already exist, storing them in $MNT/models/.

Example output:

GGML_TEST_MODELS="tiny,base" ./ci/run.sh ./tmp/results ./tmp/mnt

/tmp/results/README.md:

### ctest_debug

Runs ctest in debug mode
- status: 0
```
+ ctest --output-on-failure -L main -E test-opt
Test project /mnt/e/code/whisper.cpp/build-ci-debug
No tests were found!!!

real	0m0.031s
user	0m0.009s
sys	0m0.000s
```
### ctest_release

Runs ctest in release mode
- status: 0
```
+ ctest --output-on-failure -L main -E test-opt
Test project /mnt/e/code/whisper.cpp/build-ci-release
No tests were found!!!

real	0m0.031s
user	0m0.009s
sys	0m0.000s
```
### bench

Whisper Benchmark Results
- status: 0
#### memcpy Benchmark

```
memcpy:   10.68 GB/s (heat-up)
memcpy:   10.57 GB/s ( 1 thread)
memcpy:   10.94 GB/s ( 1 thread)
memcpy:   10.71 GB/s ( 2 thread)
memcpy:   10.91 GB/s ( 3 thread)
memcpy:   10.75 GB/s ( 4 thread)
sum:    -3071998456.000000
```

#### ggml_mul_mat Benchmark

```
  64 x   64: Q4_0    25.3 GFLOPS (128 runs) | Q4_1    24.9 GFLOPS (128 runs)
  64 x   64: Q5_0    24.8 GFLOPS (128 runs) | Q5_1    23.3 GFLOPS (128 runs) | Q8_0    26.9 GFLOPS (128 runs)
  64 x   64: F16     24.5 GFLOPS (128 runs) | F32     10.2 GFLOPS (128 runs)
 128 x  128: Q4_0    34.6 GFLOPS (128 runs) | Q4_1    31.4 GFLOPS (128 runs)
 128 x  128: Q5_0    30.8 GFLOPS (128 runs) | Q5_1    29.3 GFLOPS (128 runs) | Q8_0    47.2 GFLOPS (128 runs)
 128 x  128: F16     40.2 GFLOPS (128 runs) | F32     26.1 GFLOPS (128 runs)
 256 x  256: Q4_0    66.7 GFLOPS (128 runs) | Q4_1    55.9 GFLOPS (128 runs)
 256 x  256: Q5_0    55.9 GFLOPS (128 runs) | Q5_1    49.3 GFLOPS (128 runs) | Q8_0    68.7 GFLOPS (128 runs)
 256 x  256: F16     81.3 GFLOPS (128 runs) | F32     45.1 GFLOPS (128 runs)
 512 x  512: Q4_0    89.2 GFLOPS (128 runs) | Q4_1    85.6 GFLOPS (128 runs)
 512 x  512: Q5_0    79.7 GFLOPS (128 runs) | Q5_1    61.6 GFLOPS (128 runs) | Q8_0   110.5 GFLOPS (128 runs)
 512 x  512: F16     85.9 GFLOPS (128 runs) | F32     47.0 GFLOPS (128 runs)
1024 x 1024: Q4_0   104.1 GFLOPS ( 49 runs) | Q4_1    87.8 GFLOPS ( 41 runs)
1024 x 1024: Q5_0    86.3 GFLOPS ( 41 runs) | Q5_1    87.3 GFLOPS ( 41 runs) | Q8_0   127.5 GFLOPS ( 60 runs)
1024 x 1024: F16    105.2 GFLOPS ( 50 runs) | F32     50.0 GFLOPS ( 24 runs)
2048 x 2048: Q4_0   103.7 GFLOPS (  7 runs) | Q4_1   113.3 GFLOPS (  7 runs)
2048 x 2048: Q5_0    93.9 GFLOPS (  6 runs) | Q5_1    90.2 GFLOPS (  6 runs) | Q8_0   143.7 GFLOPS (  9 runs)
2048 x 2048: F16     96.5 GFLOPS (  6 runs) | F32     47.5 GFLOPS (  3 runs)
4096 x 4096: Q4_0   112.3 GFLOPS (  3 runs) | Q4_1    95.4 GFLOPS (  3 runs)
4096 x 4096: Q5_0    91.5 GFLOPS (  3 runs) | Q5_1    88.9 GFLOPS (  3 runs) | Q8_0   137.2 GFLOPS (  3 runs)
4096 x 4096: F16     96.2 GFLOPS (  3 runs) | F32     44.8 GFLOPS (  3 runs)
```

#### Model Benchmarks

|           Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
|              --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
|             AVX2 |          tiny |   4 |   0 |  782.14 |    3.06 |    1.27 |    0.96 | b39aebf |
|             AVX2 |          base |   4 |   0 | 1494.50 |    5.78 |    2.20 |    1.72 | b39aebf |

ggerganov · 2025-03-14T07:26:39Z

Excellent! Let me add whisper.cpp to the ggml-ci nodes and will merge this to see how it goes.

@redraskal Sending you a collaborator invite which would allow you to push branches in this repo in order to refine the ggml-ci. I think the next steps would be to add some sort of accuracy tests (see #2454) in order to keep track for any potential regressions in the quality.

ggerganov · 2025-03-14T07:40:03Z

This is the change in the ggml-ci repo to start monitoring whisper.cpp:

ggml-org/ci@8a1d8d5

The workflows are now running on the master branch:

Here is one of the runs:

https://github.com/ggml-org/ci/tree/results/whisper.cpp/f1/1de0e73c661efe4799e090be7caedbd9e193f1/ggml-100-mac-m4

redraskal · 2025-03-14T07:49:26Z

Cool, I'll take a look at #2454

ggerganov · 2025-03-14T08:02:57Z

ci/run.sh

+
+CMAKE_EXTRA="-DWHISPER_FATAL_WARNINGS=ON"
+
+if [ ! -z ${GGML_CUDA} ]; then


These checks should check for the GG_BUILD_... environment variables (see the llama.cpp script).

For example, this is the environment on the CUDA node:

https://github.com/ggml-org/ci/tree/results/whisper.cpp/f1/1de0e73c661efe4799e090be7caedbd9e193f1/ggml-4-x86-cuda-v100#environment

So we have to check for GG_BUILD_CUDA here instead of GGML_CUDA.

Right, I see what you mean

ggml-ci: add run.sh

b39aebf

ggerganov merged commit f11de0e into ggml-org:master Mar 14, 2025

ggerganov reviewed Mar 14, 2025

View reviewed changes

buxuku pushed a commit to buxuku/whisper.cpp that referenced this pull request Mar 26, 2025

ggml-ci: add run.sh (ggml-org#2877)

4550142

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-ci: add run.sh #2877

ggml-ci: add run.sh #2877

redraskal commented Mar 14, 2025

ggerganov commented Mar 14, 2025

ggerganov commented Mar 14, 2025

redraskal commented Mar 14, 2025

ggerganov Mar 14, 2025 •

edited

Loading

redraskal Mar 14, 2025


		CMAKE_EXTRA="-DWHISPER_FATAL_WARNINGS=ON"

		if [ ! -z ${GGML_CUDA} ]; then

ggml-ci: add run.sh #2877

ggml-ci: add run.sh #2877

Conversation

redraskal commented Mar 14, 2025

ggerganov commented Mar 14, 2025

ggerganov commented Mar 14, 2025

redraskal commented Mar 14, 2025

ggerganov Mar 14, 2025 • edited Loading

Choose a reason for hiding this comment

redraskal Mar 14, 2025

Choose a reason for hiding this comment

ggerganov Mar 14, 2025 •

edited

Loading