Skip to content

Commit 058f183

Browse files
authored
Add usage of taskset in the TL1_decoder_perf (#5738)
- DALI uses more threads than just value in the `num_threads` argument. In effect, the actual number of CPU cores used is greater than intended. To ensure that we don't use more cores, and avoid inefficient data migration between ones affined with the GPU we test and a distant one a usage of task set is added to pin cores available for the test. Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
1 parent f4b49c8 commit 058f183

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

qa/TL1_decoder_perf/test.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,13 @@ test_body() {
1414
if [ "$(uname -p)" == "x86_64" ]; then
1515
# Hopper
1616
MIN_PERF=19000;
17-
python hw_decoder_bench.py --width_hint 6000 --height_hint 6000 -b 408 -d 0 -g gpu -w 100 -t 100000 -i ${DALI_EXTRA_PATH}/db/single/jpeg -p rn50 -j 70 --hw_load 0.12 | tee ${LOG}
17+
# use taskset to avoid inefficient data migration between cores we don't want to use
18+
taskset --cpu-list 0-127 python hw_decoder_bench.py --width_hint 6000 --height_hint 6000 -b 408 -d 0 -g gpu -w 100 -t 100000 -i ${DALI_EXTRA_PATH}/db/single/jpeg -p rn50 -j 70 --hw_load 0.12 | tee ${LOG}
1819
else
1920
# GraceHopper
2021
MIN_PERF=29000;
21-
python hw_decoder_bench.py --width_hint 6000 --height_hint 6000 -b 408 -d 0 -g gpu -w 100 -t 100000 -i ${DALI_EXTRA_PATH}/db/single/jpeg -p rn50 -j 72 --hw_load 0.11 | tee ${LOG}
22+
# use taskset to avoid inefficient data migration between cores we don't want to use
23+
taskset --cpu-list 0-71 python hw_decoder_bench.py --width_hint 6000 --height_hint 6000 -b 408 -d 0 -g gpu -w 100 -t 100000 -i ${DALI_EXTRA_PATH}/db/single/jpeg -p rn50 -j 72 --hw_load 0.11 | tee ${LOG}
2224
fi
2325

2426
# Regex Explanation:

0 commit comments

Comments
 (0)