feat: Support Mistral Small 3.1 24B VLM in TRT workflow #4183

brb-nv · 2025-05-09T08:09:35Z

Description

# Build TRT engine for vision encoder.
$ rm -rf mistral_mm_eng/ && python examples/models/core/multimodal/build_multimodal_engine.py --model_path /home/bbuddharaju/scratch/random/hf_models/Mistral-Small-3.1-24B-Instruct-2503/ --model_type pixtral --max_batch_size 8 --output_dir mistral_mm_eng/vision/

# Create TRTLLM checkpoint for the LLM decoder.
$ rm -rf mistral_mm_llm_ckpt/ && python examples/models/core/llama/convert_checkpoint.py --model_dir /home/bbuddharaju/scratch/random/hf_models/Mistral-Small-3.1-24B-Instruct-2503/ --output_dir mistral_mm_llm_ckpt/ --dtype bfloat16

# Build TRT engine for LLM decoder.
$ rm -rf mistral_mm_eng/llm/ && trtllm-build --checkpoint_dir mistral_mm_llm_ckpt/ --output_dir mistral_mm_eng/llm/ --max_batch_size 8 --max_input_len 131072 --max_seq_len 131072 --max_multimodal_len $((131072 * 8)) --use_paged_context_fmha enable

# Run the entire pipeline.
$ python examples/models/core/multimodal/run.py --hf_model_dir /home/bbuddharaju/scratch/random/hf_models/Mistral-Small-3.1-24B-Instruct-2503/ --engine_dir mistral_mm_eng/ --visual_engine_name model.engine --batch_size 1 --enable_chunked_context --max_new_tokens 512 --temperature 0 --top_p 1 --lora_task_uids -1 --session python

Test Coverage

$ pytest tests/integration/defs/examples/test_multimodal.py::test_llm_multimodal_general[Mistral-Small-3.1-24B-Instruct-2503-pp:1-tp:1-bfloat16-bs:1-cpp_e2e:False-nb:1] -s -v

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

tests/integration/defs/conftest.py

brb-nv · 2025-05-09T19:02:20Z

/bot run

tensorrt-cicd · 2025-05-09T19:09:01Z

PR_Github #4734 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-09T19:43:11Z

PR_Github #4734 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #3416 completed with status: 'FAILURE'

tensorrt_llm/tools/multimodal_builder.py

tensorrt_llm/runtime/multimodal_model_runner.py

brb-nv · 2025-05-12T23:19:54Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-05-12T23:25:50Z

PR_Github #4913 [ run ] triggered by Bot

brb-nv · 2025-05-13T04:40:52Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-05-13T04:46:31Z

PR_Github #4953 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-13T04:46:33Z

PR_Github #4913 [ run ] completed with state ABORTED

brb-nv · 2025-05-13T05:04:51Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-05-13T05:10:35Z

PR_Github #4957 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-13T05:10:37Z

PR_Github #4953 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-05-13T09:12:43Z

PR_Github #4957 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3599 completed with status: 'FAILURE'

brb-nv · 2025-05-13T17:18:51Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-05-13T17:27:23Z

PR_Github #5049 [ run ] triggered by Bot

Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>

amukkara · 2025-05-13T19:16:08Z

/bot kill

amukkara · 2025-05-13T19:18:04Z

/bot skip --comment "Unrelated failed tests in last CI run are waived on main currently."

tensorrt-cicd · 2025-05-13T19:24:35Z

PR_Github #5056 [ kill ] triggered by Bot

tensorrt-cicd · 2025-05-13T19:25:33Z

PR_Github #5059 [ ] completed with state ABORTED

tensorrt-cicd · 2025-05-13T19:26:03Z

PR_Github #5049 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-05-13T19:26:34Z

PR_Github #5056 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit 7cadbe2

amukkara · 2025-05-13T19:30:00Z

/bot skip --comment "Unrelated failed tests in last CI run are waived on main currently."

tensorrt-cicd · 2025-05-13T19:37:17Z

PR_Github #5061 [ skip ] triggered by Bot

tensorrt-cicd · 2025-05-13T19:47:19Z

PR_Github #5061 [ skip ] completed with state SUCCESS
Skipping testing for commit 7cadbe2

brb-nv force-pushed the user/brb/mistral-small-vlm branch from 446d12c to 6b96b7d Compare May 9, 2025 18:49

brb-nv commented May 9, 2025

View reviewed changes

tests/integration/defs/conftest.py Outdated Show resolved Hide resolved

brb-nv force-pushed the user/brb/mistral-small-vlm branch 3 times, most recently from 2522e92 to 162331f Compare May 9, 2025 18:59

brb-nv requested review from tijyojwad and amukkara May 9, 2025 19:41

amukkara reviewed May 10, 2025

View reviewed changes

brb-nv changed the title ~~feat: Support for Mistral Small 3.1 24B VLM~~ feat: Support Mistral Small 3.1 24B VLM in TRT workflow May 12, 2025

brb-nv force-pushed the user/brb/mistral-small-vlm branch from 162331f to 6bc8c39 Compare May 12, 2025 21:23

brb-nv requested a review from amukkara May 13, 2025 04:57

brb-nv force-pushed the user/brb/mistral-small-vlm branch 2 times, most recently from 2af78ba to 35c1ea5 Compare May 13, 2025 04:58

amukkara approved these changes May 13, 2025

View reviewed changes

brb-nv force-pushed the user/brb/mistral-small-vlm branch 2 times, most recently from d038f54 to 7d26224 Compare May 13, 2025 17:16

feat: Support for Mistral Small 3.1 24B VLM

7cadbe2

Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>

brb-nv force-pushed the user/brb/mistral-small-vlm branch from 046970c to 7cadbe2 Compare May 13, 2025 19:03

amukkara enabled auto-merge (squash) May 13, 2025 19:17

amukkara merged commit cd5b3d2 into NVIDIA:main May 13, 2025
3 checks passed

brb-nv mentioned this pull request May 13, 2025

Mistral Small 3.1 architecture #4245

Open

brb-nv mentioned this pull request May 14, 2025

Is text-only inference supported for multi-modal vision models like Llama 3.2 11b/90b Vision Instruct? #4012

Open

feat: Support Mistral Small 3.1 24B VLM in TRT workflow #4183

feat: Support Mistral Small 3.1 24B VLM in TRT workflow #4183

Uh oh!

Conversation

brb-nv commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Uh oh!

brb-nv commented May 9, 2025

Uh oh!

tensorrt-cicd commented May 9, 2025

Uh oh!

tensorrt-cicd commented May 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brb-nv commented May 12, 2025

Uh oh!

tensorrt-cicd commented May 12, 2025

Uh oh!

brb-nv commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

brb-nv commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

brb-nv commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

amukkara commented May 13, 2025

Uh oh!

amukkara commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

amukkara commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

tensorrt-cicd commented May 13, 2025

Uh oh!

Uh oh!

Uh oh!

brb-nv commented May 9, 2025 •

edited

Loading