Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 10.5k

Code
Issues 567
Pull requests 226
Discussions
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 42 Milestones 0

Labels 42 Milestones 0

New pull request New

226 Open 1,355 Closed

226 Open 1,355 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Integrate trtllm-gen kernels for QKVGemm, FC13+swiGLU, and FC2 for Llama4

#4201 opened May 10, 2025 by eopXD

Loading…

[TRTLLM-5188] fix: [AutoDeploy] update output shape of prepare_fused_mha_metadata_fake

#4199 opened May 9, 2025 by Fridah-nv

Loading…

Added tests for Llama3.1-70B-BF16 on SM120

#4198 opened May 9, 2025 by farazkh80 • Draft

Extend the Llama-Nemotron-Nano-8B perf-integration-tests

#4195 opened May 9, 2025 by venkywonka

Loading…

1

Test main images CI result

#4194 opened May 9, 2025 by ZhanruiSunCh • Draft

5

infra: [TRTLLM-325] Prepare for NGC release - multiplatform build

#4191 opened May 9, 2025 by MartinMarciniszyn

Loading…

5

fix: Revert NIXL and ETCD from the main image

#4190 opened May 9, 2025 by Shixiaowei02

Loading…

Cherry-pick feat/llama4's 1-17 commits to main

#4189 opened May 9, 2025 by chenfeiz0326

Loading…

3

[bug/5247505] fix: CP accuracy on Blackwell

#4188 opened May 9, 2025 by DylanChen-NV

Loading…

3

infra: open source fmha v2 kernels

#4185 opened May 9, 2025 by qsang-nv

Loading…

6

feat: Support for Mistral Small 3.1 24B VLM

#4183 opened May 9, 2025 by brb-nv

Loading…

^gdr_copy

#4181 opened May 9, 2025 by chuangz0

Loading…

3

add changes for fp8, nemotron-nas, API

#4180 opened May 9, 2025 by shaharmor98

Loading…

[TRTQA-2802][fix]: add --host for mgmn serve examples script

#4175 opened May 9, 2025 by xinhe-nv

Loading…

Breaking change: perf: Enable scheduling overlap by default

#4174 opened May 9, 2025 by kaiyux

Loading…

7

chore: Remove deprecated Python runtime benchmark

#4171 opened May 9, 2025 by kaiyux

Loading…

7

exp: pull/4114

#4170 opened May 9, 2025 by tongyuantongyu • Draft

6

[feat] [AutoDeploy] Llama-4 Support

#4163 opened May 8, 2025 by lucaslie

Loading…

2 of 5 tasks

4

[TRTLLM-5054][fix] Removing repeated loading of input processor

#4161 opened May 8, 2025 by rakib-hasan

Loading…

3

fix: bump xgrammar

#4160 opened May 8, 2025 by milesial • Draft

3

[nvbugs/5268808][fix] Fix the potential out-of-range-access issue of allreduce workspace.

#4159 opened May 8, 2025 by hyukn • Draft

Add test case for kv memory estimation

#4158 opened May 8, 2025 by HuiGao-NV

Loading…

[TRTLLM-5050][feat] Enable per-request stats with PyT backend

#4156 opened May 8, 2025 by pcastonguay

Loading…

9

Feat: support exporting softmax statistics and update the kernel-selection heuristic

#4155 opened May 8, 2025 by PerkzZheng

Loading…

remove cache_transceiver_prealloc_size

#4153 opened May 8, 2025 by chuangz0

Loading…

9

Previous 1 2 3 4 5 … 9 10 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-05-07.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.