[https://nvbugs/6336801][fix] Add the two skip_softmax_threshold_scale_factor_decode/prefill aliases to…#15482
Conversation
Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughTwo allowlists for the Changesthop Attention Sync Allowlist Extensions
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Summary
_THOP_KWARG_SOURCE_ALIASESmissed two newskip_softmax_threshold_scale_factor_*aliases added by PR [TRTLLM-12807][feat] Add multiple FMHA library support to TRTLLM attention backend #15204, and_THOP_EXCLUDED_FIELDSmissedmulti_item_part_lens(added toAttentionForwardArgsby PR [TRTLLM-12982][feat] support multi item scoring in LLM.encode #14693 but rejected upstream byTrtllmAttention.forward, never reaching the FallbackFmha thop call).skip_softmax_threshold_scale_factor_decode/prefillaliases to_THOP_KWARG_SOURCE_ALIASESin the test file, and addmulti_item_part_lensto_THOP_EXCLUDED_FIELDSinfallback.py(mirroringtopk_indices/out_scale_sfwhich other backends consume but the thop fallback does not).Test plan
Links
Summary by CodeRabbit