[Feat] Refactor for `parallel_config` in `FusedMoEModularKernel` #30282

yewentao256 · 2025-12-08T22:32:29Z

Purpose

But after discussion with @ProExpertProg and @bnellnm , we found that require a must-pass param needs to update everything especially for tests, we decide to instead using a moe_parallel_config

This PR could also fix current issue in main

(Worker_TP2_EP2 pid=248418) WARNING 12-10 13:35:58 [vllm.py:1394] Current vLLM config is not set.
(Worker_TP2_EP2 pid=248418) INFO 12-10 13:35:58 [scheduler.py:228] Chunked prefill is enabled with max_num_batched_tokens=2048.
(Worker_TP4_EP4 pid=248420) WARNING 12-10 13:35:58 [vllm.py:1394] Current vLLM config is not set.
(Worker_TP4_EP4 pid=248420) INFO 12-10 13:35:58 [scheduler.py:228] Chunked prefill is enabled with max_num_batched_tokens=2048.
(Worker_TP3_EP3 pid=248419) WARNING 12-10 13:35:58 [vllm.py:1394] Current vLLM config is not set.
(Worker_TP3_EP3 pid=248419) INFO 12-10 13:35:58 [scheduler.py:228] Chunked prefill is enabled with max_num_batched_tokens=2048.
(Worker_TP5_EP5 pid=248421) WARNING 12-10 13:35:58 [vllm.py:1394] Current vLLM config is not set.
(Worker_TP5_EP5 pid=248421) INFO 12-10 13:35:58 [scheduler.py:228] Chunked prefill is enabled with max_num_batched_tokens=2048.
(Worker_TP0_EP0 pid=248416) WARNING 12-10 13:35:58 [vllm.py:1394] Current vLLM config is not set.
(Worker_TP0_EP0 pid=248416) INFO 12-10 13:35:58 [scheduler.py:228] Chunked prefill is enabled with max_num_batched_tokens=2048.
(Worker_TP6_EP6 pid=248422) WARNING 12-10 13:35:58 [vllm.py:1394] Current vLLM config is not set.
(Worker_TP6_EP6 pid=248422) INFO 12-10 13:35:58 [scheduler.py:228] Chunked prefill is enabled with max_num_batched_tokens=2

Signed-off-by: yewentao256 <zhyanwentao@126.com>

gemini-code-assist

Code Review

This pull request adds a helpful comment to vllm/model_executor/layers/fused_moe/modular_kernel.py clarifying the logic for handling the parallel_config parameter in FusedMoEModularKernel. The comment explains that while explicit passing is preferred, a fallback to the current vLLM config exists for testing purposes. This improves code clarity and maintainability. The change is correct and I have no further suggestions.

Signed-off-by: yewentao256 <zhyanwentao@126.com>

chatgpt-codex-connector · 2025-12-10T20:22:40Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

yewentao256 · 2025-12-10T20:22:51Z

@bnellnm @ProExpertProg CC

Signed-off-by: yewentao256 <zhyanwentao@126.com>

add comment

19287e3

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 requested review from mgoin and pavanimajety as code owners December 8, 2025 22:32

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

yewentao256 marked this pull request as draft December 9, 2025 00:54

yewentao256 changed the title ~~[Small] Add comment for parallel_config in FusedMoEModularKernel~~ [Feat] Refactor for parallel_config in FusedMoEModularKernel Dec 10, 2025

yewentao256 added 2 commits December 10, 2025 12:16

Merge branch 'main' into wentao-parallel_config-None-issue

0facc44

update to moe parallel config

c69c25a

Signed-off-by: yewentao256 <zhyanwentao@126.com>

mergify bot added the nvidia label Dec 10, 2025

github-project-automation bot added this to NVIDIA Dec 10, 2025

yewentao256 marked this pull request as ready for review December 10, 2025 20:22

yewentao256 requested review from robertgshaw2-redhat and tlrmchlsmth as code owners December 10, 2025 20:22

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 10, 2025

yewentao256 added 2 commits December 10, 2025 13:40

Merge branch 'main' into wentao-parallel_config-None-issue

10787cb

fix issue in main

2fabee5

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 requested a review from tjtanaa as a code owner December 10, 2025 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feat] Refactor for `parallel_config` in `FusedMoEModularKernel` #30282

[Feat] Refactor for `parallel_config` in `FusedMoEModularKernel` #30282

yewentao256 commented Dec 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chatgpt-codex-connector bot commented Dec 10, 2025

Uh oh!

yewentao256 commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Feat] Refactor for parallel_config in FusedMoEModularKernel #30282

Are you sure you want to change the base?

[Feat] Refactor for parallel_config in FusedMoEModularKernel #30282

Conversation

yewentao256 commented Dec 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector bot commented Dec 10, 2025

Uh oh!

yewentao256 commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feat] Refactor for `parallel_config` in `FusedMoEModularKernel` #30282

[Feat] Refactor for `parallel_config` in `FusedMoEModularKernel` #30282

yewentao256 commented Dec 8, 2025 •

edited by github-actions bot

Loading