We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent d9393b8 commit 1da231fCopy full SHA for 1da231f
ucm/patch/0.9.2/vllm-adapt.patch
@@ -278,7 +278,7 @@ index b06b7cc80..22c22a148 100644
278
- (output, ) = self.collective_rpc(
279
+ non_block = self.max_concurrent_batches > 1
280
+
281
-+ if not self.has_connector:
++ if not self.has_connector or self.vllm_config.model_config.use_mla:
282
+ # get output only from a single worker (output_rank)
283
+ (output, ) = self.collective_rpc(
284
+ "execute_model",
0 commit comments