Skip to content

Commit ca1698e

Browse files
MekkCyberSunMarc
andauthored
[Quantization] Fix FP8 experts replacing (#42654)
small fix Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
1 parent 81b8417 commit ca1698e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/transformers/integrations/finegrained_fp8.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -606,7 +606,7 @@ def replace_with_fp8_linear(
606606
module_kwargs = {} if pre_quantized else {"dtype": None}
607607
new_module = None
608608
with init_empty_weights():
609-
if "gate_up_proj" in module_name or "down_proj" in module_name and "experts" in module_name:
609+
if module_name.endswith(".experts"):
610610
new_module = FP8Expert(
611611
config=model.config, block_size=quantization_config.weight_block_size, **module_kwargs
612612
)

0 commit comments

Comments
 (0)