-
Notifications
You must be signed in to change notification settings - Fork 65
Description
Describe the bug
The option "lora_post_process_for_vllm" does not seem to have any effect. It is described in https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/build/README.md#configuration as "If tuning for inference on vLLM, set lora_post_process_for_vllm to true. Post process LoRA adapters to allow inferencing on vLLM. vLLM needs new token embedding weights added during tuning to be moved to a new file new_embeddings.safetensors."
When fine tuning mistralai/Mixtral-8x7B-v0.1, "lora_post_process_for_vllm":true does not result in the creation of the new file new_embeddings.safetensors. Later, when the fine-tuned model is served by vllm, the following error occurs:
raise ValueError(f"{name} is unsupported LoRA weight")
ValueError: base_model.model.lm_head.weight is unsupported LoRA weight"
Platform
- Interpreter version: Python 3.12.7
- Library version: The version of the code from the main branch as of December 9 at 4:27 pm ET
Sample Code
export SFT_TRAINER_CONFIG_JSON_PATH= config.json
accelerate launch --num_processes=5 --config_file fixtures/accelerate_fsdp_defaults.yaml tuning/sft_trainer.py
where the content of config.json is as follows:
{
"config_file": "fixtures/accelerate_fsdp_defaults.yaml",
"model_name_or_path": "mistralai/Mixtral-8x7B-v0.1",
"training_data_path": $TRAINING_PATH,
"output_dir": $OUTPUT_PATH,
"num_train_epochs": 10.0,
"per_device_train_batch_size": 1,
"gradient_accumulation_steps": 4,
"torch_dtype": "float16",
"peft_method": "lora",
"r": 8,
"lora_dropout": 0.05,
"target_modules": "all-linear",
"lora_post_process_for_vllm": true
}
Expected behavior
The expected behavior is described in https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/build/README.md#configuration: "If tuning for inference on vLLM, set lora_post_process_for_vllm to true. Post process LoRA adapters to allow inferencing on vLLM. vLLM needs new token embedding weights added during tuning to be moved to a new file new_embeddings.safetensors."
Observed behavior
When fine tuning mistralai/Mixtral-8x7B-v0.1, "lora_post_process_for_vllm":true does not result in the creation of the new file new_embeddings.safetensors. Later, when the fine-tuned model is served by vllm, the following error occurs:
raise ValueError(f"{name} is unsupported LoRA weight")
ValueError: base_model.model.lm_head.weight is unsupported LoRA weight"
Additional context
Add any other context about the problem here.