Skip to content

LoRA hot swap for service deployment #863

Merged
helloyongyang merged 7 commits intoModelTC:mainfrom
xjq:hot-lora-service
Feb 3, 2026
Merged

LoRA hot swap for service deployment #863
helloyongyang merged 7 commits intoModelTC:mainfrom
xjq:hot-lora-service

Conversation

@xjq
Copy link
Contributor

@xjq xjq commented Feb 2, 2026

I am serving qwen-image with several LoRAs, hot swap is vital for service deployment.

The patch includes:

  1. LoRA remove function added in mm_weight, weight_module, base_model.
  2. If an empty lora_path received in pipeline, LoRA will be removed from model.
  3. New parameter lora_dir added for service.
  4. New parameter lora_name and lora_strength added for API call.
  5. Service script and infer example provided.
  6. Basic test has been done.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @xjq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility of the service by implementing dynamic LoRA (Low-Rank Adaptation) hot-swapping capabilities. It allows the system to load, switch, and remove LoRA weights at runtime without requiring a service restart. This is particularly beneficial for deployments serving multiple LoRAs, as it streamlines the process of applying different model adaptations on the fly, improving operational efficiency and responsiveness to varying user demands.

Highlights

  • LoRA Removal Functionality: Introduced a remove_lora function across mm_weight, weight_module, and base_model to allow for dynamic removal of LoRA weights from the model.
  • Dynamic LoRA Switching and Removal in Pipeline: The pipeline now supports removing LoRA weights if an empty lora_path is provided, alongside the existing dynamic switching capability.
  • Service Configuration for LoRA Directory: Added a new lora_dir parameter for the service, allowing users to specify a directory where LoRA files are stored for dynamic loading.
  • API Parameters for LoRA Control: New lora_name and lora_strength parameters have been added to the API call schema, enabling clients to specify which LoRA to apply and its intensity.
  • Service and Inference Examples: Provided new scripts for starting the server with LoRA directory support and an example for making API calls with LoRA parameters.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces LoRA hot-swapping capabilities, which is a great feature for service deployments. The changes are well-structured, adding remove_lora functions and updating the pipeline to handle LoRA removal. I've provided a few suggestions to improve code conciseness, robustness, and script usability. Overall, this is a solid contribution.

Comment on lines +209 to +216
if hasattr(self, "lora_down"):
del self.lora_down
if hasattr(self, "lora_up"):
del self.lora_up
if hasattr(self, "lora_alpha"):
del self.lora_alpha
if hasattr(self, "lora_scale"):
del self.lora_scale
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The repeated if hasattr/del blocks can be made more concise and easier to maintain by using a loop.

Suggested change
if hasattr(self, "lora_down"):
del self.lora_down
if hasattr(self, "lora_up"):
del self.lora_up
if hasattr(self, "lora_alpha"):
del self.lora_alpha
if hasattr(self, "lora_scale"):
del self.lora_scale
for attr in ("lora_down", "lora_up", "lora_alpha", "lora_scale"):
if hasattr(self, attr):
delattr(self, attr)

Comment on lines +134 to +135
if hasattr(self, "current_lora_strength"):
del self.current_lora_strength
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of deleting current_lora_strength, it's more idiomatic and slightly cleaner to set it back to None. The attribute is initialized to None in __init__, so this keeps the object's structure consistent. The hasattr check is also redundant since the attribute is always present after initialization.

                    self.current_lora_strength = None

Comment on lines +4 to +6
lightx2v_path=
model_path=
lora_dir=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve usability, it's helpful to add a check that verifies if these required variables have been set. This provides a clear error message to the user if they forget to edit the script, preventing confusing errors later on.

Suggested change
lightx2v_path=
model_path=
lora_dir=
lightx2v_path=
model_path=
lora_dir=
if [[ -z "${lightx2v_path}" || -z "${model_path}" || -z "${lora_dir}" ]]; then
echo "Error: lightx2v_path, model_path, and lora_dir must be set." >&2
exit 1
fi

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@helloyongyang helloyongyang merged commit b9a501e into ModelTC:main Feb 3, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments