minor fix: vLLM LoRA request cleanup for issue #751 #765

TinLongYu · 2025-12-24T17:38:08Z

Description

This PR fixes a crash during LoRA-based RL fine-tuning with vLLM caused by a KeyError when cleaning up finished LoRA requests.

Applies a simple defensive check before removing a request ID from running_requests that only impacts the LoRA training pipeline. Updated the finish_request function to simply check if req_state.request_id in lora_stats.running_requests:

This patch may be removed once updating vLLM to v12.0+

File changed

areal/thirdparty/vllm/areal_vllm_server.py

Related Issue

Fixes #751

Testing

Tested xccl-based LoRA gsm8k training

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not
work as expected)
Documentation update
Code refactoring (no functional changes)
Performance improvement
Test coverage improvement

Checklist

I have read the Contributing Guide
I have run formatting tools (pre-commit or manual)
I have run relevant unit tests and they pass
I have added tests for new functionality
I have updated documentation if needed
My branch is up to date with main
This PR introduces breaking changes (if yes, fill out details below)
If this PR changes documentation, I have built and previewed it locally with
jb build docs
No critical issues raised by AI reviewers (/gemini review)

Breaking Change Details (if applicable):

Additional Context

Need help? Check the Contributing Guide or ask in
GitHub Discussions!

gemini-code-assist · 2025-12-24T17:38:21Z

Summary of Changes

Hello @TinLongYu, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical "KeyError" that occurred during the cleanup of LoRA requests within the vLLM integration, specifically impacting LoRA-based reinforcement learning fine-tuning. The change introduces a robust check in the finish_request function to prevent crashes by ensuring that a request ID is present before attempting its removal from tracking. This targeted bug fix enhances the stability of the LoRA training pipeline and is designed to be a temporary measure until a broader vLLM update is adopted.

Highlights

Bug Fix for LoRA Fine-tuning: Addresses a "KeyError" crash in vLLM during LoRA-based RL fine-tuning by adding a defensive check when cleaning up finished LoRA requests.
Enhanced Request Cleanup: Implements a check to ensure a request ID exists in lora_stats.running_requests before attempting to remove it, preventing unexpected crashes.
Temporary Patch: This specific patch is intended as a temporary solution and may be removed once the project updates to vLLM v12.0 or newer.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses a KeyError crash during LoRA-based fine-tuning by adding a defensive check in the finish_request function. The change is a targeted fix to prevent removing a request ID that is no longer in the running_requests list. The approach of monkey-patching vllm's LoRARequestStates is a reasonable short-term solution as noted. My review includes a suggestion to further improve the robustness of the finish_request function by also handling cases where the LoRA name might not be present in lora_name_to_stats, preventing another potential KeyError.

gemini-code-assist · 2025-12-24T17:39:39Z

areal/thirdparty/vllm/areal_vllm_server.py

+    lora_stats = self.lora_name_to_stats[req_state.lora_name]
+    if req_state.request_id in lora_stats.running_requests:
+        lora_stats.running_requests.remove(req_state.request_id)


The current implementation at line 343 (self.lora_name_to_stats[req_state.lora_name]) could still raise a KeyError if req_state.lora_name is not present in the lora_name_to_stats dictionary. To make this function more robust and prevent another potential crash, it's better to use the .get() method for safer dictionary access. This also allows for a more concise check.

Suggested change

lora_stats = self.lora_name_to_stats[req_state.lora_name]

if req_state.request_id in lora_stats.running_requests:

lora_stats.running_requests.remove(req_state.request_id)

lora_stats = self.lora_name_to_stats.get(req_state.lora_name)

if lora_stats and req_state.request_id in lora_stats.running_requests:

lora_stats.running_requests.remove(req_state.request_id)

garrett4wade · 2025-12-25T02:58:39Z

areal/thirdparty/vllm/areal_vllm_server.py

+def finish_request(self, req_state: 'RequestState'):
+    if req_state.lora_name is None:
+        return
+    lora_stats = self.lora_name_to_stats[req_state.lora_name]
+    if req_state.request_id in lora_stats.running_requests:
+        lora_stats.running_requests.remove(req_state.request_id)


LGTM, but could you add a comment about the change? We may want to note: (1) this is a simple copy from vllm v0.11.0 and we add a if-condition to check whether the request is running; (2) this patch can be removed once vllm is updated to v0.12.0 or higher.

A better solution may be applying the patch with the following condition:

from areal.utils import pkg_version if not pkg_version.is_version_higher("vllm", "0.11.1"): setattr(...) # apply the patch

FYI we wanted to update vLLM and SGLang, but they do not support the same pytorch version until the latest vllm 0.14.0.rc1. Hopefully we will update their versions when 0.14.0 is offically released.

fix vLLM LoRA request cleanup for issue inclusionAI#751

16bae03

gemini-code-assist bot reviewed Dec 24, 2025

View reviewed changes

Merge branch 'main' into lora_req_cleanup

063958c

garrett4wade reviewed Dec 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

minor fix: vLLM LoRA request cleanup for issue #751 #765

minor fix: vLLM LoRA request cleanup for issue #751 #765

TinLongYu commented Dec 24, 2025

Uh oh!

gemini-code-assist bot commented Dec 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 24, 2025

Uh oh!

garrett4wade Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

minor fix: vLLM LoRA request cleanup for issue #751 #765

Are you sure you want to change the base?

minor fix: vLLM LoRA request cleanup for issue #751 #765

Conversation

TinLongYu commented Dec 24, 2025

Description

Related Issue

Testing

Type of Change

Checklist

Additional Context

Uh oh!

gemini-code-assist bot commented Dec 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

garrett4wade Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants