-
Notifications
You must be signed in to change notification settings - Fork 603
fix(examples): te_llama compatibility with transformers >= 4.57 #2572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(examples): te_llama compatibility with transformers >= 4.57 #2572
Conversation
…= 4.57 The te_llama.py example was failing with HuggingFace transformers 4.57+ due to API changes in how decoder layer outputs are handled. Changes: - Handle case where hidden_states is passed as a tuple (older HF versions) - Return tensor directly instead of wrapped in tuple (HF 4.57+ expects this) - Fix regex pattern to use raw string (fixes SyntaxWarning) Error fixed: AttributeError: 'tuple' object has no attribute 'contiguous' Tested with: - transformer_engine 2.5.0 - transformers 4.57.3 - PyTorch container nvcr.io/nvidia/pytorch:25.08-py3 Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
2ca056c to
a65fa49
Compare
Greptile SummaryThis PR fixes compatibility issues with HuggingFace transformers >= 4.57 in the The main changes:
The fix properly addresses the reported Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant HF as HuggingFace LlamaModel
participant TELayer as TELlamaDecoderLayer
participant TE as TransformerEngine TransformerLayer
Note over HF,TE: Transformers < 4.57 (Old Behavior)
HF->>TELayer: forward(hidden_states)
Note over TELayer: hidden_states is tensor
TELayer->>TE: super().forward(hidden_states, ...)
TE-->>TELayer: returns tensor
TELayer-->>HF: returns (tensor,)
Note over HF: Extracts with layer_outputs[0]
Note over HF,TE: Transformers >= 4.57 (New Behavior)
HF->>TELayer: forward((hidden_states,))
Note over TELayer: Check if tuple and unpack [0]
TELayer->>TELayer: hidden_states = hidden_states[0]
TELayer->>TE: super().forward(hidden_states, ...)
TE-->>TELayer: returns tensor
TELayer-->>HF: returns tensor directly
Note over HF: Uses tensor directly (no unpacking)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
docs/examples/te_llama/te_llama.py, line 77-78 (link)logic: No check for empty tuple before accessing
[0]. Ifhidden_statesis an empty tuple, this will raise anIndexError.
1 file reviewed, 1 comment
|
Thanks for fixing, @sbhavani! Lgtm. |
sudhakarsingh27
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, fixing this for 4.57+ would break it for previous versions then?
nope it handles both the prev and current version of transformers. I think we should fix the version it to support the latest transformers as both TE and transformers APIs are constantly changing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see that it correctly handles version dependencies.
I agree with fixing library versions. Would you be open to
- Create a
requirements.txtfile with correct versions for TE, huggingface, accelerate, peft, datasets libraries - Add a small section at the start of the tutorial which mentions install the prereqs using
pip install -r requirements.txt?
:)
(I did that for te_gemma for your reference)
Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR fixes compatibility issues with HuggingFace transformers >= 4.57 by changing how TELlamaDecoderLayer returns outputs. The breaking change in transformers 4.57+ modified the LlamaModel forward loop to expect decoder layers to return tensors directly instead of tuples.
Key changes:
- Modified
TELlamaDecoderLayer.forward()to return tensor directly instead of wrapping in tuple - Added defensive tuple unpacking for
hidden_statesinput (backward compatibility safety) - Fixed regex
SyntaxWarningby using raw string prefix (r"model.layers.\d+.") - Added
requirements.txtto pin tested dependency versions
Issues found:
- Minor: Confusing comment about when tuple unpacking is needed
- The tuple unpacking check appears to be defensive programming rather than necessary for backward compatibility, as the return type change itself handles version compatibility
Confidence Score: 4/5
- This PR is safe to merge with minimal risk - it addresses a critical compatibility issue with clear, targeted fixes
- Score reflects that the fix correctly addresses the transformers 4.57+ breaking change by modifying return types. The regex fix is correct. Minor deduction for a misleading comment about the tuple unpacking logic, which doesn't affect functionality but could confuse future maintainers. The changes are well-tested according to PR description and follow a clear pattern.
- No files require special attention - all changes are straightforward compatibility fixes
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| docs/examples/te_llama/te_llama.py | 4/5 | Fixed forward method to return tensor directly instead of tuple for transformers >= 4.57, added input tuple handling for backward compatibility, and fixed regex SyntaxWarning with raw string prefix |
| docs/examples/te_llama/requirements.txt | 4/5 | New file pinning dependency versions for the example, including transformers==4.57.0 which contains the breaking change this PR addresses |
Sequence Diagram
sequenceDiagram
participant HF as HuggingFace LlamaModel
participant TELayer as TELlamaDecoderLayer
participant TE as TransformerLayer (TE)
Note over HF,TE: Transformers >= 4.57
HF->>TELayer: forward(hidden_states=tensor)
TELayer->>TELayer: Check isinstance(hidden_states, tuple)
Note over TELayer: False - continue
TELayer->>TE: super().forward(hidden_states, ...)
TE-->>TELayer: returns tensor
TELayer-->>HF: returns tensor directly
Note over HF,TE: Transformers < 4.57 (with old code)
HF->>TELayer: forward(hidden_states=tensor)
TELayer->>TE: super().forward(hidden_states, ...)
TE-->>TELayer: returns tensor
TELayer-->>HF: returns (tensor,)
Note over HF: Extracts with layer_outputs[0]
| # Handle case where hidden_states might be a tuple (from previous layer output) | ||
| # This can happen with older versions of HuggingFace transformers | ||
| if isinstance(hidden_states, tuple): | ||
| hidden_states = hidden_states[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is misleading about when tuple unpacking is needed.
With the new code that returns tensor directly, hidden_states should never be a tuple when called from HuggingFace's LlamaModel forward loop (in any version). The old code returned (tensor,) for transformers < 4.57, but HF's loop extracted it with layer_outputs[0] before passing to the next layer.
This check appears to be defensive programming rather than addressing a real backward compatibility scenario. Consider clarifying the comment to explain this is a safety check rather than expected behavior.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Description
The
te_llama.pyexample fails with HF transformers 4.57+ due to a breaking change in how decoder layer outputs are handled. In transformers 4.57+, theLlamaModelforward loop changed causingTELlamaDecoderLayerto fail because it was returning a tuple(tensor,)instead of the tensor directly.Fixes #2567
Type of change
Changes
Please list the changes introduced in this PR:
hidden_statesis passed as a tuple (for backward compatibility with older HF versions)SyntaxWarningby using raw string prefix (r"model.layers.\d+.")Checklist:
Testing
Tested with:
transformer_engine 2.5.0+f05f12ctransformers 4.57.3nvcr.io/nvidia/pytorch:25.08-py3