Skip to content

Conversation

@ngxson
Copy link
Owner

@ngxson ngxson commented Dec 16, 2025

Make sure to read the contributing guidelines before submitting a PR

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed incorrect tensor name in model output normalization layer to improve model compatibility
  • Documentation

    • Updated development documentation for tensor configuration guidance when adding new models

✏️ Tip: You can customize this high-level summary in your review settings.

@ngxson ngxson marked this pull request as draft December 16, 2025 11:24
@coderabbitai
Copy link

coderabbitai bot commented Dec 16, 2025

Walkthrough

The pull request updates the tensor naming system for GGML model definitions by introducing a new model_tensors set member to the LLM_TN_IMPL struct, adding the LLM_TENSOR_OUTPUT_NORM_LFM2 enum value, and refactoring struct constructors. Documentation is updated to reflect the new guidance on adding tensor names.

Changes

Cohort / File(s) Summary
Documentation update
docs/development/HOWTO-add-model.md
Revised guidance for adding tensor-related names: changed from directing additions to the LLM_TENSOR_NAMES map to directing additions to llm_get_tensor_names, with note that LLM_TENSOR_NAMES may also require updates.
Tensor system header changes
src/llama-arch.h
Added #include <set> dependency; extended llm_tensor enum with LLM_TENSOR_OUTPUT_NORM_LFM2; introduced new public member const std::set<llm_tensor> model_tensors to LLM_TN_IMPL struct; added new constructor LLM_TN_IMPL(llm_arch arch, llm_tensor tensor, const char * suffix, int bid, int xid); refactored LLM_TN operators to use constructor calls instead of aggregate initialization.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Documentation change is straightforward and requires minimal review
  • Header file modifications are structural and consistent in pattern
  • Key areas for attention:
    • Verify the new constructor implementation matches the signature declaration
    • Confirm the std::set<llm_tensor> addition doesn't introduce unintended side effects
    • Ensure the enum extension LLM_TENSOR_OUTPUT_NORM_LFM2 is used consistently throughout the codebase

Poem

🐰 A tensor's name was twisted wrong, but now we set things right,
With sets and constructors dancing in the header's light,
The documentation guides the way, no more shall names confound,
This rabbit hops with glee today—the tensor truth is found! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description only contains the template boilerplate with a link to contributing guidelines and lacks any substantive content explaining the changes, rationale, or implementation details. Add a detailed description explaining what was refactored, why the changes were made, and how the new model_tensors structure improves the codebase.
Title check ❓ Inconclusive The title is vague and uses unclear phrasing like 'Xsn' (appears to be a username prefix) and 'arch refactor llm names', making it unclear what the primary change is without context. Revise the title to be more descriptive and clear about what was refactored, e.g., 'Refactor LLM tensor name handling to use model_tensors' or similar specific terminology.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch xsn/arch_refactor_llm_names

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/llama-arch.h (1)

531-533: Each LLM_TN_IMPL instance rebuilds the tensor set redundantly during model loading.

The constructor calls llm_get_tensor_names(arch) for every instantiation (1,760+ times per model load), rebuilding an identical set of ~83 tensors. Since model_tensors depends only on the arch parameter, consider caching the result per-architecture instead of recreating it for each instance. This would eliminate redundant allocations during the model loading phase.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b1db3d and 4a78eba.

📒 Files selected for processing (2)
  • docs/development/HOWTO-add-model.md (1 hunks)
  • src/llama-arch.h (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{cpp,h,hpp}: Always format C++ code using git clang-format before committing, following .clang-format configuration (4-space indentation, 120 column limit, braces on same line for functions, pointer/reference alignment in middle)
Minimize external dependencies; avoid adding new external dependencies unless absolutely necessary

Files:

  • src/llama-arch.h
**/*.{cpp,h,hpp,py}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Ensure cross-platform compatibility by testing code changes on Linux, macOS, and Windows when possible

Files:

  • src/llama-arch.h
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to src/llama.cpp : Maintain core library implementation in `src/llama.cpp` with careful attention to API contracts and backward compatibility
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to include/llama.h : Exercise careful consideration when making changes to the public API in `include/llama.h`, as API stability is critical
Learnt from: ngxson
Repo: ngxson/llama.cpp PR: 14
File: examples/lookahead/lookahead.cpp:97-98
Timestamp: 2025-03-13T22:56:47.417Z
Learning: ngxson prefers to prioritize simplification in some cases and defer adding error handling to a later time, as indicated by their response to a suggestion about adding error checks for llama_decode_ext calls.
📚 Learning: 2025-11-29T22:55:53.865Z
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to src/llama.cpp : Maintain core library implementation in `src/llama.cpp` with careful attention to API contracts and backward compatibility

Applied to files:

  • docs/development/HOWTO-add-model.md
  • src/llama-arch.h
📚 Learning: 2025-11-29T22:55:53.865Z
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to include/llama.h : Exercise careful consideration when making changes to the public API in `include/llama.h`, as API stability is critical

Applied to files:

  • docs/development/HOWTO-add-model.md
  • src/llama-arch.h
🧬 Code graph analysis (1)
src/llama-arch.h (1)
src/llama-arch.cpp (1)
  • LLM_TN_IMPL (2394-2396)
🔇 Additional comments (4)
docs/development/HOWTO-add-model.md (1)

100-100: Documentation update aligns with architectural changes.

The updated guidance to add tensor names via llm_get_tensor_names aligns with the new model_tensors member in LLM_TN_IMPL that initializes from this function. The parenthetical note about potentially updating LLM_TENSOR_NAMES provides helpful backward compatibility context.

src/llama-arch.h (3)

6-6: LGTM: Include added for new std::set member.

The <set> include is necessary for the new model_tensors member in LLM_TN_IMPL.


320-320: No action required. The new LLM_TENSOR_OUTPUT_NORM_LFM2 enum is properly integrated into the tensor mapping system, including the name mapping ("token_embd_norm"), LFM2 architecture tensor list, and operation mapping.


555-561: Constructor calls replace aggregate initialization.

The operator overloads now correctly invoke the constructor to initialize the new model_tensors member. Codebase search confirms no aggregate initialization patterns for LLM_TN_IMPL exist elsewhere.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants