(FOR CI) Xsn/arch refactor llm names #60

ngxson · 2025-12-16T11:24:49Z

Make sure to read the contributing guidelines before submitting a PR

Summary by CodeRabbit

Release Notes

Bug Fixes
- Fixed incorrect tensor name in model output normalization layer to improve model compatibility
Documentation
- Updated development documentation for tensor configuration guidance when adding new models

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-16T11:25:05Z

Walkthrough

The pull request updates the tensor naming system for GGML model definitions by introducing a new model_tensors set member to the LLM_TN_IMPL struct, adding the LLM_TENSOR_OUTPUT_NORM_LFM2 enum value, and refactoring struct constructors. Documentation is updated to reflect the new guidance on adding tensor names.

Changes

Cohort / File(s)	Summary
Documentation update `docs/development/HOWTO-add-model.md`	Revised guidance for adding tensor-related names: changed from directing additions to the LLM_TENSOR_NAMES map to directing additions to `llm_get_tensor_names`, with note that LLM_TENSOR_NAMES may also require updates.
Tensor system header changes `src/llama-arch.h`	Added `#include <set>` dependency; extended `llm_tensor` enum with `LLM_TENSOR_OUTPUT_NORM_LFM2`; introduced new public member `const std::set<llm_tensor> model_tensors` to `LLM_TN_IMPL` struct; added new constructor `LLM_TN_IMPL(llm_arch arch, llm_tensor tensor, const char * suffix, int bid, int xid)`; refactored `LLM_TN` operators to use constructor calls instead of aggregate initialization.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Documentation change is straightforward and requires minimal review
Header file modifications are structural and consistent in pattern
Key areas for attention:
- Verify the new constructor implementation matches the signature declaration
- Confirm the std::set<llm_tensor> addition doesn't introduce unintended side effects
- Ensure the enum extension LLM_TENSOR_OUTPUT_NORM_LFM2 is used consistently throughout the codebase

Poem

🐰 A tensor's name was twisted wrong, but now we set things right,
With sets and constructors dancing in the header's light,
The documentation guides the way, no more shall names confound,
This rabbit hops with glee today—the tensor truth is found! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description only contains the template boilerplate with a link to contributing guidelines and lacks any substantive content explaining the changes, rationale, or implementation details.	Add a detailed description explaining what was refactored, why the changes were made, and how the new model_tensors structure improves the codebase.
Title check	❓ Inconclusive	The title is vague and uses unclear phrasing like 'Xsn' (appears to be a username prefix) and 'arch refactor llm names', making it unclear what the primary change is without context.	Revise the title to be more descriptive and clear about what was refactored, e.g., 'Refactor LLM tensor name handling to use model_tensors' or similar specific terminology.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch xsn/arch_refactor_llm_names

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/llama-arch.h (1)

531-533: Each LLM_TN_IMPL instance rebuilds the tensor set redundantly during model loading.

The constructor calls llm_get_tensor_names(arch) for every instantiation (1,760+ times per model load), rebuilding an identical set of ~83 tensors. Since model_tensors depends only on the arch parameter, consider caching the result per-architecture instead of recreating it for each instance. This would eliminate redundant allocations during the model loading phase.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b1db3d and 4a78eba.

📒 Files selected for processing (2)

docs/development/HOWTO-add-model.md (1 hunks)
src/llama-arch.h (4 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{cpp,h,hpp}: Always format C++ code using git clang-format before committing, following .clang-format configuration (4-space indentation, 120 column limit, braces on same line for functions, pointer/reference alignment in middle)
Minimize external dependencies; avoid adding new external dependencies unless absolutely necessary

Files:

src/llama-arch.h

**/*.{cpp,h,hpp,py}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Ensure cross-platform compatibility by testing code changes on Linux, macOS, and Windows when possible

Files:

src/llama-arch.h

🧠 Learnings (3)

📓 Common learnings

Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to src/llama.cpp : Maintain core library implementation in `src/llama.cpp` with careful attention to API contracts and backward compatibility

Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to include/llama.h : Exercise careful consideration when making changes to the public API in `include/llama.h`, as API stability is critical

Learnt from: ngxson
Repo: ngxson/llama.cpp PR: 14
File: examples/lookahead/lookahead.cpp:97-98
Timestamp: 2025-03-13T22:56:47.417Z
Learning: ngxson prefers to prioritize simplification in some cases and defer adding error handling to a later time, as indicated by their response to a suggestion about adding error checks for llama_decode_ext calls.

📚 Learning: 2025-11-29T22:55:53.865Z

Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to src/llama.cpp : Maintain core library implementation in `src/llama.cpp` with careful attention to API contracts and backward compatibility

Applied to files:

docs/development/HOWTO-add-model.md
src/llama-arch.h

📚 Learning: 2025-11-29T22:55:53.865Z

Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to include/llama.h : Exercise careful consideration when making changes to the public API in `include/llama.h`, as API stability is critical

Applied to files:

docs/development/HOWTO-add-model.md
src/llama-arch.h

🧬 Code graph analysis (1)

src/llama-arch.h (1)

src/llama-arch.cpp (1)

LLM_TN_IMPL (2394-2396)

🔇 Additional comments (4)

docs/development/HOWTO-add-model.md (1)

100-100: Documentation update aligns with architectural changes.

The updated guidance to add tensor names via llm_get_tensor_names aligns with the new model_tensors member in LLM_TN_IMPL that initializes from this function. The parenthetical note about potentially updating LLM_TENSOR_NAMES provides helpful backward compatibility context.

src/llama-arch.h (3)

6-6: LGTM: Include added for new std::set member.

The <set> include is necessary for the new model_tensors member in LLM_TN_IMPL.

320-320: No action required. The new LLM_TENSOR_OUTPUT_NORM_LFM2 enum is properly integrated into the tensor mapping system, including the name mapping ("token_embd_norm"), LFM2 architecture tensor list, and operation mapping.

555-561: Constructor calls replace aggregate initialization.

The operator overloads now correctly invoke the constructor to initialize the new model_tensors member. Codebase search confirms no aggregate initialization patterns for LLM_TN_IMPL exist elsewhere.

ngxson added 7 commits December 15, 2025 10:02

arch: refactor LLM_TENSOR_NAMES

d2bed05

update docs

6215579

typo

dc73ba9

Merge branch 'master' into xsn/arch_refactor_llm_names

942ddbe

fix LLM_ARCH_NEMOTRON_H_MOE

f4b088c

show more meaningful error message on missing tensor

dffb032

fix and tested LLM_ARCH_NEMOTRON_H_MOE

4a78eba

ngxson marked this pull request as draft December 16, 2025 11:24

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

github-actions bot added the documentation Improvements or additions to documentation label Dec 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(FOR CI) Xsn/arch refactor llm names #60

(FOR CI) Xsn/arch refactor llm names #60

ngxson commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 16, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

(FOR CI) Xsn/arch refactor llm names #60

Are you sure you want to change the base?

(FOR CI) Xsn/arch refactor llm names #60

Conversation

ngxson commented Dec 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngxson commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2025 •

edited

Loading