-
Notifications
You must be signed in to change notification settings - Fork 5
(FOR CI) Xsn/arch refactor llm names #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request updates the tensor naming system for GGML model definitions by introducing a new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/llama-arch.h (1)
531-533: EachLLM_TN_IMPLinstance rebuilds the tensor set redundantly during model loading.The constructor calls
llm_get_tensor_names(arch)for every instantiation (1,760+ times per model load), rebuilding an identical set of ~83 tensors. Sincemodel_tensorsdepends only on thearchparameter, consider caching the result per-architecture instead of recreating it for each instance. This would eliminate redundant allocations during the model loading phase.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/development/HOWTO-add-model.md(1 hunks)src/llama-arch.h(4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{cpp,h,hpp}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{cpp,h,hpp}: Always format C++ code usinggit clang-formatbefore committing, following .clang-format configuration (4-space indentation, 120 column limit, braces on same line for functions, pointer/reference alignment in middle)
Minimize external dependencies; avoid adding new external dependencies unless absolutely necessary
Files:
src/llama-arch.h
**/*.{cpp,h,hpp,py}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Ensure cross-platform compatibility by testing code changes on Linux, macOS, and Windows when possible
Files:
src/llama-arch.h
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to src/llama.cpp : Maintain core library implementation in `src/llama.cpp` with careful attention to API contracts and backward compatibility
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to include/llama.h : Exercise careful consideration when making changes to the public API in `include/llama.h`, as API stability is critical
Learnt from: ngxson
Repo: ngxson/llama.cpp PR: 14
File: examples/lookahead/lookahead.cpp:97-98
Timestamp: 2025-03-13T22:56:47.417Z
Learning: ngxson prefers to prioritize simplification in some cases and defer adding error handling to a later time, as indicated by their response to a suggestion about adding error checks for llama_decode_ext calls.
📚 Learning: 2025-11-29T22:55:53.865Z
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to src/llama.cpp : Maintain core library implementation in `src/llama.cpp` with careful attention to API contracts and backward compatibility
Applied to files:
docs/development/HOWTO-add-model.mdsrc/llama-arch.h
📚 Learning: 2025-11-29T22:55:53.865Z
Learnt from: CR
Repo: ngxson/llama.cpp PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T22:55:53.865Z
Learning: Applies to include/llama.h : Exercise careful consideration when making changes to the public API in `include/llama.h`, as API stability is critical
Applied to files:
docs/development/HOWTO-add-model.mdsrc/llama-arch.h
🧬 Code graph analysis (1)
src/llama-arch.h (1)
src/llama-arch.cpp (1)
LLM_TN_IMPL(2394-2396)
🔇 Additional comments (4)
docs/development/HOWTO-add-model.md (1)
100-100: Documentation update aligns with architectural changes.The updated guidance to add tensor names via
llm_get_tensor_namesaligns with the newmodel_tensorsmember inLLM_TN_IMPLthat initializes from this function. The parenthetical note about potentially updatingLLM_TENSOR_NAMESprovides helpful backward compatibility context.src/llama-arch.h (3)
6-6: LGTM: Include added for new std::set member.The
<set>include is necessary for the newmodel_tensorsmember inLLM_TN_IMPL.
320-320: No action required. The newLLM_TENSOR_OUTPUT_NORM_LFM2enum is properly integrated into the tensor mapping system, including the name mapping ("token_embd_norm"), LFM2 architecture tensor list, and operation mapping.
555-561: Constructor calls replace aggregate initialization.The operator overloads now correctly invoke the constructor to initialize the new
model_tensorsmember. Codebase search confirms no aggregate initialization patterns forLLM_TN_IMPLexist elsewhere.
Make sure to read the contributing guidelines before submitting a PR
Summary by CodeRabbit
Release Notes
Bug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.