Skip to content
Discussion options

You must be logged in to vote

Because Q/K gets permuted on conversion:

if name.endswith(("q_proj.weight", "q_proj.bias")):
data_torch = LlamaModel.permute(data_torch, n_head, n_head)
if name.endswith(("k_proj.weight", "k_proj.bias")):
data_torch = LlamaModel.permute(data_torch, n_head, n_kv_head)

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@kimjoohyungsd
Comment options

Answer selected by ggerganov
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants