Skip to content

Conversation

@stephantul
Copy link

@stephantul stephantul commented Jul 15, 2025

This PR adds mean encoding to the main trainer loop. This means that instead of passing each token separately through the encoder, we pass the mean of all tokens. This leads to 1.5 point improvement on Nanobeir for a minilm

This PR also fixes some typing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant