Llama.cpp tester #18144
Replies: 1 comment
-
|
I get a 504 while seeing the results. But for sure this is useful in catching regressions and it would be nice if something like this is part of the CI. Even say if could run nightly and take a batch of commits from master to verify rather than run on every commit. Currently the paths that are not tested AFAIK
All these have a good user base (judging from the issues created when one of these breaks) and would it be nice to have a test suite which covers at least these. I realize this has a monetary cost though, but considering the user/developer time saved in catching regressions early the benefit would be significant IMO. I think the entirety of this suite can be run in <1 GPU hour, which should cost less than 1$. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I want to share a little experiment I'm doing. I've got a script running on a change frozen vm that
https://llama-tester-results.surge.sh/#/models
Models were picked based on ability to fit in vram, but open to suggestions.
Been running it for 24h and highlights so far:
https://llama-tester-results.surge.sh/#/model/unsloth-LFM2-8B-A1B-GGUF-Q8_0/prompt/p0
https://llama-tester-results.surge.sh/#/model/ggml-org-gpt-oss-20b-GGUF/prompt/p0
https://llama-tester-results.surge.sh/#/model/unsloth-granite-4.0-h-tiny-GGUF-q8_0/prompt/p0
https://llama-tester-results.surge.sh/#/model/unsloth-granite-4.0-h-tiny-GGUF-q8_0/prompt/p1
Beta Was this translation helpful? Give feedback.
All reactions