-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
This table displays the evaluation benchmarks for Llama3 3B model trained on 1T tokens.
| Metric | llama3_3b_1t_100k | llama3_3b_1t_200k | llama3_3b_1t_300k | llama3_3b_1t_400k | llama3_3b_1t_500k | llama3_3b_1t_600k | llama3_3b_1t_700k | llama3_3b_1t_800k | llama3_3b_1t_900k | llama3_3b_1t_1000k |
|---|---|---|---|---|---|---|---|---|---|---|
| winogrande | 0.606156 | 0.599053 | 0.610103 | 0.617995 | 0.617995 | 0.636148 | 0.633781 | 0.639305 | 0.647987 | 0.657459 |
| truthfulqa_mc2 | 0.346594 | 0.366435 | 0.404596 | 0.369011 | 0.426658 | 0.364889 | 0.394947 | 0.388509 | 0.350430 | 0.393013 |
| social_iqa | 0.325998 | 0.327533 | 0.329069 | 0.329069 | 0.328045 | 0.325486 | 0.329580 | 0.333675 | 0.320880 | 0.318321 |
| sciq | 0.860 | 0.878 | 0.854 | 0.864 | 0.866 | 0.882 | 0.875 | 0.901 | 0.884 | 0.889 |
| piqa | 0.731774 | 0.738303 | 0.751360 | 0.749184 | 0.764418 | 0.756801 | 0.766594 | 0.771491 | 0.772579 | 0.762242 |
| openbookqa | 0.410 | 0.410 | 0.418 | 0.408 | 0.430 | 0.426 | 0.448 | 0.450 | 0.456 | 0.438 |
| lambada | 23.339747 | 18.346882 | 14.713008 | 14.887917 | 13.840738 | 12.946245 | 11.497174 | 11.167095 | 10.157396 | 11.664451 |
| lambada_openai | 15.943495 | 14.2658622 | 12.675483 | 12.241740 | 10.795581 | 10.252835 | 9.736051 | 8.705719 | 8.582805 | 9.018882 |
| lambada_standard | 30.735998 | 22.427902 | 16.750534 | 17.534093 | 16.885896 | 15.639655 | 13.258297 | 13.628472 | 11.731986 | 14.310020 |
| hellaswag | 0.588229 | 0.608146 | 0.621291 | 0.625075 | 0.638120 | 0.648277 | 0.659928 | 0.671878 | 0.677455 | 0.677156 |
| copa | 0.77 | 0.76 | 0.78 | 0.78 | 0.76 | 0.82 | 0.80 | 0.82 | 0.83 | 0.82 |
| boolq | 0.633639 | 0.582263 | 0.575535 | 0.652294 | 0.612538 | 0.682263 | 0.664526 | 0.665138 | 0.681651 | 0.694190 |
| arc_easy | 0.689815 | 0.697811 | 0.699074 | 0.688131 | 0.697811 | 0.726431 | 0.727273 | 0.737374 | 0.753367 | 0.730219 |
| arc_challenge | 0.401024 | 0.403584 | 0.421502 | 0.408703 | 0.430887 | 0.460751 | 0.441980 | 0.467577 | 0.488908 | 0.461604 |
| mmlu | 0.257656 | 0.247828 | 0.261003 | 0.258154 | 0.270047 | 0.243341 | 0.273821 | 0.261216 | 0.257157 | 0.293263 |
Trend (plot format):
Metadata
Metadata
Assignees
Labels
No labels














