diff --git a/README.md b/README.md index 7924ff7..854dea9 100644 --- a/README.md +++ b/README.md @@ -25,11 +25,9 @@ issue with your env: ```bash git clone https://github.com/Dao-AILab/causal-conv1d.git -cd causal-conv1d && pip install . && cd .. +cd causal-conv1d && FORCE_BUILD=TRUE pip install . && cd .. git clone https://github.com/state-spaces/mamba.git -cd mamba && pip install . && cd .. -git clone https://github.com/Dao-AILab/flash-attention.git -cd flash-attention && pip install . && cd .. +cd mamba && MAMBA_FORCE_BUILD=TRUE pip install . && cd .. ``` For users using our HF versions of the model, you would need to install the latest transformers which includes our newly merged implementation for our Bamba models: @@ -72,123 +70,104 @@ For exact reproduction of Bamba 9.8B using the same training data, access is ava Benchmark -Bamba 9B (2.2T) +Bamba 9B (3.1T) -General +General -MMLU (5-shot) +MMLU -60.77 +67.92 -ARC-C (25-shot) +ARC-C -63.23 +63.57 -GSM8K (5-shot) +GSM8K -36.77 +41.70 -Hellaswag (10-shot) +Hellaswag -81.8 +83.85 -OpenbookQA (5-shot) +OpenbookQA -47.6 +51.0 Piqa (5-shot) -82.26 +83.62 TruthfulQA (0-shot) -49.21 +50.86 Winogrande (5-shot) -76.87 +79.48 -HF OpenLLM- V2* - -MMLU-PRO (5-shot) +Boolq -17.53 +82.78 -BBH (3-shot) - -17.4 - - - -GPQA (0-shot) - -4.14 +HF OpenLLM- V2* - - -IFEval (0-shot) +MMLU-PRO -15.16 +25.41 -MATH Lvl 5 (4-shot) +BBH -1.66 +24.78 -MuSR (0-shot) +GPQA -9.59 +5.93 -Safety Tasks - -PopQA (5-shot) +IFEval -20.5 +19.00 -Toxigen (5-shot) +MATH Lvl 5 -57.4 +6.42 -BBQ (5-shot) +MuSR -44.2 +9.28 -Crows-pairs english (5-shot) - -70.78 - - *For the v2 leaderboard results, we perform [normalization](https://huggingface.co/docs/leaderboards/open_llm_leaderboard/normalization) and report the normalized results.