-
Notifications
You must be signed in to change notification settings - Fork 46
[peculator training] Update benchmark_speculator_logical.py to support gpt_bigcode/granite #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: specu-benchmark
Are you sure you want to change the base?
[peculator training] Update benchmark_speculator_logical.py to support gpt_bigcode/granite #62
Conversation
|
Thanks @sahilsuneja1 for putting this together! I don't think we need the caller script - it's ultimately just a simple python call with a bunch of arguments, right? I'd just add a comment with a sample call or two (Llama 7b / granite 20b?) to the top of the |
| # This example script measures the logical speedup of running a speculator atop a base model. Run as: | ||
| # export CUDA_VISIBLE_DEVICES=1 | ||
| # e.g., #1: torchrun --nproc_per_node=1 benchmark_speculator_logical.py --architecture=paged_llama --variant=7b --model_path=~/models/7B-F --tokenizer=~/models/tokenizer.model --model_source=hf --speculator_path=~/models/speculator_7B_F.pth --compile | ||
| # e.g., #2: torchrun --nproc_per_node=1 benchmark_speculator_logical.py --architecture=paged_gpt_bigcode --variant=ibm.20b --model_path=~/models/granite-20b-instruct --tokenizer=~/models/granite-20b-instruct --model_source=hf --speculator_path=~/models/speculator_granite20B.pth --n_predict=4 --threshes=[6,4,3,3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a --data_path and --subdata!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed with fake values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
(Black formatter is requesting changes though)
|
Updated formatting |
|
Further CI complaints. I'll try and figure out how to get them running automatically for you rather than having to wait for my explicit go-ahead |
|
Fixed isort issues |
|
Dunno how to fix the mypy errors |
|
There are some suggestions here on how to fix some of these, possibly via a fully qualified import? @afrittoli may also have suggestions. pasting the mypy errors here, for easier reference: |
|
Looks like a bunch of that is because this is relying on the paged attention branch, which hasn't fully landed in |
we can either point directly at the github URL within requirements.txt (see some examples here: https://stackoverflow.com/questions/16584552/how-to-state-in-requirements-txt-a-direct-github-source) or we could publish fms-extras to pypi. probably the first option is simpler/easier |
@daviswer: should we also add the caller script?