Skip to content

Commit e5a9b63

Browse files
ericyangpanclaude
andcommitted
docs: add model benchmark and specification reference documentation
Add comprehensive reference documentation for model benchmarks and specifications to guide manifest creation and updates. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent 2f86bf9 commit e5a9b63

File tree

2 files changed

+30
-0
lines changed

2 files changed

+30
-0
lines changed

docs/MODEL_BENCHMARK_REFERENCE.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
SWE-bench:https://www.swebench.com
2+
Terminal-Bench:https://www.tbench.ai/leaderboard/terminal-bench/2.0
3+
MMMU:https://mmmu-benchmark.github.io/#leaderboard
4+
MMMU-Pro:https://mmmu-benchmark.github.io/#leaderboard
5+
WebDev Arena:https://web.lmarena.ai/leaderboard
6+
SciCode:https://scicode-bench.github.io/leaderboard/
7+
LiveCodeBench:https://livecodebench.github.io/leaderboard.html

docs/MODEL_SPEC_REFERENCE.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Model Spec Reference
2+
3+
## Foundation Model Providers
4+
5+
| Provider | Model List | Model Detail |
6+
|----------|------------|-------------|
7+
| Alibaba Cloud | https://help.aliyun.com/zh/dashscope/developer-reference/tongyiqianwen-large-language-models | https://help.aliyun.com/zh/dashscope/developer-reference/tongyiqianwen-large-language-models#`{modelId}` |
8+
| Anthropic | https://platform.claude.com/docs/en/about-claude/models/overview | |
9+
| DeepSeek | https://platform.deepseek.com/api-docs/zh-cn/ | https://platform.deepseek.com/api-docs/zh-cn/ |
10+
| Google AI | https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models | https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/3-pro |
11+
| Meta AI | https://llama.meta.com/docs/models/ | https://llama.meta.com/docs/models/`{modelId}` |
12+
| MiniMax | https://platform.minimaxi.com/document/LLM/introduction | https://platform.minimaxi.com/document/LLM/introduction |
13+
| Moonshot | https://platform.moonshot.ai/docs | - |
14+
| OpenAI | https://platform.openai.com/docs/models | https://platform.openai.com/docs/models/gpt-5.2 |
15+
| xAI | https://docs.x.ai/docs/models | https://docs.x.ai/docs |
16+
| Z.ai | https://docs.z.ai | - |
17+
18+
## Model Service Providers
19+
20+
| Provider | Model List | Model Detail |
21+
|----------|------------|-------------|
22+
| OpenRouter | https://openrouter.ai/models | https://openrouter.ai/models/`{modelId}` |
23+
| SiliconFlow | https://docs.siliconflow.cn/Models | - |

0 commit comments

Comments
 (0)