Llama/Qwen/DeepSeek开源之争——CLiB开源大模子排行榜03.04

渣渣兔 · 2025-4-22 20:21:14

在当前的开源大模子竞争中，Llama、Qwen和DeepSeek形成了三足鼎立之势。Llama曾是开源领域的标杆，但随着Qwen和DeepSeek的崛起，其职位受到挑衅。Qwen依附多样化开源尺寸和强劲性能，在Hugging Face等开源社区中逾越Llama，成为新的标杆。DeepSeek则通过基于Qwen等模子的蒸馏技术，快速构建高性能模子，推动开源生态发展。
谁优谁劣，我们直接上榜单！
评测维度：医疗、教育、法律、行政公务、推理与数学计算、语言与指令遵从。

排名	大模子	机构	输出价格（元/M tok）	总分
1	DeepSeek-R1	深度求索	16.0	87.34
2	qwq-32b-preview	阿里巴巴	7.0	77.85
3	DeepSeek-R1-Distill-Qwen-32B	深度求索	1.3	77.49
4	qwen2.5-72b-instruct	阿里巴巴	12.0	76.89
5	qwen2.5-32b-instruct	阿里巴巴	7.0	75.85
6	deepseek-chat-v3	深度求索	8.0	75.03
7	qwen2.5-14b-instruct	阿里巴巴	6.0	72.77
8	DeepSeek-R1-Distill-Qwen-14B	深度求索	0.7	72.77
9	DeepSeek-R1-Distill-Llama-70B	深度求索	4.1	71.37
10	internlm2_5-20b-chat	上海人工智能实验室	1.0	70.20
11	Meta-Llama-3.1-405B-Instruct	Meta	21.0	69.55
12	qwen2.5-7b-instruct	阿里巴巴	2.0	69.11
13	internlm2_5-7b-chat	上海人工智能实验室	0.4	68.05
14	Llama-3.3-70B-Instruct	meta	4.1	67.86
15	glm-4-9b-chat	智谱AI	0.6	67.12
16	qwen2.5-math-72b-instruct	阿里巴巴	12.0	67.03
17	Llama-3.3-70B-Instruct-fp8	meta	2.2	66.86
18	Llama-3.1-Nemotron-70B-Instruct-fp8	nvidia	2.2	66.67
19	Yi-1.5-34B-Chat	零一万物	1.3	66.64
20	Hermes-3-Llama-3.1-405B	NousResearch	5.8	65.65
21	phi-4	微软	1.0	62.92
22	qwen2.5-3b-instruct	阿里巴巴	0.0	58.64
23	Yi-1.5-9B-Chat	零一万物	0.4	58.56
24	gemma-2-27b-it	Google	1.3	57.89
25	gemma-2-9b-it	Google	0.6	55.41
26	Llama-3.1-8B-Instruct	Meta	0.4	53.03
27	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.4	52.42
28	DeepSeek-R1-Distill-Llama-8B	深度求索	0.4	52.35
29	Mistral-Nemo-Instruct-2407	Mistral	0.6	52.24
30	Meta-Llama-3.1-8B-Instruct-fp8	meta	0.4	51.39
31	qwen2.5-1.5b-instruct	阿里巴巴	0.0	49.03
32	Llama-3.2-3B-Instruct	meta	0.2	46.76
33	Mistral-7B-Instruct-v0.3	Mistral	0.4	42.19
34	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.1	40.43
35	qwen2.5-0.5b-instruct	阿里巴巴	0.0	37.89
36	Llama-3.2-1B-Instruct	meta	0.2	36.59

各细分领域完备评测结果详见：https://github.com/jeinlee1991/chinese-llm-benchmark

大模子评测EasyLLM，目前已就DeepSeek和各个大模子的不同能力维度进行了综合评测（详情可回顾以下链接

		自动登录	找回密码
密码			立即注册

Llama/Qwen/DeepSeek开源之争——CLiB开源大模子排行榜03.04

本帖子中包含更多资源

0 个回复

快速回复

楼主热帖

标签云

浏览过的版块