测试基准 | - | - | - | - | 差值 |
测试基准名 | 指标 | 指标值 - int8 | 指标值 - fp16 | 标准差 - fp16 | - |
hellaswag | acc_norm | 0.7849 | 0.7849 | 0.0041 | 0 |
hellaswag | acc | 0.5921 | 0.5931 | 0.0049 | 0.001 |
piqa | acc | 0.7965 | 0.7959 | 0.0094 | 0.0006 |
piqa | acc_norm | 0.8101 | 0.8107 | 0.0091 | 0.0006 |
lambada | ppl | 3.0142 | 3.0152 | 0.0552 | 0.001 |
lambada | acc | 0.7464 | 0.7466 | 0.0061 | 0.0002 |
winogrande | acc | 0.7174 | 0.7245 | 0.0125 | 0.0071 |
测试基准 | - | - | - | - | 差值 |
测试基准名 | 指标 | 指标值 - int8 | 指标值 - fp16 | 标准差 - fp16 | - |
hellaswag | acc_norm | 0.7274 | 0.7303 | 0.0044 | 0.0029 |
hellaswag | acc | 0.5563 | 0.5584 | 0.005 | 0.0021 |
piqa | acc | 0.7835 | 0.7884 | 0.0095 | 0.0049 |
piqa | acc_norm | 0.7922 | 0.7911 | 0.0095 | 0.0011 |
lambada | ppl | 3.9191 | 3.931 | 0.0846 | 0.0119 |
lambada | acc | 0.6808 | 0.6718 | 0.0065 | 0.009 |
winogrande | acc | 0.7048 | 0.7048 | 0.0128 | 0 |
精度 | 参数量 | 硬件 | 每词元延迟 (单位: 毫秒,batch size: 1) | 每词元延迟 (单位: 毫秒,batch size: 8) | 每词元延迟 (单位: 毫秒,batch size: 32) |
bf16 | 176B | 8xA100 80GB | 239 | 32 | 9.9 |
int8 | 176B | 4xA100 80GB | 282 | 37.5 | 10.2 |
bf16 | 176B | 14xA100 40GB | 285 | 36.5 | 10.4 |
int8 | 176B | 5xA100 40GB | 367 | 46.4 | oom |
fp16 | 11B | 2xT4 15GB | 11.7 | 1.7 | 0.5 |
int8 | 11B | 1xT4 15GB | 43.5 | 5.3 | 1.3 |
fp32 | 3B | 2xT4 15GB | 45 | 7.2 | 3.1 |
int8 | 3B | 1xT4 15GB | 312 | 39.1 | 10.2 |
欢迎光临 ToB企服应用市场:ToB评测及商务社交产业平台 (https://dis.qidao123.com/) | Powered by Discuz! X3.4 |