人工智能llama-factory下使用unsloth微调DeepSeek-R1-Distill-Qwen-1.5B

尚未崩坏 发表于 2025-4-23 04:53:08

llama-factory下使用unsloth微调DeepSeek-R1-Distill-Qwen-1.5B

开发情况

操作系统 Windows 11 家庭中文版 (64位)
CPU AMD 9950X
主板华硕 PRIME X870-P WIFI
内存 32GB(4800 MHz)
主硬盘致钛 TiPlus7100 1TB
显卡 NVIDIA GeForce RTX 4060
DeepSeek-R1-Distill-Qwen-1.5B
LLaMA-Factory
Anaconda3-2024.10-1-Windows-x86_64.exe
cuda_12.8.0_571.96_windows.exe
NVIDIA_app_v11.0.2.312.exe
torch-2.6.0+cu126-cp312-cp312-win_amd64.whl
triton-3.2.0-cp312-cp312-win_amd64.whl
Visual Studio 2022
VC_redist.x64.exe
开发过程

[*]安装Visual Studio 2022和VC_redist.x64.exe，将cl.exe路径添加到体系情况变量，主要是deepseek的ptx或triton编译要用到
D:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\bin\Hostx64\x86
D:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.42.34433\bin\Hostx64\x64

[*]安装anaconda并创建捏造情况llama_factory，使用whl方式安装torch和triton，由于torch大且pip下载慢，window下pip没法安装triton，其他的运行过程中缺什么pip install装什么
conda create --name llama_factory
conda activate llama_factory
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install .\torch-2.6.0+cu126-cp312-cp312-win_amd64.whl
pip install .\triton-3.2.0-cp312-cp312-win_amd64.whl
pip install unsloth

[*]安装LLaMA-Factory，在捏造情况llama_factory下安装
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
conda activate llama_factory
cd LLaMA-Factory
pip install -e "."
llamafactory-cli webui

[*]将魔塔社区上的parquet格式CoT数据集转为LLaMA-Factory的alpaca格式，并在dataset_info.json里面举行声明
import pandas as pd
import json
df = pd.read_parquet('D:\\DeepSeek-R1-Distill-Qwen-1.5B\\Magpie-Reasoning-V1-150K-CoT-Deepseek-R1-Llama-70B\\train-00000-of-00006.parquet')

print(df.tail())
# 创建一个空列表来存储转换后的数据
alpaca_data = []
i = 0
for index, row in df.iterrows():
i = i + 1
# 创建一个字典来存储当前行的数据
data_point = {
   "instruction": row['instruction'],
   "input": "",
   "output": row['response']
}
print(data_point)
# 将字典添加到列表中
alpaca_data.append(data_point)
if (i>3000):
   break

# 将列表转换为JSON字符串
alpaca_json = json.dumps(alpaca_data, indent=4)
alpaca_data[-1]
# 保存到文件
with open('D:/DeepSeek-R1-Distill-Qwen-1.5B/LLaMA-Factory/data/MRPC_train_data.json', 'w') as f:
f.write(alpaca_json)
https://i-blog.csdnimg.cn/direct/b6d934b815e94474b90b463ffea12881.pnghttps://i-blog.csdnimg.cn/direct/4966d3192f82480bbb028b4870506adf.png

[*]版本对应关系
==((====))==Unsloth 2025.2.15: Fast Qwen2 patching. Transformers: 4.48.3.
\\ /| GPU: NVIDIA GeForce RTX 4060. Max memory: 7.996 GB. Platform: Windows.
O^O/ \_/ \ Torch: 2.6.0+cu126. CUDA: 8.9. CUDA Toolkit: 12.6. Triton: 3.2.0
\    / Bfloat16 = TRUE. FA
"-____-" Free Apache license: http://github.com/unslothai/unsloth
https://i-blog.csdnimg.cn/direct/e43d30567397400abb0e82324a8810ea.png

[*]加速方式使用unsloth，数据集使用CoT数据集，截断长度拉到最长，然后3000条CoT数据微调大概要3小时
https://i-blog.csdnimg.cn/direct/f9d7fe2cc0f4480bb14d3a98ace2a2b8.png
https://i-blog.csdnimg.cn/direct/399ed131cec6418fae53471e1e5a3474.jpeg#pic_center

免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！更多信息从访问主页：qidao123.com:ToB企服之家，中国第一个企服评测及商务社交产业平台。

页: [1]

IT评测·应用市场-qidao123.com技术社区's Archiver

llama-factory下使用unsloth微调DeepSeek-R1-Distill-Qwen-1.5B