序言
本文以 LLaMA-Factory 为例,在超算互联网平台SCNet上利用异构加快卡AI 显存64GB PCIE,对 Llama3-8B-Instruct 模型进行 LoRA 微调、归并和推理。
一、参考资料
github仓库代码:LLaMA-Factory,利用最新的代码分支:v0.8.3。
超算互联网平台
异构加快卡AI 显存64GB PCIE
二、预备情况
1. 体系镜像
异构加快卡AI为国产加快卡(DCU),基于DTK软件栈(对标NVIDIA的CUDA),请选择 dtk24.04 版本的镜像情况。
以jupyterlab-pytorch:2.1.0-ubuntu20.04-dtk24.04-py310 镜像为例。
2. 软硬件依赖
特别注意:由 requirements.txt 文件可知,LLaMA-Factory项目要求最低版本 transformers>=4.41.2,vllm>=0.4.3 。
必须项至少推荐python3.83.11torch1.13.12.3.0transformers4.41.24.41.2datasets2.16.02.19.2accelerate0.30.10.30.1peft0.11.10.11.1trl0.8.60.9.4 可选项至少推荐CUDA11.612.2deepspeed0.10.00.14.0bitsandbytes0.39.00.43.1vllm0.4.30.4.3flash-attn2.3.02.5.9 3. 克隆base情况
- root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# conda create -n llama_factory_torch --clone base
- Source: /opt/conda
- Destination: /opt/conda/envs/llama_factory_torch
- The following packages cannot be cloned out of the root environment:
- - https://repo.anaconda.com/pkgs/main/linux-64::conda-23.7.4-py310h06a4308_0
- Packages: 44
- Files: 53489
- Downloading and Extracting Packages
- Downloading and Extracting Packages
- Preparing transaction: done
- Verifying transaction: done
- Executing transaction: done
- #
- # To activate this environment, use
- #
- # $ conda activate llama_factory_torch
- #
- # To deactivate an active environment, use
- #
- # $ conda deactivate
复制代码 4. 安装 LLaMA Factory
- git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
- cd LLaMA-Factory
- pip install -e ".[torch,metrics]"
复制代码- root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# source activate llama_factory_torch
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
- odels/LLaMA-Factory# pip install -e ".[torch,metrics]"
- Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
- Obtaining file:///public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory
- Installing build dependencies ... done
- Checking if build backend supports build_editable ... done
- Getting requirements to build editable ... done
- Preparing editable metadata (pyproject.toml) ... done
- ...
- Checking if build backend supports build_editable ... done
- Building wheels for collected packages: llamafactory
- Building editable for llamafactory (pyproject.toml) ... done
- Created wheel for llamafactory: filename=llamafactory-0.8.4.dev0-0.editable-py3-none-any.whl size=20781 sha256=70c0480e2b648516e0eac3d39371d4100cbdaa1f277d87b657bf2adec9e0b2be
- Stored in directory: /tmp/pip-ephem-wheel-cache-uhypmj_8/wheels/e9/b4/89/f13e921e37904ee0c839434aad2d7b2951c2c68e596667c7ef
- Successfully built llamafactory
- DEPRECATION: lmdeploy 0.1.0-git782048c.abi0.dtk2404.torch2.1. has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of lmdeploy or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
- DEPRECATION: mmcv 2.0.1-gitc0ccf15.abi0.dtk2404.torch2.1. has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of mmcv or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
- Installing collected packages: pydub, jieba, urllib3, tomlkit, shtab, semantic-version, scipy, ruff, rouge-chinese, joblib, importlib-resources, ffmpy, docstring-parser, aiofiles, nltk, tyro, sse-starlette, tokenizers, gradio-client, transformers, trl, peft, gradio, llamafactory
- Attempting uninstall: urllib3
- Found existing installation: urllib3 1.26.13
- Uninstalling urllib3-1.26.13:
- Successfully uninstalled urllib3-1.26.13
- Attempting uninstall: tokenizers
- Found existing installation: tokenizers 0.15.0
- Uninstalling tokenizers-0.15.0:
- Successfully uninstalled tokenizers-0.15.0
- Attempting uninstall: transformers
- Found existing installation: transformers 4.38.0
- Uninstalling transformers-4.38.0:
- Successfully uninstalled transformers-4.38.0
- ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
- lmdeploy 0.1.0-git782048c.abi0.dtk2404.torch2.1. requires transformers==4.33.2, but you have transformers 4.43.3 which is incompatible.
- Successfully installed aiofiles-23.2.1 docstring-parser-0.16 ffmpy-0.4.0 gradio-4.40.0 gradio-client-1.2.0 importlib-resources-6.4.0 jieba-0.42.1 joblib-1.4.2 llamafactory-0.8.4.dev0 nltk-3.8.1 peft-0.12.0 pydub-0.25.1 rouge-chinese-1.0.3 ruff-0.5.5 scipy-1.14.0 semantic-version-2.10.0 shtab-1.7.1 sse-starlette-2.1.3 tokenizers-0.19.1 tomlkit-0.12.0 transformers-4.43.3 trl-0.9.6 tyro-0.8.5 urllib3-2.2.2
- WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
- [notice] A new release of pip is available: 24.0 -> 24.2
- [notice] To update, run: pip install --upgrade pip
复制代码 5. 解决依赖包冲突
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
- odels/LLaMA-Factory# pip install --no-deps -e .
- Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
- Obtaining file:///public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory
- Installing build dependencies ... done
- Checking if build backend supports build_editable ... done
- Getting requirements to build editable ... done
- Preparing editable metadata (pyproject.toml) ... done
- Building wheels for collected packages: llamafactory
- Building editable for llamafactory (pyproject.toml) ... done
- Created wheel for llamafactory: filename=llamafactory-0.8.4.dev0-0.editable-py3-none-any.whl size=20781 sha256=f874a791bc9fdca02075cda0459104b48a57d300a077eca00eee7221cde429c3
- Stored in directory: /tmp/pip-ephem-wheel-cache-7vjiq3f3/wheels/e9/b4/89/f13e921e37904ee0c839434aad2d7b2951c2c68e596667c7ef
- Successfully built llamafactory
- Installing collected packages: llamafactory
- Attempting uninstall: llamafactory
- Found existing installation: llamafactory 0.8.4.dev0
- Uninstalling llamafactory-0.8.4.dev0:
- Successfully uninstalled llamafactory-0.8.4.dev0
- Successfully installed llamafactory-0.8.4.dev0
- WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
- [notice] A new release of pip is available: 24.0 -> 24.2
- [notice] To update, run: pip install --upgrade pip
复制代码 6. 安装 vllm==0.4.3
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
- odels/LLaMA-Factory# pip list | grep llvm
- [notice] A new release of pip is available: 24.0 -> 24.2
- [notice] To update, run: pip install --upgrade pip
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
- odels/LLaMA-Factory# pip install --no-dependencies vllm==0.4.3
- Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
- Collecting vllm==0.4.3
- Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1a/1e/10bcb6566f4fa8b95ff85bddfd1675ff7db33ba861f59bd70aa3b92a46b7/vllm-0.4.3-cp310-cp310-manylinux1_x86_64.whl (131.1 MB)
- Installing collected packages: vllm
- Attempting uninstall: vllm
- Found existing installation: vllm 0.3.3+git3380931.abi0.dtk2404.torch2.1
- Uninstalling vllm-0.3.3+git3380931.abi0.dtk2404.torch2.1:
- Successfully uninstalled vllm-0.3.3+git3380931.abi0.dtk2404.torch2.1
- Successfully installed vllm-0.4.3
- WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
- [notice] A new release of pip is available: 24.0 -> 24.2
- [notice] To update, run: pip install --upgrade pip
复制代码 三、关键步调
1. 获取Access Token
获取Access Token,并登录Hugging Face 账户。
详细步调,请参考另一篇博客:Hugging Face和ModelScope大模型/数据集的下载加快方法
- pip install --upgrade huggingface_hub
- huggingface-cli login
复制代码 2. llamafactory-cli 指令
利用 llamafactory-cli help 表现帮助信息。
- (llama_fct) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/mod
- els/LLaMA-Factory# llamafactory-cli help
- No ROCm runtime is found, using ROCM_HOME='/opt/dtk'
- /opt/conda/envs/llama_fct/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Fail ed to load image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or d irectory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this w arning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or ` libpng` installed before building `torchvision` from source?
- warn(
- [2024-08-01 15:12:24,629] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- ----------------------------------------------------------------------
- | Usage: |
- | llamafactory-cli api -h: launch an OpenAI-style API server |
- | llamafactory-cli chat -h: launch a chat interface in CLI |
- | llamafactory-cli eval -h: evaluate models |
- | llamafactory-cli export -h: merge LoRA adapters and export model |
- | llamafactory-cli train -h: train models |
- | llamafactory-cli webchat -h: launch a chat interface in Web UI |
- | llamafactory-cli webui: launch LlamaBoard |
- | llamafactory-cli version: show version info |
- ----------------------------------------------------------------------
复制代码 3. 快速开始
下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA 微调、归并和推理。
- llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
- llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
复制代码 3.1 LoRA 微调
模型微调训练是在DCU上进行的。
3.1.1 单卡情况
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
- odels/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- [2024-08-01 19:06:41,134] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (a uto detect)
- 08/01/2024 19:06:44 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distri buted training: False, compute dtype: torch.bfloat16
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file tokenizer.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file added_tokens.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file special_tokens_map.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file tokenizer_config.json
- [INFO|tokenization_utils_base.py:2533] 2024-08-01 19:06:45,563 >> Special tokens have been added in the voca bulary, make sure the associated word embeddings are fine-tuned or trained.
- 08/01/2024 19:06:45 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
- 08/01/2024 19:06:45 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
- 08/01/2024 19:06:45 - INFO - llamafactory.data.loader - Loading dataset identity.json...
- Converting format of dataset (num_proc=16): 100%|███████████████████| 91/91 [00:00<00:00, 444.18 examples/s]
- 08/01/2024 19:06:47 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
- Converting format of dataset (num_proc=16): 100%|██████████████| 1000/1000 [00:00<00:00, 4851.17 examples/s]
- Running tokenizer on dataset (num_proc=16): 100%|███████████████| 1091/1091 [00:02<00:00, 375.29 examples/s]
- training example:
- input_ids:
- [128000, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
- inputs:
- <|begin_of_text|><|start_header_id|>user<|end_header_id|>
- hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>
- Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
- label_ids:
- [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 5991, 609, 39254, 459 , 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
- labels:
- Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
- [INFO|configuration_utils.py:731] 2024-08-01 19:06:53,502 >> loading configuration file /root/.cache/modelsc ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-01 19:06:53,503 >> Model config LlamaConfig {
- "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- [INFO|modeling_utils.py:3631] 2024-08-01 19:06:53,534 >> loading weights file /root/.cache/modelscope/hub/LL M-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
- [INFO|modeling_utils.py:1572] 2024-08-01 19:06:53,534 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
- [INFO|configuration_utils.py:1038] 2024-08-01 19:06:53,536 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "eos_token_id": 128009
- }
- Loading checkpoint shards: 100%|██████████████████████████████████████████████| 4/4 [00:08<00:00, 2.04s/it]
- [INFO|modeling_utils.py:4463] 2024-08-01 19:07:01,775 >> All model checkpoint weights were used when initial izing LlamaForCausalLM.
- [INFO|modeling_utils.py:4471] 2024-08-01 19:07:01,775 >> All the weights of LlamaForCausalLM were initialize d from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
- If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaFor CausalLM for predictions without further training.
- [INFO|configuration_utils.py:991] 2024-08-01 19:07:01,779 >> loading configuration file /root/.cache/modelsc ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json
- [INFO|configuration_utils.py:1038] 2024-08-01 19:07:01,780 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "do_sample": true,
- "eos_token_id": [
- 128001,
- 128009
- ],
- "max_length": 4096,
- "temperature": 0.6,
- "top_p": 0.9
- }
- 08/01/2024 19:07:01 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
- 08/01/2024 19:07:01 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementati on.
- 08/01/2024 19:07:01 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
- 08/01/2024 19:07:01 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
- 08/01/2024 19:07:01 - INFO - llamafactory.model.model_utils.misc - Found linear modules: q_proj,up_proj,v_pr oj,down_proj,k_proj,o_proj,gate_proj
- 08/01/2024 19:07:04 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
- Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
- [INFO|trainer.py:648] 2024-08-01 19:07:04,471 >> Using auto half precision backend
- [INFO|trainer.py:2134] 2024-08-01 19:07:04,831 >> ***** Running training *****
- [INFO|trainer.py:2135] 2024-08-01 19:07:04,831 >> Num examples = 981
- [INFO|trainer.py:2136] 2024-08-01 19:07:04,831 >> Num Epochs = 3
- [INFO|trainer.py:2137] 2024-08-01 19:07:04,832 >> Instantaneous batch size per device = 1
- [INFO|trainer.py:2140] 2024-08-01 19:07:04,832 >> Total train batch size (w. parallel, distributed & accumulation) = 8
- [INFO|trainer.py:2141] 2024-08-01 19:07:04,832 >> Gradient Accumulation steps = 8
- [INFO|trainer.py:2142] 2024-08-01 19:07:04,832 >> Total optimization steps = 366
- [INFO|trainer.py:2143] 2024-08-01 19:07:04,836 >> Number of trainable parameters = 20,971,520
- {'loss': 1.5025, 'grad_norm': 1.3309401273727417, 'learning_rate': 2.702702702702703e-05, 'epoch': 0.08}
- {'loss': 1.3424, 'grad_norm': 1.8096668720245361, 'learning_rate': 5.405405405405406e-05, 'epoch': 0.16}
- {'loss': 1.1286, 'grad_norm': 1.2990491390228271, 'learning_rate': 8.108108108108109e-05, 'epoch': 0.24}
- {'loss': 0.9808, 'grad_norm': 1.1075998544692993, 'learning_rate': 9.997948550797227e-05, 'epoch': 0.33}
- {'loss': 0.9924, 'grad_norm': 1.8073676824569702, 'learning_rate': 9.961525153583327e-05, 'epoch': 0.41}
- {'loss': 1.0052, 'grad_norm': 1.2079122066497803, 'learning_rate': 9.879896064123961e-05, 'epoch': 0.49}
- {'loss': 0.9973, 'grad_norm': 1.7361079454421997, 'learning_rate': 9.753805025397779e-05, 'epoch': 0.57}
- {'loss': 0.8488, 'grad_norm': 1.1059085130691528, 'learning_rate': 9.584400884284545e-05, 'epoch': 0.65}
- {'loss': 0.9893, 'grad_norm': 0.8711654543876648, 'learning_rate': 9.373227124134888e-05, 'epoch': 0.73}
- {'loss': 0.9116, 'grad_norm': 1.3793599605560303, 'learning_rate': 9.122207801708802e-05, 'epoch': 0.82}
- {'loss': 1.0429, 'grad_norm': 1.3769993782043457, 'learning_rate': 8.833630016614976e-05, 'epoch': 0.9}
- {'loss': 0.9323, 'grad_norm': 1.2503643035888672, 'learning_rate': 8.510123072976239e-05, 'epoch': 0.98}
- {'loss': 0.9213, 'grad_norm': 2.449227809906006, 'learning_rate': 8.154634523184388e-05, 'epoch': 1.06}
- {'loss': 0.8386, 'grad_norm': 1.009852409362793, 'learning_rate': 7.770403312015721e-05, 'epoch': 1.14}
- 40%|███████████████████████████▌ | 146/366 [10:19<15:11, 4.14s/it] {'loss': 0.856, 'grad_norm': 0.863474428653717, 'learning_rate': 7.360930265797935e-05, 'epoch': 1.22}
- {'loss': 0.838, 'grad_norm': 0.712546169757843, 'learning_rate': 6.929946195508932e-05, 'epoch': 1.3}
- {'loss': 0.8268, 'grad_norm': 1.6060960292816162, 'learning_rate': 6.481377904428171e-05, 'epoch': 1.39}
- {'loss': 0.7326, 'grad_norm': 0.7863644957542419, 'learning_rate': 6.019312410053286e-05, 'epoch': 1.47}
- {'loss': 0.7823, 'grad_norm': 0.8964634537696838, 'learning_rate': 5.547959706265068e-05, 'epoch': 1.55}
- {'loss': 0.7599, 'grad_norm': 0.5305138826370239, 'learning_rate': 5.0716144050239375e-05, 'epoch': 1.63}
- {'loss': 0.815, 'grad_norm': 0.8153926730155945, 'learning_rate': 4.594616607090028e-05, 'epoch': 1.71}
- {'loss': 0.8258, 'grad_norm': 1.3266267776489258, 'learning_rate': 4.121312358283463e-05, 'epoch': 1.79}
- {'loss': 0.7446, 'grad_norm': 1.8706341981887817, 'learning_rate': 3.656014051577713e-05, 'epoch': 1.88}
- {'loss': 0.7539, 'grad_norm': 1.5148639678955078, 'learning_rate': 3.202961135812437e-05, 'epoch': 1.96}
- {'loss': 0.7512, 'grad_norm': 1.3771291971206665, 'learning_rate': 2.7662814890184818e-05, 'epoch': 2.04}
- {'loss': 0.7128, 'grad_norm': 1.420331597328186, 'learning_rate': 2.3499538082923606e-05, 'epoch': 2.12}
- {'loss': 0.635, 'grad_norm': 0.9235875010490417, 'learning_rate': 1.9577713588953795e-05, 'epoch': 2.2}
- {'loss': 0.6628, 'grad_norm': 1.6558737754821777, 'learning_rate': 1.5933074128684332e-05, 'epoch': 2.28}
- {'loss': 0.681, 'grad_norm': 0.8138720393180847, 'learning_rate': 1.2598826920598772e-05, 'epoch': 2.36}
- {'loss': 0.6707, 'grad_norm': 1.0700312852859497, 'learning_rate': 9.605351122011309e-06, 'epoch': 2.45}
- {'loss': 0.6201, 'grad_norm': 1.3334729671478271, 'learning_rate': 6.979921036993042e-06, 'epoch': 2.53}
- {'loss': 0.6698, 'grad_norm': 1.440247893333435, 'learning_rate': 4.746457613389904e-06, 'epoch': 2.61}
- {'loss': 0.7072, 'grad_norm': 0.9171076416969299, 'learning_rate': 2.925310493105099e-06, 'epoch': 2.69}
- {'loss': 0.6871, 'grad_norm': 0.9809044003486633, 'learning_rate': 1.5330726014397668e-06, 'epoch': 2.77}
- {'loss': 0.5931, 'grad_norm': 1.7158288955688477, 'learning_rate': 5.824289648152126e-07, 'epoch': 2.85}
- {'loss': 0.6827, 'grad_norm': 1.3241132497787476, 'learning_rate': 8.204113433559201e-08, 'epoch': 2.94}
- 100%|█████████████████████████████████████████████████████████████████████| 366/366 [25:42<00:00, 4.02s/it] [INFO|trainer.py:3503] 2024-08-01 19:32:47,527 >> Saving model checkpoint to saves/llama3-8b/lora/sft/checkp oint-366
- [INFO|configuration_utils.py:731] 2024-08-01 19:32:47,556 >> loading configuration file /root/.cache/modelsc ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-01 19:32:47,557 >> Model config LlamaConfig {
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- [INFO|tokenization_utils_base.py:2702] 2024-08-01 19:32:47,675 >> tokenizer config file saved in saves/llama 3-8b/lora/sft/checkpoint-366/tokenizer_config.json
- [INFO|tokenization_utils_base.py:2711] 2024-08-01 19:32:47,677 >> Special tokens file saved in saves/llama3- 8b/lora/sft/checkpoint-366/special_tokens_map.json
- [INFO|trainer.py:2394] 2024-08-01 19:32:48,046 >>
- Training completed. Do not forget to share your model on huggingface.co/models =)
- {'train_runtime': 1543.2099, 'train_samples_per_second': 1.907, 'train_steps_per_second': 0.237, 'train_loss ': 0.8416516305318947, 'epoch': 2.98}
- 100%|█████████████████████████████████████████████████████████████████████| 366/366 [25:43<00:00, 4.22s/it]
- [INFO|trainer.py:3503] 2024-08-01 19:32:48,050 >> Saving model checkpoint to saves/llama3-8b/lora/sft
- [INFO|configuration_utils.py:731] 2024-08-01 19:32:48,081 >> loading configuration file /root/.cache/modelsc ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-01 19:32:48,082 >> Model config LlamaConfig {
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- [INFO|tokenization_utils_base.py:2702] 2024-08-01 19:32:48,191 >> tokenizer config file saved in saves/llama 3-8b/lora/sft/tokenizer_config.json
- [INFO|tokenization_utils_base.py:2711] 2024-08-01 19:32:48,192 >> Special tokens file saved in saves/llama3- 8b/lora/sft/special_tokens_map.json
- ***** train metrics *****
- epoch = 2.9847
- total_flos = 20619353GF
- train_loss = 0.8417
- train_runtime = 0:25:43.20
- train_samples_per_second = 1.907
- train_steps_per_second = 0.237
- Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
- 08/01/2024 19:32:48 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.
- 08/01/2024 19:32:48 - WARNING - llamafactory.extras.ploting - No metric eval_accuracy to plot.
- [INFO|trainer.py:3819] 2024-08-01 19:32:48,529 >>
- ***** Running Evaluation *****
- [INFO|trainer.py:3821] 2024-08-01 19:32:48,529 >> Num examples = 110
- [INFO|trainer.py:3824] 2024-08-01 19:32:48,529 >> Batch size = 1
- 100%|█████████████████████████████████████████████████████████████████████| 110/110 [00:18<00:00, 6.07it/s]
- ***** eval metrics *****
- epoch = 2.9847
- eval_loss = 0.9957
- eval_runtime = 0:00:18.23
- eval_samples_per_second = 6.031
- eval_steps_per_second = 6.031
- [INFO|modelcard.py:449] 2024-08-01 19:33:06,773 >> Dropping the following result as it does not have all the necessary fields:
- {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
复制代码 输出效果
- root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models# tree -L 6 LLaMA-Factory/saves/
- LLaMA-Factory/saves/
- `-- llama3-8b
- `-- lora
- `-- sft
- |-- README.md
- |-- adapter_config.json
- |-- adapter_model.safetensors
- |-- all_results.json
- |-- checkpoint-366
- | |-- README.md
- | |-- adapter_config.json
- | |-- adapter_model.safetensors
- | |-- optimizer.pt
- | |-- rng_state.pth
- | |-- scheduler.pt
- | |-- special_tokens_map.json
- | |-- tokenizer.json
- | |-- tokenizer_config.json
- | |-- trainer_state.json
- | `-- training_args.bin
- |-- eval_results.json
- |-- special_tokens_map.json
- |-- tokenizer.json
- |-- tokenizer_config.json
- |-- train_results.json
- |-- trainer_log.jsonl
- |-- trainer_state.json
- |-- training_args.bin
- `-- training_loss.png
复制代码 运行时的资源占用情况
3.1.2 多卡情况
- (llama_factory_torch) root@notebook-1819291427828183041-scnlbe5oi5-51898:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# CUDA_VISIBLE_DEVICES=0,1,2 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- [2024-08-02 18:08:58,775] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- 08/02/2024 18:09:01 - INFO - llamafactory.cli - Initializing distributed tasks at: 127.0.0.1:26472
- [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING]
- [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING] *****************************************
- [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
- [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING] *****************************************
- [2024-08-02 18:09:09,155] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- [2024-08-02 18:09:09,269] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- [2024-08-02 18:09:09,457] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- WARNING: Logging before InitGoogleLogging() is written to STDERR
- I0802 18:09:12.353489 95618 ProcessGroupNCCL.cpp:686] [Rank 2] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 180000000000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=353053408
- 08/02/2024 18:09:12 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
- 08/02/2024 18:09:12 - INFO - llamafactory.hparams.parser - Process rank: 2, device: cuda:2, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
- WARNING: Logging before InitGoogleLogging() is written to STDERR
- I0802 18:09:12.555290 95617 ProcessGroupNCCL.cpp:686] [Rank 1] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 180000000000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=369111936
- 08/02/2024 18:09:12 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
- 08/02/2024 18:09:12 - INFO - llamafactory.hparams.parser - Process rank: 1, device: cuda:1, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
- WARNING: Logging before InitGoogleLogging() is written to STDERR
- I0802 18:09:13.120337 95616 ProcessGroupNCCL.cpp:686] [Rank 0] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 180000000000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=359553664
- 08/02/2024 18:09:13 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
- 08/02/2024 18:09:13 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
- 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
- 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
- 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
- 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
- I0802 18:09:14.158418 95618 ProcessGroupNCCL.cpp:2780] Rank 2 using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
- I0802 18:09:14.165846 95617 ProcessGroupNCCL.cpp:2780] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file tokenizer.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file added_tokens.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file special_tokens_map.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file tokenizer_config.json
- [INFO|tokenization_utils_base.py:2533] 2024-08-02 18:09:14,684 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- 08/02/2024 18:09:14 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
- 08/02/2024 18:09:14 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
- 08/02/2024 18:09:14 - INFO - llamafactory.data.loader - Loading dataset identity.json...
- Converting format of dataset (num_proc=16): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 91/91 [00:00<00:00, 301.60 examples/s]
- 08/02/2024 18:09:16 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
- Converting format of dataset (num_proc=16): 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 3399.93 examples/s]
- I0802 18:09:18.295866 95616 ProcessGroupNCCL.cpp:2780] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
- I0802 18:09:19.234498 95616 ProcessGroupNCCL.cpp:1340] NCCL_DEBUG: N/A
- 08/02/2024 18:09:19 - INFO - llamafactory.data.loader - Loading dataset identity.json...
- 08/02/2024 18:09:19 - INFO - llamafactory.data.loader - Loading dataset identity.json...
- Running tokenizer on dataset (num_proc=16): 0%| | 0/1091 [00:00<?, ? examples/s]08/02/2024 18:09:20 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
- 08/02/2024 18:09:20 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
- Running tokenizer on dataset (num_proc=16): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 1091/1091 [00:03<00:00, 273.44 examples/s]
- training example:
- input_ids:
- [128000, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
- inputs:
- <|begin_of_text|><|start_header_id|>user<|end_header_id|>
- hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>
- Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
- label_ids:
- [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
- labels:
- Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
- [INFO|configuration_utils.py:731] 2024-08-02 18:09:24,080 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-02 18:09:24,082 >> Model config LlamaConfig {
- "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- [INFO|modeling_utils.py:3631] 2024-08-02 18:09:24,119 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
- [INFO|modeling_utils.py:1572] 2024-08-02 18:09:24,119 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
- [INFO|configuration_utils.py:1038] 2024-08-02 18:09:24,121 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "eos_token_id": 128009
- }
- Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:09<00:00, 2.49s/it]
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.misc - Found linear modules: up_proj,gate_proj,o_proj,q_proj,k_proj,v_proj,down_proj
- Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00, 2.59s/it]
- [INFO|modeling_utils.py:4463] 2024-08-02 18:09:34,552 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
- [INFO|modeling_utils.py:4471] 2024-08-02 18:09:34,552 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
- If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
- [INFO|configuration_utils.py:991] 2024-08-02 18:09:34,555 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json
- [INFO|configuration_utils.py:1038] 2024-08-02 18:09:34,555 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "do_sample": true,
- "eos_token_id": [
- 128001,
- 128009
- ],
- "max_length": 4096,
- "temperature": 0.6,
- "top_p": 0.9
- }
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.misc - Found linear modules: k_proj,o_proj,v_proj,down_proj,q_proj,up_proj,gate_proj
- Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00, 2.52s/it]
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
- 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
- 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.misc - Found linear modules: gate_proj,down_proj,q_proj,o_proj,up_proj,v_proj,k_proj
- 08/02/2024 18:09:34 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
- 08/02/2024 18:09:34 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
- Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
- [INFO|trainer.py:648] 2024-08-02 18:09:34,983 >> Using auto half precision backend
- 08/02/2024 18:09:35 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
- [INFO|trainer.py:2134] 2024-08-02 18:09:37,114 >> ***** Running training *****
- [INFO|trainer.py:2135] 2024-08-02 18:09:37,114 >> Num examples = 981
- [INFO|trainer.py:2136] 2024-08-02 18:09:37,114 >> Num Epochs = 3
- [INFO|trainer.py:2137] 2024-08-02 18:09:37,114 >> Instantaneous batch size per device = 1
- [INFO|trainer.py:2140] 2024-08-02 18:09:37,114 >> Total train batch size (w. parallel, distributed & accumulation) = 24
- [INFO|trainer.py:2141] 2024-08-02 18:09:37,114 >> Gradient Accumulation steps = 8
- [INFO|trainer.py:2142] 2024-08-02 18:09:37,114 >> Total optimization steps = 120
- [INFO|trainer.py:2143] 2024-08-02 18:09:37,119 >> Number of trainable parameters = 20,971,520
- {'loss': 1.4267, 'grad_norm': 1.401288628578186, 'learning_rate': 8.333333333333334e-05, 'epoch': 0.24}
- {'loss': 1.1319, 'grad_norm': 1.4780751466751099, 'learning_rate': 9.865224352899119e-05, 'epoch': 0.49}
- {'loss': 0.9963, 'grad_norm': 0.532632052898407, 'learning_rate': 9.330127018922194e-05, 'epoch': 0.73}
- {'loss': 0.9792, 'grad_norm': 0.7996620535850525, 'learning_rate': 8.43120818934367e-05, 'epoch': 0.98}
- {'loss': 0.937, 'grad_norm': 0.4041236639022827, 'learning_rate': 7.243995901002312e-05, 'epoch': 1.22}
- {'loss': 0.8805, 'grad_norm': 0.5675532221794128, 'learning_rate': 5.868240888334653e-05, 'epoch': 1.47}
- {'loss': 0.8467, 'grad_norm': 0.5038197636604309, 'learning_rate': 4.4195354293738484e-05, 'epoch': 1.71}
- {'loss': 0.8612, 'grad_norm': 0.7851077914237976, 'learning_rate': 3.019601169804216e-05, 'epoch': 1.96}
- {'loss': 0.818, 'grad_norm': 0.450968474149704, 'learning_rate': 1.7860619515673033e-05, 'epoch': 2.2}
- {'loss': 0.8308, 'grad_norm': 0.5961077809333801, 'learning_rate': 8.225609429353187e-06, 'epoch': 2.45}
- {'loss': 0.8071, 'grad_norm': 0.5323781371116638, 'learning_rate': 2.100524384225555e-06, 'epoch': 2.69}
- {'loss': 0.8061, 'grad_norm': 0.7563619017601013, 'learning_rate': 0.0, 'epoch': 2.94}
- 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [09:46<00:00, 4.56s/it][INFO|trainer.py:3503] 2024-08-02 18:19:24,273 >> Saving model checkpoint to saves/llama3-8b/lora/sft/checkpoint-120
- [INFO|configuration_utils.py:731] 2024-08-02 18:19:24,304 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-02 18:19:24,305 >> Model config LlamaConfig {
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- [INFO|tokenization_utils_base.py:2702] 2024-08-02 18:19:24,432 >> tokenizer config file saved in saves/llama3-8b/lora/sft/checkpoint-120/tokenizer_config.json
- [INFO|tokenization_utils_base.py:2711] 2024-08-02 18:19:24,434 >> Special tokens file saved in saves/llama3-8b/lora/sft/checkpoint-120/special_tokens_map.json
- [INFO|trainer.py:2394] 2024-08-02 18:19:24,832 >>
- Training completed. Do not forget to share your model on huggingface.co/models =)
- {'train_runtime': 587.7138, 'train_samples_per_second': 5.008, 'train_steps_per_second': 0.204, 'train_loss': 0.9434665679931641, 'epoch': 2.94}
- 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [09:47<00:00, 4.90s/it]
- [INFO|trainer.py:3503] 2024-08-02 18:19:24,837 >> Saving model checkpoint to saves/llama3-8b/lora/sft
- [INFO|configuration_utils.py:731] 2024-08-02 18:19:24,907 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-02 18:19:24,908 >> Model config LlamaConfig {
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- [INFO|tokenization_utils_base.py:2702] 2024-08-02 18:19:25,048 >> tokenizer config file saved in saves/llama3-8b/lora/sft/tokenizer_config.json
- [INFO|tokenization_utils_base.py:2711] 2024-08-02 18:19:25,055 >> Special tokens file saved in saves/llama3-8b/lora/sft/special_tokens_map.json
- ***** train metrics *****
- epoch = 2.9358
- total_flos = 20332711GF
- train_loss = 0.9435
- train_runtime = 0:09:47.71
- train_samples_per_second = 5.008
- train_steps_per_second = 0.204
- Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
- 08/02/2024 18:19:25 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.
- 08/02/2024 18:19:25 - WARNING - llamafactory.extras.ploting - No metric eval_accuracy to plot.
- [INFO|trainer.py:3819] 2024-08-02 18:19:25,357 >>
- ***** Running Evaluation *****
- [INFO|trainer.py:3821] 2024-08-02 18:19:25,357 >> Num examples = 110
- [INFO|trainer.py:3824] 2024-08-02 18:19:25,357 >> Batch size = 1
- 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:08<00:00, 4.50it/s]
- ***** eval metrics *****
- epoch = 2.9358
- eval_loss = 0.9702
- eval_runtime = 0:00:08.33
- eval_samples_per_second = 13.193
- eval_steps_per_second = 4.438
- [INFO|modelcard.py:449] 2024-08-02 18:19:33,712 >> Dropping the following result as it does not have all the necessary fields:
- {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
复制代码 运行时的资源占用情况
3.2 模型归并
模型归并实在CPU上进行的。
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
- [2024-08-01 21:34:37,394] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file tokenizer.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file added_tokens.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file special_tokens_map.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file tokenizer_config.json
- [INFO|tokenization_utils_base.py:2533] 2024-08-01 21:34:42,030 >> Special tokens have been added in the vocabulary, make sure the associa ted word embeddings are fine-tuned or trained.
- 08/01/2024 21:34:42 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
- 08/01/2024 21:34:42 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
- [INFO|configuration_utils.py:731] 2024-08-01 21:34:42,031 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Lla ma-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-01 21:34:42,032 >> Model config LlamaConfig {
- "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- 08/01/2024 21:34:42 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
- [INFO|modeling_utils.py:3631] 2024-08-01 21:34:42,058 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-In struct/model.safetensors.index.json
- [INFO|modeling_utils.py:1572] 2024-08-01 21:34:42,058 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
- [INFO|configuration_utils.py:1038] 2024-08-01 21:34:42,059 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "eos_token_id": 128009
- }
- Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 3.40it/s]
- [INFO|modeling_utils.py:4463] 2024-08-01 21:34:43,324 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
- [INFO|modeling_utils.py:4471] 2024-08-01 21:34:43,324 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint a t /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
- If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions with out further training.
- [INFO|configuration_utils.py:991] 2024-08-01 21:34:43,327 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Lla ma-3-8B-Instruct/generation_config.json
- [INFO|configuration_utils.py:1038] 2024-08-01 21:34:43,327 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "do_sample": true,
- "eos_token_id": [
- 128001,
- 128009
- ],
- "max_length": 4096,
- "temperature": 0.6,
- "top_p": 0.9
- }
- 08/01/2024 21:34:43 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
- 08/01/2024 21:40:34 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
- 08/01/2024 21:40:34 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/llama3-8b/lora/sft
- 08/01/2024 21:40:34 - INFO - llamafactory.model.loader - all params: 8,030,261,248
- 08/01/2024 21:40:34 - INFO - llamafactory.train.tuner - Convert model dtype to: torch.bfloat16.
- [INFO|configuration_utils.py:472] 2024-08-01 21:40:34,700 >> Configuration saved in models/llama3_lora_sft/config.json
- [INFO|configuration_utils.py:807] 2024-08-01 21:40:34,704 >> Configuration saved in models/llama3_lora_sft/generation_config.json
- [INFO|modeling_utils.py:2763] 2024-08-01 21:40:49,039 >> The model is bigger than the maximum size per checkpoint (2GB) and is going to be split in 9 checkpoint shards. You can find where each parameters has been saved in the index located at models/llama3_lora_sft/model.safetensors.index.json.
- [INFO|tokenization_utils_base.py:2702] 2024-08-01 21:40:49,046 >> tokenizer config file saved in models/llama3_lora_sft/tokenizer_config.json
- [INFO|tokenization_utils_base.py:2711] 2024-08-01 21:40:49,048 >> Special tokens file saved in models/llama3_lora_sft/special_tokens_map.json
复制代码 输出效果
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models# tree -L 6 LLaMA-Factory/models/llama3_lora_sft/
- LLaMA-Factory/models/llama3_lora_sft/
- |-- config.json
- |-- generation_config.json
- |-- model-00001-of-00009.safetensors
- |-- model-00002-of-00009.safetensors
- |-- model-00003-of-00009.safetensors
- |-- model-00004-of-00009.safetensors
- |-- model-00005-of-00009.safetensors
- |-- model-00006-of-00009.safetensors
- |-- model-00007-of-00009.safetensors
- |-- model-00008-of-00009.safetensors
- |-- model-00009-of-00009.safetensors
- |-- model.safetensors.index.json
- |-- special_tokens_map.json
- |-- tokenizer.json
- `-- tokenizer_config.json
复制代码 运行时的资源占用情况
3.3 LoRA 推理
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-20553:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
- [2024-08-02 22:08:48,070] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- 2024-08-02 22:08:52,267 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file tokenizer.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file added_tokens.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file special_tokens_map.json
- [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file tokenizer_config.json
- [INFO|tokenization_utils_base.py:2533] 2024-08-02 22:08:52,818 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- 08/02/2024 22:08:52 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
- 08/02/2024 22:08:52 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
- [INFO|configuration_utils.py:731] 2024-08-02 22:08:52,820 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
- [INFO|configuration_utils.py:800] 2024-08-02 22:08:52,821 >> Model config LlamaConfig {
- "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
- "architectures": [
- "LlamaForCausalLM"
- ],
- "attention_bias": false,
- "attention_dropout": 0.0,
- "bos_token_id": 128000,
- "eos_token_id": 128009,
- "hidden_act": "silu",
- "hidden_size": 4096,
- "initializer_range": 0.02,
- "intermediate_size": 14336,
- "max_position_embeddings": 8192,
- "mlp_bias": false,
- "model_type": "llama",
- "num_attention_heads": 32,
- "num_hidden_layers": 32,
- "num_key_value_heads": 8,
- "pretraining_tp": 1,
- "rms_norm_eps": 1e-05,
- "rope_scaling": null,
- "rope_theta": 500000.0,
- "tie_word_embeddings": false,
- "torch_dtype": "bfloat16",
- "transformers_version": "4.43.3",
- "use_cache": true,
- "vocab_size": 128256
- }
- 08/02/2024 22:08:52 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
- [INFO|modeling_utils.py:3631] 2024-08-02 22:08:52,847 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
- [INFO|modeling_utils.py:1572] 2024-08-02 22:08:52,847 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
- [INFO|configuration_utils.py:1038] 2024-08-02 22:08:52,848 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "eos_token_id": 128009
- }
- Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00, 1.98s/it]
- [INFO|modeling_utils.py:4463] 2024-08-02 22:09:01,148 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
- [INFO|modeling_utils.py:4471] 2024-08-02 22:09:01,148 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
- If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
- [INFO|configuration_utils.py:991] 2024-08-02 22:09:01,151 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json
- [INFO|configuration_utils.py:1038] 2024-08-02 22:09:01,152 >> Generate config GenerationConfig {
- "bos_token_id": 128000,
- "do_sample": true,
- "eos_token_id": [
- 128001,
- 128009
- ],
- "max_length": 4096,
- "temperature": 0.6,
- "top_p": 0.9
- }
- 08/02/2024 22:09:01 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
- 08/02/2024 22:09:06 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
- 08/02/2024 22:09:06 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/llama3-8b/lora/sft
- 08/02/2024 22:09:06 - INFO - llamafactory.model.loader - all params: 8,030,261,248
- Welcome to the CLI application, use `clear` to remove the history, use `exit` to exit the application.
- User: 中国深圳有哪些旅游景点
- Assistant: 深圳是一个非常有名的旅游城市,拥有许多名副其名的旅游景点。以下是一些主要的旅游景点:
- 1. Window of the World:这是一个规模宏大的主题公园,展示了世界各地的风土人情。
- 2. Splendid China Miniature Theme Park:这个公园展现了中国的历史和文化,拥有许多精致的模型和景观。
- 3. Dafen Oil Painting Village:这个村庄是中国最大的油画村,拥有数以万计的油画作品,展示了中国油画的技艺。
- 4. Dameisha Beach:这个沙滩是深圳最为人知的旅游景点之一,拥有洁洁的沙滩和清澈的海水,是一个非常适合休闲的场所。
- 5. Mangrove Forest Nature Reserve:这个自然保护区拥有广泛的 mangrove 森林,展示了中国的自然景观。
- 6. Shenzhen Museum:这个博物馆展现了深圳的历史和文化,拥有许多历史和艺术的收藏品。
- 7. Lianhua Mountain Park:这个公园是深圳最大的公园,拥有许多山路和景观,展示了中国的自然美景。
- 8. Shenzhen Bay Sports Center:这个体育中心拥有许多不同的运动场所,展示了中国的体育技艺。
- 9. OCT-LOFT:这个文化区拥有许多艺术和文化的项目,展示了中国的艺术和文化。
- 10. Fairy Lake Botanical Garden:这个植物园拥有许多不同的植物和花卉,展示了中国的自然美景。
- User: 中国广州有哪些旅游景点
- Assistant: 广州是一个非常有名的旅游城市,拥有许多名副其名的旅游景点。以下是一些主要的旅游景点:
- 1. Canton Tower:这是一个位于广州的超高建筑,拥有360度的观景台,展示了广州的全景。
- 2. Chimelong Paradise:这个主题公园拥有许多不同的游乐设施和景观,展示了中国的游乐技艺。
- 3. Baiyun Mountain:这个山区拥有许多不同的景观和游乐设施,展示了中国的自然美景。
- 4. Yuexiu Park:这个公园是广州最大的公园,拥有许多不同的景观和游乐设施,展示了中国的自然美景。
- 5. Temple of the Six Banyan Trees:这个寺庙拥有许多不同的文化和历史的收藏品,展示了中国的历史和文化。
- 6. Museum of the Chinese Revolution:这个博物馆展现了中国革命的历史和文化,拥有许多不同的收藏品和展品。
- 7. Guangzhou Tower:这个塔楼是广州最早的建筑,拥有许多不同的景观和游乐设施,展示了中国的历史和文化。
- 8. Guangzhou Museum:这个博物馆展现了广州的历史和文化,拥有许多不同的收藏品和展品。
- 9. Flower Street:这个街区拥有许多不同的花卉和景观,展示了中国的自然美景。
- 10. Shamian Island:这个岛区拥有许多不同的景观和游乐设施,展示了中国的自然美景和历史文化。
复制代码 四、FAQ
Q:OSError: You are trying to access a gated repo. Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
- (llama_fct) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/mod
- els/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- No ROCm runtime is found, using ROCM_HOME='/opt/dtk'
- /opt/conda/envs/llama_fct/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Fail ed to load image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or d irectory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this w arning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or ` libpng` installed before building `torchvision` from source?
- warn(
- [2024-08-01 15:13:21,242] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- 08/01/2024 15:13:24 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cpu, n_gpu: 0, distributed training: False, compute dtype: torch.bfloat16
- [INFO|tokenization_auto.py:682] 2024-08-01 15:13:25,152 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
- Traceback (most recent call last):
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
- response.raise_for_status()
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
- raise HTTPError(http_error_msg, response=self)
- requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://hf-mirror.com/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json
- The above exception was the direct cause of the following exception:
- Traceback (most recent call last):
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/utils/hub.py", line 402, in cached_file
- resolved_file = hf_hub_download(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
- return f(*args, **kwargs)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
- return fn(*args, **kwargs)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1240, in hf_hub_download
- return _hf_hub_download_to_cache_dir(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1347, in _hf_hub_download_to_cache_dir
- _raise_on_head_call_error(head_call_error, force_download, local_files_only)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1854, in _raise_on_head_call_error
- raise head_call_error
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1751, in _get_metadata_or_catch_error
- metadata = get_hf_file_metadata(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
- return fn(*args, **kwargs)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1673, in get_hf_file_metadata
- r = _request_wrapper(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 376, in _request_wrapper
- response = _request_wrapper(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 400, in _request_wrapper
- hf_raise_for_status(response)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 321, in hf_raise_for_status
- raise GatedRepoError(message, response) from e
- huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66ab3595-53663c2f4d5cf81405b65b9e;080cfa15-3220-4ab1-b123-4a32ba31a03a)
- Cannot access gated repo for url https://hf-mirror.com/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.
- Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.
- The above exception was the direct cause of the following exception:
- Traceback (most recent call last):
- File "/opt/conda/envs/llama_fct/bin/llamafactory-cli", line 8, in <module>
- sys.exit(main())
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
- run_exp()
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
- run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 44, in run_sft
- tokenizer_module = load_tokenizer(model_args)
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 69, in load_tokenizer
- tokenizer = AutoTokenizer.from_pretrained(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 853, in from_pretrained
- config = AutoConfig.from_pretrained(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 972, in from_pretrained
- config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/configuration_utils.py", line 632, in get_config_dict
- config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
- resolved_config_file = cached_file(
- File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/utils/hub.py", line 420, in cached_file
- raise EnvironmentError(
- OSError: You are trying to access a gated repo.
- Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
- 401 Client Error. (Request ID: Root=1-66ab3595-53663c2f4d5cf81405b65b9e;080cfa15-3220-4ab1-b123-4a32ba31a03a)
- Cannot access gated repo for url https://hf-mirror.com/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.
- Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.
复制代码 错误缘故原由:默认是从Hugging Face中获取模型,由于Hugging Face 模型授权失败,导致获取模型失败。
解决方法:从modelscope下载模型。
- export USE_MODELSCOPE_HUB=1
- # Windows 使用 `set USE_MODELSCOPE_HUB=1`
复制代码 将 model_name_or_path 设置为模型 ID 来加载对应的模型。在魔搭社区查察全部可用的模型,例如 LLM-Research/Meta-Llama-3-8B-Instruct。
修改 llama3_lora_sft.yaml 文件:
- # model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
- 改为
- model_name_or_path: LLM-Research/Meta-Llama-3-8B-Instruct
复制代码- llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
复制代码 Q:OSError: LLM-Research/Meta-Llama-3-8B-Instruct is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
- [2024-08-01 21:17:22,212] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
- Traceback (most recent call last):
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
- response.raise_for_status()
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
- raise HTTPError(http_error_msg, response=self)
- requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://hf-mirror.com/LLM-Research/Meta-Llama-3-8B-Instruct/resolve/main/tokenizer_config.json
- The above exception was the direct cause of the following exception:
- Traceback (most recent call last):
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/utils/hub.py", line 402, in cached_file
- resolved_file = hf_hub_download(
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
- return fn(*args, **kwargs)
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
- return _hf_hub_download_to_cache_dir(
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir
- _raise_on_head_call_error(head_call_error, force_download, local_files_only)
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1823, in _raise_on_head_call_error
- raise head_call_error
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error
- metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
- return fn(*args, **kwargs)
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
- r = _request_wrapper(
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
- response = _request_wrapper(
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 396, in _request_wrapper
- hf_raise_for_status(response)
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 352, in hf_raise_for_status
- raise RepositoryNotFoundError(message, response) from e
- huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-66ab8ae6-4ed0547e1f86fcb201b723f8;acee559e-0676-48e4-8871-b6eb58e797ca)
- Repository Not Found for url: https://hf-mirror.com/LLM-Research/Meta-Llama-3-8B-Instruct/resolve/main/tokenizer_config.json.
- Please make sure you specified the correct `repo_id` and `repo_type`.
- If you are trying to access a private or gated repo, make sure you are authenticated.
- Invalid username or password.
- The above exception was the direct cause of the following exception:
- Traceback (most recent call last):
- File "/opt/conda/envs/llama_factory_torch/bin/llamafactory-cli", line 8, in <module>
- sys.exit(main())
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 81, in main
- run_chat()
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 125, in run_chat
- chat_model = ChatModel()
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 44, in __init__
- self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 53, in __init__
- tokenizer_module = load_tokenizer(model_args)
- File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 69, in load_tokenizer
- tokenizer = AutoTokenizer.from_pretrained(
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 833, in from_pretrained
- tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 665, in get_tokenizer_config
- resolved_config_file = cached_file(
- File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/utils/hub.py", line 425, in cached_file
- raise EnvironmentError(
- OSError: LLM-Research/Meta-Llama-3-8B-Instruct is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
- If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
复制代码 错误缘故原由:找不到 LLM-Research/Meta-Llama-3-8B-Instruct模型。
解决方法:从modelscope下载模型。
- export USE_MODELSCOPE_HUB=1
复制代码 Q:ModuleNotFoundError: No module named 'modelscope'
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- [2024-08-01 19:05:15,320] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (a uto detect)08/01/2024 19:05:18 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distri buted training: False, compute dtype: torch.bfloat16Traceback (most recent call last): File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/extras/misc.py", line 219, i n try_download_model_from_ms from modelscope import snapshot_downloadModuleNotFoundError: No module named 'modelscope'During handling of the above exception, another exception occurred:Traceback (most recent call last): File "/opt/conda/envs/llama_factory_torch/bin/llamafactory-cli", line 8, in <module> sys.exit(main()) File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main run_exp() File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 44, in run_sft tokenizer_module = load_tokenizer(model_args) File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 67, i n load_tokenizer init_kwargs = _get_init_kwargs(model_args) File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 52, i n _get_init_kwargs model_args.model_name_or_path = try_download_model_from_ms(model_args) File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/extras/misc.py", line 224, i n try_download_model_from_ms raise ImportError("Please install modelscope via `pip install modelscope -U`")ImportError: Please install modelscope via `pip install modelscope -U`
复制代码 错误缘故原由:缺少modelscope依赖包。
解决方法:安装modelscope。
- (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
- odels/LLaMA-Factory# pip install --no-dependencies modelscope
- Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
- Collecting modelscope
- Using cached https://pypi.tuna.tsinghua.edu.cn/packages/38/37/9fe505ebc67ba5e0345a69d6e8b2ee8630523975b484 d221691ef60182bd/modelscope-1.16.1-py3-none-any.whl (5.7 MB)
- Installing collected packages: modelscope
- Successfully installed modelscope-1.16.1
复制代码 Q:ImportError: /PATH/TO/site-packages/torch/lib/libtorch_hip.so: undefined symbol: ncclCommInitRankConfig
- (llama_fct_pytorch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- Traceback (most recent call last): File "/opt/conda/envs/llama_fct_pytorch/bin/llamafactory-cli", line 5, in <module> from llamafactory.cli import main File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/__init__.py", line 38, in <module> from .cli import VERSION File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 21, in <module> from . import launcher File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/launcher.py", line 15, in <module> from llamafactory.train.tuner import run_exp File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/tuner.py", line 19, in <module> import torch File "/opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/__init__.py", line 237, in <module> from torch._C import * # noqa: F403ImportError: /opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/lib/libtorch_hip.so: undefined symbol: ncclCommInitRankConfig
复制代码- >>> import torch
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- File "/opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/__init__.py", line 237, in <module>
- from torch._C import * # noqa: F403
- ImportError: /opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/lib/libtorch_hip.so: undefined symbol: ncclCommInitRankConfig
复制代码 错误缘故原由:当前PyTorch版本不支持DCU。
该问题的解决方法,请参考下文的FAQ。
Q:PyTorch版本不支持DCU
- (llama_fct) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
- No ROCm runtime is found, using ROCM_HOME='/opt/dtk'/opt/conda/envs/llama_fct/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to lo ad image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before buildin g `torchvision` from source? warn([2024-08-01 17:49:08,805] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (a uto detect)08/01/2024 17:49:12 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cpu, n_gpu: 0, distribut ed training: False, compute dtype: torch.bfloat16Downloading: 100%|█████████████████████████████████████████████████████████| 654/654 [00:00<00:00, 2.56kB/s]Downloading: 100%|█████████████████████████████████████████████████████████| 48.0/48.0 [00:00<00:00, 183B/s]Downloading: 100%|███████████████████████████████████████████████████████████| 187/187 [00:00<00:00, 759B/s]Downloading: 100%|█████████████████████████████████████████████████████| 7.62k/7.62k [00:00<00:00, 29.9kB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.63G/4.63G [01:33<00:00, 53.4MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.66G/4.66G [01:02<00:00, 79.9MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.58G/4.58G [01:00<00:00, 81.7MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 1.09G/1.09G [00:22<00:00, 51.6MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 23.4k/23.4k [00:00<00:00, 53.6kB/s]Downloading: 100%|██████████████████████████████████████████████████████| 36.3k/36.3k [00:00<00:00, 125kB/s]Downloading: 100%|█████████████████████████████████████████████████████████| 73.0/73.0 [00:00<00:00, 293B/s]Downloading: 100%|█████████████████████████████████████████████████████| 8.66M/8.66M [00:00<00:00, 13.5MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 49.8k/49.8k [00:00<00:00, 90.0kB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.59k/4.59k [00:00<00:00, 18.7kB/s][INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,510 >> loading file tokenizer.json[INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,511 >> loading file added_tokens.json[INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,511 >> loading file special_tokens_map.json[INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,511 >> loading file tokenizer_config.json[INFO|tokenization_utils_base.py:2533] 2024-08-01 17:53:53,854 >> Special tokens have been added in the voca bulary, make sure the associated word embeddings are fine-tuned or trained.08/01/2024 17:53:53 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>08/01/2024 17:53:53 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>08/01/2024 17:53:53 - INFO - llamafactory.data.loader - Loading dataset identity.json...Generating train split: 91 examples [00:00, 10580.81 examples/s]Converting format of dataset (num_proc=16): 100%|███████████████████| 91/91 [00:00<00:00, 427.78 examples/s]08/01/2024 17:53:56 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...Generating train split: 1000 examples [00:00, 66788.28 examples/s]Converting format of dataset (num_proc=16): 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 4688.60 examples/s]Running tokenizer on dataset (num_proc=16): 100%|███████████████████████████████████████████████████████████████████████████████████████████| 1091/1091 [00:03<00:00, 295.08 examples/s]training example:input_ids:[128000, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]inputs:<|begin_of_text|><|start_header_id|>user<|end_header_id|>hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>label_ids:[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]labels:Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>[INFO|configuration_utils.py:731] 2024-08-01 17:54:02,106 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json[INFO|configuration_utils.py:800] 2024-08-01 17:54:02,108 >> Model config LlamaConfig { "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct", "architectures": [ "LlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 128000, "eos_token_id": 128009, "hidden_act": "silu", "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 8192, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32, "num_key_value_heads": 8, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": null, "rope_theta": 500000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.43.3", "use_cache": true, "vocab_size": 128256}[INFO|modeling_utils.py:3631] 2024-08-01 17:54:02,139 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json[INFO|modeling_utils.py:1572] 2024-08-01 17:54:02,140 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.[INFO|configuration_utils.py:1038] 2024-08-01 17:54:02,142 >> Generate config GenerationConfig { "bos_token_id": 128000, "eos_token_id": 128009}Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.68it/s][INFO|modeling_utils.py:4463] 2024-08-01 17:54:03,708 >> All model checkpoint weights were used when initializing LlamaForCausalLM.[INFO|modeling_utils.py:4471] 2024-08-01 17:54:03,709 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.[INFO|configuration_utils.py:991] 2024-08-01 17:54:03,712 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json[INFO|configuration_utils.py:1038] 2024-08-01 17:54:03,713 >> Generate config GenerationConfig { "bos_token_id": 128000, "do_sample": true, "eos_token_id": [ 128001, 128009 ], "max_length": 4096, "temperature": 0.6, "top_p": 0.9}08/01/2024 17:54:03 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.08/01/2024 17:54:03 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.08/01/2024 17:54:03 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.08/01/2024 17:54:03 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA08/01/2024 17:54:03 - INFO - llamafactory.model.model_utils.misc - Found linear modules: q_proj,down_proj,o_proj,k_proj,gate_proj,up_proj,v_proj08/01/2024 17:54:08 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.[INFO|trainer.py:648] 2024-08-01 17:54:08,091 >> Using cpu_amp half precision backend[INFO|trainer.py:2134] 2024-08-01 17:54:09,008 >> ***** Running training *****[INFO|trainer.py:2135] 2024-08-01 17:54:09,008 >> Num examples = 981[INFO|trainer.py:2136] 2024-08-01 17:54:09,008 >> Num Epochs = 3[INFO|trainer.py:2137] 2024-08-01 17:54:09,008 >> Instantaneous batch size per device = 1[INFO|trainer.py:2140] 2024-08-01 17:54:09,008 >> Total train batch size (w. parallel, distributed & accumulation) = 8[INFO|trainer.py:2141] 2024-08-01 17:54:09,008 >> Gradient Accumulation steps = 8[INFO|trainer.py:2142] 2024-08-01 17:54:09,008 >> Total optimization steps = 366[INFO|trainer.py:2143] 2024-08-01 17:54:09,012 >> Number of trainable parameters = 20,971,520 0%| | 0/366 [00:00<?, ?it/s
复制代码 错误缘故原由:当前PyTorch不支持DCU,导致步伐卡住,模型无法微调训练。
解决方法:在光合社区中查询并下载安装PyTorch。以 torch-2.1.0+das1.1.git3ac1bdd.abi1.dtk2404-cp310-cp310-manylinux_2_31_x86_64 为例,尝试安装 torch-2.1.0。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |