快速体验LoRA微调Llama3-8B模型以及海光DCU推理加快(曙光超算互联网平台国 ...

打印 上一主题 下一主题

主题 661|帖子 661|积分 1983

序言

本文以 LLaMA-Factory 为例,在超算互联网平台SCNet上利用异构加快卡AI 显存64GB PCIE,对 Llama3-8B-Instruct 模型进行 LoRA 微调归并推理
一、参考资料

github仓库代码:LLaMA-Factory,利用最新的代码分支:v0.8.3。
超算互联网平台
异构加快卡AI 显存64GB PCIE
二、预备情况

1. 体系镜像

异构加快卡AI为国产加快卡(DCU),基于DTK软件栈(对标NVIDIA的CUDA),请选择 dtk24.04 版本的镜像情况。
以jupyterlab-pytorch:2.1.0-ubuntu20.04-dtk24.04-py310 镜像为例。
2. 软硬件依赖

特别注意:由 requirements.txt 文件可知,LLaMA-Factory项目要求最低版本 transformers>=4.41.2,vllm>=0.4.3 。
必须项至少推荐python3.83.11torch1.13.12.3.0transformers4.41.24.41.2datasets2.16.02.19.2accelerate0.30.10.30.1peft0.11.10.11.1trl0.8.60.9.4 可选项至少推荐CUDA11.612.2deepspeed0.10.00.14.0bitsandbytes0.39.00.43.1vllm0.4.30.4.3flash-attn2.3.02.5.9 3. 克隆base情况

  1. root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# conda create -n llama_factory_torch --clone base
  2. Source:      /opt/conda
  3. Destination: /opt/conda/envs/llama_factory_torch
  4. The following packages cannot be cloned out of the root environment:
  5. - https://repo.anaconda.com/pkgs/main/linux-64::conda-23.7.4-py310h06a4308_0
  6. Packages: 44
  7. Files: 53489
  8. Downloading and Extracting Packages
  9. Downloading and Extracting Packages
  10. Preparing transaction: done
  11. Verifying transaction: done
  12. Executing transaction: done
  13. #
  14. # To activate this environment, use
  15. #
  16. #     $ conda activate llama_factory_torch
  17. #
  18. # To deactivate an active environment, use
  19. #
  20. #     $ conda deactivate
复制代码
4. 安装 LLaMA Factory

  1. git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
  2. cd LLaMA-Factory
  3. pip install -e ".[torch,metrics]"
复制代码
  1. root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# source activate llama_factory_torch
  2. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
  3. odels/LLaMA-Factory# pip install -e ".[torch,metrics]"
  4. Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
  5. Obtaining file:///public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory
  6.   Installing build dependencies ... done
  7.   Checking if build backend supports build_editable ... done
  8.   Getting requirements to build editable ... done
  9.   Preparing editable metadata (pyproject.toml) ... done
  10. ...
  11. Checking if build backend supports build_editable ... done
  12. Building wheels for collected packages: llamafactory
  13.   Building editable for llamafactory (pyproject.toml) ... done
  14.   Created wheel for llamafactory: filename=llamafactory-0.8.4.dev0-0.editable-py3-none-any.whl size=20781 sha256=70c0480e2b648516e0eac3d39371d4100cbdaa1f277d87b657bf2adec9e0b2be
  15.   Stored in directory: /tmp/pip-ephem-wheel-cache-uhypmj_8/wheels/e9/b4/89/f13e921e37904ee0c839434aad2d7b2951c2c68e596667c7ef
  16. Successfully built llamafactory
  17. DEPRECATION: lmdeploy 0.1.0-git782048c.abi0.dtk2404.torch2.1. has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of lmdeploy or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
  18. DEPRECATION: mmcv 2.0.1-gitc0ccf15.abi0.dtk2404.torch2.1. has a non-standard version number. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of mmcv or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
  19. Installing collected packages: pydub, jieba, urllib3, tomlkit, shtab, semantic-version, scipy, ruff, rouge-chinese, joblib, importlib-resources, ffmpy, docstring-parser, aiofiles, nltk, tyro, sse-starlette, tokenizers, gradio-client, transformers, trl, peft, gradio, llamafactory
  20.   Attempting uninstall: urllib3
  21.     Found existing installation: urllib3 1.26.13
  22.     Uninstalling urllib3-1.26.13:
  23.       Successfully uninstalled urllib3-1.26.13
  24.   Attempting uninstall: tokenizers
  25.     Found existing installation: tokenizers 0.15.0
  26.     Uninstalling tokenizers-0.15.0:
  27.       Successfully uninstalled tokenizers-0.15.0
  28.   Attempting uninstall: transformers
  29.     Found existing installation: transformers 4.38.0
  30.     Uninstalling transformers-4.38.0:
  31.       Successfully uninstalled transformers-4.38.0
  32. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
  33. lmdeploy 0.1.0-git782048c.abi0.dtk2404.torch2.1. requires transformers==4.33.2, but you have transformers 4.43.3 which is incompatible.
  34. Successfully installed aiofiles-23.2.1 docstring-parser-0.16 ffmpy-0.4.0 gradio-4.40.0 gradio-client-1.2.0 importlib-resources-6.4.0 jieba-0.42.1 joblib-1.4.2 llamafactory-0.8.4.dev0 nltk-3.8.1 peft-0.12.0 pydub-0.25.1 rouge-chinese-1.0.3 ruff-0.5.5 scipy-1.14.0 semantic-version-2.10.0 shtab-1.7.1 sse-starlette-2.1.3 tokenizers-0.19.1 tomlkit-0.12.0 transformers-4.43.3 trl-0.9.6 tyro-0.8.5 urllib3-2.2.2
  35. WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
  36. [notice] A new release of pip is available: 24.0 -> 24.2
  37. [notice] To update, run: pip install --upgrade pip
复制代码
5. 解决依赖包冲突

  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
  2. odels/LLaMA-Factory# pip install --no-deps -e .
  3. Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
  4. Obtaining file:///public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory
  5.   Installing build dependencies ... done
  6.   Checking if build backend supports build_editable ... done
  7.   Getting requirements to build editable ... done
  8.   Preparing editable metadata (pyproject.toml) ... done
  9. Building wheels for collected packages: llamafactory
  10.   Building editable for llamafactory (pyproject.toml) ... done
  11.   Created wheel for llamafactory: filename=llamafactory-0.8.4.dev0-0.editable-py3-none-any.whl size=20781 sha256=f874a791bc9fdca02075cda0459104b48a57d300a077eca00eee7221cde429c3
  12.   Stored in directory: /tmp/pip-ephem-wheel-cache-7vjiq3f3/wheels/e9/b4/89/f13e921e37904ee0c839434aad2d7b2951c2c68e596667c7ef
  13. Successfully built llamafactory
  14. Installing collected packages: llamafactory
  15.   Attempting uninstall: llamafactory
  16.     Found existing installation: llamafactory 0.8.4.dev0
  17.     Uninstalling llamafactory-0.8.4.dev0:
  18.       Successfully uninstalled llamafactory-0.8.4.dev0
  19. Successfully installed llamafactory-0.8.4.dev0
  20. WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
  21. [notice] A new release of pip is available: 24.0 -> 24.2
  22. [notice] To update, run: pip install --upgrade pip
复制代码
6. 安装 vllm==0.4.3

  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
  2. odels/LLaMA-Factory# pip list | grep llvm
  3. [notice] A new release of pip is available: 24.0 -> 24.2
  4. [notice] To update, run: pip install --upgrade pip
  5. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
  6. odels/LLaMA-Factory# pip install --no-dependencies vllm==0.4.3
  7. Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
  8. Collecting vllm==0.4.3
  9.   Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1a/1e/10bcb6566f4fa8b95ff85bddfd1675ff7db33ba861f59bd70aa3b92a46b7/vllm-0.4.3-cp310-cp310-manylinux1_x86_64.whl (131.1 MB)
  10. Installing collected packages: vllm
  11.   Attempting uninstall: vllm
  12.     Found existing installation: vllm 0.3.3+git3380931.abi0.dtk2404.torch2.1
  13.     Uninstalling vllm-0.3.3+git3380931.abi0.dtk2404.torch2.1:
  14.       Successfully uninstalled vllm-0.3.3+git3380931.abi0.dtk2404.torch2.1
  15. Successfully installed vllm-0.4.3
  16. WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
  17. [notice] A new release of pip is available: 24.0 -> 24.2
  18. [notice] To update, run: pip install --upgrade pip
复制代码
三、关键步调

1. 获取Access Token

获取Access Token,并登录Hugging Face 账户。
详细步调,请参考另一篇博客:Hugging Face和ModelScope大模型/数据集的下载加快方法
  1. pip install --upgrade huggingface_hub
  2. huggingface-cli login
复制代码
2. llamafactory-cli 指令

利用 llamafactory-cli help 表现帮助信息。
  1. (llama_fct) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/mod
  2. els/LLaMA-Factory# llamafactory-cli help
  3. No ROCm runtime is found, using ROCM_HOME='/opt/dtk'
  4. /opt/conda/envs/llama_fct/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Fail                                     ed to load image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or d                                     irectory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this w                                     arning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `                                     libpng` installed before building `torchvision` from source?
  5.   warn(
  6. [2024-08-01 15:12:24,629] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to                                      cuda (auto detect)
  7. ----------------------------------------------------------------------
  8. | Usage:                                                             |
  9. |   llamafactory-cli api -h: launch an OpenAI-style API server       |
  10. |   llamafactory-cli chat -h: launch a chat interface in CLI         |
  11. |   llamafactory-cli eval -h: evaluate models                        |
  12. |   llamafactory-cli export -h: merge LoRA adapters and export model |
  13. |   llamafactory-cli train -h: train models                          |
  14. |   llamafactory-cli webchat -h: launch a chat interface in Web UI   |
  15. |   llamafactory-cli webui: launch LlamaBoard                        |
  16. |   llamafactory-cli version: show version info                      |
  17. ----------------------------------------------------------------------
复制代码
3. 快速开始

下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA 微调归并推理
  1. llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  2. llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
  3. llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
复制代码
3.1 LoRA 微调

模型微调训练是在DCU上进行的。
3.1.1 单卡情况

  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
  2. odels/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  3. [2024-08-01 19:06:41,134] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (a                                                                            uto detect)
  4. 08/01/2024 19:06:44 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distri                                                                            buted training: False, compute dtype: torch.bfloat16
  5. [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file tokenizer.json
  6. [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file added_tokens.json
  7. [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file special_tokens_map.json
  8. [INFO|tokenization_utils_base.py:2287] 2024-08-01 19:06:45,194 >> loading file tokenizer_config.json
  9. [INFO|tokenization_utils_base.py:2533] 2024-08-01 19:06:45,563 >> Special tokens have been added in the voca                                                                            bulary, make sure the associated word embeddings are fine-tuned or trained.
  10. 08/01/2024 19:06:45 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
  11. 08/01/2024 19:06:45 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
  12. 08/01/2024 19:06:45 - INFO - llamafactory.data.loader - Loading dataset identity.json...
  13. Converting format of dataset (num_proc=16): 100%|███████████████████| 91/91 [00:00<00:00, 444.18 examples/s]
  14. 08/01/2024 19:06:47 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
  15. Converting format of dataset (num_proc=16): 100%|██████████████| 1000/1000 [00:00<00:00, 4851.17 examples/s]
  16. Running tokenizer on dataset (num_proc=16): 100%|███████████████| 1091/1091 [00:02<00:00, 375.29 examples/s]
  17. training example:
  18. input_ids:
  19. [128000, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 5991, 609,                                                                             39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
  20. inputs:
  21. <|begin_of_text|><|start_header_id|>user<|end_header_id|>
  22. hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>
  23. Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
  24. label_ids:
  25. [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 5991, 609, 39254, 459                                                                            , 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
  26. labels:
  27. Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
  28. [INFO|configuration_utils.py:731] 2024-08-01 19:06:53,502 >> loading configuration file /root/.cache/modelsc                                                                            ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  29. [INFO|configuration_utils.py:800] 2024-08-01 19:06:53,503 >> Model config LlamaConfig {
  30.   "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
  31.   "architectures": [
  32.     "LlamaForCausalLM"
  33.   ],
  34.   "attention_bias": false,
  35.   "attention_dropout": 0.0,
  36.   "bos_token_id": 128000,
  37.   "eos_token_id": 128009,
  38.   "hidden_act": "silu",
  39.   "hidden_size": 4096,
  40.   "initializer_range": 0.02,
  41.   "intermediate_size": 14336,
  42.   "max_position_embeddings": 8192,
  43.   "mlp_bias": false,
  44.   "model_type": "llama",
  45.   "num_attention_heads": 32,
  46.   "num_hidden_layers": 32,
  47.   "num_key_value_heads": 8,
  48.   "pretraining_tp": 1,
  49.   "rms_norm_eps": 1e-05,
  50.   "rope_scaling": null,
  51.   "rope_theta": 500000.0,
  52.   "tie_word_embeddings": false,
  53.   "torch_dtype": "bfloat16",
  54.   "transformers_version": "4.43.3",
  55.   "use_cache": true,
  56.   "vocab_size": 128256
  57. }
  58. [INFO|modeling_utils.py:3631] 2024-08-01 19:06:53,534 >> loading weights file /root/.cache/modelscope/hub/LL                                                                            M-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
  59. [INFO|modeling_utils.py:1572] 2024-08-01 19:06:53,534 >> Instantiating LlamaForCausalLM model under default                                                                             dtype torch.bfloat16.
  60. [INFO|configuration_utils.py:1038] 2024-08-01 19:06:53,536 >> Generate config GenerationConfig {
  61.   "bos_token_id": 128000,
  62.   "eos_token_id": 128009
  63. }
  64. Loading checkpoint shards: 100%|██████████████████████████████████████████████| 4/4 [00:08<00:00,  2.04s/it]
  65. [INFO|modeling_utils.py:4463] 2024-08-01 19:07:01,775 >> All model checkpoint weights were used when initial                                                                            izing LlamaForCausalLM.
  66. [INFO|modeling_utils.py:4471] 2024-08-01 19:07:01,775 >> All the weights of LlamaForCausalLM were initialize                                                                            d from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
  67. If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaFor                                                                            CausalLM for predictions without further training.
  68. [INFO|configuration_utils.py:991] 2024-08-01 19:07:01,779 >> loading configuration file /root/.cache/modelsc                                                                            ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json
  69. [INFO|configuration_utils.py:1038] 2024-08-01 19:07:01,780 >> Generate config GenerationConfig {
  70.   "bos_token_id": 128000,
  71.   "do_sample": true,
  72.   "eos_token_id": [
  73.     128001,
  74.     128009
  75.   ],
  76.   "max_length": 4096,
  77.   "temperature": 0.6,
  78.   "top_p": 0.9
  79. }
  80. 08/01/2024 19:07:01 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
  81. 08/01/2024 19:07:01 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementati                                                                            on.
  82. 08/01/2024 19:07:01 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
  83. 08/01/2024 19:07:01 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
  84. 08/01/2024 19:07:01 - INFO - llamafactory.model.model_utils.misc - Found linear modules: q_proj,up_proj,v_pr                                                                            oj,down_proj,k_proj,o_proj,gate_proj
  85. 08/01/2024 19:07:04 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
  86. Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
  87. [INFO|trainer.py:648] 2024-08-01 19:07:04,471 >> Using auto half precision backend
  88. [INFO|trainer.py:2134] 2024-08-01 19:07:04,831 >> ***** Running training *****
  89. [INFO|trainer.py:2135] 2024-08-01 19:07:04,831 >>   Num examples = 981
  90. [INFO|trainer.py:2136] 2024-08-01 19:07:04,831 >>   Num Epochs = 3
  91. [INFO|trainer.py:2137] 2024-08-01 19:07:04,832 >>   Instantaneous batch size per device = 1
  92. [INFO|trainer.py:2140] 2024-08-01 19:07:04,832 >>   Total train batch size (w. parallel, distributed & accumulation) = 8
  93. [INFO|trainer.py:2141] 2024-08-01 19:07:04,832 >>   Gradient Accumulation steps = 8
  94. [INFO|trainer.py:2142] 2024-08-01 19:07:04,832 >>   Total optimization steps = 366
  95. [INFO|trainer.py:2143] 2024-08-01 19:07:04,836 >>   Number of trainable parameters = 20,971,520
  96. {'loss': 1.5025, 'grad_norm': 1.3309401273727417, 'learning_rate': 2.702702702702703e-05, 'epoch': 0.08}
  97. {'loss': 1.3424, 'grad_norm': 1.8096668720245361, 'learning_rate': 5.405405405405406e-05, 'epoch': 0.16}
  98. {'loss': 1.1286, 'grad_norm': 1.2990491390228271, 'learning_rate': 8.108108108108109e-05, 'epoch': 0.24}
  99. {'loss': 0.9808, 'grad_norm': 1.1075998544692993, 'learning_rate': 9.997948550797227e-05, 'epoch': 0.33}
  100. {'loss': 0.9924, 'grad_norm': 1.8073676824569702, 'learning_rate': 9.961525153583327e-05, 'epoch': 0.41}
  101. {'loss': 1.0052, 'grad_norm': 1.2079122066497803, 'learning_rate': 9.879896064123961e-05, 'epoch': 0.49}
  102. {'loss': 0.9973, 'grad_norm': 1.7361079454421997, 'learning_rate': 9.753805025397779e-05, 'epoch': 0.57}
  103. {'loss': 0.8488, 'grad_norm': 1.1059085130691528, 'learning_rate': 9.584400884284545e-05, 'epoch': 0.65}
  104. {'loss': 0.9893, 'grad_norm': 0.8711654543876648, 'learning_rate': 9.373227124134888e-05, 'epoch': 0.73}
  105. {'loss': 0.9116, 'grad_norm': 1.3793599605560303, 'learning_rate': 9.122207801708802e-05, 'epoch': 0.82}
  106. {'loss': 1.0429, 'grad_norm': 1.3769993782043457, 'learning_rate': 8.833630016614976e-05, 'epoch': 0.9}
  107. {'loss': 0.9323, 'grad_norm': 1.2503643035888672, 'learning_rate': 8.510123072976239e-05, 'epoch': 0.98}
  108. {'loss': 0.9213, 'grad_norm': 2.449227809906006, 'learning_rate': 8.154634523184388e-05, 'epoch': 1.06}
  109. {'loss': 0.8386, 'grad_norm': 1.009852409362793, 'learning_rate': 7.770403312015721e-05, 'epoch': 1.14}
  110. 40%|███████████████████████████▌                                         | 146/366 [10:19<15:11,  4.14s/it]                                                                            {'loss': 0.856, 'grad_norm': 0.863474428653717, 'learning_rate': 7.360930265797935e-05, 'epoch': 1.22}
  111. {'loss': 0.838, 'grad_norm': 0.712546169757843, 'learning_rate': 6.929946195508932e-05, 'epoch': 1.3}
  112. {'loss': 0.8268, 'grad_norm': 1.6060960292816162, 'learning_rate': 6.481377904428171e-05, 'epoch': 1.39}
  113. {'loss': 0.7326, 'grad_norm': 0.7863644957542419, 'learning_rate': 6.019312410053286e-05, 'epoch': 1.47}
  114. {'loss': 0.7823, 'grad_norm': 0.8964634537696838, 'learning_rate': 5.547959706265068e-05, 'epoch': 1.55}
  115. {'loss': 0.7599, 'grad_norm': 0.5305138826370239, 'learning_rate': 5.0716144050239375e-05, 'epoch': 1.63}
  116. {'loss': 0.815, 'grad_norm': 0.8153926730155945, 'learning_rate': 4.594616607090028e-05, 'epoch': 1.71}
  117. {'loss': 0.8258, 'grad_norm': 1.3266267776489258, 'learning_rate': 4.121312358283463e-05, 'epoch': 1.79}
  118. {'loss': 0.7446, 'grad_norm': 1.8706341981887817, 'learning_rate': 3.656014051577713e-05, 'epoch': 1.88}
  119. {'loss': 0.7539, 'grad_norm': 1.5148639678955078, 'learning_rate': 3.202961135812437e-05, 'epoch': 1.96}
  120. {'loss': 0.7512, 'grad_norm': 1.3771291971206665, 'learning_rate': 2.7662814890184818e-05, 'epoch': 2.04}
  121. {'loss': 0.7128, 'grad_norm': 1.420331597328186, 'learning_rate': 2.3499538082923606e-05, 'epoch': 2.12}
  122. {'loss': 0.635, 'grad_norm': 0.9235875010490417, 'learning_rate': 1.9577713588953795e-05, 'epoch': 2.2}
  123. {'loss': 0.6628, 'grad_norm': 1.6558737754821777, 'learning_rate': 1.5933074128684332e-05, 'epoch': 2.28}
  124. {'loss': 0.681, 'grad_norm': 0.8138720393180847, 'learning_rate': 1.2598826920598772e-05, 'epoch': 2.36}
  125. {'loss': 0.6707, 'grad_norm': 1.0700312852859497, 'learning_rate': 9.605351122011309e-06, 'epoch': 2.45}
  126. {'loss': 0.6201, 'grad_norm': 1.3334729671478271, 'learning_rate': 6.979921036993042e-06, 'epoch': 2.53}
  127. {'loss': 0.6698, 'grad_norm': 1.440247893333435, 'learning_rate': 4.746457613389904e-06, 'epoch': 2.61}
  128. {'loss': 0.7072, 'grad_norm': 0.9171076416969299, 'learning_rate': 2.925310493105099e-06, 'epoch': 2.69}
  129. {'loss': 0.6871, 'grad_norm': 0.9809044003486633, 'learning_rate': 1.5330726014397668e-06, 'epoch': 2.77}
  130. {'loss': 0.5931, 'grad_norm': 1.7158288955688477, 'learning_rate': 5.824289648152126e-07, 'epoch': 2.85}
  131. {'loss': 0.6827, 'grad_norm': 1.3241132497787476, 'learning_rate': 8.204113433559201e-08, 'epoch': 2.94}
  132. 100%|█████████████████████████████████████████████████████████████████████| 366/366 [25:42<00:00,  4.02s/it]                                                                            [INFO|trainer.py:3503] 2024-08-01 19:32:47,527 >> Saving model checkpoint to saves/llama3-8b/lora/sft/checkp                                                                            oint-366
  133. [INFO|configuration_utils.py:731] 2024-08-01 19:32:47,556 >> loading configuration file /root/.cache/modelsc                                                                            ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  134. [INFO|configuration_utils.py:800] 2024-08-01 19:32:47,557 >> Model config LlamaConfig {
  135.   "architectures": [
  136.     "LlamaForCausalLM"
  137.   ],
  138.   "attention_bias": false,
  139.   "attention_dropout": 0.0,
  140.   "bos_token_id": 128000,
  141.   "eos_token_id": 128009,
  142.   "hidden_act": "silu",
  143.   "hidden_size": 4096,
  144.   "initializer_range": 0.02,
  145.   "intermediate_size": 14336,
  146.   "max_position_embeddings": 8192,
  147.   "mlp_bias": false,
  148.   "model_type": "llama",
  149.   "num_attention_heads": 32,
  150.   "num_hidden_layers": 32,
  151.   "num_key_value_heads": 8,
  152.   "pretraining_tp": 1,
  153.   "rms_norm_eps": 1e-05,
  154.   "rope_scaling": null,
  155.   "rope_theta": 500000.0,
  156.   "tie_word_embeddings": false,
  157.   "torch_dtype": "bfloat16",
  158.   "transformers_version": "4.43.3",
  159.   "use_cache": true,
  160.   "vocab_size": 128256
  161. }
  162. [INFO|tokenization_utils_base.py:2702] 2024-08-01 19:32:47,675 >> tokenizer config file saved in saves/llama                                                                            3-8b/lora/sft/checkpoint-366/tokenizer_config.json
  163. [INFO|tokenization_utils_base.py:2711] 2024-08-01 19:32:47,677 >> Special tokens file saved in saves/llama3-                                                                            8b/lora/sft/checkpoint-366/special_tokens_map.json
  164. [INFO|trainer.py:2394] 2024-08-01 19:32:48,046 >>
  165. Training completed. Do not forget to share your model on huggingface.co/models =)
  166. {'train_runtime': 1543.2099, 'train_samples_per_second': 1.907, 'train_steps_per_second': 0.237, 'train_loss                                                                            ': 0.8416516305318947, 'epoch': 2.98}
  167. 100%|█████████████████████████████████████████████████████████████████████| 366/366 [25:43<00:00,  4.22s/it]
  168. [INFO|trainer.py:3503] 2024-08-01 19:32:48,050 >> Saving model checkpoint to saves/llama3-8b/lora/sft
  169. [INFO|configuration_utils.py:731] 2024-08-01 19:32:48,081 >> loading configuration file /root/.cache/modelsc                                                                            ope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  170. [INFO|configuration_utils.py:800] 2024-08-01 19:32:48,082 >> Model config LlamaConfig {
  171.   "architectures": [
  172.     "LlamaForCausalLM"
  173.   ],
  174.   "attention_bias": false,
  175.   "attention_dropout": 0.0,
  176.   "bos_token_id": 128000,
  177.   "eos_token_id": 128009,
  178.   "hidden_act": "silu",
  179.   "hidden_size": 4096,
  180.   "initializer_range": 0.02,
  181.   "intermediate_size": 14336,
  182.   "max_position_embeddings": 8192,
  183.   "mlp_bias": false,
  184.   "model_type": "llama",
  185.   "num_attention_heads": 32,
  186.   "num_hidden_layers": 32,
  187.   "num_key_value_heads": 8,
  188.   "pretraining_tp": 1,
  189.   "rms_norm_eps": 1e-05,
  190.   "rope_scaling": null,
  191.   "rope_theta": 500000.0,
  192.   "tie_word_embeddings": false,
  193.   "torch_dtype": "bfloat16",
  194.   "transformers_version": "4.43.3",
  195.   "use_cache": true,
  196.   "vocab_size": 128256
  197. }
  198. [INFO|tokenization_utils_base.py:2702] 2024-08-01 19:32:48,191 >> tokenizer config file saved in saves/llama                                                                            3-8b/lora/sft/tokenizer_config.json
  199. [INFO|tokenization_utils_base.py:2711] 2024-08-01 19:32:48,192 >> Special tokens file saved in saves/llama3-                                                                            8b/lora/sft/special_tokens_map.json
  200. ***** train metrics *****
  201.   epoch                    =     2.9847
  202.   total_flos               = 20619353GF
  203.   train_loss               =     0.8417
  204.   train_runtime            = 0:25:43.20
  205.   train_samples_per_second =      1.907
  206.   train_steps_per_second   =      0.237
  207. Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
  208. 08/01/2024 19:32:48 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.
  209. 08/01/2024 19:32:48 - WARNING - llamafactory.extras.ploting - No metric eval_accuracy to plot.
  210. [INFO|trainer.py:3819] 2024-08-01 19:32:48,529 >>
  211. ***** Running Evaluation *****
  212. [INFO|trainer.py:3821] 2024-08-01 19:32:48,529 >>   Num examples = 110
  213. [INFO|trainer.py:3824] 2024-08-01 19:32:48,529 >>   Batch size = 1
  214. 100%|█████████████████████████████████████████████████████████████████████| 110/110 [00:18<00:00,  6.07it/s]
  215. ***** eval metrics *****
  216.   epoch                   =     2.9847
  217.   eval_loss               =     0.9957
  218.   eval_runtime            = 0:00:18.23
  219.   eval_samples_per_second =      6.031
  220.   eval_steps_per_second   =      6.031
  221. [INFO|modelcard.py:449] 2024-08-01 19:33:06,773 >> Dropping the following result as it does not have all the                                                                             necessary fields:
  222. {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
复制代码
输出效果
  1. root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models# tree -L 6 LLaMA-Factory/saves/
  2. LLaMA-Factory/saves/
  3. `-- llama3-8b
  4.     `-- lora
  5.         `-- sft
  6.             |-- README.md
  7.             |-- adapter_config.json
  8.             |-- adapter_model.safetensors
  9.             |-- all_results.json
  10.             |-- checkpoint-366
  11.             |   |-- README.md
  12.             |   |-- adapter_config.json
  13.             |   |-- adapter_model.safetensors
  14.             |   |-- optimizer.pt
  15.             |   |-- rng_state.pth
  16.             |   |-- scheduler.pt
  17.             |   |-- special_tokens_map.json
  18.             |   |-- tokenizer.json
  19.             |   |-- tokenizer_config.json
  20.             |   |-- trainer_state.json
  21.             |   `-- training_args.bin
  22.             |-- eval_results.json
  23.             |-- special_tokens_map.json
  24.             |-- tokenizer.json
  25.             |-- tokenizer_config.json
  26.             |-- train_results.json
  27.             |-- trainer_log.jsonl
  28.             |-- trainer_state.json
  29.             |-- training_args.bin
  30.             `-- training_loss.png
复制代码
运行时的资源占用情况


3.1.2 多卡情况

  1. (llama_factory_torch) root@notebook-1819291427828183041-scnlbe5oi5-51898:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# CUDA_VISIBLE_DEVICES=0,1,2 llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  2. [2024-08-02 18:08:58,775] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  3. 08/02/2024 18:09:01 - INFO - llamafactory.cli - Initializing distributed tasks at: 127.0.0.1:26472
  4. [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING]
  5. [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING] *****************************************
  6. [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  7. [2024-08-02 18:09:04,227] torch.distributed.run: [WARNING] *****************************************
  8. [2024-08-02 18:09:09,155] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  9. [2024-08-02 18:09:09,269] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  10. [2024-08-02 18:09:09,457] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  11. WARNING: Logging before InitGoogleLogging() is written to STDERR
  12. I0802 18:09:12.353489 95618 ProcessGroupNCCL.cpp:686] [Rank 2] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 180000000000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=353053408
  13. 08/02/2024 18:09:12 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
  14. 08/02/2024 18:09:12 - INFO - llamafactory.hparams.parser - Process rank: 2, device: cuda:2, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
  15. WARNING: Logging before InitGoogleLogging() is written to STDERR
  16. I0802 18:09:12.555290 95617 ProcessGroupNCCL.cpp:686] [Rank 1] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 180000000000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=369111936
  17. 08/02/2024 18:09:12 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
  18. 08/02/2024 18:09:12 - INFO - llamafactory.hparams.parser - Process rank: 1, device: cuda:1, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
  19. WARNING: Logging before InitGoogleLogging() is written to STDERR
  20. I0802 18:09:13.120337 95616 ProcessGroupNCCL.cpp:686] [Rank 0] ProcessGroupNCCL initialization options:NCCL_ASYNC_ERROR_HANDLING: 1, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 180000000000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=359553664
  21. 08/02/2024 18:09:13 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
  22. 08/02/2024 18:09:13 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
  23. 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
  24. 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
  25. 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
  26. 08/02/2024 18:09:13 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
  27. I0802 18:09:14.158418 95618 ProcessGroupNCCL.cpp:2780] Rank 2 using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
  28. I0802 18:09:14.165846 95617 ProcessGroupNCCL.cpp:2780] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
  29. [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file tokenizer.json
  30. [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file added_tokens.json
  31. [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file special_tokens_map.json
  32. [INFO|tokenization_utils_base.py:2287] 2024-08-02 18:09:14,276 >> loading file tokenizer_config.json
  33. [INFO|tokenization_utils_base.py:2533] 2024-08-02 18:09:14,684 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
  34. 08/02/2024 18:09:14 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
  35. 08/02/2024 18:09:14 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
  36. 08/02/2024 18:09:14 - INFO - llamafactory.data.loader - Loading dataset identity.json...
  37. Converting format of dataset (num_proc=16): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 91/91 [00:00<00:00, 301.60 examples/s]
  38. 08/02/2024 18:09:16 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
  39. Converting format of dataset (num_proc=16): 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 3399.93 examples/s]
  40. I0802 18:09:18.295866 95616 ProcessGroupNCCL.cpp:2780] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
  41. I0802 18:09:19.234498 95616 ProcessGroupNCCL.cpp:1340] NCCL_DEBUG: N/A
  42. 08/02/2024 18:09:19 - INFO - llamafactory.data.loader - Loading dataset identity.json...
  43. 08/02/2024 18:09:19 - INFO - llamafactory.data.loader - Loading dataset identity.json...
  44. Running tokenizer on dataset (num_proc=16):   0%|                                                                                                            | 0/1091 [00:00<?, ? examples/s]08/02/2024 18:09:20 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
  45. 08/02/2024 18:09:20 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...
  46. Running tokenizer on dataset (num_proc=16): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 1091/1091 [00:03<00:00, 273.44 examples/s]
  47. training example:
  48. input_ids:
  49. [128000, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
  50. inputs:
  51. <|begin_of_text|><|start_header_id|>user<|end_header_id|>
  52. hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>
  53. Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
  54. label_ids:
  55. [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]
  56. labels:
  57. Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>
  58. [INFO|configuration_utils.py:731] 2024-08-02 18:09:24,080 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  59. [INFO|configuration_utils.py:800] 2024-08-02 18:09:24,082 >> Model config LlamaConfig {
  60.   "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
  61.   "architectures": [
  62.     "LlamaForCausalLM"
  63.   ],
  64.   "attention_bias": false,
  65.   "attention_dropout": 0.0,
  66.   "bos_token_id": 128000,
  67.   "eos_token_id": 128009,
  68.   "hidden_act": "silu",
  69.   "hidden_size": 4096,
  70.   "initializer_range": 0.02,
  71.   "intermediate_size": 14336,
  72.   "max_position_embeddings": 8192,
  73.   "mlp_bias": false,
  74.   "model_type": "llama",
  75.   "num_attention_heads": 32,
  76.   "num_hidden_layers": 32,
  77.   "num_key_value_heads": 8,
  78.   "pretraining_tp": 1,
  79.   "rms_norm_eps": 1e-05,
  80.   "rope_scaling": null,
  81.   "rope_theta": 500000.0,
  82.   "tie_word_embeddings": false,
  83.   "torch_dtype": "bfloat16",
  84.   "transformers_version": "4.43.3",
  85.   "use_cache": true,
  86.   "vocab_size": 128256
  87. }
  88. [INFO|modeling_utils.py:3631] 2024-08-02 18:09:24,119 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
  89. [INFO|modeling_utils.py:1572] 2024-08-02 18:09:24,119 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
  90. [INFO|configuration_utils.py:1038] 2024-08-02 18:09:24,121 >> Generate config GenerationConfig {
  91.   "bos_token_id": 128000,
  92.   "eos_token_id": 128009
  93. }
  94. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:09<00:00,  2.49s/it]
  95. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
  96. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
  97. 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
  98. 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
  99. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.misc - Found linear modules: up_proj,gate_proj,o_proj,q_proj,k_proj,v_proj,down_proj
  100. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00,  2.59s/it]
  101. [INFO|modeling_utils.py:4463] 2024-08-02 18:09:34,552 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
  102. [INFO|modeling_utils.py:4471] 2024-08-02 18:09:34,552 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
  103. If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
  104. [INFO|configuration_utils.py:991] 2024-08-02 18:09:34,555 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json
  105. [INFO|configuration_utils.py:1038] 2024-08-02 18:09:34,555 >> Generate config GenerationConfig {
  106.   "bos_token_id": 128000,
  107.   "do_sample": true,
  108.   "eos_token_id": [
  109.     128001,
  110.     128009
  111.   ],
  112.   "max_length": 4096,
  113.   "temperature": 0.6,
  114.   "top_p": 0.9
  115. }
  116. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
  117. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
  118. 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
  119. 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
  120. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.misc - Found linear modules: k_proj,o_proj,v_proj,down_proj,q_proj,up_proj,gate_proj
  121. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00,  2.52s/it]
  122. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
  123. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
  124. 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
  125. 08/02/2024 18:09:34 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
  126. 08/02/2024 18:09:34 - INFO - llamafactory.model.model_utils.misc - Found linear modules: gate_proj,down_proj,q_proj,o_proj,up_proj,v_proj,k_proj
  127. 08/02/2024 18:09:34 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
  128. 08/02/2024 18:09:34 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
  129. Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
  130. [INFO|trainer.py:648] 2024-08-02 18:09:34,983 >> Using auto half precision backend
  131. 08/02/2024 18:09:35 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
  132. [INFO|trainer.py:2134] 2024-08-02 18:09:37,114 >> ***** Running training *****
  133. [INFO|trainer.py:2135] 2024-08-02 18:09:37,114 >>   Num examples = 981
  134. [INFO|trainer.py:2136] 2024-08-02 18:09:37,114 >>   Num Epochs = 3
  135. [INFO|trainer.py:2137] 2024-08-02 18:09:37,114 >>   Instantaneous batch size per device = 1
  136. [INFO|trainer.py:2140] 2024-08-02 18:09:37,114 >>   Total train batch size (w. parallel, distributed & accumulation) = 24
  137. [INFO|trainer.py:2141] 2024-08-02 18:09:37,114 >>   Gradient Accumulation steps = 8
  138. [INFO|trainer.py:2142] 2024-08-02 18:09:37,114 >>   Total optimization steps = 120
  139. [INFO|trainer.py:2143] 2024-08-02 18:09:37,119 >>   Number of trainable parameters = 20,971,520
  140. {'loss': 1.4267, 'grad_norm': 1.401288628578186, 'learning_rate': 8.333333333333334e-05, 'epoch': 0.24}
  141. {'loss': 1.1319, 'grad_norm': 1.4780751466751099, 'learning_rate': 9.865224352899119e-05, 'epoch': 0.49}
  142. {'loss': 0.9963, 'grad_norm': 0.532632052898407, 'learning_rate': 9.330127018922194e-05, 'epoch': 0.73}
  143. {'loss': 0.9792, 'grad_norm': 0.7996620535850525, 'learning_rate': 8.43120818934367e-05, 'epoch': 0.98}
  144. {'loss': 0.937, 'grad_norm': 0.4041236639022827, 'learning_rate': 7.243995901002312e-05, 'epoch': 1.22}
  145. {'loss': 0.8805, 'grad_norm': 0.5675532221794128, 'learning_rate': 5.868240888334653e-05, 'epoch': 1.47}
  146. {'loss': 0.8467, 'grad_norm': 0.5038197636604309, 'learning_rate': 4.4195354293738484e-05, 'epoch': 1.71}
  147. {'loss': 0.8612, 'grad_norm': 0.7851077914237976, 'learning_rate': 3.019601169804216e-05, 'epoch': 1.96}
  148. {'loss': 0.818, 'grad_norm': 0.450968474149704, 'learning_rate': 1.7860619515673033e-05, 'epoch': 2.2}
  149. {'loss': 0.8308, 'grad_norm': 0.5961077809333801, 'learning_rate': 8.225609429353187e-06, 'epoch': 2.45}
  150. {'loss': 0.8071, 'grad_norm': 0.5323781371116638, 'learning_rate': 2.100524384225555e-06, 'epoch': 2.69}
  151. {'loss': 0.8061, 'grad_norm': 0.7563619017601013, 'learning_rate': 0.0, 'epoch': 2.94}
  152. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [09:46<00:00,  4.56s/it][INFO|trainer.py:3503] 2024-08-02 18:19:24,273 >> Saving model checkpoint to saves/llama3-8b/lora/sft/checkpoint-120
  153. [INFO|configuration_utils.py:731] 2024-08-02 18:19:24,304 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  154. [INFO|configuration_utils.py:800] 2024-08-02 18:19:24,305 >> Model config LlamaConfig {
  155.   "architectures": [
  156.     "LlamaForCausalLM"
  157.   ],
  158.   "attention_bias": false,
  159.   "attention_dropout": 0.0,
  160.   "bos_token_id": 128000,
  161.   "eos_token_id": 128009,
  162.   "hidden_act": "silu",
  163.   "hidden_size": 4096,
  164.   "initializer_range": 0.02,
  165.   "intermediate_size": 14336,
  166.   "max_position_embeddings": 8192,
  167.   "mlp_bias": false,
  168.   "model_type": "llama",
  169.   "num_attention_heads": 32,
  170.   "num_hidden_layers": 32,
  171.   "num_key_value_heads": 8,
  172.   "pretraining_tp": 1,
  173.   "rms_norm_eps": 1e-05,
  174.   "rope_scaling": null,
  175.   "rope_theta": 500000.0,
  176.   "tie_word_embeddings": false,
  177.   "torch_dtype": "bfloat16",
  178.   "transformers_version": "4.43.3",
  179.   "use_cache": true,
  180.   "vocab_size": 128256
  181. }
  182. [INFO|tokenization_utils_base.py:2702] 2024-08-02 18:19:24,432 >> tokenizer config file saved in saves/llama3-8b/lora/sft/checkpoint-120/tokenizer_config.json
  183. [INFO|tokenization_utils_base.py:2711] 2024-08-02 18:19:24,434 >> Special tokens file saved in saves/llama3-8b/lora/sft/checkpoint-120/special_tokens_map.json
  184. [INFO|trainer.py:2394] 2024-08-02 18:19:24,832 >>
  185. Training completed. Do not forget to share your model on huggingface.co/models =)
  186. {'train_runtime': 587.7138, 'train_samples_per_second': 5.008, 'train_steps_per_second': 0.204, 'train_loss': 0.9434665679931641, 'epoch': 2.94}
  187. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 120/120 [09:47<00:00,  4.90s/it]
  188. [INFO|trainer.py:3503] 2024-08-02 18:19:24,837 >> Saving model checkpoint to saves/llama3-8b/lora/sft
  189. [INFO|configuration_utils.py:731] 2024-08-02 18:19:24,907 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  190. [INFO|configuration_utils.py:800] 2024-08-02 18:19:24,908 >> Model config LlamaConfig {
  191.   "architectures": [
  192.     "LlamaForCausalLM"
  193.   ],
  194.   "attention_bias": false,
  195.   "attention_dropout": 0.0,
  196.   "bos_token_id": 128000,
  197.   "eos_token_id": 128009,
  198.   "hidden_act": "silu",
  199.   "hidden_size": 4096,
  200.   "initializer_range": 0.02,
  201.   "intermediate_size": 14336,
  202.   "max_position_embeddings": 8192,
  203.   "mlp_bias": false,
  204.   "model_type": "llama",
  205.   "num_attention_heads": 32,
  206.   "num_hidden_layers": 32,
  207.   "num_key_value_heads": 8,
  208.   "pretraining_tp": 1,
  209.   "rms_norm_eps": 1e-05,
  210.   "rope_scaling": null,
  211.   "rope_theta": 500000.0,
  212.   "tie_word_embeddings": false,
  213.   "torch_dtype": "bfloat16",
  214.   "transformers_version": "4.43.3",
  215.   "use_cache": true,
  216.   "vocab_size": 128256
  217. }
  218. [INFO|tokenization_utils_base.py:2702] 2024-08-02 18:19:25,048 >> tokenizer config file saved in saves/llama3-8b/lora/sft/tokenizer_config.json
  219. [INFO|tokenization_utils_base.py:2711] 2024-08-02 18:19:25,055 >> Special tokens file saved in saves/llama3-8b/lora/sft/special_tokens_map.json
  220. ***** train metrics *****
  221.   epoch                    =     2.9358
  222.   total_flos               = 20332711GF
  223.   train_loss               =     0.9435
  224.   train_runtime            = 0:09:47.71
  225.   train_samples_per_second =      5.008
  226.   train_steps_per_second   =      0.204
  227. Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
  228. 08/02/2024 18:19:25 - WARNING - llamafactory.extras.ploting - No metric eval_loss to plot.
  229. 08/02/2024 18:19:25 - WARNING - llamafactory.extras.ploting - No metric eval_accuracy to plot.
  230. [INFO|trainer.py:3819] 2024-08-02 18:19:25,357 >>
  231. ***** Running Evaluation *****
  232. [INFO|trainer.py:3821] 2024-08-02 18:19:25,357 >>   Num examples = 110
  233. [INFO|trainer.py:3824] 2024-08-02 18:19:25,357 >>   Batch size = 1
  234. 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:08<00:00,  4.50it/s]
  235. ***** eval metrics *****
  236.   epoch                   =     2.9358
  237.   eval_loss               =     0.9702
  238.   eval_runtime            = 0:00:08.33
  239.   eval_samples_per_second =     13.193
  240.   eval_steps_per_second   =      4.438
  241. [INFO|modelcard.py:449] 2024-08-02 18:19:33,712 >> Dropping the following result as it does not have all the necessary fields:
  242. {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
复制代码
运行时的资源占用情况


3.2 模型归并

模型归并实在CPU上进行的。
  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
  2. [2024-08-01 21:34:37,394] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  3. [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file tokenizer.json
  4. [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file added_tokens.json
  5. [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file special_tokens_map.json
  6. [INFO|tokenization_utils_base.py:2287] 2024-08-01 21:34:41,664 >> loading file tokenizer_config.json
  7. [INFO|tokenization_utils_base.py:2533] 2024-08-01 21:34:42,030 >> Special tokens have been added in the vocabulary, make sure the associa                 ted word embeddings are fine-tuned or trained.
  8. 08/01/2024 21:34:42 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
  9. 08/01/2024 21:34:42 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
  10. [INFO|configuration_utils.py:731] 2024-08-01 21:34:42,031 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Lla                 ma-3-8B-Instruct/config.json
  11. [INFO|configuration_utils.py:800] 2024-08-01 21:34:42,032 >> Model config LlamaConfig {
  12.   "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
  13.   "architectures": [
  14.     "LlamaForCausalLM"
  15.   ],
  16.   "attention_bias": false,
  17.   "attention_dropout": 0.0,
  18.   "bos_token_id": 128000,
  19.   "eos_token_id": 128009,
  20.   "hidden_act": "silu",
  21.   "hidden_size": 4096,
  22.   "initializer_range": 0.02,
  23.   "intermediate_size": 14336,
  24.   "max_position_embeddings": 8192,
  25.   "mlp_bias": false,
  26.   "model_type": "llama",
  27.   "num_attention_heads": 32,
  28.   "num_hidden_layers": 32,
  29.   "num_key_value_heads": 8,
  30.   "pretraining_tp": 1,
  31.   "rms_norm_eps": 1e-05,
  32.   "rope_scaling": null,
  33.   "rope_theta": 500000.0,
  34.   "tie_word_embeddings": false,
  35.   "torch_dtype": "bfloat16",
  36.   "transformers_version": "4.43.3",
  37.   "use_cache": true,
  38.   "vocab_size": 128256
  39. }
  40. 08/01/2024 21:34:42 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
  41. [INFO|modeling_utils.py:3631] 2024-08-01 21:34:42,058 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-In                 struct/model.safetensors.index.json
  42. [INFO|modeling_utils.py:1572] 2024-08-01 21:34:42,058 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
  43. [INFO|configuration_utils.py:1038] 2024-08-01 21:34:42,059 >> Generate config GenerationConfig {
  44.   "bos_token_id": 128000,
  45.   "eos_token_id": 128009
  46. }
  47. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.40it/s]
  48. [INFO|modeling_utils.py:4463] 2024-08-01 21:34:43,324 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
  49. [INFO|modeling_utils.py:4471] 2024-08-01 21:34:43,324 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint a                 t /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
  50. If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions with                 out further training.
  51. [INFO|configuration_utils.py:991] 2024-08-01 21:34:43,327 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Lla                 ma-3-8B-Instruct/generation_config.json
  52. [INFO|configuration_utils.py:1038] 2024-08-01 21:34:43,327 >> Generate config GenerationConfig {
  53.   "bos_token_id": 128000,
  54.   "do_sample": true,
  55.   "eos_token_id": [
  56.     128001,
  57.     128009
  58.   ],
  59.   "max_length": 4096,
  60.   "temperature": 0.6,
  61.   "top_p": 0.9
  62. }
  63. 08/01/2024 21:34:43 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
  64. 08/01/2024 21:40:34 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
  65. 08/01/2024 21:40:34 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/llama3-8b/lora/sft
  66. 08/01/2024 21:40:34 - INFO - llamafactory.model.loader - all params: 8,030,261,248
  67. 08/01/2024 21:40:34 - INFO - llamafactory.train.tuner - Convert model dtype to: torch.bfloat16.
  68. [INFO|configuration_utils.py:472] 2024-08-01 21:40:34,700 >> Configuration saved in models/llama3_lora_sft/config.json
  69. [INFO|configuration_utils.py:807] 2024-08-01 21:40:34,704 >> Configuration saved in models/llama3_lora_sft/generation_config.json
  70. [INFO|modeling_utils.py:2763] 2024-08-01 21:40:49,039 >> The model is bigger than the maximum size per checkpoint (2GB) and is going to be split in 9 checkpoint shards. You can find where each parameters has been saved in the index located at models/llama3_lora_sft/model.safetensors.index.json.
  71. [INFO|tokenization_utils_base.py:2702] 2024-08-01 21:40:49,046 >> tokenizer config file saved in models/llama3_lora_sft/tokenizer_config.json
  72. [INFO|tokenization_utils_base.py:2711] 2024-08-01 21:40:49,048 >> Special tokens file saved in models/llama3_lora_sft/special_tokens_map.json
复制代码
输出效果
  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models# tree -L 6 LLaMA-Factory/models/llama3_lora_sft/
  2. LLaMA-Factory/models/llama3_lora_sft/
  3. |-- config.json
  4. |-- generation_config.json
  5. |-- model-00001-of-00009.safetensors
  6. |-- model-00002-of-00009.safetensors
  7. |-- model-00003-of-00009.safetensors
  8. |-- model-00004-of-00009.safetensors
  9. |-- model-00005-of-00009.safetensors
  10. |-- model-00006-of-00009.safetensors
  11. |-- model-00007-of-00009.safetensors
  12. |-- model-00008-of-00009.safetensors
  13. |-- model-00009-of-00009.safetensors
  14. |-- model.safetensors.index.json
  15. |-- special_tokens_map.json
  16. |-- tokenizer.json
  17. `-- tokenizer_config.json
复制代码
运行时的资源占用情况

3.3 LoRA 推理

  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-20553:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
  2. [2024-08-02 22:08:48,070] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  3. 2024-08-02 22:08:52,267 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
  4. [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file tokenizer.json
  5. [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file added_tokens.json
  6. [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file special_tokens_map.json
  7. [INFO|tokenization_utils_base.py:2287] 2024-08-02 22:08:52,535 >> loading file tokenizer_config.json
  8. [INFO|tokenization_utils_base.py:2533] 2024-08-02 22:08:52,818 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
  9. 08/02/2024 22:08:52 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
  10. 08/02/2024 22:08:52 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
  11. [INFO|configuration_utils.py:731] 2024-08-02 22:08:52,820 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json
  12. [INFO|configuration_utils.py:800] 2024-08-02 22:08:52,821 >> Model config LlamaConfig {
  13.   "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",
  14.   "architectures": [
  15.     "LlamaForCausalLM"
  16.   ],
  17.   "attention_bias": false,
  18.   "attention_dropout": 0.0,
  19.   "bos_token_id": 128000,
  20.   "eos_token_id": 128009,
  21.   "hidden_act": "silu",
  22.   "hidden_size": 4096,
  23.   "initializer_range": 0.02,
  24.   "intermediate_size": 14336,
  25.   "max_position_embeddings": 8192,
  26.   "mlp_bias": false,
  27.   "model_type": "llama",
  28.   "num_attention_heads": 32,
  29.   "num_hidden_layers": 32,
  30.   "num_key_value_heads": 8,
  31.   "pretraining_tp": 1,
  32.   "rms_norm_eps": 1e-05,
  33.   "rope_scaling": null,
  34.   "rope_theta": 500000.0,
  35.   "tie_word_embeddings": false,
  36.   "torch_dtype": "bfloat16",
  37.   "transformers_version": "4.43.3",
  38.   "use_cache": true,
  39.   "vocab_size": 128256
  40. }
  41. 08/02/2024 22:08:52 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
  42. [INFO|modeling_utils.py:3631] 2024-08-02 22:08:52,847 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json
  43. [INFO|modeling_utils.py:1572] 2024-08-02 22:08:52,847 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
  44. [INFO|configuration_utils.py:1038] 2024-08-02 22:08:52,848 >> Generate config GenerationConfig {
  45.   "bos_token_id": 128000,
  46.   "eos_token_id": 128009
  47. }
  48. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00,  1.98s/it]
  49. [INFO|modeling_utils.py:4463] 2024-08-02 22:09:01,148 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
  50. [INFO|modeling_utils.py:4471] 2024-08-02 22:09:01,148 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.
  51. If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
  52. [INFO|configuration_utils.py:991] 2024-08-02 22:09:01,151 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json
  53. [INFO|configuration_utils.py:1038] 2024-08-02 22:09:01,152 >> Generate config GenerationConfig {
  54.   "bos_token_id": 128000,
  55.   "do_sample": true,
  56.   "eos_token_id": [
  57.     128001,
  58.     128009
  59.   ],
  60.   "max_length": 4096,
  61.   "temperature": 0.6,
  62.   "top_p": 0.9
  63. }
  64. 08/02/2024 22:09:01 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
  65. 08/02/2024 22:09:06 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
  66. 08/02/2024 22:09:06 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/llama3-8b/lora/sft
  67. 08/02/2024 22:09:06 - INFO - llamafactory.model.loader - all params: 8,030,261,248
  68. Welcome to the CLI application, use `clear` to remove the history, use `exit` to exit the application.
  69. User: 中国深圳有哪些旅游景点
  70. Assistant: 深圳是一个非常有名的旅游城市,拥有许多名副其名的旅游景点。以下是一些主要的旅游景点:
  71. 1. Window of the World:这是一个规模宏大的主题公园,展示了世界各地的风土人情。
  72. 2. Splendid China Miniature Theme Park:这个公园展现了中国的历史和文化,拥有许多精致的模型和景观。
  73. 3. Dafen Oil Painting Village:这个村庄是中国最大的油画村,拥有数以万计的油画作品,展示了中国油画的技艺。
  74. 4. Dameisha Beach:这个沙滩是深圳最为人知的旅游景点之一,拥有洁洁的沙滩和清澈的海水,是一个非常适合休闲的场所。
  75. 5. Mangrove Forest Nature Reserve:这个自然保护区拥有广泛的 mangrove 森林,展示了中国的自然景观。
  76. 6. Shenzhen Museum:这个博物馆展现了深圳的历史和文化,拥有许多历史和艺术的收藏品。
  77. 7. Lianhua Mountain Park:这个公园是深圳最大的公园,拥有许多山路和景观,展示了中国的自然美景。
  78. 8. Shenzhen Bay Sports Center:这个体育中心拥有许多不同的运动场所,展示了中国的体育技艺。
  79. 9. OCT-LOFT:这个文化区拥有许多艺术和文化的项目,展示了中国的艺术和文化。
  80. 10. Fairy Lake Botanical Garden:这个植物园拥有许多不同的植物和花卉,展示了中国的自然美景。
  81. User: 中国广州有哪些旅游景点
  82. Assistant: 广州是一个非常有名的旅游城市,拥有许多名副其名的旅游景点。以下是一些主要的旅游景点:
  83. 1. Canton Tower:这是一个位于广州的超高建筑,拥有360度的观景台,展示了广州的全景。
  84. 2. Chimelong Paradise:这个主题公园拥有许多不同的游乐设施和景观,展示了中国的游乐技艺。
  85. 3. Baiyun Mountain:这个山区拥有许多不同的景观和游乐设施,展示了中国的自然美景。
  86. 4. Yuexiu Park:这个公园是广州最大的公园,拥有许多不同的景观和游乐设施,展示了中国的自然美景。
  87. 5. Temple of the Six Banyan Trees:这个寺庙拥有许多不同的文化和历史的收藏品,展示了中国的历史和文化。
  88. 6. Museum of the Chinese Revolution:这个博物馆展现了中国革命的历史和文化,拥有许多不同的收藏品和展品。
  89. 7. Guangzhou Tower:这个塔楼是广州最早的建筑,拥有许多不同的景观和游乐设施,展示了中国的历史和文化。
  90. 8. Guangzhou Museum:这个博物馆展现了广州的历史和文化,拥有许多不同的收藏品和展品。
  91. 9. Flower Street:这个街区拥有许多不同的花卉和景观,展示了中国的自然美景。
  92. 10. Shamian Island:这个岛区拥有许多不同的景观和游乐设施,展示了中国的自然美景和历史文化。
复制代码
四、FAQ

Q:OSError: You are trying to access a gated repo. Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.

  1. (llama_fct) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/mod
  2. els/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  3. No ROCm runtime is found, using ROCM_HOME='/opt/dtk'
  4. /opt/conda/envs/llama_fct/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Fail                                     ed to load image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or d                                     irectory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this w                                     arning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `                                     libpng` installed before building `torchvision` from source?
  5.   warn(
  6. [2024-08-01 15:13:21,242] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to                                      cuda (auto detect)
  7. 08/01/2024 15:13:24 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cpu, n_gpu: 0, distributed training: False, compute dtype: torch.bfloat16
  8. [INFO|tokenization_auto.py:682] 2024-08-01 15:13:25,152 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
  9. Traceback (most recent call last):
  10.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
  11.     response.raise_for_status()
  12.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
  13.     raise HTTPError(http_error_msg, response=self)
  14. requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://hf-mirror.com/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json
  15. The above exception was the direct cause of the following exception:
  16. Traceback (most recent call last):
  17.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/utils/hub.py", line 402, in cached_file
  18.     resolved_file = hf_hub_download(
  19.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
  20.     return f(*args, **kwargs)
  21.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
  22.     return fn(*args, **kwargs)
  23.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1240, in hf_hub_download
  24.     return _hf_hub_download_to_cache_dir(
  25.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1347, in _hf_hub_download_to_cache_dir
  26.     _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  27.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1854, in _raise_on_head_call_error
  28.     raise head_call_error
  29.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1751, in _get_metadata_or_catch_error
  30.     metadata = get_hf_file_metadata(
  31.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
  32.     return fn(*args, **kwargs)
  33.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1673, in get_hf_file_metadata
  34.     r = _request_wrapper(
  35.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 376, in _request_wrapper
  36.     response = _request_wrapper(
  37.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 400, in _request_wrapper
  38.     hf_raise_for_status(response)
  39.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 321, in hf_raise_for_status
  40.     raise GatedRepoError(message, response) from e
  41. huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66ab3595-53663c2f4d5cf81405b65b9e;080cfa15-3220-4ab1-b123-4a32ba31a03a)
  42. Cannot access gated repo for url https://hf-mirror.com/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.
  43. Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.
  44. The above exception was the direct cause of the following exception:
  45. Traceback (most recent call last):
  46.   File "/opt/conda/envs/llama_fct/bin/llamafactory-cli", line 8, in <module>
  47.     sys.exit(main())
  48.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
  49.     run_exp()
  50.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
  51.     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  52.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 44, in run_sft
  53.     tokenizer_module = load_tokenizer(model_args)
  54.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 69, in load_tokenizer
  55.     tokenizer = AutoTokenizer.from_pretrained(
  56.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 853, in from_pretrained
  57.     config = AutoConfig.from_pretrained(
  58.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 972, in from_pretrained
  59.     config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  60.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/configuration_utils.py", line 632, in get_config_dict
  61.     config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  62.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
  63.     resolved_config_file = cached_file(
  64.   File "/opt/conda/envs/llama_fct/lib/python3.10/site-packages/transformers/utils/hub.py", line 420, in cached_file
  65.     raise EnvironmentError(
  66. OSError: You are trying to access a gated repo.
  67. Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
  68. 401 Client Error. (Request ID: Root=1-66ab3595-53663c2f4d5cf81405b65b9e;080cfa15-3220-4ab1-b123-4a32ba31a03a)
  69. Cannot access gated repo for url https://hf-mirror.com/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.
  70. Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.
复制代码
错误缘故原由:默认是从Hugging Face中获取模型,由于Hugging Face 模型授权失败,导致获取模型失败。
解决方法:从modelscope下载模型。
  1. export USE_MODELSCOPE_HUB=1
  2. # Windows 使用 `set USE_MODELSCOPE_HUB=1`
复制代码
将 model_name_or_path 设置为模型 ID 来加载对应的模型。在魔搭社区查察全部可用的模型,例如 LLM-Research/Meta-Llama-3-8B-Instruct。
修改 llama3_lora_sft.yaml 文件:
  1. # model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
  2. 改为
  3. model_name_or_path: LLM-Research/Meta-Llama-3-8B-Instruct
复制代码
  1. llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
复制代码
Q:OSError: LLM-Research/Meta-Llama-3-8B-Instruct is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
  2. [2024-08-01 21:17:22,212] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
  3. Traceback (most recent call last):
  4.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
  5.     response.raise_for_status()
  6.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
  7.     raise HTTPError(http_error_msg, response=self)
  8. requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://hf-mirror.com/LLM-Research/Meta-Llama-3-8B-Instruct/resolve/main/tokenizer_config.json
  9. The above exception was the direct cause of the following exception:
  10. Traceback (most recent call last):
  11.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/utils/hub.py", line 402, in cached_file
  12.     resolved_file = hf_hub_download(
  13.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
  14.     return fn(*args, **kwargs)
  15.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
  16.     return _hf_hub_download_to_cache_dir(
  17.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir
  18.     _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  19.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1823, in _raise_on_head_call_error
  20.     raise head_call_error
  21.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error
  22.     metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
  23.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
  24.     return fn(*args, **kwargs)
  25.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
  26.     r = _request_wrapper(
  27.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
  28.     response = _request_wrapper(
  29.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 396, in _request_wrapper
  30.     hf_raise_for_status(response)
  31.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 352, in hf_raise_for_status
  32.     raise RepositoryNotFoundError(message, response) from e
  33. huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-66ab8ae6-4ed0547e1f86fcb201b723f8;acee559e-0676-48e4-8871-b6eb58e797ca)
  34. Repository Not Found for url: https://hf-mirror.com/LLM-Research/Meta-Llama-3-8B-Instruct/resolve/main/tokenizer_config.json.
  35. Please make sure you specified the correct `repo_id` and `repo_type`.
  36. If you are trying to access a private or gated repo, make sure you are authenticated.
  37. Invalid username or password.
  38. The above exception was the direct cause of the following exception:
  39. Traceback (most recent call last):
  40.   File "/opt/conda/envs/llama_factory_torch/bin/llamafactory-cli", line 8, in <module>
  41.     sys.exit(main())
  42.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 81, in main
  43.     run_chat()
  44.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 125, in run_chat
  45.     chat_model = ChatModel()
  46.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 44, in __init__
  47.     self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
  48.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 53, in __init__
  49.     tokenizer_module = load_tokenizer(model_args)
  50.   File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 69, in load_tokenizer
  51.     tokenizer = AutoTokenizer.from_pretrained(
  52.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 833, in from_pretrained
  53.     tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  54.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 665, in get_tokenizer_config
  55.     resolved_config_file = cached_file(
  56.   File "/opt/conda/envs/llama_factory_torch/lib/python3.10/site-packages/transformers/utils/hub.py", line 425, in cached_file
  57.     raise EnvironmentError(
  58. OSError: LLM-Research/Meta-Llama-3-8B-Instruct is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
  59. If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
复制代码
错误缘故原由:找不到 LLM-Research/Meta-Llama-3-8B-Instruct模型。
解决方法:从modelscope下载模型。
  1. export USE_MODELSCOPE_HUB=1
复制代码
Q:ModuleNotFoundError: No module named 'modelscope'

  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  2. [2024-08-01 19:05:15,320] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (a                                                                            uto detect)08/01/2024 19:05:18 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cuda:0, n_gpu: 1, distri                                                                            buted training: False, compute dtype: torch.bfloat16Traceback (most recent call last):  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/extras/misc.py", line 219, i                                                                            n try_download_model_from_ms    from modelscope import snapshot_downloadModuleNotFoundError: No module named 'modelscope'During handling of the above exception, another exception occurred:Traceback (most recent call last):  File "/opt/conda/envs/llama_factory_torch/bin/llamafactory-cli", line 8, in <module>    sys.exit(main())  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main    run_exp()  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in                                                                             run_exp    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line                                                                             44, in run_sft    tokenizer_module = load_tokenizer(model_args)  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 67, i                                                                            n load_tokenizer    init_kwargs = _get_init_kwargs(model_args)  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/model/loader.py", line 52, i                                                                            n _get_init_kwargs    model_args.model_name_or_path = try_download_model_from_ms(model_args)  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/extras/misc.py", line 224, i                                                                            n try_download_model_from_ms    raise ImportError("Please install modelscope via `pip install modelscope -U`")ImportError: Please install modelscope via `pip install modelscope -U`
复制代码
错误缘故原由:缺少modelscope依赖包。
解决方法:安装modelscope。
  1. (llama_factory_torch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/m
  2. odels/LLaMA-Factory# pip install --no-dependencies modelscope
  3. Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
  4. Collecting modelscope
  5.   Using cached https://pypi.tuna.tsinghua.edu.cn/packages/38/37/9fe505ebc67ba5e0345a69d6e8b2ee8630523975b484                                                                            d221691ef60182bd/modelscope-1.16.1-py3-none-any.whl (5.7 MB)
  6. Installing collected packages: modelscope
  7. Successfully installed modelscope-1.16.1
复制代码
Q:ImportError: /PATH/TO/site-packages/torch/lib/libtorch_hip.so: undefined symbol: ncclCommInitRankConfig

  1. (llama_fct_pytorch) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  2. Traceback (most recent call last):  File "/opt/conda/envs/llama_fct_pytorch/bin/llamafactory-cli", line 5, in <module>    from llamafactory.cli import main  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/__init__.py", line 38, in <module>    from .cli import VERSION  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/cli.py", line 21, in <module>    from . import launcher  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/launcher.py", line 15, in <module>    from llamafactory.train.tuner import run_exp  File "/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory/src/llamafactory/train/tuner.py", line 19, in <module>    import torch  File "/opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/__init__.py", line 237, in <module>    from torch._C import *  # noqa: F403ImportError: /opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/lib/libtorch_hip.so: undefined symbol: ncclCommInitRankConfig
复制代码
  1. >>> import torch
  2. Traceback (most recent call last):
  3.   File "<stdin>", line 1, in <module>
  4.   File "/opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/__init__.py", line 237, in <module>
  5.     from torch._C import *  # noqa: F403
  6. ImportError: /opt/conda/envs/llama_fct_pytorch/lib/python3.10/site-packages/torch/lib/libtorch_hip.so: undefined symbol: ncclCommInitRankConfig
复制代码
错误缘故原由:当前PyTorch版本不支持DCU。
该问题的解决方法,请参考下文的FAQ。
Q:PyTorch版本不支持DCU

  1. (llama_fct) root@notebook-1813389960667746306-scnlbe5oi5-17811:/public/home/scnlbe5oi5/Downloads/models/LLaMA-Factory# llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
  2. No ROCm runtime is found, using ROCM_HOME='/opt/dtk'/opt/conda/envs/llama_fct/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to lo                                                                            ad image Python extension: 'libc10_hip.so: cannot open shared object file: No such file or directory'If you                                                                             don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there                                                                             might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before buildin                                                                            g `torchvision` from source?  warn([2024-08-01 17:49:08,805] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (a                                                                            uto detect)08/01/2024 17:49:12 - INFO - llamafactory.hparams.parser - Process rank: 0, device: cpu, n_gpu: 0, distribut                                                                            ed training: False, compute dtype: torch.bfloat16Downloading: 100%|█████████████████████████████████████████████████████████| 654/654 [00:00<00:00, 2.56kB/s]Downloading: 100%|█████████████████████████████████████████████████████████| 48.0/48.0 [00:00<00:00, 183B/s]Downloading: 100%|███████████████████████████████████████████████████████████| 187/187 [00:00<00:00, 759B/s]Downloading: 100%|█████████████████████████████████████████████████████| 7.62k/7.62k [00:00<00:00, 29.9kB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.63G/4.63G [01:33<00:00, 53.4MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.66G/4.66G [01:02<00:00, 79.9MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.58G/4.58G [01:00<00:00, 81.7MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 1.09G/1.09G [00:22<00:00, 51.6MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 23.4k/23.4k [00:00<00:00, 53.6kB/s]Downloading: 100%|██████████████████████████████████████████████████████| 36.3k/36.3k [00:00<00:00, 125kB/s]Downloading: 100%|█████████████████████████████████████████████████████████| 73.0/73.0 [00:00<00:00, 293B/s]Downloading: 100%|█████████████████████████████████████████████████████| 8.66M/8.66M [00:00<00:00, 13.5MB/s]Downloading: 100%|█████████████████████████████████████████████████████| 49.8k/49.8k [00:00<00:00, 90.0kB/s]Downloading: 100%|█████████████████████████████████████████████████████| 4.59k/4.59k [00:00<00:00, 18.7kB/s][INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,510 >> loading file tokenizer.json[INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,511 >> loading file added_tokens.json[INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,511 >> loading file special_tokens_map.json[INFO|tokenization_utils_base.py:2287] 2024-08-01 17:53:53,511 >> loading file tokenizer_config.json[INFO|tokenization_utils_base.py:2533] 2024-08-01 17:53:53,854 >> Special tokens have been added in the voca                                                                            bulary, make sure the associated word embeddings are fine-tuned or trained.08/01/2024 17:53:53 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>08/01/2024 17:53:53 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>08/01/2024 17:53:53 - INFO - llamafactory.data.loader - Loading dataset identity.json...Generating train split: 91 examples [00:00, 10580.81 examples/s]Converting format of dataset (num_proc=16): 100%|███████████████████| 91/91 [00:00<00:00, 427.78 examples/s]08/01/2024 17:53:56 - INFO - llamafactory.data.loader - Loading dataset alpaca_en_demo.json...Generating train split: 1000 examples [00:00, 66788.28 examples/s]Converting format of dataset (num_proc=16): 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 4688.60 examples/s]Running tokenizer on dataset (num_proc=16): 100%|███████████████████████████████████████████████████████████████████████████████████████████| 1091/1091 [00:03<00:00, 295.08 examples/s]training example:input_ids:[128000, 128006, 882, 128007, 271, 6151, 128009, 128006, 78191, 128007, 271, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]inputs:<|begin_of_text|><|start_header_id|>user<|end_header_id|>hi<|eot_id|><|start_header_id|>assistant<|end_header_id|>Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>label_ids:[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 9906, 0, 358, 1097, 5991, 609, 39254, 459, 15592, 18328, 8040, 555, 5991, 3170, 3500, 13, 2650, 649, 358, 7945, 499, 3432, 30, 128009]labels:Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?<|eot_id|>[INFO|configuration_utils.py:731] 2024-08-01 17:54:02,106 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/config.json[INFO|configuration_utils.py:800] 2024-08-01 17:54:02,108 >> Model config LlamaConfig {  "_name_or_path": "/root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct",  "architectures": [    "LlamaForCausalLM"  ],  "attention_bias": false,  "attention_dropout": 0.0,  "bos_token_id": 128000,  "eos_token_id": 128009,  "hidden_act": "silu",  "hidden_size": 4096,  "initializer_range": 0.02,  "intermediate_size": 14336,  "max_position_embeddings": 8192,  "mlp_bias": false,  "model_type": "llama",  "num_attention_heads": 32,  "num_hidden_layers": 32,  "num_key_value_heads": 8,  "pretraining_tp": 1,  "rms_norm_eps": 1e-05,  "rope_scaling": null,  "rope_theta": 500000.0,  "tie_word_embeddings": false,  "torch_dtype": "bfloat16",  "transformers_version": "4.43.3",  "use_cache": true,  "vocab_size": 128256}[INFO|modeling_utils.py:3631] 2024-08-01 17:54:02,139 >> loading weights file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/model.safetensors.index.json[INFO|modeling_utils.py:1572] 2024-08-01 17:54:02,140 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.[INFO|configuration_utils.py:1038] 2024-08-01 17:54:02,142 >> Generate config GenerationConfig {  "bos_token_id": 128000,  "eos_token_id": 128009}Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.68it/s][INFO|modeling_utils.py:4463] 2024-08-01 17:54:03,708 >> All model checkpoint weights were used when initializing LlamaForCausalLM.[INFO|modeling_utils.py:4471] 2024-08-01 17:54:03,709 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct.If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.[INFO|configuration_utils.py:991] 2024-08-01 17:54:03,712 >> loading configuration file /root/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct/generation_config.json[INFO|configuration_utils.py:1038] 2024-08-01 17:54:03,713 >> Generate config GenerationConfig {  "bos_token_id": 128000,  "do_sample": true,  "eos_token_id": [    128001,    128009  ],  "max_length": 4096,  "temperature": 0.6,  "top_p": 0.9}08/01/2024 17:54:03 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.08/01/2024 17:54:03 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.08/01/2024 17:54:03 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.08/01/2024 17:54:03 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA08/01/2024 17:54:03 - INFO - llamafactory.model.model_utils.misc - Found linear modules: q_proj,down_proj,o_proj,k_proj,gate_proj,up_proj,v_proj08/01/2024 17:54:08 - INFO - llamafactory.model.loader - trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.[INFO|trainer.py:648] 2024-08-01 17:54:08,091 >> Using cpu_amp half precision backend[INFO|trainer.py:2134] 2024-08-01 17:54:09,008 >> ***** Running training *****[INFO|trainer.py:2135] 2024-08-01 17:54:09,008 >>   Num examples = 981[INFO|trainer.py:2136] 2024-08-01 17:54:09,008 >>   Num Epochs = 3[INFO|trainer.py:2137] 2024-08-01 17:54:09,008 >>   Instantaneous batch size per device = 1[INFO|trainer.py:2140] 2024-08-01 17:54:09,008 >>   Total train batch size (w. parallel, distributed & accumulation) = 8[INFO|trainer.py:2141] 2024-08-01 17:54:09,008 >>   Gradient Accumulation steps = 8[INFO|trainer.py:2142] 2024-08-01 17:54:09,008 >>   Total optimization steps = 366[INFO|trainer.py:2143] 2024-08-01 17:54:09,012 >>   Number of trainable parameters = 20,971,520  0%|                                                                                                                                                           | 0/366 [00:00<?, ?it/s
复制代码
错误缘故原由:当前PyTorch不支持DCU,导致步伐卡住,模型无法微调训练。
解决方法:在光合社区中查询并下载安装PyTorch。以 torch-2.1.0+das1.1.git3ac1bdd.abi1.dtk2404-cp310-cp310-manylinux_2_31_x86_64 为例,尝试安装 torch-2.1.0。

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

王柳

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表