【条记】Ubuntu中Llama3中文微调,并加载微调后的模型:中文微调数据集介绍 ...

打印 上一主题 下一主题

主题 537|帖子 537|积分 1611

实践:about ollama


安装
  1. curl -fsSL https://ollama.com/install.sh | sh
复制代码

部署
  1. ollama create example -f Modelfile
复制代码
运行
  1. ollama run example
复制代码
停止(ollama加载的大模型将会停止占用显存,此时ollama属于失联状态,部署和运行操作失效,会报错:
Error: could not connect to ollama app, is it running?
必要启动后,才可以进行部署和运行操作)
  1. systemctl stop ollama.service
复制代码
停止后启动(启动后,可以接着使用ollama 部署和运行大模型)
  1. systemctl start ollama.service
复制代码


Modelfile contents:
  1. FROM /home/wangbin/Desktop/Llama3/dir-unsloth.F16.gguf
  2. PARAMETER stop "<|im_start|>"
  3. PARAMETER stop "<|im_end|>"
  4. TEMPLATE """
  5. <|im_start|>system
  6. {{ .System }}<|im_end|>
  7. <|im_start|>user
  8. {{ .Prompt }}<|im_end|>
  9. <|im_start|>assistant
  10. """
  11. PARAMETER temperature 0.8
  12. PARAMETER num_ctx 8192
  13. PARAMETER stop "<|system|>"
  14. PARAMETER stop "<|user|>"
  15. PARAMETER stop "<|assistant|>"
  16. SYSTEM """You are a helpful, smart, kind, and efficient AI assistant.Your name is Aila. You always fulfill the user's requests to the best of your ability."""
复制代码


ollama 参数:
  1. (unsloth_env) wangbin@wangbin-LEGION-REN9000K-34IRZ:~/Desktop/Llama3$ ollama
  2. Usage:
  3.   ollama [flags]
  4.   ollama [command]
  5. Available Commands:
  6.   serve       Start ollama
  7.   create      Create a model from a Modelfile
  8.   show        Show information for a model
  9.   run         Run a model
  10.   pull        Pull a model from a registry
  11.   push        Push a model to a registry
  12.   list        List models
  13.   ps          List running models
  14.   cp          Copy a model
  15.   rm          Remove a model
  16.   help        Help about any command
  17. Flags:
  18.   -h, --help      help for ollama
  19.   -v, --version   Show version information
复制代码



卸载
  1. 1.Stop the Ollama Service
  2.     First things first, we need to stop the Ollama service from running. This ensures a smooth uninstallation process. Open your terminal and enter the following command:
  3. sudo systemctl stop ollama
  4. This command halts the Ollama service.
  5. 2.Disable the Ollama Service
  6.     Now that the service is stopped, we need to disable it so that it doesn’t start up again upon system reboot. Enter the following command:
  7. sudo systemctl disable ollama
  8. This ensures that Ollama won’t automatically start up in the future.
  9. 3.Remove the Service File
  10.     We need to tidy up by removing the service file associated with Ollama. Enter the following command:
  11. sudo rm /etc/systemd/system/ollama.service
  12. This deletes the service file from your system.
  13. 4.Delete the Ollama Binary
  14.     Next up, we’ll remove the Ollama binary itself. Enter the following command:
  15. sudo rm $(which ollama)
  16. This command removes the binary from your bin directory.
  17. 5.Remove Downloaded Models and Ollama User
  18.     Lastly, we’ll clean up any remaining bits and pieces. Enter the following commands one by one:
  19. sudo rm -r /usr/share/ollama
  20. sudo userdel ollama sudo groupdel ollama
  21. These commands delete any downloaded models and remove the Ollama user and group from your system.
复制代码




正文:










清洗PDF:
  1. 清洗PDF
  2. import PyPDF2
  3. import re
  4. def clean_extracted_text(text):
  5.     """Clean and preprocess extracted text."""
  6.     # Remove chapter titles and sections
  7.     text = re.sub(r'^(Introduction|Chapter \d+:|What is|Examples:|Chapter \d+)', '', text, flags=re.MULTILINE)
  8.     text = re.sub(r'ctitious', 'fictitious', text)
  9.     text = re.sub(r'ISBN[- ]13: \d{13}', '', text)
  10.     text = re.sub(r'ISBN[- ]10: \d{10}', '', text)
  11.     text = re.sub(r'Library of Congress Control Number : \d+', '', text)
  12.     text = re.sub(r'(\.|\?|\!)(\S)', r'\1 \2', text)  # Ensure space after punctuation
  13.     text = re.sub(r'All rights reserved|Copyright \d{4}', '', text)
  14.     text = re.sub(r'\n\s*\n', '\n', text)
  15.     text = re.sub(r'[^\x00-\x7F]+', ' ', text)
  16.     text = re.sub(r'\s{2,}', ' ', text)
  17.     # Remove all newlines and replace newlines only after periods
  18.     text = text.replace('\n', ' ')
  19.     text = re.sub(r'(\.)(\s)', r'\1\n', text)
  20.     return text
  21. def extract_text_from_pdf(pdf_path):
  22.     """Extract text from a PDF file."""
  23.     with open(pdf_path, 'rb') as file:
  24.         reader = PyPDF2.PdfReader(file)
  25.         text = ''
  26.         for page in reader.pages:
  27.             if page.extract_text():
  28.                 text += page.extract_text() + ' '  # Append text of each page
  29.     return text
  30. def main():
  31.     pdf_path = '/Users/charlesqin/Documents/The Art of Asking ChatGPT.pdf'  # Path to your PDF file
  32.     extracted_text = extract_text_from_pdf(pdf_path)
  33.     cleaned_text = clean_extracted_text(extracted_text)
  34.     # Output the cleaned text to a file
  35.     with open('cleaned_text_output.txt', 'w', encoding='utf-8') as file:
  36.         file.write(cleaned_text)
  37. if __name__ == '__main__':
  38.     main()
复制代码




微调代码:
  1. from unsloth import FastLanguageModel
  2. import torch
  3. from trl import SFTTrainer
  4. from transformers import TrainingArguments
  5. max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
  6. dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
  7. load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
  8. # 4bit pre quantized models we support for 4x faster downloading + no OOMs.
  9. fourbit_models = [
  10.     "unsloth/mistral-7b-bnb-4bit",
  11.     "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",
  12.     "unsloth/llama-2-7b-bnb-4bit",
  13.     "unsloth/gemma-7b-bnb-4bit",
  14.     "unsloth/gemma-7b-it-bnb-4bit", # Instruct version of Gemma 7b
  15.     "unsloth/gemma-2b-bnb-4bit",
  16.     "unsloth/gemma-2b-it-bnb-4bit", # Instruct version of Gemma 2b
  17.     "unsloth/llama-3-8b-bnb-4bit", # [NEW] 15 Trillion token Llama-3
  18. ] # More models at https://huggingface.co/unsloth
  19. model, tokenizer = FastLanguageModel.from_pretrained(
  20.     model_name = "unsloth/llama-3-8b-bnb-4bit",
  21.     max_seq_length = max_seq_length,
  22.     dtype = dtype,
  23.     load_in_4bit = load_in_4bit,
  24.     # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
  25. )
  26. model = FastLanguageModel.get_peft_model(
  27.     model,
  28.     r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
  29.     target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
  30.                       "gate_proj", "up_proj", "down_proj",],
  31.     lora_alpha = 16,
  32.     lora_dropout = 0, # Supports any, but = 0 is optimized
  33.     bias = "none",    # Supports any, but = "none" is optimized
  34.     # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
  35.     use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
  36.     random_state = 3407,
  37.     use_rslora = False,  # We support rank stabilized LoRA
  38.     loftq_config = None, # And LoftQ
  39. )
  40. alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
  41. ### Instruction:
  42. {}
  43. ### Input:
  44. {}
  45. ### Response:
  46. {}"""
  47. EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN
  48. def formatting_prompts_func(examples):
  49.     instructions = examples["instruction"]
  50.     inputs       = examples["input"]
  51.     outputs      = examples["output"]
  52.     texts = []
  53.     for instruction, input, output in zip(instructions, inputs, outputs):
  54.         # Must add EOS_TOKEN, otherwise your generation will go on forever!
  55.         text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
  56.         texts.append(text)
  57.     return { "text" : texts, }
  58. pass
  59. from datasets import load_dataset
  60. file_path = "/home/Ubuntu/alpaca_gpt4_data_zh.json"
  61. dataset = load_dataset("json", data_files={"train": file_path}, split="train")
  62. dataset = dataset.map(formatting_prompts_func, batched = True,)
  63. trainer = SFTTrainer(
  64.     model = model,
  65.     tokenizer = tokenizer,
  66.     train_dataset = dataset,
  67.     dataset_text_field = "text",
  68.     max_seq_length = max_seq_length,
  69.     dataset_num_proc = 2,
  70.     packing = False, # Can make training 5x faster for short sequences.
  71.     args = TrainingArguments(
  72.         per_device_train_batch_size = 2,
  73.         gradient_accumulation_steps = 4,
  74.         warmup_steps = 5,
  75.         max_steps = 60,
  76.         learning_rate = 2e-4,
  77.         fp16 = not torch.cuda.is_bf16_supported(),
  78.         bf16 = torch.cuda.is_bf16_supported(),
  79.         logging_steps = 1,
  80.         optim = "adamw_8bit",
  81.         weight_decay = 0.01,
  82.         lr_scheduler_type = "linear",
  83.         seed = 3407,
  84.         output_dir = "outputs",
  85.     ),
  86. )
  87. trainer_stats = trainer.train()
  88. model.save_pretrained_gguf("dir", tokenizer, quantization_method = "q4_k_m")
  89. model.save_pretrained_gguf("dir", tokenizer, quantization_method = "q8_0")
  90. model.save_pretrained_gguf("dir", tokenizer, quantization_method = "f16")
复制代码


Ollama:













LM Studio:














我们使用经过Fine Tuning以后的Llama3大模型,询问它题目:



然后我们使用没有经过Fine Tuning的Llama3,还是用刚才的题目询问它:







Reference link:https://www.youtube.com/watch?v=oxTVzGwKeoU


免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

悠扬随风

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表