人工智能-AIGC——微调技术(Datawhale X 魔搭 Al夏令营)

种地发表于 2024-9-25 04:24:43

AIGC——微调技术(Datawhale X 魔搭 Al夏令营)

微调（Fine-tuning）是一种在深度学习和机器学习领域中常用的技术，旨在通过调整预训练模型的参数来使其更好地顺应特定任务。其基本原理和参数的理解对于实现更好的效果至关紧张。
前言

了解微调的基本原理，对微调的各种参数有一个更加清晰的了解，来实现一个更好的效果，并且在这个Task中给各人先容一下文生图的工作流平台工具ComfyUI，来实现一个更加高度定制的文生图。
一、工具初探一ComfyUI应用场景探索

ComfyUI是一个基于节点流程的AI绘图工具WebUI，它专注于提供更加精准的工作流定制，通过将Stable Diffusion的流程拆分成节点，实现了工作流的定制和可复现性。
1、20分钟速通安装ComfyUI

选择利用魔搭社区提供的Notebook和免费的GPU算力体验来体验ComfyUI。
https://i-blog.csdnimg.cn/direct/c36cc84dd30f467e8d4c906aecc5c41e.png
2、下载脚本代码文件

下载安装ComfyUI的执行文件和task1中微调完成Lora文件
git lfs install
git clone https://www.modelscope.cn/datasets/maochase/kolors_test_comfyui.git
mv kolors_test_comfyui/* ./
rm -rf kolors_test_comfyui/
mkdir -p /mnt/workspace/models/lightning_logs/version_0/checkpoints/
mv epoch=0-step=500.ckpt /mnt/workspace/models/lightning_logs/version_0/checkpoints/ https://i-blog.csdnimg.cn/direct/65bed4688a3e4d7da621ce9f8214fa5c.png
https://i-blog.csdnimg.cn/direct/f2f084d981d34dd282209a95077a53d0.png
3、进入ComfyUI的安装文件

https://i-blog.csdnimg.cn/direct/4ac75f4c52a7461f8b731c23e1a13e5b.png
4、一键执行安装步伐（约莫10min）

https://i-blog.csdnimg.cn/direct/cfdc84e2e8eb43e481d38c4c8ee485b7.png
https://i-blog.csdnimg.cn/direct/a8364035cee74887b22093d1d5d7bd61.png
5、进入预览界面

当执行到最后一个节点的内容输出了一个访问的链接的时候，复制链接到欣赏器中访问
PS：如果链接访问白屏，或者报错，就等一会再访问重试，步伐可能没有正常启动完毕
https://i-blog.csdnimg.cn/direct/dc7c5c5418b9400287a31a2dbce72074.png
https://i-blog.csdnimg.cn/direct/1f2b55a78e7549d3a3fc2c062cda0788.png
6、浅尝ComfyUI工作流

1.不带Lora的工作流样例

创建.json格式文件：
{
"last_node_id": 15,
"last_link_id": 18,
"nodes": [
{
   "id": 11,
   "type": "VAELoader",
   "pos": [
   1323,
   240
   ],
   "size": {
   "0": 315,
   "1": 58
   },
   "flags": {},
   "order": 0,
   "mode": 0,
   "outputs": [
   {
      "name": "VAE",
      "type": "VAE",
      "links": [
         12
      ],
      "shape": 3
   }
   ],
   "properties": {
   "Node name for S&R": "VAELoader"
   },
   "widgets_values": [
   "sdxl.vae.safetensors"
   ]
},
{
   "id": 10,
   "type": "VAEDecode",
   "pos": [
   1368,
   369
   ],
   "size": {
   "0": 210,
   "1": 46
   },
   "flags": {},
   "order": 6,
   "mode": 0,
   "inputs": [
   {
      "name": "samples",
      "type": "LATENT",
      "link": 18
   },
   {
      "name": "vae",
      "type": "VAE",
      "link": 12,
      "slot_index": 1
   }
   ],
   "outputs": [
   {
      "name": "IMAGE",
      "type": "IMAGE",
      "links": [
         13
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "VAEDecode"
   }
},
{
   "id": 14,
   "type": "KolorsSampler",
   "pos": [
   1011,
   371
   ],
   "size": {
   "0": 315,
   "1": 222
   },
   "flags": {},
   "order": 5,
   "mode": 0,
   "inputs": [
   {
      "name": "kolors_model",
      "type": "KOLORSMODEL",
      "link": 16
   },
   {
      "name": "kolors_embeds",
      "type": "KOLORS_EMBEDS",
      "link": 17
   }
   ],
   "outputs": [
   {
      "name": "latent",
      "type": "LATENT",
      "links": [
         18
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "KolorsSampler"
   },
   "widgets_values": [
   1024,
   1024,
   1000102404233412,
   "fixed",
   25,
   5,
   "EulerDiscreteScheduler"
   ]
},
{
   "id": 6,
   "type": "DownloadAndLoadKolorsModel",
   "pos": [
   201,
   368
   ],
   "size": {
   "0": 315,
   "1": 82
   },
   "flags": {},
   "order": 1,
   "mode": 0,
   "outputs": [
   {
      "name": "kolors_model",
      "type": "KOLORSMODEL",
      "links": [
         16
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "DownloadAndLoadKolorsModel"
   },
   "widgets_values": [
   "Kwai-Kolors/Kolors",
   "fp16"
   ]
},
{
   "id": 3,
   "type": "PreviewImage",
   "pos": [
   1366,
   468
   ],
   "size": [
   535.4001724243165,
   562.2001106262207
   ],
   "flags": {},
   "order": 7,
   "mode": 0,
   "inputs": [
   {
      "name": "images",
      "type": "IMAGE",
      "link": 13
   }
   ],
   "properties": {
   "Node name for S&R": "PreviewImage"
   }
},
{
   "id": 12,
   "type": "KolorsTextEncode",
   "pos": [
   519,
   529
   ],
   "size": [
   457.2893696934723,
   225.28656056301645
   ],
   "flags": {},
   "order": 4,
   "mode": 0,
   "inputs": [
   {
      "name": "chatglm3_model",
      "type": "CHATGLM3MODEL",
      "link": 14,
      "slot_index": 0
   }
   ],
   "outputs": [
   {
      "name": "kolors_embeds",
      "type": "KOLORS_EMBEDS",
      "links": [
         17
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "KolorsTextEncode"
   },
   "widgets_values": [
   "cinematic photograph of an astronaut riding a horse in space |\nillustration of a cat wearing a top hat and a scarf|\nphotograph of a goldfish in a bowl |\nanime screencap of a red haired girl",
   "",
   1
   ]
},
{
   "id": 15,
   "type": "Note",
   "pos": [
   200,
   636
   ],
   "size": [
   273.5273818969726,
   149.55464588512064
   ],
   "flags": {},
   "order": 2,
   "mode": 0,
   "properties": {
   "text": ""
   },
   "widgets_values": [
   "Text encoding takes the most VRAM, quantization can reduce that a lot.\n\nApproximate values I have observed:\nfp16 - 12 GB\nquant8 - 8-9 GB\nquant4 - 4-5 GB\n\nquant4 reduces the quality quite a bit, 8 seems fine"
   ],
   "color": "#432",
   "bgcolor": "#653"
},
{
   "id": 13,
   "type": "DownloadAndLoadChatGLM3",
   "pos": [
   206,
   522
   ],
   "size": [
   274.5334274291992,
   58
   ],
   "flags": {},
   "order": 3,
   "mode": 0,
   "outputs": [
   {
      "name": "chatglm3_model",
      "type": "CHATGLM3MODEL",
      "links": [
         14
      ],
      "shape": 3
   }
   ],
   "properties": {
   "Node name for S&R": "DownloadAndLoadChatGLM3"
   },
   "widgets_values": [
   "fp16"
   ]
}
],
"links": [
[
   12,
   11,
   0,
   10,
   1,
   "VAE"
],
[
   13,
   10,
   0,
   3,
   0,
   "IMAGE"
],
[
   14,
   13,
   0,
   12,
   0,
   "CHATGLM3MODEL"
],
[
   16,
   6,
   0,
   14,
   0,
   "KOLORSMODEL"
],
[
   17,
   12,
   0,
   14,
   1,
   "KOLORS_EMBEDS"
],
[
   18,
   14,
   0,
   10,
   0,
   "LATENT"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
   "scale": 1.1,
   "offset": {
   "0": -114.73954010009766,
   "1": -139.79705810546875
   }
}
},
"version": 0.4
} 加载模型，并完成第一次生图
PS：初次点击生成图片会加载资源，时间较长，各人耐心等待

2.带Lora的工作流样例

创建.json格式文件：

{
"last_node_id": 16,
"last_link_id": 20,
"nodes": [
{
   "id": 11,
   "type": "VAELoader",
   "pos": [
   1323,
   240
   ],
   "size": {
   "0": 315,
   "1": 58
   },
   "flags": {},
   "order": 0,
   "mode": 0,
   "outputs": [
   {
      "name": "VAE",
      "type": "VAE",
      "links": [
         12
      ],
      "shape": 3
   }
   ],
   "properties": {
   "Node name for S&R": "VAELoader"
   },
   "widgets_values": [
   "sdxl.vae.safetensors"
   ]
},
{
   "id": 10,
   "type": "VAEDecode",
   "pos": [
   1368,
   369
   ],
   "size": {
   "0": 210,
   "1": 46
   },
   "flags": {},
   "order": 7,
   "mode": 0,
   "inputs": [
   {
      "name": "samples",
      "type": "LATENT",
      "link": 18
   },
   {
      "name": "vae",
      "type": "VAE",
      "link": 12,
      "slot_index": 1
   }
   ],
   "outputs": [
   {
      "name": "IMAGE",
      "type": "IMAGE",
      "links": [
         13
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "VAEDecode"
   }
},
{
   "id": 15,
   "type": "Note",
   "pos": [
   200,
   636
   ],
   "size": {
   "0": 273.5273742675781,
   "1": 149.5546417236328
   },
   "flags": {},
   "order": 1,
   "mode": 0,
   "properties": {
   "text": ""
   },
   "widgets_values": [
   "Text encoding takes the most VRAM, quantization can reduce that a lot.\n\nApproximate values I have observed:\nfp16 - 12 GB\nquant8 - 8-9 GB\nquant4 - 4-5 GB\n\nquant4 reduces the quality quite a bit, 8 seems fine"
   ],
   "color": "#432",
   "bgcolor": "#653"
},
{
   "id": 13,
   "type": "DownloadAndLoadChatGLM3",
   "pos": [
   206,
   522
   ],
   "size": {
   "0": 274.5334167480469,
   "1": 58
   },
   "flags": {},
   "order": 2,
   "mode": 0,
   "outputs": [
   {
      "name": "chatglm3_model",
      "type": "CHATGLM3MODEL",
      "links": [
         14
      ],
      "shape": 3
   }
   ],
   "properties": {
   "Node name for S&R": "DownloadAndLoadChatGLM3"
   },
   "widgets_values": [
   "fp16"
   ]
},
{
   "id": 6,
   "type": "DownloadAndLoadKolorsModel",
   "pos": [
   201,
   368
   ],
   "size": {
   "0": 315,
   "1": 82
   },
   "flags": {},
   "order": 3,
   "mode": 0,
   "outputs": [
   {
      "name": "kolors_model",
      "type": "KOLORSMODEL",
      "links": [
         19
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "DownloadAndLoadKolorsModel"
   },
   "widgets_values": [
   "Kwai-Kolors/Kolors",
   "fp16"
   ]
},
{
   "id": 12,
   "type": "KolorsTextEncode",
   "pos": [
   519,
   529
   ],
   "size": {
   "0": 457.28936767578125,
   "1": 225.28656005859375
   },
   "flags": {},
   "order": 4,
   "mode": 0,
   "inputs": [
   {
      "name": "chatglm3_model",
      "type": "CHATGLM3MODEL",
      "link": 14,
      "slot_index": 0
   }
   ],
   "outputs": [
   {
      "name": "kolors_embeds",
      "type": "KOLORS_EMBEDS",
      "links": [
         17
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "KolorsTextEncode"
   },
   "widgets_values": [
   "二次元，长发，少女，白色背景",
   "",
   1
   ]
},
{
   "id": 3,
   "type": "PreviewImage",
   "pos": [
   1366,
   469
   ],
   "size": {
   "0": 535.400146484375,
   "1": 562.2001342773438
   },
   "flags": {},
   "order": 8,
   "mode": 0,
   "inputs": [
   {
      "name": "images",
      "type": "IMAGE",
      "link": 13
   }
   ],
   "properties": {
   "Node name for S&R": "PreviewImage"
   }
},
{
   "id": 16,
   "type": "LoadKolorsLoRA",
   "pos": [
   606,
   368
   ],
   "size": {
   "0": 317.4000244140625,
   "1": 82
   },
   "flags": {},
   "order": 5,
   "mode": 0,
   "inputs": [
   {
      "name": "kolors_model",
      "type": "KOLORSMODEL",
      "link": 19
   }
   ],
   "outputs": [
   {
      "name": "kolors_model",
      "type": "KOLORSMODEL",
      "links": [
         20
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "LoadKolorsLoRA"
   },
   "widgets_values": [
   "/mnt/workspace/models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt",
   2
   ]
},
{
   "id": 14,
   "type": "KolorsSampler",
   "pos": [
   1011,
   371
   ],
   "size": {
   "0": 315,
   "1": 266
   },
   "flags": {},
   "order": 6,
   "mode": 0,
   "inputs": [
   {
      "name": "kolors_model",
      "type": "KOLORSMODEL",
      "link": 20
   },
   {
      "name": "kolors_embeds",
      "type": "KOLORS_EMBEDS",
      "link": 17
   },
   {
      "name": "latent",
      "type": "LATENT",
      "link": null
   }
   ],
   "outputs": [
   {
      "name": "latent",
      "type": "LATENT",
      "links": [
         18
      ],
      "shape": 3,
      "slot_index": 0
   }
   ],
   "properties": {
   "Node name for S&R": "KolorsSampler"
   },
   "widgets_values": [
   1024,
   1024,
   0,
   "fixed",
   25,
   5,
   "EulerDiscreteScheduler",
   1
   ]
}
],
"links": [
[
   12,
   11,
   0,
   10,
   1,
   "VAE"
],
[
   13,
   10,
   0,
   3,
   0,
   "IMAGE"
],
[
   14,
   13,
   0,
   12,
   0,
   "CHATGLM3MODEL"
],
[
   17,
   12,
   0,
   14,
   1,
   "KOLORS_EMBEDS"
],
[
   18,
   14,
   0,
   10,
   0,
   "LATENT"
],
[
   19,
   6,
   0,
   16,
   0,
   "KOLORSMODEL"
],
[
   20,
   16,
   0,
   14,
   0,
   "KOLORSMODEL"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
   "scale": 1.2100000000000002,
   "offset": {
   "0": -183.91309381910426,
   "1": -202.11110769225016
   }
}
},
"version": 0.4
}

7、关闭魔塔GPU环境

二、Lora微调

Lora微调，全称为Low-Rank Adaptation（低秩顺应），是一种高效的模型微调技术，特别适用于大型预训练模型。该技术通过引入低秩矩阵来保持预训练模型的大部分参数不变，仅调整少量参数以顺应特定任务。
1.Lora微调的基本原理

[*]参数矩阵的低秩近似：

[*]大模型通常具有过参数化的特点，即参数矩阵的维度很高，但在特定任务中，只有一小部分参数起紧张作用。
[*]Lora利用低秩矩阵分解的思想，通过引入两个维度较小的矩阵A和B（A的维度为dxr，B的维度为rxd，其中r远小于d）来近似原始权重矩阵。
[*]这两个矩阵相乘后，得到的矩阵AB的秩远小于原始权重矩阵的秩，但可以或许在一定程度上保持模型在特定任务上的性能。

[*]旁路布局：

[*]在网络中增加一个旁路布局，该旁路是A和B两个矩阵相乘的结果。
[*]在训练过程中，冻结原始网络的参数，只训练旁路参数A和B。
[*]由于A和B的参数量远远小于原始网络的参数，因此训练时所需的显存开销大大减小。

2.Task2中的的微调代码

代码如下：
import os
cmd = """
python DiffSynth-Studio/examples/train/kolors/train_kolors_lora.py \ # 选择使用可图的Lora训练脚本DiffSynth-Studio/examples/train/kolors/train_kolors_lora.py
--pretrained_unet_path models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors \ # 选择unet模型
--pretrained_text_encoder_path models/kolors/Kolors/text_encoder \ # 选择text_encoder
--pretrained_fp16_vae_path models/sdxl-vae-fp16-fix/diffusion_pytorch_model.safetensors \ # 选择vae模型
--lora_rank 16 \ # lora_rank 16 表示在权衡模型表达能力和训练效率时，选择了使用 16 作为秩，适合在不显著降低模型性能的前提下，通过 LoRA 减少计算和内存的需求
--lora_alpha 4.0 \ # 设置 LoRA 的 alpha 值，影响调整的强度
--dataset_path data/lora_dataset_processed \ # 指定数据集路径，用于训练模型
--output_path ./models \ # 指定输出路径，用于保存模型
--max_epochs 1 \ # 设置最大训练轮数为 1
--center_crop \ # 启用中心裁剪，这通常用于图像预处理
--use_gradient_checkpointing \ # 启用梯度检查点技术，以节省内存
--precision "16-mixed" # 指定训练时的精度为混合 16 位精度（half precision），这可以加速训练并减少显存使用
""".strip()
os.system(cmd) # 执行可图Lora训练总结

AIGC作为人工智能领域的一个紧张分支，正在徐徐改变我们的生存方式和工作方式。随着技术的不停发展和美满，我们有来由信赖AIGC将在更多领域展现出其独特的魅力和价值。

免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！更多信息从访问主页：qidao123.com:ToB企服之家，中国第一个企服评测及商务社交产业平台。

页: [1]

qidao123.com技术社区-IT企服评测·应用市场's Archiver

AIGC——微调技术(Datawhale X 魔搭 Al夏令营)