马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?立即注册
x
微调(Fine-tuning)是一种在深度学习和机器学习领域中常用的技术,旨在通过调整预训练模型的参数来使其更好地顺应特定任务。其基本原理和参数的理解对于实现更好的效果至关紧张。
前言
了解微调的基本原理,对微调的各种参数有一个更加清晰的了解,来实现一个更好的效果,并且在这个Task中给各人先容一下文生图的工作流平台工具ComfyUI,来实现一个更加高度定制的文生图。
一、工具初探一ComfyUI应用场景探索
ComfyUI是一个基于节点流程的AI绘图工具WebUI,它专注于提供更加精准的工作流定制,通过将Stable Diffusion的流程拆分成节点,实现了工作流的定制和可复现性。
1、20分钟速通安装ComfyUI
选择利用魔搭社区提供的Notebook和免费的GPU算力体验来体验ComfyUI。
2、下载脚本代码文件
下载安装ComfyUI的执行文件和task1中微调完成Lora文件
- git lfs install
- git clone https://www.modelscope.cn/datasets/maochase/kolors_test_comfyui.git
- mv kolors_test_comfyui/* ./
- rm -rf kolors_test_comfyui/
- mkdir -p /mnt/workspace/models/lightning_logs/version_0/checkpoints/
- mv epoch=0-step=500.ckpt /mnt/workspace/models/lightning_logs/version_0/checkpoints/
复制代码
3、进入ComfyUI的安装文件
4、一键执行安装步伐(约莫10min)
5、进入预览界面
当执行到最后一个节点的内容输出了一个访问的链接的时候,复制链接到欣赏器中访问
PS:如果链接访问白屏,或者报错,就等一会再访问重试,步伐可能没有正常启动完毕
6、浅尝ComfyUI工作流
1.不带Lora的工作流样例
创建.json格式文件:
- {
- "last_node_id": 15,
- "last_link_id": 18,
- "nodes": [
- {
- "id": 11,
- "type": "VAELoader",
- "pos": [
- 1323,
- 240
- ],
- "size": {
- "0": 315,
- "1": 58
- },
- "flags": {},
- "order": 0,
- "mode": 0,
- "outputs": [
- {
- "name": "VAE",
- "type": "VAE",
- "links": [
- 12
- ],
- "shape": 3
- }
- ],
- "properties": {
- "Node name for S&R": "VAELoader"
- },
- "widgets_values": [
- "sdxl.vae.safetensors"
- ]
- },
- {
- "id": 10,
- "type": "VAEDecode",
- "pos": [
- 1368,
- 369
- ],
- "size": {
- "0": 210,
- "1": 46
- },
- "flags": {},
- "order": 6,
- "mode": 0,
- "inputs": [
- {
- "name": "samples",
- "type": "LATENT",
- "link": 18
- },
- {
- "name": "vae",
- "type": "VAE",
- "link": 12,
- "slot_index": 1
- }
- ],
- "outputs": [
- {
- "name": "IMAGE",
- "type": "IMAGE",
- "links": [
- 13
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "VAEDecode"
- }
- },
- {
- "id": 14,
- "type": "KolorsSampler",
- "pos": [
- 1011,
- 371
- ],
- "size": {
- "0": 315,
- "1": 222
- },
- "flags": {},
- "order": 5,
- "mode": 0,
- "inputs": [
- {
- "name": "kolors_model",
- "type": "KOLORSMODEL",
- "link": 16
- },
- {
- "name": "kolors_embeds",
- "type": "KOLORS_EMBEDS",
- "link": 17
- }
- ],
- "outputs": [
- {
- "name": "latent",
- "type": "LATENT",
- "links": [
- 18
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "KolorsSampler"
- },
- "widgets_values": [
- 1024,
- 1024,
- 1000102404233412,
- "fixed",
- 25,
- 5,
- "EulerDiscreteScheduler"
- ]
- },
- {
- "id": 6,
- "type": "DownloadAndLoadKolorsModel",
- "pos": [
- 201,
- 368
- ],
- "size": {
- "0": 315,
- "1": 82
- },
- "flags": {},
- "order": 1,
- "mode": 0,
- "outputs": [
- {
- "name": "kolors_model",
- "type": "KOLORSMODEL",
- "links": [
- 16
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "DownloadAndLoadKolorsModel"
- },
- "widgets_values": [
- "Kwai-Kolors/Kolors",
- "fp16"
- ]
- },
- {
- "id": 3,
- "type": "PreviewImage",
- "pos": [
- 1366,
- 468
- ],
- "size": [
- 535.4001724243165,
- 562.2001106262207
- ],
- "flags": {},
- "order": 7,
- "mode": 0,
- "inputs": [
- {
- "name": "images",
- "type": "IMAGE",
- "link": 13
- }
- ],
- "properties": {
- "Node name for S&R": "PreviewImage"
- }
- },
- {
- "id": 12,
- "type": "KolorsTextEncode",
- "pos": [
- 519,
- 529
- ],
- "size": [
- 457.2893696934723,
- 225.28656056301645
- ],
- "flags": {},
- "order": 4,
- "mode": 0,
- "inputs": [
- {
- "name": "chatglm3_model",
- "type": "CHATGLM3MODEL",
- "link": 14,
- "slot_index": 0
- }
- ],
- "outputs": [
- {
- "name": "kolors_embeds",
- "type": "KOLORS_EMBEDS",
- "links": [
- 17
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "KolorsTextEncode"
- },
- "widgets_values": [
- "cinematic photograph of an astronaut riding a horse in space |\nillustration of a cat wearing a top hat and a scarf |\nphotograph of a goldfish in a bowl |\nanime screencap of a red haired girl",
- "",
- 1
- ]
- },
- {
- "id": 15,
- "type": "Note",
- "pos": [
- 200,
- 636
- ],
- "size": [
- 273.5273818969726,
- 149.55464588512064
- ],
- "flags": {},
- "order": 2,
- "mode": 0,
- "properties": {
- "text": ""
- },
- "widgets_values": [
- "Text encoding takes the most VRAM, quantization can reduce that a lot.\n\nApproximate values I have observed:\nfp16 - 12 GB\nquant8 - 8-9 GB\nquant4 - 4-5 GB\n\nquant4 reduces the quality quite a bit, 8 seems fine"
- ],
- "color": "#432",
- "bgcolor": "#653"
- },
- {
- "id": 13,
- "type": "DownloadAndLoadChatGLM3",
- "pos": [
- 206,
- 522
- ],
- "size": [
- 274.5334274291992,
- 58
- ],
- "flags": {},
- "order": 3,
- "mode": 0,
- "outputs": [
- {
- "name": "chatglm3_model",
- "type": "CHATGLM3MODEL",
- "links": [
- 14
- ],
- "shape": 3
- }
- ],
- "properties": {
- "Node name for S&R": "DownloadAndLoadChatGLM3"
- },
- "widgets_values": [
- "fp16"
- ]
- }
- ],
- "links": [
- [
- 12,
- 11,
- 0,
- 10,
- 1,
- "VAE"
- ],
- [
- 13,
- 10,
- 0,
- 3,
- 0,
- "IMAGE"
- ],
- [
- 14,
- 13,
- 0,
- 12,
- 0,
- "CHATGLM3MODEL"
- ],
- [
- 16,
- 6,
- 0,
- 14,
- 0,
- "KOLORSMODEL"
- ],
- [
- 17,
- 12,
- 0,
- 14,
- 1,
- "KOLORS_EMBEDS"
- ],
- [
- 18,
- 14,
- 0,
- 10,
- 0,
- "LATENT"
- ]
- ],
- "groups": [],
- "config": {},
- "extra": {
- "ds": {
- "scale": 1.1,
- "offset": {
- "0": -114.73954010009766,
- "1": -139.79705810546875
- }
- }
- },
- "version": 0.4
- }
复制代码 加载模型,并完成第一次生图
PS:初次点击生成图片会加载资源,时间较长,各人耐心等待
2.带Lora的工作流样例
创建.json格式文件:
-
- {
- "last_node_id": 16,
- "last_link_id": 20,
- "nodes": [
- {
- "id": 11,
- "type": "VAELoader",
- "pos": [
- 1323,
- 240
- ],
- "size": {
- "0": 315,
- "1": 58
- },
- "flags": {},
- "order": 0,
- "mode": 0,
- "outputs": [
- {
- "name": "VAE",
- "type": "VAE",
- "links": [
- 12
- ],
- "shape": 3
- }
- ],
- "properties": {
- "Node name for S&R": "VAELoader"
- },
- "widgets_values": [
- "sdxl.vae.safetensors"
- ]
- },
- {
- "id": 10,
- "type": "VAEDecode",
- "pos": [
- 1368,
- 369
- ],
- "size": {
- "0": 210,
- "1": 46
- },
- "flags": {},
- "order": 7,
- "mode": 0,
- "inputs": [
- {
- "name": "samples",
- "type": "LATENT",
- "link": 18
- },
- {
- "name": "vae",
- "type": "VAE",
- "link": 12,
- "slot_index": 1
- }
- ],
- "outputs": [
- {
- "name": "IMAGE",
- "type": "IMAGE",
- "links": [
- 13
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "VAEDecode"
- }
- },
- {
- "id": 15,
- "type": "Note",
- "pos": [
- 200,
- 636
- ],
- "size": {
- "0": 273.5273742675781,
- "1": 149.5546417236328
- },
- "flags": {},
- "order": 1,
- "mode": 0,
- "properties": {
- "text": ""
- },
- "widgets_values": [
- "Text encoding takes the most VRAM, quantization can reduce that a lot.\n\nApproximate values I have observed:\nfp16 - 12 GB\nquant8 - 8-9 GB\nquant4 - 4-5 GB\n\nquant4 reduces the quality quite a bit, 8 seems fine"
- ],
- "color": "#432",
- "bgcolor": "#653"
- },
- {
- "id": 13,
- "type": "DownloadAndLoadChatGLM3",
- "pos": [
- 206,
- 522
- ],
- "size": {
- "0": 274.5334167480469,
- "1": 58
- },
- "flags": {},
- "order": 2,
- "mode": 0,
- "outputs": [
- {
- "name": "chatglm3_model",
- "type": "CHATGLM3MODEL",
- "links": [
- 14
- ],
- "shape": 3
- }
- ],
- "properties": {
- "Node name for S&R": "DownloadAndLoadChatGLM3"
- },
- "widgets_values": [
- "fp16"
- ]
- },
- {
- "id": 6,
- "type": "DownloadAndLoadKolorsModel",
- "pos": [
- 201,
- 368
- ],
- "size": {
- "0": 315,
- "1": 82
- },
- "flags": {},
- "order": 3,
- "mode": 0,
- "outputs": [
- {
- "name": "kolors_model",
- "type": "KOLORSMODEL",
- "links": [
- 19
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "DownloadAndLoadKolorsModel"
- },
- "widgets_values": [
- "Kwai-Kolors/Kolors",
- "fp16"
- ]
- },
- {
- "id": 12,
- "type": "KolorsTextEncode",
- "pos": [
- 519,
- 529
- ],
- "size": {
- "0": 457.28936767578125,
- "1": 225.28656005859375
- },
- "flags": {},
- "order": 4,
- "mode": 0,
- "inputs": [
- {
- "name": "chatglm3_model",
- "type": "CHATGLM3MODEL",
- "link": 14,
- "slot_index": 0
- }
- ],
- "outputs": [
- {
- "name": "kolors_embeds",
- "type": "KOLORS_EMBEDS",
- "links": [
- 17
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "KolorsTextEncode"
- },
- "widgets_values": [
- "二次元,长发,少女,白色背景",
- "",
- 1
- ]
- },
- {
- "id": 3,
- "type": "PreviewImage",
- "pos": [
- 1366,
- 469
- ],
- "size": {
- "0": 535.400146484375,
- "1": 562.2001342773438
- },
- "flags": {},
- "order": 8,
- "mode": 0,
- "inputs": [
- {
- "name": "images",
- "type": "IMAGE",
- "link": 13
- }
- ],
- "properties": {
- "Node name for S&R": "PreviewImage"
- }
- },
- {
- "id": 16,
- "type": "LoadKolorsLoRA",
- "pos": [
- 606,
- 368
- ],
- "size": {
- "0": 317.4000244140625,
- "1": 82
- },
- "flags": {},
- "order": 5,
- "mode": 0,
- "inputs": [
- {
- "name": "kolors_model",
- "type": "KOLORSMODEL",
- "link": 19
- }
- ],
- "outputs": [
- {
- "name": "kolors_model",
- "type": "KOLORSMODEL",
- "links": [
- 20
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "LoadKolorsLoRA"
- },
- "widgets_values": [
- "/mnt/workspace/models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt",
- 2
- ]
- },
- {
- "id": 14,
- "type": "KolorsSampler",
- "pos": [
- 1011,
- 371
- ],
- "size": {
- "0": 315,
- "1": 266
- },
- "flags": {},
- "order": 6,
- "mode": 0,
- "inputs": [
- {
- "name": "kolors_model",
- "type": "KOLORSMODEL",
- "link": 20
- },
- {
- "name": "kolors_embeds",
- "type": "KOLORS_EMBEDS",
- "link": 17
- },
- {
- "name": "latent",
- "type": "LATENT",
- "link": null
- }
- ],
- "outputs": [
- {
- "name": "latent",
- "type": "LATENT",
- "links": [
- 18
- ],
- "shape": 3,
- "slot_index": 0
- }
- ],
- "properties": {
- "Node name for S&R": "KolorsSampler"
- },
- "widgets_values": [
- 1024,
- 1024,
- 0,
- "fixed",
- 25,
- 5,
- "EulerDiscreteScheduler",
- 1
- ]
- }
- ],
- "links": [
- [
- 12,
- 11,
- 0,
- 10,
- 1,
- "VAE"
- ],
- [
- 13,
- 10,
- 0,
- 3,
- 0,
- "IMAGE"
- ],
- [
- 14,
- 13,
- 0,
- 12,
- 0,
- "CHATGLM3MODEL"
- ],
- [
- 17,
- 12,
- 0,
- 14,
- 1,
- "KOLORS_EMBEDS"
- ],
- [
- 18,
- 14,
- 0,
- 10,
- 0,
- "LATENT"
- ],
- [
- 19,
- 6,
- 0,
- 16,
- 0,
- "KOLORSMODEL"
- ],
- [
- 20,
- 16,
- 0,
- 14,
- 0,
- "KOLORSMODEL"
- ]
- ],
- "groups": [],
- "config": {},
- "extra": {
- "ds": {
- "scale": 1.2100000000000002,
- "offset": {
- "0": -183.91309381910426,
- "1": -202.11110769225016
- }
- }
- },
- "version": 0.4
- }
-
复制代码 7、关闭魔塔GPU环境
二、Lora微调
Lora微调,全称为Low-Rank Adaptation(低秩顺应),是一种高效的模型微调技术,特别适用于大型预训练模型。该技术通过引入低秩矩阵来保持预训练模型的大部分参数不变,仅调整少量参数以顺应特定任务。
1.Lora微调的基本原理
- 参数矩阵的低秩近似:
- 大模型通常具有过参数化的特点,即参数矩阵的维度很高,但在特定任务中,只有一小部分参数起紧张作用。
- Lora利用低秩矩阵分解的思想,通过引入两个维度较小的矩阵A和B(A的维度为dxr,B的维度为rxd,其中r远小于d)来近似原始权重矩阵。
- 这两个矩阵相乘后,得到的矩阵AB的秩远小于原始权重矩阵的秩,但可以或许在一定程度上保持模型在特定任务上的性能。
- 旁路布局:
- 在网络中增加一个旁路布局,该旁路是A和B两个矩阵相乘的结果。
- 在训练过程中,冻结原始网络的参数,只训练旁路参数A和B。
- 由于A和B的参数量远远小于原始网络的参数,因此训练时所需的显存开销大大减小。
2.Task2中的的微调代码
代码如下:
- import os
- cmd = """
- python DiffSynth-Studio/examples/train/kolors/train_kolors_lora.py \ # 选择使用可图的Lora训练脚本DiffSynth-Studio/examples/train/kolors/train_kolors_lora.py
- --pretrained_unet_path models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors \ # 选择unet模型
- --pretrained_text_encoder_path models/kolors/Kolors/text_encoder \ # 选择text_encoder
- --pretrained_fp16_vae_path models/sdxl-vae-fp16-fix/diffusion_pytorch_model.safetensors \ # 选择vae模型
- --lora_rank 16 \ # lora_rank 16 表示在权衡模型表达能力和训练效率时,选择了使用 16 作为秩,适合在不显著降低模型性能的前提下,通过 LoRA 减少计算和内存的需求
- --lora_alpha 4.0 \ # 设置 LoRA 的 alpha 值,影响调整的强度
- --dataset_path data/lora_dataset_processed \ # 指定数据集路径,用于训练模型
- --output_path ./models \ # 指定输出路径,用于保存模型
- --max_epochs 1 \ # 设置最大训练轮数为 1
- --center_crop \ # 启用中心裁剪,这通常用于图像预处理
- --use_gradient_checkpointing \ # 启用梯度检查点技术,以节省内存
- --precision "16-mixed" # 指定训练时的精度为混合 16 位精度(half precision),这可以加速训练并减少显存使用
- """.strip()
- os.system(cmd) # 执行可图Lora训练
复制代码 总结
AIGC作为人工智能领域的一个紧张分支,正在徐徐改变我们的生存方式和工作方式。随着技术的不停发展和美满,我们有来由信赖AIGC将在更多领域展现出其独特的魅力和价值。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |