用开源模子MusicGen制作六一儿童节专属音乐

打印 上一主题 下一主题

主题 681|帖子 681|积分 2043

利用的是开源模子MusicGen,它可以根据文字形貌大概已有旋律生成高质量的音乐(32kHz),其原理是通过生成Encodec token然后再解码为音频,模子利用EnCodec神经音频编解码器来从原始波形中学习离散音频token。EnCodec将音频信号映射到一个或多个并行的离散token流。然后利用一个自回归语言模子来递归地对EnCodec中的音频token举行建模。生成的token然后被馈送到EnCodec解码器,将它们映射回音频空间并获取输出波形。末了,可以利用不同类型的条件模子来控制生成
  

准备运行环境

拷贝模子文件

  1. import moxing as mox
  2. mox.file.copy_parallel('obs://modelarts-labs-bj4-v2/case_zoo/MusicGen/model/', 'model')
  3. mox.file.copy_parallel('obs://modelarts-labs-bj4-v2/course/ModelBox/opus-mt-zh-en', 'opus-mt-zh-en')
  4. mox.file.copy_parallel('obs://modelarts-labs-bj4-v2/course/ModelBox/frpc_linux_amd64', 'frpc_linux_amd64')
复制代码
基于Python3.9.15 创建假造运行环境

  1. !/home/ma-user/anaconda3/bin/conda create -n python-3.9.15 python=3.9.15 -y --override-channels --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  2. !/home/ma-user/anaconda3/envs/python-3.9.15/bin/pip install ipykernel
复制代码
修改Kernel文件

  1. import json
  2. import os
  3. data = {
  4.    "display_name": "python-3.9.15",
  5.    "env": {
  6.       "PATH": "/home/ma-user/anaconda3/envs/python-3.9.15/bin:/home/ma-user/anaconda3/envs/python-3.7.10/bin:/modelarts/authoring/notebook-conda/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/ma-user/modelarts/ma-cli/bin:/home/ma-user/modelarts/ma-cli/bin:/home/ma-user/anaconda3/envs/PyTorch-1.8/bin"
  7.    },
  8.    "language": "python",
  9.    "argv": [
  10.       "/home/ma-user/anaconda3/envs/python-3.9.15/bin/python",
  11.       "-m",
  12.       "ipykernel",
  13.       "-f",
  14.       "{connection_file}"
  15.    ]
  16. }
  17. if not os.path.exists("/home/ma-user/anaconda3/share/jupyter/kernels/python-3.9.15/"):
  18.     os.mkdir("/home/ma-user/anaconda3/share/jupyter/kernels/python-3.9.15/")
  19. with open('/home/ma-user/anaconda3/share/jupyter/kernels/python-3.9.15/kernel.json', 'w') as f:
  20.     json.dump(data, f, indent=4)
  21.     print('kernel.json文件修改完毕')
复制代码
安装依赖

  1. !pip install --upgrade pip
  2. !pip install torch==2.0.1 torchvision==0.15.2
  3. !pip install sentencepiece
  4. !pip install librosa
  5. !pip install --upgrade transformers scipy
  6. !pip install gradio==4.16.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
  7. !cp frpc_linux_amd64 /home/ma-user/anaconda3/envs/python-3.9.15/lib/python3.9/site-packages/gradio/frpc_linux_amd64_v0.2
  8. !chmod +x /home/ma-user/anaconda3/envs/python-3.9.15/lib/python3.9/site-packages/gradio/frpc_linux_amd64_v0.2
复制代码
模子测试

模子推理

  1. #@title Default title text
  2. import torch
  3. from transformers import AutoProcessor, MusicgenForConditionalGeneration, pipeline
  4. zh2en = pipeline("translation", model="./opus-mt-zh-en")
  5. prompt = "六一儿童节  男孩专属节奏感强的音乐"
  6. prompt = zh2en(prompt)[0].get("translation_text")
  7. print(prompt)
  8. device = 'cuda' if torch.cuda.is_available() else 'cpu'
  9. processor = AutoProcessor.from_pretrained("./model/")
  10. model = MusicgenForConditionalGeneration.from_pretrained("./model/")
  11. model.to(device)
  12. inputs = processor(
  13.     text=[prompt],
  14.     padding=True,
  15.     return_tensors="pt",
  16. ).to(device)
  17. # max_new_tokens对应生成音乐的长度,1024表示生成20s长的音乐;
  18. # 目前最大支持生成30s长的音乐,对应max_new_tokens值为1536
  19. audio_values = model.generate(**inputs, max_new_tokens=1024)
复制代码
生成音频文件

  1. from IPython.display import Audio
  2. sampling_rate = model.config.audio_encoder.sampling_rate
  3. if torch.cuda.is_available():
  4.     audio_data = audio_values[0].cpu().numpy()
  5. else:
  6.     audio_data = audio_values[0].numpy()
  7.    
  8. Audio(audio_data, rate=sampling_rate)
复制代码
保存文件

  1. import scipy
  2. sampling_rate = model.config.audio_encoder.sampling_rate
  3. if torch.cuda.is_available():
  4.     audio_data = audio_values[0, 0].cpu().numpy()
  5. else:
  6.     audio_data = audio_values[0, 0].numpy()
  7. scipy.io.wavfile.write("music_out.wav", rate=sampling_rate, data=audio_data)
复制代码

图形化生成界面应用

  1. import torch
  2. import scipy
  3. import librosa
  4. from transformers import AutoProcessor, MusicgenForConditionalGeneration, pipeline
  5. def music_generate(prompt: str, duration: int):
  6.     zh2en = pipeline("translation", model="./opus-mt-zh-en")
  7.     token = int(duration / 5 * 256)
  8.     print('token:',token)
  9.    
  10.     prompt = zh2en(prompt)[0].get("translation_text")
  11.     print('prompt:',prompt)
  12.     device = 'cuda' if torch.cuda.is_available() else 'cpu'
  13.     processor = AutoProcessor.from_pretrained("./model/")
  14.     model = MusicgenForConditionalGeneration.from_pretrained("./model/")
  15.     model.to(device)
  16.     inputs = processor(
  17.         text=[prompt],
  18.         padding=True,
  19.         return_tensors="pt",
  20.     ).to(device)
  21.     audio_values = model.generate(**inputs, max_new_tokens=token)
  22.    
  23.     sampling_rate = model.config.audio_encoder.sampling_rate
  24.     if torch.cuda.is_available():
  25.         audio_data = audio_values[0, 0].cpu().numpy()
  26.     else:
  27.         audio_data = audio_values[0, 0].numpy()
  28.         
  29.     scipy.io.wavfile.write("music_out.wav", rate=sampling_rate, data=audio_data)
  30.    
  31.     audio,sr = librosa.load(path="music_out.wav")
  32.    
  33.     return sr, audio
复制代码
  1. import gradio as gr
  2. with gr.Blocks() as demo:
  3.     gr.HTML("""<h1 align="center">文本生成音乐</h1>""")
  4.     with gr.Row():
  5.         with gr.Column(scale=1):
  6.             prompt = gr.Textbox(lines=1, label="提示语")
  7.             duration = gr.Slider(5, 30, value=15, step=5, label="歌曲时长(单位:s)", interactive=True)
  8.             runBtn = gr.Button(value="生成", variant="primary")
  9.         with gr.Column(scale=1):
  10.             music = gr.Audio(label="输出")
  11.     runBtn.click(music_generate, inputs=[prompt, duration], outputs=[music], show_progress=True)
  12. demo.queue().launch(share=True)
复制代码
  1. huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
  2. To disable this warning, you can either:
  3.         - Avoid using `tokenizers` before the fork if possible
  4.         - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  5. huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
  6. To disable this warning, you can either:
  7.         - Avoid using `tokenizers` before the fork if possible
  8.         - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  9. Running on local URL:  http://127.0.0.1:7860
  10. IMPORTANT: You are using gradio version 4.16.0, however version 4.29.0 is available, please upgrade.
  11. --------
  12. huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
  13. To disable this warning, you can either:
  14.         - Avoid using `tokenizers` before the fork if possible
  15.         - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  16. Running on public URL: https://cd3ee3f9072d7e8f5d.gradio.live
  17. This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
复制代码
点击链接打开图形界面,如图所示


免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

杀鸡焉用牛刀

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表