国内Ubuntu环境Docker部署 SenseVoice
国内Ubuntu环境Docker部署 SenseVoice趁热搞定了 docker 部署 SenseVoice。在这里记录一下相关的文件。
SenseVoice是一个大模型语音识别库, 支持多种语言识别,速度快,准确率高,详细先容请参考GitHub官网:
https://github.com/FunAudioLLM/SenseVoice
本笔记主要记录使用 docker 进行部署的相关文件,文件内容放在末了。
[*]Dockerfile
[*]compose.yaml
[*]requirements.txt
[*]start.sh
[*]webui.py
[*]model_download.py
部署过程
1. 下载须要的模型
model_download.py
import os
import argparse
parser = argparse.ArgumentParser(description='modelscope模型下载')
parser.add_argument('--model_name', type=str, help='the model name from modelscope, example AI-ModelScope/stable-diffusion-2-1', required=True)
parser.add_argument('--local_dir', type=str, help='the model cache path.', default=os.getcwd(), required=True)
if __name__ == '__main__':
args = parser.parse_args()
print(f"current workspace is {os.getcwd()}")
print(f"the model_name is {args.local_dir}/{args.model_name}")
print(f"the local_dir is {args.local_dir}")
try:
from modelscope import snapshot_download
model_dir = snapshot_download(args.model_name, local_dir=args.local_dir)
except ImportError:
print("modelscope was not installed! try to install...")
os.system("pip install modelscope")
except Exception as e:
print(f"An error occurred: {e}")
在SenseVoice项目的根目次下创建一个 download_model.py 文件,并将上述内容写入。
执行以下命令分别下载 SenseVoiceSmall和 speech_fsmn_vad_zh-cn-16k-common-pytorch 模型。
python3 model_download.py --model_name=iic/SenseVoiceSmall --local_dir=models/iic/SenseVoiceSmall
python3 model_download.py --model_name=iic/speech_fsmn_vad_zh-cn-16k-common-pytorch --local_dir=models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch
2、docker部署
[*]Dockerfile
[*]compose.yaml
[*]requirements.txt
[*]start.sh
[*]webui.py
请在 SenseVoice项目的根目次下创建一个 docker 文件夹,并将上述文件放入 docker 文件夹内。
修改 webui.py 文件18行的 model 变量为 models/iic/SenseVoiceSmall (上述1下载模型设置的本地路径); 20行的vad_model参数修改为 models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch。
https://i-blog.csdnimg.cn/direct/e28cf7fa51d54b7fa21357a42138ac45.png
webui.py
# coding=utf-8
import os
import librosa
import base64
import io
import gradio as gr
import re
import numpy as np
import torch
import torchaudio
from argparse import ArgumentParser
from funasr import AutoModel
model = "models/iic/SenseVoiceSmall"
model = AutoModel(model=model,
vad_model="models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch",
vad_kwargs={"max_single_segment_time": 30000},
trust_remote_code=True,
)
import re
emo_dict = {
"<|HAPPY|>": "
页:
[1]