根本头脑:使用昇腾NPU处理芯片+昇腾Mindie推理框架+embeding分词+排序举行dify支撑,对外部客户使用,因为整套华为昇腾处理架构为aarch64位,所以团体docker镜像使用arm镜像,本教程以Atlas 800 9000为底子举行部署和测试,本博客的时间点为2025年2月23日,镜像文件和教程仅限于目前官方支持的驱动版本,后续官方有版本更新,一切以昇腾官网为基准;目前测试昇腾的硬件平台都是支持本文部署,抛开项目配景,举行简要部署记录,所有部署一定要以华为的硬件为底子;
第一步:先向昇腾方申请装备,申请到Atlas 800 9000服务器,使用昇腾官方提供的账号和暗码包管可以登录上服务器;

(1)更新一下驱动,因为昇腾官方的提供的镜像必要指定版本的驱动固件,下载安装更新 Version: 23.0.rc2将会变动为Version: 23.0.0,下载地址:社区版-固件与驱动-昇腾社区

更新安装固件,并更新固件,重启装备
- [root@dify HwHiAiUser]# pwd
- /home/HwHiAiUser
- [root@dify HwHiAiUser]# ls -l
- total 131112
- -rw------- 1 root root 134251528 Dec 7 16:16 Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run
- [root@dify HwHiAiUser]# chmod 777 Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run
- [root@dify HwHiAiUser]# ls
- Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run
- [root@dify HwHiAiUser]# sudo ./Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run --full --force
- Verifying archive integrity... 100% SHA256 checksums are OK. All good.
- Uncompressing ASCEND DRIVER RUN PACKAGE 100%
- [Driver] [2025-02-23 15:46:26] [INFO]Start time: 2025-02-23 15:46:26
- [Driver] [2025-02-23 15:46:26] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log
- [Driver] [2025-02-23 15:46:26] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log
- [Driver] [2025-02-23 15:46:26] [INFO]base version is 23.0.rc2.
- [Driver] [2025-02-23 15:46:26] [WARNING]Do not power off or restart the system during the installation/upgrade
- [Driver] [2025-02-23 15:46:26] [INFO]set username and usergroup, HwHiAiUser:HwHiAiUser
- [Driver] [2025-02-23 15:46:26] [INFO]Driver package has been installed on the path /usr/local/Ascend, the version is 23.0.rc2, and the version of this package is 23.0.0,do you want to continue? [y/n]
- y
- [Driver] [2025-02-23 15:46:36] [INFO]driver install type: Direct
- [Driver] [2025-02-23 15:46:36] [INFO]upgradePercentage:10%
- [Driver] [2025-02-23 15:46:40] [INFO]upgradePercentage:30%
- [Driver] [2025-02-23 15:46:40] [INFO]upgradePercentage:40%
- [Driver] [2025-02-23 15:46:42] [INFO]upgradePercentage:90%
- [Driver] [2025-02-23 15:46:45] [INFO]upgradePercentage:100%
- [Driver] [2025-02-23 15:46:45] [INFO]Driver package installed successfully! Reboot needed for installation/upgrade to take effect!
- [Driver] [2025-02-23 15:46:45] [INFO]End time: 2025-02-23 15:46:45
- [root@dify HwHiAiUser]# sudo reboot
复制代码 固件更新完成,查看驱动版本为Version: 23.0.0

(2)将底子模型先下载下来,一会举行挂载推理模型,分词模型、到排序模型,举行使用 ,可以去魔搭社区下载ModelScope魔搭社区,先下载模型:DeepSeek-R1-Distill-Qwen-32B ,下载使用方式参考官方引导方式即可;

使用python脚本下载模型
- [root@dify HwHiAiUser]# pwd
- /home/HwHiAiUser
- [root@dify HwHiAiUser]# pip3 install modelscope==1.18.0 -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
- [root@dify HwHiAiUser]# python3
- Python 3.7.0 (default, May 11 2024, 10:32:14)
- [GCC 7.3.0] on linux
- Type "help", "copyright", "credits" or "license" for more information.
- >>> import modelscope
- >>> exit()
- [root@dify HwHiAiUser]# cat down.py
- #模型下载
- from modelscope import snapshot_download
- model_dir = snapshot_download('deepseek-ai/DeepSeek-R1-Distill-Qwen-32B',cache_dir=".")
- [root@dify HwHiAiUser]# python3 down.py
- Downloading [figures/benchmark.jpg]: 100%|██████████████████████████████████████████████████████████████████████| 759k/759k [00:00<00:00, 1.78MB/s]
- Downloading [config.json]: 100%|██████████████████████████████████████████████████████████████████████████████████| 664/664 [00:00<00:00, 2.10kB/s]
- Downloading [configuration.json]: 100%|███████████████████████████████████████████████████████████████████████████| 73.0/73.0 [00:00<00:00, 233B/s]
- Downloading [generation_config.json]: 100%|█████████████████████████████████████████████████████████████████████████| 181/181 [00:00<00:00, 686B/s]
- Downloading [LICENSE]: 100%|██████████████████████████████████████████████████████████████████████████████████| 1.04k/1.04k [00:00<00:00, 2.92kB/s]
- Downloading [model-00001-of-000008.safetensors]: 0%| | 1.00M/8.19G [00:00<59:21, 2.47MB/s]Downloading [model-00001-of-000008.safetensors]: 0%| | 16.0M/8.19G [00:00<03:43, 39.3MB/s]
复制代码 下载完成,查看权重目次
- [root@dify HwHiAiUser]# pwd
- /home/HwHiAiUser
- [root@dify HwHiAiUser]# tree -L 2
- .
- ├── Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run
- ├── deepseek-ai
- │ ├── DeepSeek-R1-Distill-Qwen-32B
- │
- └── down.py
- 3 directories, 3 files
复制代码 二、使用官方镜像 昇腾镜像堆栈详情,举行昇腾MindIE环境构建,因为计划测试DeepSeek-R1-Distill-Qwen-32B-W8A8模型,所以记得创建容器挂载两张卡即可
(1)拉取Atals 800 9000镜像,发起从官方拉取,自己要根据自己的机型拉取对应的镜像
也可以从下面的公开链接拉取镜像,创建双卡容器
- [root@dify HwHiAiUser]#yum install docker
- [root@dify HwHiAiUser]# docker pull swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie:910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02
- Error response from daemon: Get https://swr.cn-east-317.qdrgznjszx.com/v2/: x509: certificate signed by unknown authority
- [root@dify HwHiAiUser]#
复制代码 修改配置源,添加mindie的镜像源;
- 解决办法:
- [root@dify HwHiAiUser]#vim /etc/docker/daemon.json
- 填入内容
- { "insecure-registries": ["https://swr.cn-east-317.qdrgznjszx.com"], "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"] }
- 保存退出、然后重启docker即可
- [root@dify HwHiAiUser]# systemctl restart docker.service
- [root@dify HwHiAiUser]# docker pull swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie:910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02
- 910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02: Pulling from qd-aicc/mindie
- edab87ea811e: Pull complete
- 72906c864c93: Pull complete
- 98f62a370e96: Pull complete
- Digest: sha256:6ceefe4506f58084717ec9bed7df75e51032fdd709d791a627084fe4bd92abea
- Status: Downloaded newer image for swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie:910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02
- [root@dify HwHiAiUser]#
复制代码 创建容器,进入容器,计划使用两张昇腾NPU卡推理DeepSeek-R1-Distill-Qwen-32B的W8A8模型,所以构建的容器用两张卡,选6、7卡吧,0-6号卡可以跑文本嵌入模型、重排序模型;创建容器脚本
- [root@dify ~]# cd /home/HwHiAiUser/
- [root@dify HwHiAiUser]# ls
- Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run deepseek-ai down.py
- [root@dify HwHiAiUser]# docker images
- REPOSITORY TAG IMAGE ID CREATED SIZE
- swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie 910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02 69f30d0c15be 5 weeks ago 16.5GB
- [root@dify HwHiAiUser]# vim docker_run.sh
- [root@dify HwHiAiUser]# vim docker_run.sh
- [root@dify HwHiAiUser]# vim docker_run.sh
- [root@dify HwHiAiUser]# cat docker_run.sh
- #!/bin/bash
- docker_images=swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie:910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02
- model_dir=/home/HwHiAiUser #根据实际情况修改挂载目录
- docker run -it --name qdaicc --ipc=host --net=host \
- --device=/dev/davinci6 \
- --device=/dev/davinci7 \
- --device=/dev/davinci_manager \
- --device=/dev/devmm_svm \
- --device=/dev/hisi_hdc \
- -v /usr/local/dcmi:/usr/local/dcmi \
- -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
- -v /usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/common \
- -v /usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/driver \
- -v /etc/ascend_install.info:/etc/ascend_install.info \
- -v /etc/vnpu.cfg:/etc/vnpu.cfg \
- -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
- -v ${model_dir}:${model_dir} \
- -v /var/log/npu:/usr/slog ${docker_images} \
- /bin/bash
- [root@dify HwHiAiUser]#
复制代码 填进去内容如上,启动镜像
- [root@dify HwHiAiUser]# bash docker_run.sh
- (Python310) root@dify:/usr/local/Ascend/atb-models# cd /home/HwHiAiUser/
- (Python310) root@dify:/home/HwHiAiUser# ls
- Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run deepseek-ai docker_run.sh down.py
复制代码 因为之前挂在的目次是 /home/HwHiAiUser/ ,所以可以在docker里面看到物理机的下载权重,再查看一下卡数是两张
(2)举行模型量化Ascend/ModelZoo-PyTorch - Gitee.com 直接进入量化阶段,在容器外面操作即可,环境不用管,因为系统已经默认配置了环境,直接跳到 权重量化 阶段,安装过程缺什么,,在docker外面git下源码,进入容器内部举行量化,这里的容器发起在创建个8卡的容器,双卡容器量化会体现npu显存不敷,除非你用cpu转模型,我就懒得创建容器了,使用cpu量化吧;
- [root@dify HwHiAiUser]# pwd
- /home/HwHiAiUser
- [root@dify HwHiAiUser]# git clone https://gitee.com/ascend/msit.git
- Cloning into 'msit'...
- remote: Enumerating objects: 81125, done.
- remote: Total 81125 (delta 0), reused 0 (delta 0), pack-reused 81125
- Receiving objects: 100% (81125/81125), 71.73 MiB | 12.14 MiB/s, done.
- Resolving deltas: 100% (59704/59704), done.
- [root@dify HwHiAiUser]# cd msit/
- .git/ .gitee/ msit/ msmodelslim/ msserviceprofiler/
- [root@dify Qwen]# docker start b5399c4da202
- b5399c4da202
- [root@dify Qwen]# docker exec -it b5399c4da202 /bin/bash
- (Python310) root@dify:/home/HwHiAiUser/msit# cd msmodelslim/
- (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim# bash install.sh
- #安装成功,pip缺啥安装啥
- (Python310) root@dify:/home/HwHiAiUser# cd /home/HwHiAiUser/msit/msmodelslim/example/Qwen
- #量化模型
- (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# python3 quant_qwen.py --model_path /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/ --save_directory /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8 --calib_file ../common/boolq.jsonl --w_bit 8 --a_bit 8 --device_type npu
- 2025-02-23 18:15:25,404 - msmodelslim-logger - WARNING - The current CANN version does not support LayerSelector quantile method.
- 或者cpu处理
- (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# python3 quant_qwen.py --model_path /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/ --save_directory /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8 --calib_file ../common/boolq.jsonl --w_bit 8 --a_bit 8 --device_type cpu
- 2025-02-23 18:25:10,776 - msmodelslim-logger - WARNING - The current CANN version does not support LayerSelector quantile method.
- 2025-02-23 18:25:10,783 - msmodelslim-logger - WARNING - `cpu` is set as `dev_type`, `dev_id` cannot be specified manually!
复制代码 转换完成之后生成权重文件
- (Python310) root@dify:/home/HwHiAiUser/deepseek-ai# cd /home/HwHiAiUser/msit/msmodelslim/example/Qwen
- (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# ls /home/HwHiAiUser/deepseek-ai/
- DeepSeek-R1-Distill-Qwen-32B DeepSeek-R1-Distill-Qwen-32B-W8A8
- (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen#
复制代码 因为Atlas 800 9000不支持bf16,所以修改float16,别的装备参考昇腾手册
- (Python310) root@dify:/home/HwHiAiUser/msit/msmodelslim/example/Qwen# vim /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/config.json
- 该字段要设置为:"torch_dtype": "float16"
复制代码 (3)启动MindIE服务,先记录本机的ip地址,模型路径和以及模型名字
模型路径权重: /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/
模型名字: DeepSeek-R1-Distill-Qwen-32B-W8A8
修改配置文件
- (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# pwd
- /usr/local/Ascend/mindie/latest/mindie-service
- (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# vim conf/config.json
复制代码 修改解释一下,ipAddress,主要为了背面搭建dify使用的推理引擎模型,别的参考mindie手册
MindSpore Models服务化使用-MindSpore Models使用-模型推理使用流程-MindIE LLM开发指南-大模型开发-MindIE1.0.0开发文档-昇腾社区
单机推理-配置MindIE Server-配置MindIE-MindIE安装指南-环境预备-MindIE1.0.0开发文档-昇腾社区
- "ipAddress" : "192.168.1.115", 改为本地地址
- "httpsEnabled" : false,
- "npuDeviceIds" : [[0,1]],
- "modelName" : "DeepSeek-R1-Distill-Qwen-32B-W8A8",
- "modelWeightPath" : "/home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/",
- "maxInputTokenLen" : 4096,
- "maxIterTimes" : 4096,
- "truncation" : true,
复制代码 修改内容如下:
- (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# cat conf/config.json
- {
- "Version" : "1.0.0",
- "LogConfig" :
- {
- "logLevel" : "Info",
- "logFileSize" : 20,
- "logFileNum" : 20,
- "logPath" : "logs/mindie-server.log"
- },
- "ServerConfig" :
- {
- "ipAddress" : "192.168.1.115",
- "managementIpAddress" : "127.0.0.2",
- "port" : 1025,
- "managementPort" : 1026,
- "metricsPort" : 1027,
- "allowAllZeroIpListening" : false,
- "maxLinkNum" : 1000,
- "httpsEnabled" : false,
- "fullTextEnabled" : false,
- "tlsCaPath" : "security/ca/",
- "tlsCaFile" : ["ca.pem"],
- "tlsCert" : "security/certs/server.pem",
- "tlsPk" : "security/keys/server.key.pem",
- "tlsPkPwd" : "security/pass/key_pwd.txt",
- "tlsCrlPath" : "security/certs/",
- "tlsCrlFiles" : ["server_crl.pem"],
- "managementTlsCaFile" : ["management_ca.pem"],
- "managementTlsCert" : "security/certs/management/server.pem",
- "managementTlsPk" : "security/keys/management/server.key.pem",
- "managementTlsPkPwd" : "security/pass/management/key_pwd.txt",
- "managementTlsCrlPath" : "security/management/certs/",
- "managementTlsCrlFiles" : ["server_crl.pem"],
- "kmcKsfMaster" : "tools/pmt/master/ksfa",
- "kmcKsfStandby" : "tools/pmt/standby/ksfb",
- "inferMode" : "standard",
- "interCommTLSEnabled" : true,
- "interCommPort" : 1121,
- "interCommTlsCaPath" : "security/grpc/ca/",
- "interCommTlsCaFiles" : ["ca.pem"],
- "interCommTlsCert" : "security/grpc/certs/server.pem",
- "interCommPk" : "security/grpc/keys/server.key.pem",
- "interCommPkPwd" : "security/grpc/pass/key_pwd.txt",
- "interCommTlsCrlPath" : "security/grpc/certs/",
- "interCommTlsCrlFiles" : ["server_crl.pem"],
- "openAiSupport" : "vllm"
- },
- "BackendConfig" : {
- "backendName" : "mindieservice_llm_engine",
- "modelInstanceNumber" : 1,
- "npuDeviceIds" : [[0,1]],
- "tokenizerProcessNumber" : 8,
- "multiNodesInferEnabled" : false,
- "multiNodesInferPort" : 1120,
- "interNodeTLSEnabled" : true,
- "interNodeTlsCaPath" : "security/grpc/ca/",
- "interNodeTlsCaFiles" : ["ca.pem"],
- "interNodeTlsCert" : "security/grpc/certs/server.pem",
- "interNodeTlsPk" : "security/grpc/keys/server.key.pem",
- "interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt",
- "interNodeTlsCrlPath" : "security/grpc/certs/",
- "interNodeTlsCrlFiles" : ["server_crl.pem"],
- "interNodeKmcKsfMaster" : "tools/pmt/master/ksfa",
- "interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb",
- "ModelDeployConfig" :
- {
- "maxSeqLen" : 2560,
- "maxInputTokenLen" : 4096,
- "truncation" : true,
- "ModelConfig" : [
- {
- "modelInstanceType" : "Standard",
- "modelName" : "DeepSeek-R1-Distill-Qwen-32B-W8A8",
- "modelWeightPath" : "/home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/",
- "worldSize" : 2,
- "cpuMemSize" : 5,
- "npuMemSize" : -1,
- "backendType" : "atb",
- "trustRemoteCode" : false
- }
- ]
- },
- "ScheduleConfig" :
- {
- "templateType" : "Standard",
- "templateName" : "Standard_LLM",
- "cacheBlockSize" : 128,
- "maxPrefillBatchSize" : 50,
- "maxPrefillTokens" : 8192,
- "prefillTimeMsPerReq" : 150,
- "prefillPolicyType" : 0,
- "decodeTimeMsPerReq" : 50,
- "decodePolicyType" : 0,
- "maxBatchSize" : 200,
- "maxIterTimes" : 4096,
- "maxPreemptCount" : 0,
- "supportSelectBatch" : false,
- "maxQueueDelayMicroseconds" : 5000
- }
- }
- }
复制代码 修改模型权限,启动服务
- (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# chmod -R 750 /home/HwHiAiUser/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B-W8A8/
- (Python310) root@dify:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- [2025-02-23 19:04:44,279] [89160] [281464373506464] [llm] [INFO][logging.py-227] : Skip binding cpu.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
- Daemon start success!
复制代码 重启一个终端,查看npu使用状况

本机测试
- [root@dify ~]# curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"inputs":"如何赚大钱","parameters":{"decoder_input_details":true,"details":true,"do_sample":true,"max_new_tokens":50,"repetition_penalty":1.03,"return_full_text":false,"seed":null,"temperature":0.5,"top_k":10,"top_p":0.95,"truncate":null,"typical_p":0.5,"watermark":false}}' http://192.168.1.115:1025/generate
- {"details":{"prompt_tokens":5,"finish_reason":"length","generated_tokens":50,"prefill":[{"id":151646,"logprob":null,"special":null,"text":null},{"id":100007,"logprob":null,"special":null,"text":null},{"id":102223,"logprob":null,"special":null,"text":null},{"id":26288,"logprob":null,"special":null,"text":null},{"id":99428,"logprob":null,"special":null,"text":null}],"seed":2240260787,"tokens":[{"id":26850,"logprob":null,"special":null,"text":null},{"id":100007,"logprob":null,"special":null,"text":null},{"id":102223,"logprob":null,"special":null,"text":null},{"id":26288,"logprob":null,"special":null,"text":null},{"id":99428,"logprob":null,"special":null,"text":null},{"id":11319,"logprob":null,"special":null,"text":null},{"id":1406,"logprob":null,"special":null,"text":null},{"id":151649,"logprob":null,"special":null,"text":null},{"id":271,"logprob":null,"special":null,"text":null},{"id":102223,"logprob":null,"special":null,"text":null},{"id":26288,"logprob":null,"special":null,"text":null},{"id":99428,"logprob":null,"special":null,"text":null},{"id":102119,"logprob":null,"special":null,"text":null},{"id":85106,"logprob":null,"special":null,"text":null},{"id":100374,"logprob":null,"special":null,"text":null},{"id":99605,"logprob":null,"special":null,"text":null},{"id":9370,"logprob":null,"special":null,"text":null},{"id":101139,"logprob":null,"special":null,"text":null},{"id":5373,"logprob":null,"special":null,"text":null},{"id":85329,"logprob":null,"special":null,"text":null},{"id":33108,"logprob":null,"special":null,"text":null},{"id":99345,"logprob":null,"special":null,"text":null},{"id":101135,"logprob":null,"special":null,"text":null},{"id":1773,"logprob":null,"special":null,"text":null},{"id":87752,"logprob":null,"special":null,"text":null},{"id":99639,"logprob":null,"special":null,"text":null},{"id":97084,"logprob":null,"special":null,"text":null},{"id":102716,"logprob":null,"special":null,"text":null},{"id":39907,"logprob":null,"special":null,"text":null},{"id":48443,"logprob":null,"special":null,"text":null},{"id":14374,"logprob":null,"special":null,"text":null},{"id":220,"logprob":null,"special":null,"text":null},{"id":16,"logprob":null,"special":null,"text":null},{"id":13,"logprob":null,"special":null,"text":null},{"id":3070,"logprob":null,"special":null,"text":null},{"id":99716,"logprob":null,"special":null,"text":null},{"id":102447,"logprob":null,"special":null,"text":null},{"id":1019,"logprob":null,"special":null,"text":null},{"id":256,"logprob":null,"special":null,"text":null},{"id":481,"logprob":null,"special":null,"text":null},{"id":3070,"logprob":null,"special":null,"text":null},{"id":104023,"logprob":null,"special":null,"text":null},{"id":5373,"logprob":null,"special":null,"text":null},{"id":100025,"logprob":null,"special":null,"text":null},{"id":334,"logprob":null,"special":null,"text":null},{"id":5122,"logprob":null,"special":null,"text":null},{"id":67338,"logprob":null,"special":null,"text":null},{"id":101930,"logprob":null,"special":null,"text":null},{"id":99716,"logprob":null,"special":null,"text":null},{"id":101172,"logprob":null,"special":null,"text":null}]},"generated_text":"?\n\n如何赚大钱?\n\n\n</think>\n\n赚大钱通常需要结合个人的技能、资源和市场机会。以下是一些常见的方法:\n\n### 1. **投资理财**\n - **股票、基金**:通过长期投资优质"}[root@dify ~]#
复制代码 三、启动分词服务和重排序服务,首先去华为仓下载镜像 昇腾镜像堆栈详情, 对应自己的装备查找镜像

(1)拉取镜像Atlas 800 9000,一定要根据自己的硬件版本去官方仓拉取镜像,举行分词服务启动,新镜像以官方为主
- [root@dify ~]# docker pull swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64
- [root@dify ~]# docker images
- REPOSITORY TAG IMAGE ID CREATED SIZE
- swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei 6.0.RC3-910-aarch64 affece68b209 2 days ago 22.6GB
- swr.cn-east-317.qdrgznjszx.com/qd-aicc/mindie 910A-ascend_24.1.rc3-cann_8.0.t63-py_3.10-ubuntu_20.04-aarch64-mindie_1.0.T71.02 69f30d0c15be 5 weeks ago 16.5GB
- [root@dify ~]#
复制代码 拉取完镜像之后,举行必要的权重模型下载
- [root@dify ~]# cd /home/HwHiAiUser/
- [root@dify HwHiAiUser]# pwd
- /home/HwHiAiUser
- [root@dify HwHiAiUser]# vim down.py
- [root@dify HwHiAiUser]# cat down.py
- #模型下载
- from modelscope import snapshot_download
- model_dir = snapshot_download('BAAI/bge-m3',cache_dir=".")
- from modelscope import snapshot_download
- model_dir = snapshot_download('BAAI/bge-large-zh-v1.5',cache_dir=".")
- from modelscope import snapshot_download
- model_dir = snapshot_download('BAAI/bge-reranker-large',cache_dir=".")
- [root@dify HwHiAiUser]# python3 down.py
复制代码 下载完模型,修改每一个模型内部的配置项 Atlas800 9000/300I Duo/300V Pro装备,Atlas 800T A2等装备不用走该步骤
- [root@dify HwHiAiUser]# ls
- Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run BAAI deepseek-ai docker_run.sh down.py msit
- [root@dify HwHiAiUser]# vim BAAI/bge-large-zh-v1___5/config.json
- [root@dify HwHiAiUser]# vim BAAI/bge-m3/config.json
- [root@dify HwHiAiUser]# vim BAAI/bge-reranker-large/config.json
- "torch_dtype": "float16",
复制代码 (2)创建三个容器,暂定容器名字是 bge-m3、bge-large-zh-v1___5、bge-reranker-large,在创建之前,必要接洽昇腾技能人员,开通服务器对外端口,暂定开通的为8001,8002,8003 和niginx转发端口-入方向:|出方向:TCP/8001,8002,8003,8004,442
将模型拷贝到/home/data下,参考官方手册来即可
- [root@dify ~]# cd /home/HwHiAiUser/
- [root@dify HwHiAiUser]# ls
- Ascend-hdk-910-npu-driver_23.0.0_linux-aarch64.run BAAI deepseek-ai docker_run.sh down.py msit
- [root@dify HwHiAiUser]# pwd
- /home/HwHiAiUser
- [root@dify HwHiAiUser]# mkdir -p /home/data
- [root@dify HwHiAiUser]# cp -r BAAI/* /home/data/
- [root@dify HwHiAiUser]# ls /home/data/
- bge-large-zh-v1___5 bge-m3 bge-reranker-large
- [root@dify HwHiAiUser]#
复制代码 参考官方阐明:
ASCEND_VISIBLE_DEVICES环境变量表现将宿主机上的npu卡挂载到容器,假如挂载多张卡使用逗号分隔,如:ASCEND_VISIBLE_DEVICES=0,1,2,3;挂载多张卡到容器时,默认会寻找最优的一张卡调用,假如不希望容器内部自动寻找最优的卡,启动容器时可通过TEI_NPU_DEVICE=卡id指定使用哪张卡,注意这里的变量TEI_NPU_DEVICE配置从0开始取,容器内已将外部卡id举行了逻辑映射,编号从0连续映射;注意:配置的ASCEND_VISIBLE_DEVICES对应的卡不能被其他容器已挂载,否则会报错
- [root@dify ~]# docker run -u root -e TEI_NPU_DEVICE=0 -itd --name=bge-reranker-large --net=host -e HOME=/home/HwHiAiUser --privileged=true -v /home/data:/home/HwHiAiUser/model -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver --entrypoint /home/HwHiAiUser/start.sh swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 BAAI/bge-reranker-large 192.168.1.115 8001
- ef2383785c58ec5a650eb9d852ba965c48eb7b8cc7679cb7c194d2f2d0eb1a0d
- [root@dify ~]# docker start ef2383785c58ec5a650eb9d852ba965c48eb7b8cc7679cb7c194d2f2d0eb1a0d
- ef2383785c58ec5a650eb9d852ba965c48eb7b8cc7679cb7c194d2f2d0eb1a0d
- [root@dify ~]# docker run -u root -e TEI_NPU_DEVICE=1 -itd --name=bge-m3 --net=host -e HOME=/home/HwHiAiUser --privileged=true -v /home/data:/home/HwHiAiUser/model -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver --entrypoint /home/HwHiAiUser/start.sh swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 BAAI/bge-m3 192.168.1.115 8002
- 50dd3573f1ae1363211791425a2f681445b220f5a45bbdbe572a361ce974f63a
- [root@dify ~]# docker start 50dd3573f1ae1363211791425a2f681445b220f5a45bbdbe572a361ce974f63a
- 50dd3573f1ae1363211791425a2f681445b220f5a45bbdbe572a361ce974f63a
- bge-large-zh-v1___5 bge-m3 bge-reranker-large
- [root@dify ~]# docker run -u root -e TEI_NPU_DEVICE=2 -itd --name=bge-large-zh-v1___5 --net=host -e HOME=/home/HwHiAiUser --privileged=true -v /home/data:/home/HwHiAiUser/model -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver --entrypoint /home/HwHiAiUser/start.sh swr.cn-east-317.qdrgznjszx.com/sxj731533730/mis-tei:6.0.RC3-910-aarch64 BAAI/bge-large-zh-v1___5 192.168.1.115 8003
- d360f2b558c6556af53e19abd9f0782600f8cab1a7c60dc90fcf0b6061511c96
- [root@dify ~]# docker start d360f2b558c6556af53e19abd9f0782600f8cab1a7c60dc90fcf0b6061511c96
- d360f2b558c6556af53e19abd9f0782600f8cab1a7c60dc90fcf0b6061511c96
复制代码 查看一下三个服务,两个分词,一个排序模型,当然也可以放在一个NPU上运行
记录一下对外的服务端口 mindie推理服务 192.168.1.115:1025 ;bge-reranker-large服务:192.168.1.115:8001 bge-m3服务:192.168.1.115:8002 bge-large-zh-v1___5服务: 192.168.1.115:8003
四、部署dify环境举行部署配置,部署遇到的最大标题就是昇腾架构使用的aarch64,gitee使用docker镜像容器是x86_64,所以找镜像替代即可
(1)拉取dify的源码
- [root@dify HwHiAiUser]# git clone https://gitee.com/dify_ai/dify.git
- Cloning into 'dify'...
- remote: Enumerating objects: 206836, done.
- remote: Counting objects: 100% (10350/10350), done.
- remote: Compressing objects: 100% (5418/5418), done.
- remote: Total 206836 (delta 6559), reused 7867 (delta 4637), pack-reused 196486
- Receiving objects: 100% (206836/206836), 80.47 MiB | 3.03 MiB/s, done.
- Resolving deltas: 100% (161147/161147), done.
- [root@dify HwHiAiUser]# cd dify
- [root@dify dify]# git checkout 0.15.3
- Note: checking out '0.15.3'.
- You are in 'detached HEAD' state. You can look around, make experimental
- changes and commit them, and you can discard any commits you make in this
- state without impacting any branches by performing another checkout.
- If you want to create a new branch to retain commits you create, you may
- do so (now or later) by using -b with the checkout command again. Example:
- git checkout -b <new-branch-name>
- HEAD is now at ca19bd31d chore(*): Bump version to 0.15.3 (#13308)
- [root@dify HwHiAiUser]# cd docker/
- [root@dify docker]# cp .env.example .env
- [root@dify docker]# vim .env
复制代码 修改848行、906行
- NGINX_PORT=80
- # SSL settings are only applied when HTTPS_ENABLED is true
- NGINX_SSL_PORT=443
- 修改
- NGINX_PORT=8004
- # SSL settings are only applied when HTTPS_ENABLED is true
- NGINX_SSL_PORT=442
- 另一处
- EXPOSE_NGINX_PORT=80
- EXPOSE_NGINX_SSL_PORT=443
- 修改
- EXPOSE_NGINX_PORT=8004
- EXPOSE_NGINX_SSL_PORT=442
复制代码 修改配置文件
- [root@dify docker]# vim docker-compose.yaml
复制代码 第486行添加 --ignore-warnings ARM64-COW-BUG

将492行 修改0.2.10修改为0.2.1
(2)下载docker-compose,配置工具
- sudo curl -L https://github.com/docker/compose/releases/download/v2.33.0/docker-compose-linux-aarch64 -o /usr/local/bin/docker-compose
- 或者这样下载
- [root@dify docker]# cd /usr/local/bin/
- [root@dify bin]# pwd
- /usr/local/bin
- [root@dify bin]# wget https://sxj731533730.obs.cn-east-317.qdrgznjszx.com/docker-compose
- --2025-02-25 21:07:54-- https://sxj731533730.obs.cn-east-317.qdrgznjszx.com/docker-compose
- Resolving sxj731533730.obs.cn-east-317.qdrgznjszx.com (sxj731533730.obs.cn-east-317.qdrgznjszx.com)... 100.125.32.125
- Connecting to sxj731533730.obs.cn-east-317.qdrgznjszx.com (sxj731533730.obs.cn-east-317.qdrgznjszx.com)|100.125.32.125|:443... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: 71778465 (68M) [application/octet-stream]
- Saving to: ‘docker-compose’
- docker-compose 100%[=====================================================================>] 68.45M 220MB/s in 0.3s
- 2025-02-25 21:07:54 (220 MB/s) - ‘docker-compose’ saved [71778465/71778465]
- [root@dify bin]# ls
- cloud-id cloud-init-per jsondiff jsonpointer modelscope npu-healthcheck.sh tqdm
- cloud-init docker-compose jsonpatch jsonschema normalizer npu-smi
- [root@dify bin]# chmod 777 docker-compose
- [root@dify bin]# docker-compose -v
- Docker Compose version v2.33.0
复制代码
(3)拉取镜像,预备启动dify环境,根据。yaml找aarch64位库即可
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-api:0.15.3-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-api:0.15.3-linuxarm64 docker.io/langgenius/dify-api:0.15.3
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-web:0.15.3-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-web:0.15.3-linuxarm64 docker.io/langgenius/dify-web:0.15.3
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/postgres:15-alpine-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/postgres:15-alpine-linuxarm64 docker.io/postgres:15-alpine
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/redis:6-alpine-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/redis:6-alpine-linuxarm64 docker.io/redis:6-alpine
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.10-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.10-linuxarm64 docker.io/langgenius/dify-sandbox:0.2.10
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.1-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/dify-sandbox:0.2.1-linuxarm64 docker.io/langgenius/dify-sandbox:0.2.1
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ubuntu/squid:latest-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ubuntu/squid:latest-linuxarm64 docker.io/ubuntu/squid:latest
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/certbot/certbot:v3.1.0-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/certbot/certbot:v3.1.0-linuxarm64 docker.io/certbot/certbot:latest
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/nginx:latest-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/nginx:latest-linuxarm64 docker.io/nginx:latest
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pingcap/tidb:v8.4.0-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pingcap/tidb:v8.4.0-linuxarm64 docker.io/pingcap/tidb:v8.4.0
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/semitechnologies/weaviate:1.19.0-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/semitechnologies/weaviate:1.19.0-linuxarm64 docker.io/semitechnologies/weaviate:1.19.0
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/qdrant:v1.7.3-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/langgenius/qdrant:v1.7.3-linuxarm64 docker.io/langgenius/qdrant:v1.7.3
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pgvector/pgvector:pg16-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/pgvector/pgvector:pg16-linuxarm64 docker.io/pgvector/pgvector:pg16
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/tensorchord/pgvecto-rs:pg16-v0.3.0-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/tensorchord/pgvecto-rs:pg16-v0.3.0-linuxarm64 docker.io/tensorchord/pgvecto-rs:pg16-v0.3.0
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/chroma-core/chroma:0.5.20-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/chroma-core/chroma:0.5.20-linuxarm64 ghcr.io/chroma-core/chroma:0.5.20
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215-linuxarm64 quay.io/oceanbase/oceanbase-ce:4.3.3.0-100000142024101215
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/container-registry.oracle.com/database/free:latest-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/container-registry.oracle.com/database/free:latest-linuxarm64 docker.io/container-registry.oracle.com/database/free:latest
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/coreos/etcd:v3.5.5-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/quay.io/coreos/etcd:v3.5.5-linuxarm64 quay.io/coreos/etcd:v3.5.5
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/minio/minio:RELEASE.2023-03-20T20-16-18Z-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/minio/minio:RELEASE.2023-03-20T20-16-18Z-linuxarm64 docker.io/minio/minio:RELEASE.2023-03-20T20-16-18Z
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/milvusdb/milvus:v2.5.0-beta-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/milvusdb/milvus:v2.5.0-beta-linuxarm64 docker.io/milvusdb/milvus:v2.5.0-beta
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch:latest-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch:latest-linuxarm64 docker.io/opensearchproject/opensearch:latest
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch-dashboards:latest-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/opensearchproject/opensearch-dashboards:latest-linuxarm64 docker.io/opensearchproject/opensearch-dashboards:latest
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/myscale/myscaledb:1.6.4-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/myscale/myscaledb:1.6.4-linuxarm64 docker.io/myscale/myscaledb:1.6.4
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/elasticsearch/elasticsearch:8.14.3-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/elasticsearch/elasticsearch:8.14.3-linuxarm64 docker.elastic.co/elasticsearch/elasticsearch:8.14.3
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/kibana/kibana:8.14.3-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.elastic.co/kibana/kibana:8.14.3-linuxarm64 docker.elastic.co/kibana/kibana:8.14.3
- docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/robwilkes/unstructured-api:latest-linuxarm64
- docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/robwilkes/unstructured-api:latest-linuxarm64 docker.io/robwilkes/unstructured-api:latest
复制代码 然后启动dify成功
- [root@dify HwHiAiUser]# cd dify/
- [root@dify dify]# cd docker
- [root@dify docker]# pwd
- /home/HwHiAiUser/dify/docker
- [root@dify docker]# docker-compose up -d
- [+] Running 11/11
- ✔ Network docker_default Created 0.1s
- ✔ Network docker_ssrf_proxy_network Created 0.1s
- ✔ Container docker-sandbox-1 Started 1.6s
- ✔ Container docker-redis-1 Started 1.5s
- ✔ Container docker-web-1 Started 1.6s
- ✔ Container docker-weaviate-1 Started 1.9s
- ✔ Container docker-db-1 Started 1.8s
- ✔ Container docker-ssrf_proxy-1 Started 2.1s
- ✔ Container docker-api-1 Started 3.2s
- ✔ Container docker-worker-1 Started 3.0s
- ✔ Container docker-nginx-1 Started 3.7s
- [root@dify docker]#
复制代码 后台启动成功
五、启动dify举行配置界面,在地址栏输入http://ip(访问服务器的ip地址):8004端口,可以革新出dify界面
注册一下,这个是所有者权限,只能注册一次,无法修改,假如修改,必要重新拉dify服务
使用所有者权限进入账户,点击右边的设置
选择模型供应商
在下面的列表中找到这两个配置项
添加第一个模型deepseek
OpenAI-API-compatible
类型选LLM 模型名字对应你的mindie的name:DeepSeek-R1-Distill-Qwen-32B-W8A8 mindie的URL:http://192.168.1.115:1025/v1 只要后台服务启动中,前端可以生存,就是ok,秘钥随意填
Text Embedding Inference
然后配置排序模型和分词模型,支持RAG,秘钥随便写,只要后台服务启动中,前端可以生存,就是ok
1.1 选择 RERANK URL设置 http://192.168.1.115:8001 模型名 :bge-reranker-large
1.2 选择 TEXT EMBEDDING URL设置 http://192.168.1.115:8002 模型名 : bge-large-zh-v1___5
1.3 选择 TEXT EMBEDDING URL设置 http://192.168.1.115:8003 模型名 : bge-m3
继续添加
六、实际测试,跑在昇腾上面的DeepSeek-R1-Distill-Qwen-32B-W8A8 双卡
测试知识库RAG,看一下知识库的内容
开始处理文本
处理文本
不挂知识库的结果
挂了知识库的结果
找到了文本数据,并作出了解释
七、增加公司域名和添加约请人使用邮箱发送功能约请
目前该平台支持增加管理员权限-支持问答和知识库使用,平凡用户只支持问答,目前以复制链接形式约请,注册新用户即可,增加邮箱发送功能,阅读手册中
增加公司网址访问,修改前面的端口号
- NGINX_PORT=80
- # SSL settings are only applied when HTTPS_ENABLED is true
- NGINX_SSL_PORT=443
- 保持默认
- NGINX_PORT=80
- # SSL settings are only applied when HTTPS_ENABLED is true
- NGINX_SSL_PORT=442
- 另一处
- EXPOSE_NGINX_PORT=80
- EXPOSE_NGINX_SSL_PORT=443
- 保持默认
- EXPOSE_NGINX_PORT=80
- EXPOSE_NGINX_SSL_PORT=442
复制代码 同时将获的密钥添加到指定目次下
- [root@wuzhoutuili-0003 docker]# ls ./nginx/ssl/ -a
- . .. .gitkeep pem _****.cer _****.key
- [root@wuzhoutuili-0003 docker]#
复制代码 修改支持https://公司网址访问,访问应用了
或者是访问dify的ip:port
测试结果,直接输入网址登录即可
增加邮箱发送约请功能,就以qq邮箱为主吧,731533730@qq.com,不要给我发垃圾邮件哦,qq加不上好友~
首先仍然打开本地配置
- [root@wuzhoutuili-0003 docker]# docker-compose down
- /home/HwHiAiUser/dify/docker
- [root@wuzhoutuili-0003 docker]# vim .env
- [root@wuzhoutuili-0003 docker]# docker-compose up -d
复制代码 修改配置文件,这个配置文件的内容来自这里:
上图信息泉源
扫码获取密钥
进入账户发出约请
测试邮件收到 :https://wx.mail.qq.com/list/readtemplate?name=app_intro.html#/agreement/authorizationCode
假如要让链接打开,可用,必要修改官方代码
- [root@wuzhoutuili-0003 docker]# vim ../api/tasks/mail_invite_member_task.py
- [root@wuzhoutuili-0003 docker]# pwd
- /home/HwHiAiUser/dify/docker
复制代码
修改,浅读了一下代码,找到了标题地点;
- url = f"{dify_config.CONSOLE_WEB_URL}/activate?email={encoded_invitee_email}&token={token}"
复制代码 同时必要设置一下发送的网页环境变量
- [root@wuzhoutuili-0003 docker]# vim .env
复制代码
然后重启
- [root@wuzhoutuili-0003 docker]# vim .env
- [root@wuzhoutuili-0003 docker]# docker-compose down[+] Running 11/11 ✔ Container docker-worker-1 Removed 4.3s ✔ Container docker-nginx-1 Removed 10.7s ✔ Container docker-ssrf_proxy-1 Removed 10.7s ✔ Container docker-weaviate-1 Removed 0.4s ✔ Container docker-sandbox-1 Removed 0.3s ✔ Container docker-api-1 Removed 4.6s ✔ Container docker-web-1 Removed 10.3s ✔ Container docker-db-1 Removed 0.4s ✔ Container docker-redis-1 Removed 0.4s ✔ Network docker_default Removed 0.3s ✔ Network docker_ssrf_proxy_network Removed 0.1s [root@wuzhoutuili-0003 docker]# docker-compose up -d[+] Running 11/11 ✔ Network docker_ssrf_proxy_network Created 0.0s ✔ Network docker_default Created 0.1s ✔ Container docker-db-1 Started 1.1s ✔ Container docker-sandbox-1 Started 1.0s ✔ Container docker-ssrf_proxy-1 Started 1.2s ✔ Container docker-redis-1 Started 1.2s ✔ Container docker-web-1 Started 1.2s ✔ Container docker-weaviate-1 Started 1.1s ✔ Container docker-worker-1 Started 1.7s ✔ Container docker-api-1 Started 1.7s ✔ Container docker-nginx-1 Started 2.0s [root@wuzhoutuili-0003 docker]#
复制代码 然后就可以打开邮件链接,正常跳转了,我去给作者个pr,奖励一下他
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |