llama-index调用qwen大模子实现RAG

[复制链接]
发表于 2026-2-23 08:56:52 | 显示全部楼层 |阅读模式
配景

llama-
index在实现RAG方案的时间多是用的llama等英文大模子,对于国内的诸多模子案例较少,本次将利用qwen大模子实现llama-
index的RAG方案。
情况设置

(1)pip包

llama
index须要预装许多包,这里先把我乐成的案例内里的pip包设置发出来,在requirements.txt内里。
  1. absl-py==1.4.0
  2. accelerate==0.27.2
  3. aiohttp==3.9.3
  4. aiosignal==1.3.1
  5. aliyun-python-sdk-core==2.13.36
  6. aliyun-python-sdk-kms==2.16.1
  7. annotated-types==0.6.0
  8. anyio==3.7.1
  9. apphub @ file:///environment/apps/apphub/dist/apphub-1.0.0.tar.gz#sha256=260f99c0de4c575b19ab913aa134877e9efd81b820b97511fc8379674643c253
  10. argon2-cffi==21.3.0
  11. argon2-cffi-bindings==21.2.0
  12. asgiref==3.7.2
  13. asttokens==2.2.1
  14. astunparse==1.6.3
  15. async-timeout==4.0.3
  16. attrs==23.1.0
  17. Babel==2.12.1
  18. backcall==0.2.0
  19. backoff==2.2.1
  20. bcrypt==4.1.2
  21. beautifulsoup4==4.12.3
  22. bleach==6.0.0
  23. boltons @ file:///croot/boltons_1677628692245/work
  24. brotlipy==0.7.0
  25. bs4==0.0.2
  26. build==1.1.1
  27. cachetools==5.3.1
  28. certifi @ file:///croot/certifi_1690232220950/work/certifi
  29. cffi @ file:///croot/cffi_1670423208954/work
  30. chardet==3.0.4
  31. charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
  32. chroma-hnswlib==0.7.3
  33. chromadb==0.4.24
  34. click==7.1.2
  35. cmake==3.25.0
  36. coloredlogs==15.0.1
  37. comm==0.1.4
  38. conda @ file:///croot/conda_1690494963117/work
  39. conda-content-trust @ file:///tmp/abs_5952f1c8-355c-4855-ad2e-538535021ba5h26t22e5/croots/recipe/conda-content-trust_1658126371814/work
  40. conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1685032319139/work/src
  41. conda-package-handling @ file:///croot/conda-package-handling_1685024767917/work
  42. conda_package_streaming @ file:///croot/conda-package-streaming_1685019673878/work
  43. contourpy==1.2.0
  44. crcmod==1.7
  45. cryptography @ file:///croot/cryptography_1686613057838/work
  46. cycler==0.12.1
  47. dataclasses-json==0.6.4
  48. debugpy==1.6.7
  49. decorator==5.1.1
  50. defusedxml==0.7.1
  51. Deprecated==1.2.14
  52. dirtyjson==1.0.8
  53. distro==1.9.0
  54. ecdsa==0.18.0
  55. exceptiongroup==1.1.2
  56. executing==1.2.0
  57. fastapi==0.104.1
  58. fastjsonschema==2.18.0
  59. featurize==0.0.24
  60. filelock==3.9.0
  61. flatbuffers==23.5.26
  62. fonttools==4.44.0
  63. frozenlist==1.4.1
  64. fsspec==2024.2.0
  65. gast==0.4.0
  66. google-auth==2.22.0
  67. google-auth-oauthlib==1.0.0
  68. google-pasta==0.2.0
  69. googleapis-common-protos==1.62.0
  70. greenlet==3.0.3
  71. grpcio==1.62.0
  72. gunicorn==21.2.0
  73. h11==0.14.0
  74. h5py==3.9.0
  75. httpcore==0.17.3
  76. httptools==0.6.1
  77. httpx==0.24.1
  78. huggingface-hub==0.20.3
  79. humanfriendly==10.0
  80. idna==2.10
  81. imageio==2.32.0
  82. importlib-metadata==6.11.0
  83. importlib_resources==6.1.3
  84. ipykernel==6.25.0
  85. ipython==8.14.0
  86. ipython-genutils==0.2.0
  87. ipywidgets==8.1.2
  88. jedi==0.19.0
  89. Jinja2==3.1.2
  90. jmespath==0.10.0
  91. joblib==1.3.2
  92. json5==0.9.14
  93. jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work
  94. jsonpointer==2.1
  95. jsonschema==4.18.6
  96. jsonschema-specifications==2023.7.1
  97. jupyter-server==1.24.0
  98. jupyter_client==8.3.0
  99. jupyter_core==5.3.1
  100. jupyterlab==3.2.9
  101. jupyterlab-pygments==0.2.2
  102. jupyterlab_server==2.24.0
  103. jupyterlab_widgets==3.0.10
  104. keras==2.13.1
  105. kiwisolver==1.4.5
  106. kubernetes==29.0.0
  107. lazy_loader==0.3
  108. libclang==16.0.6
  109. libmambapy @ file:///croot/mamba-split_1685993156657/work/libmambapy
  110. lit==15.0.7
  111. llama-
  112. index==0.10.17
  113. llama-
  114. index-agent-openai==0.1.5
  115. llama-
  116. index-cli==0.1.8
  117. llama-
  118. index-core==0.10.17
  119. llama-
  120. index-embeddings-huggingface==0.1.4
  121. llama-
  122. index-embeddings-openai==0.1.6
  123. llama-
  124. index-indices-managed-llama-cloud==0.1.3
  125. llama-
  126. index-legacy==0.9.48
  127. llama-
  128. index-llms-huggingface==0.1.3
  129. llama-
  130. index-llms-openai==0.1.7
  131. llama-
  132. index-multi-modal-llms-openai==0.1.4
  133. llama-
  134. index-program-openai==0.1.4
  135. llama-
  136. index-question-gen-openai==0.1.3
  137. llama-
  138. index-readers-file==0.1.8
  139. llama-
  140. index-readers-llama-parse==0.1.3
  141. llama-
  142. index-vector-stores-chroma==0.1.5
  143. llama-parse==0.3.8
  144. llama
  145. index-py-client==0.1.13
  146. Markdown==3.4.4
  147. MarkupSafe==2.1.2
  148. marshmallow==3.21.1
  149. matplotlib==3.8.1
  150. matplotlib-inline==0.1.6
  151. mistune==3.0.1
  152. mmh3==4.1.0
  153. monotonic==1.6
  154. mpmath==1.2.1
  155. multidict==6.0.4
  156. mypy-extensions==1.0.0
  157. nbclassic==0.2.8
  158. nbclient==0.8.0
  159. nbconvert==7.7.3
  160. nbformat==5.9.2
  161. nest-asyncio==1.6.0
  162. networkx==3.0
  163. nltk==3.8.1
  164. notebook==6.4.12
  165. numpy==1.24.1
  166. nvidia-cublas-cu12==12.1.3.1
  167. nvidia-cuda-cupti-cu12==12.1.105
  168. nvidia-cuda-nvrtc-cu12==12.1.105
  169. nvidia-cuda-runtime-cu12==12.1.105
  170. nvidia-cudnn-cu12==8.9.2.26
  171. nvidia-cufft-cu12==11.0.2.54
  172. nvidia-curand-cu12==10.3.2.106
  173. nvidia-cusolver-cu12==11.4.5.107
  174. nvidia-cusparse-cu12==12.1.0.106
  175. nvidia-nccl-cu12==2.19.3
  176. nvidia-nvjitlink-cu12==12.4.99
  177. nvidia-nvtx-cu12==12.1.105
  178. oauthlib==3.2.2
  179. onnxruntime==1.17.1
  180. openai==1.13.3
  181. opencv-python==4.8.1.78
  182. opentelemetry-api==1.23.0
  183. opentelemetry-exporter-otlp-proto-common==1.23.0
  184. opentelemetry-exporter-otlp-proto-grpc==1.23.0
  185. opentelemetry-instrumentation==0.44b0
  186. opentelemetry-instrumentation-asgi==0.44b0
  187. opentelemetry-instrumentation-fastapi==0.44b0
  188. opentelemetry-proto==1.23.0
  189. opentelemetry-sdk==1.23.0
  190. opentelemetry-semantic-conventions==0.44b0
  191. opentelemetry-util-http==0.44b0
  192. opt-einsum==3.3.0
  193. orjson==3.9.15
  194. oss2==2.18.1
  195. overrides==7.7.0
  196. packaging @ file:///croot/packaging_1678965309396/work
  197. pandas==2.1.2
  198. pandocfilters==1.5.0
  199. parso==0.8.3
  200. pexpect==4.8.0
  201. pickleshare==0.7.5
  202. Pillow==9.3.0
  203. platformdirs==3.10.0
  204. pluggy @ file:///tmp/build/80754af9/pluggy_1648024709248/work
  205. posthog==3.5.0
  206. prometheus-client==0.17.1
  207. prompt-toolkit==3.0.39
  208. protobuf==4.23.4
  209. psutil==5.9.5
  210. ptyprocess==0.7.0
  211. pulsar-client==3.4.0
  212. pure-eval==0.2.2
  213. pyasn1==0.5.0
  214. pyasn1-modules==0.3.0
  215. pycosat @ file:///croot/pycosat_1666805502580/work
  216. pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
  217. pycryptodome==3.18.0
  218. pydantic==2.4.2
  219. pydantic_core==2.10.1
  220. Pygments==2.15.1
  221. PyMuPDF==1.23.26
  222. PyMuPDFb==1.23.22
  223. pyOpenSSL @ file:///croot/pyopenssl_1677607685877/work
  224. pyparsing==3.1.1
  225. pypdf==4.1.0
  226. PyPika==0.48.9
  227. pyproject_hooks==1.0.0
  228. PySocks @ file:///home/builder/ci_310/pysocks_1640793678128/work
  229. python-dateutil==2.8.2
  230. python-dotenv==1.0.0
  231. pytz==2023.3.post1
  232. PyYAML==6.0.1
  233. pyzmq==25.1.0
  234. referencing==0.30.0
  235. regex==2023.12.25
  236. requests==2.31.0
  237. requests-oauthlib==1.3.1
  238. rpds-py==0.9.2
  239. rsa==4.9
  240. ruamel.yaml @ file:///croot/ruamel.yaml_1666304550667/work
  241. ruamel.yaml.clib @ file:///croot/ruamel.yaml.clib_1666302247304/work
  242. safetensors==0.4.2
  243. scikit-image==0.22.0
  244. scikit-learn==1.3.2
  245. scipy==1.11.3
  246. seaborn==0.13.0
  247. Send2Trash==1.8.2
  248. six @ file:///tmp/build/80754af9/six_1644875935023/work
  249. sniffio==1.3.0
  250. socksio==1.0.0
  251. soupsieve==2.4.1
  252. SQLAlchemy==2.0.28
  253. sshpubkeys==3.3.1
  254. stack-data==0.6.2
  255. starlette==0.27.0
  256. sympy==1.11.1
  257. tabulate==0.8.7
  258. tenacity==8.2.3
  259. tensorboard==2.13.0
  260. tensorboard-data-server==0.7.1
  261. tensorflow==2.13.0
  262. tensorflow-estimator==2.13.0
  263. tensorflow-io-gcs-filesystem==0.33.0
  264. termcolor==2.3.0
  265. terminado==0.17.1
  266. threadpoolctl==3.2.0
  267. tifffile==2023.9.26
  268. tiktoken==0.6.0
  269. tinycss2==1.2.1
  270. tokenizers==0.15.2
  271. tomli==2.0.1
  272. toolz @ file:///croot/toolz_1667464077321/work
  273. torch==2.2.1
  274. torchaudio==2.0.2+cu118
  275. torchvision==0.15.2+cu118
  276. tornado==6.3.2
  277. tqdm==4.66.2
  278. traitlets==5.9.0
  279. transformers==4.38.2
  280. triton==2.2.0
  281. typer==0.9.0
  282. typing-inspect==0.9.0
  283. typing_extensions==4.8.0
  284. tzdata==2023.3
  285. urllib3==1.25.11
  286. uvicorn==0.23.2
  287. uvloop==0.19.0
  288. watchfiles==0.21.0
  289. wcwidth==0.2.5
  290. webencodings==0.5.1
  291. websocket-client==1.2.1
  292. websockets==12.0
  293. Werkzeug==2.3.6
  294. widgetsnbextension==4.0.10
  295. workspace @ file:///home/featurize/work/workspace/dist/workspace-0.1.0.tar.gz#sha256=b292beb3599f79d3791771eff9dc422cc37c58c1fc8daadeafbf025a2e7ea986
  296. wrapt==1.15.0
  297. yarl==1.9.2
  298. zipp==3.17.0
  299. zstandard @ file:///croot/zstandard_1677013143055/work
复制代码
(2)python 情况


(3)安装下令
  1. !pip install llama-
  2. index
  3. !pip install llama-
  4. index-llms-huggingface
  5. !pip install llama-
  6. index-embeddings-huggingface
  7. !pip install llama-
  8. index ipywidgets
  9. !pip install torch
  10. !git clone https://www.modelscope.cn/AI-ModelScope/bge-small-zh-v1.5.git
  11. !git clone https://www.modelscope.cn/qwen/Qwen1.5-4B-Chat.git
复制代码
(4)目次布局


代码 

(1)加载模子
  1. import torch
  2. from llama_
  3. index.llms.huggingface import HuggingFaceLLM
  4. from llama_
  5. index.core import PromptTemplate
  6. import os
  7. os.environ['KMP_DUPLICATE_LIB_OK']='True'
  8. # Model names (make sure you have access on HF)
  9. LLAMA2_7B = "/home/featurize/Qwen1.5-4B-Chat"
  10. # LLAMA2_7B_CHAT = "meta-llama/Llama-2-7b-chat-hf"
  11. # LLAMA2_13B = "meta-llama/Llama-2-13b-hf"
  12. LLAMA2_13B_CHAT = "/home/featurize/Qwen1.5-4B-Chat"
  13. # LLAMA2_70B = "meta-llama/Llama-2-70b-hf"
  14. # LLAMA2_70B_CHAT = "meta-llama/Llama-2-70b-chat-hf"
  15. selected_model = LLAMA2_13B_CHAT
  16. SYSTEM_PROMPT = """You are an AI assistant that answers questions in a friendly manner, based on the given source documents. Here are some rules you always follow:
  17. - Generate human readable output, avoid creating output with gibberish text.
  18. - Generate only the requested output, don't include any other language before or after the requested output.
  19. - Never say thank you, that you are happy to help, that you are an AI agent, etc. Just answer directly.
  20. - Generate professional language typically used in business documents in North America.
  21. - Never generate offensive or foul language.
  22. """
  23. query_wrapper_prompt = PromptTemplate(
  24.     "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
  25. )
  26. llm = HuggingFaceLLM(context_window=4096,
  27.     max_new_tokens=2048,
  28.     generate_kwargs={"temperature": 0.0, "do_sample": False},
  29.     query_wrapper_prompt=query_wrapper_prompt,
  30.     tokenizer_name=selected_model,
  31.     model_name=selected_model,
  32.     device_map="auto"
  33. )
复制代码

(2)加载词嵌入向量
  1. from llama_
  2. index.embeddings.huggingface import HuggingFaceEmbedding
  3. embed_model = HuggingFaceEmbedding(model_name="/home/featurize/bge-small-zh-v1.5")
复制代码
  1. from llama_
  2. index.core import Settings
  3. Settings.llm = llm
  4. Settings.embed_model = embed_model
复制代码
  1. from llama_
  2. index.core import SimpleDirectoryReader
  3. # load documents
  4. documents = SimpleDirectoryReader("./data/").load_data()
复制代码
  1. from llama_
  2. index.core import VectorStoreIndex
  3. index = VectorStoreIndex.from_documents(documents)
复制代码
 
  1. index
复制代码
 
  1. # set Logging to DEBUG for more detailed outputsquery_engine =
  2. index.as_query_engine()
复制代码
  1. response = query_engine.query("小额贷款咋规定的?")
  2. print(response)
复制代码
 
知识库 

llama
index实现RAG中很关键的一环就是知识库,知识库告急是各种范例的文档,这里给的文档是一个pdf文件,文件内容如下。

 总结

从上面的代码可以看出,我们利用qwen和bge-zh模子可以实现当地下载模子的RAG方案,知识库内里的内容也可以实现中文问答,这非常有利于我们举行私有化摆设方案,从而扩展我们的功能

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!qidao123.com:ToB企服之家,中国第一个企服评测及软件市场,开放入驻,技术点评得现金

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

×
回复

使用道具 举报

登录后关闭弹窗

登录参与点评抽奖  加入IT实名职场社区
去登录
快速回复 返回顶部 返回列表