langchain 模子加载HuggingFaceEmbeddings、文本切割RecursiveCharacterTex
参考:https://github.com/TommyTang930/LangChain_LLM_ChatBot
https://python.langchain.com/docs/integrations/vectorstores/faiss
1、文本切割RecursiveCharacterTextSplitter
这里对着类举行了改写,对中文切分更友好
import re
from typing import List, Optional, Any
from langchain.text_splitter import RecursiveCharacterTextSplitter
import logging
logger = logging.getLogger(__name__)
def _split_text_with_regex_from_end(
text: str, separator: str, keep_separator: bool
) -> List:
# Now that we have the separator, split the text
if separator:
if keep_separator:
# The parentheses in the pattern keep the delimiters in the result.
_splits = re.split(f"({separator})", text)
splits = ["".join(i) for i in zip(_splits, _splits)]
if len(_splits) % 2 == 1:
splits += _splits[-
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
页:
[1]