中药大数据（三）中医知识图谱的创建

铁佛 · 2025-3-21 03:29:25

本项目纯原创，转载请说明。
如果各人有其他必要制作的知识图谱，大概要基于知识图谱做一些应用，也欢迎接洽！
1 先看下效果

（1）总体图谱数据

（2）性味归经【部门】

（3）医学册本收录方剂【部门】

（4）医学册本收录药材【部门】

（5）方剂的构成药材【部门】

2 数据预处理

要做的事情是从处方字段中提取出所有的方剂
也就是根据以下的数据去提取药材

代码如下：

# 读取 tb_prescriptions 和 tb_cmedicine 数据
df_prescriptions = pd.read_sql('SELECT * FROM tb_prescription', cnn) # 方剂表
df_cmedicine = pd.read_sql('SELECT title FROM tb_cmedicine', cnn) # 中药表
# 获取中药名列表，并按长度从大到小排序
medicine_titles = df_cmedicine['title'].tolist()
medicine_titles.sort(key=len, reverse=True) # 按长度排序，长的药名优先匹配
# 函数：检查每个 prescription 中出现了哪些中药（最大匹配）
def find_medicines_in_prescription(prescription, medicine_titles):
found_medicines = []
for medicine in medicine_titles:
if medicine in prescription:
found_medicines.append(medicine)
# 将匹配到的药名从 prescription 中移除，避免重复匹配较短的名称
prescription = prescription.replace(medicine, '')
return ','.join(found_medicines)
# 遍历 prescription 列，并检查每个方剂中包含的中药
df_prescriptions['found_medicines'] = df_prescriptions['prescription'].apply(find_medicines_in_prescription, args=(medicine_titles,))
# 打印结果：prescription 中找到的中药
for index, row in df_prescriptions.iterrows():
print(f"Prescription: {row['prescription']}\nFound Medicines: {row['found_medicines']}\n")
# 将提取的中药信息更新到 tb_prescription 表的 fangji 字段
for index, row in df_prescriptions.iterrows():
found_medicines = row['found_medicines']
# 更新 tb_prescription 表的 fangji 字段
update_query = f"""
UPDATE tb_prescription
SET fangji = '{found_medicines}'
WHERE id = {row['id']} -- 假设 tb_prescription 表有一个 id 字段作为主键
"""
# 执行 SQL 更新语句
cnn.execute(update_query)
# 确保关闭连接
cnn.close()

复制代码

其中有一个题目：
Prescription: 春酒5升，葶苈子2升。
Found Medicines: 酒,葶苈子,葶苈
药方里出现了苈子，但是匹配的时间葶苈子,葶苈都匹配了，出现这个题目主要是匹配的时间应该是最大匹配，就是类似要有贪心头脑。
3 neo4j 知识图谱构建代码

下面贴出部门的Neo4j导入的代码
创建节点尽可能用merge语句，否则会出现大量重复节点

# 连接到 Neo4j 数据库
# 读取 tb_prescription 和 tb_cmedicine 数据
df_prescriptions = pd.read_sql('SELECT * FROM tb_prescription', cnn)
df_cmedicine = pd.read_sql('SELECT * FROM tb_cmedicine', cnn)
# 将 tb_cmedicine 转换为字典，方便根据药名查找对应的药材信息
cmedicine_dict = df_cmedicine.set_index('title').T.to_dict()
# 正则表达式，用于提取《》之间的书名号内容
def extract_book_title(excerpt):
match = re.search(r'《([^》]+)》', excerpt)
if match:
return f'《{match.group(1)}》'
return None
# 创建药方和药材的知识图谱，确保节点和关系不会重复
for index, row in df_prescriptions.iterrows():
# 创建药方节点（防止重复）
prescription_node = Node("Prescription", name=row['title'],
prescription=row['prescription'],
making=row['making'],
functional_indications=row['functional_indications'],
usage=row['usage'],
excerpt=row['excerpt'],
care=row['care'])
graph.merge(prescription_node, "Prescription", "name") # 防止重复创建方剂节点
# 分割 fangji 中的药材名称
medicines = row['fangji'].split(',') if row['fangji'] else []
for medicine in medicines:
medicine = medicine.strip() # 去除药材名称前后的空格
# 从 tb_cmedicine 数据中获取该药材的详细信息
if medicine in cmedicine_dict:
med_info = cmedicine_dict[medicine]
# 创建药材节点（防止重复）
medicine_node = Node("Medicine", name=medicine,
pinyin=med_info.get('pinyin'),
alias=med_info.get('alias'),
source=med_info.get('source'),
english_name=med_info.get('english_name'),
habitat=med_info.get('habitat'),
flavor=med_info.get('flavor'),
functional_indications=med_info.get('functional_indications'),
usage=med_info.get('usage'),
excerpt=med_info.get('excerpt'),
provenance=med_info.get('provenance'),
shape_properties=med_info.get('shape_properties'),
attribution=med_info.get('attribution'),
prototype=med_info.get('prototype'),
discuss=med_info.get('discuss'),
chemical_composition=med_info.get('chemical_composition'))
graph.merge(medicine_node, "Medicine", "name") # 防止重复创建药材节点
# 创建 Prescription -> Medicine 关系（防止重复）
relationship = Relationship(prescription_node, "所用药材", medicine_node)
graph.merge(relationship, "Prescription", "name")
# 提取古籍书名号《》中的内容并创建古籍节点（药材的摘录，防止重复）
book_title = extract_book_title(med_info.get('excerpt', ''))
if book_title:
# 创建古籍节点（防止重复）
book_node = Node("Book", name=book_title)
graph.merge(book_node, "Book", "name")
# 创建 Book -> Medicine 的 "收录药材" 关系（防止重复）
recorded_relationship_medicine = Relationship(book_node, "收录药材", medicine_node)
graph.merge(recorded_relationship_medicine, "Book", "name")

复制代码

免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！更多信息从访问主页：qidao123.com:ToB企服之家，中国第一个企服评测及商务社交产业平台。

		自动登录	找回密码
密码			立即注册

中药大数据（三）中医知识图谱的创建

本帖子中包含更多资源

0 个回复

快速回复

楼主热帖

标签云

浏览过的版块