ToB企服应用市场:ToB评测及商务社交产业平台

标题: lightRAG 论文阅读笔记 [打印本页]

作者: 宁睿    时间: 2024-12-16 05:32
标题: lightRAG 论文阅读笔记
论文原文
https://arxiv.org/pdf/2410.05779v1
     这里我先说一下自己的感受,这篇论文整体看下来,没有太多惊艳的地方。核心就是使用知识图谱,通过模型对文档抽取实体和关系。 然后基于此来构建查询。核心问题还是在办理知识之间的毗连问题。
  

论文主要办理的问题和效果

   办理的问题:

    取得的效果:

    论文快读

   这篇论文介绍了一种名为 LightRAG 的新型检索增强生成(Retrieval-Augmented Generation, RAG)体系。LightRAG 旨在通过整合图结构到文本索引和检索过程中,来办理现有 RAG 体系的局限性。以下是对论文的详细解读:
  1. 弁言和背景

  
  2. LightRAG 的提出

  
  3. LightRAG 架构

  
  4. 实验评估

  
  5. 效果和讨论

  
  6. 相干工作

  
  7. 结论

  
  这篇论文展示了 LightRAG 在处理复杂查询和大规模数据集时的优势,并通过实验验证了其在检索准确性和服从上的明显改进。
  
核心promt

在这篇论文中没有看到太多新奇的东西,可能也就prompt能看看。
构建图的prompt,用来抽取实体和关系 

  1. -Goal-
  2. Given a text document that is potentially relevant to this activity and a list of entity types, identify all entities of those types from the text and all relationships among the identified entities.
  3. -Steps-
  4. 1. Identify all entities. For each identified entity, extract the following information:
  5. - entity_name: Name of the entity, capitalized
  6. - entity_type: One of the following types: [organization, person, geo, event]
  7. - entity_description: Comprehensive description of the entity's attributes and activities Format each entity as ("entity" <><entity_name><><entity_type><|><entity_description>)
  8. 2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
  9. For each pair of related entities, extract the following information:
  10. - source_entity: name of the source entity, as identified in step 1
  11. - target_entity: name of the target entity, as identified in step 1
  12. - relationship_description: explanation as to why you think the source entity and the target entity are related to each other
  13. - relationship_strength: a numeric score indicating strength of the relationship between the source entity and target entity
  14. - relationship_keywords: one or more high-level key words that summarize the overarching nature of the relationship, focusing on concepts or themes rather than specific details
  15. Format each relationship as ("relationship"<|><source_entity><|><target_entity><|><relationship_description><><relationship_keywords><|><relationship_strength>)
  16. 3. Identify high-level key words that summarize the main concepts, themes, or topics of the entire text. These should capture the overarching ideas present in the document.
  17. Format the content-level key words as ("content _keywords"<|><high_level_keywords›)
  18. 4. Return output in English as a single list of all the entities and relationships identified in steps 1 and 2. Use **##** as the list delimiter.
  19. 5. When finished, output <|COMPLETE|>
  20. -Real Data-
  21. Entity_types: {entity_types}
  22. Text: {input_text}
复制代码

抽取关键词的prompt

 
  1. ---Role---
  2. You are a helpful assistant tasked with identifying both high-level and low-level keywords in the user's query.
  3. ---Goal---
  4. Given the query, list both high-level and low-level keywords. High-level keywords focus on overarching concepts or themes, while low-level keywords focus on specific entities, details, or concrete terms.
复制代码
  1. - Output the keywords in JSON format.
  2. - The JSON should have two keys:
  3. - "high_level keywords" for overarching concepts or themes.
  4. - "low level keywords" for specific entities or details.
  5. -Examples-
  6. Example 1:
  7. Query: "How does international trade influence global economic stability?"
  8. Output: {{ "high_level_keywords": ["International trade", "Global economic stability", "Economic impact"], "low_level_keywords": ["Trade agreements", "Tariffs",
  9. "Currency exchange", "Imports", "Exports"] }}
  10. Example 2:
  11. Query: "What are the environmental consequences of deforestation on biodiversity?" Output: {{ "high_level_keywords": ["Environmental consequences", "Deforestation".
  12. ", "Biodiversity loss"], "low _level_keywords": ["Species extinction", "Habitat
  13. destruction", "Carbon emissions", "Rainforest", "Ecosystem"] }}
  14. Example 3:
  15. Query: "What is the role of education in reducing poverty?"
  16. Output: {{ "high_level_keywords": ["Education", "Poverty reduction", "Socioeconomic development"], "low _level_keywords": ["School access", "Literacy rates", "Job training", "Income inequality" }}
  17. -Real Data-Query: {query}
  18. Output:
复制代码
 


免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。




欢迎光临 ToB企服应用市场:ToB评测及商务社交产业平台 (https://dis.qidao123.com/) Powered by Discuz! X3.4