结构化输出也有类似的目标,但它还简化了与系统下游组件的集成。Instructor和Outlines在结构化输出方面表现良好。(假如你正在导入LLM API SDK,请利用Instructor;假如你正在导入Huggingface用于自托管模型,请利用Outlines。)结构化输入清晰表达任务,类似于训练数据的格式,增长了更好输出的可能性。
"content": "从这个产品描述中提取<name>, <size>, <price>和<color>并放入<response>标签中。\n<description>The SmartHome Mini is a compact smart home assistant available in black or white for only $49.99. At just 5 inches wide, it lets you control lights, thermostats, and other connected devices via voice or app—no matter where you place it in your home. This affordable little hub brings convenient hands-free control to your smart devices.</description>"
最近的研究表明,RAG可能占有优势。一项研究比较了RAG和无监督微调(又称一连预训练),在MMLU子集和当前事件上举行评估。他们发现,RAG在处置惩罚训练中遇到的知识和完全新知识方面, consistently outperformed fine-tuning for knowledge encountered during training as well as entirely new knowledge。另一篇论文中,他们比较了RAG和有监督微调在农业数据集上的表现。同样,RAG的性能提拔大于微调,特别是对于GPT-4。