agent rag
Agentic RAG总体范式:与传统RAG的差异、定位与能力机制
对比并界定传统RAG与Agentic RAG的差异与能力定位,回答“为何/如何通过自治、迭代与编排代理能力嵌入RAG以提升适应性与性能”,属于范式框架与总体定位类工作。
- Traditional rag vs. agentic rag: A comparative study of retrieval-augmented systems(F Neha, D Bhati, 2025, Authorea Preprints)
- Agentic RAG Systems for Improving Adaptability and Performance in AI-Driven Information Retrieval(Ajit Singh, 2025, Available at SSRN 5188363)
层级/协同多代理架构:任务分解—多源检索—融合决策
以多智能体协作为核心:通过任务分解、分模态/多源检索与多路结果融合(投票/一致性/专家细化等)提升答案质量;其中实体解析方向体现“任务专用代理”如何协同完成复杂认知步骤。
- HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation(Pei Liu, Xin Liu, Ruoyu Yao, Junming Liu, Siyuan Meng, Ding Wang, Jun Ma, 2025, Proceedings of the 33rd ACM International Conference on Multimedia)
- Multi-agent Retrieval-Augmented Generation for Enhancing Answer Generation and Knowledge Retrieval(Deepak Kumar, Bhavesh Jain, 2025, Lecture Notes in Computer Science)
- Multi-Agent RAG Framework for Entity Resolution: Advancing Beyond Single-LLM Approaches with Specialized Agent Coordination(A. T. Muhammad, Muzakkiruddin Ahmed Mohammed, Mariofanna Milanova, John Talburt, Mert Can Cakmak, 2025, Computers)
迭代检索与检索轨迹建模:自适应重检索、上下文压缩与多跳推进
聚焦“检索过程自身的智能化”:通过迭代检索、检索轨迹建模与检索压缩(摘要/计划推进、上下文管理)来优化多轮重检索与多跳信息获取;强调检索—生成耦合中的过程控制。
- Adaptive iterative retrieval for enhanced retrieval-augmented generation(Wenhan Han, Xiao Xiao, Yaohang Li, Jun Wang, Mykola Pechenizkiy, Meng Fang, 2025, Neurocomputing)
- Learning to Retrieve Iteratively for In-Context Learning(Yunmo Chen, Tongfei Chen, Harsh Jhamtani, Patrick Xia, Richard Shin, Jason Eisner, Benjamin Van Durme, 2024, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing)
- Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach(Zhouyu Jiang, Mengshu Sun, Lei Liang, Zhiqiang Zhang, 2024, Companion Proceedings of the ACM on Web Conference 2025)
- MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation(Chia-Yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin‐Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, Na Zou, 2025, Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers))
面向事实核查/安全合规的闭环代理:检索—验证—停止条件
在高风险/高可靠需求场景中构建闭环代理流程:以证据检索为前提,引入验证与停止条件(置信度/需求推导驱动),把“检索相关性+核查/安全推导”形成受控闭环。
- FIRE: Fact-checking with Iterative Retrieval and Verification(Zhuohan Xie, Rui Xing, Yuxia Wang, Jiahui Geng, Iqbal Hasan, Dhruv Sahnan, Iryna Gurevych, Preslav Nakov, 2025, Findings of the Association for Computational Linguistics: NAACL 2025)
- Towards Automated Safety Requirements Derivation Using Agent-based RAG(Balahari Vignesh Balu, Florian Geißler, Francesco Carella, João-Vitor Zacchi, Josef Jiru, Núria Mata, R. Stollé, 2025, Proceedings of the AAAI Symposium Series)
领域应用型多代理RAG:临床决策、能耗/对话决策与危害场景
将Agentic RAG落地到具体业务的多代理系统:医疗/临床侧强调自查询与检索质量提升;决策对话侧强调角色化多代理支持理解—检索—分析—展示;危害/风险场景侧构建面向用户的证据整合与生成流程。
- Enhancing Clinical Decision Support with Adaptive Iterative Self-Query Retrieval for Retrieval-Augmented Large Language Models(Srinivagasam Prabha, C. A. Gomez-Cabello, S. A. Haider, Ariana Genovese, Maissa Trabilsy, Nadia G. Wood, Sanjay Bagaria, Cui Tao, AJ Forte, 2025, Bioengineering)
- Multi-Agent RAG Chatbot Architecture for Decision Support in Net-Zero Emission Energy Systems(Gihan Gamage, Nishan Mills, Daswin De Silva, Milos Manic, Harsha Moraliyage, Andrew Jennings, D. Alahakoon, 2024, 2024 IEEE International Conference on Industrial Technology (ICIT))
- MARSHA: multi-agent RAG system for hazard adaptation(Yangxinyu Xie, Bowen Jiang, Tanwi Mallick, J. Bergerson, John K. Hutchison, Duane R. Verner, Jordan Branham, M. R. Alexander, Robert B. Ross, Yan Feng, L. Levy, Weijie J. Su, C. J. Taylor, 2025, npj Climate Action)
移动端/跨应用长时任务的双级检索增强:高层策略—低层执行
面向移动端/真实世界长时、多应用、UI执行型任务:区分高层规划与低层操作所需知识,并分别进行检索增强(如Manager/Operator双级思路),解决长时任务中的知识获取与执行协同问题。
- Mobile-Agent-RAG: Driving Smart Multi-Agent Coordination with Contextual Knowledge Empowerment for Long-Horizon Mobile Automation(Yuxiang Zhou, Jichang Li, Yanhao Zhang, Haonan Lu, Guanbin Li, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
多模态Agentic RAG与ReAct式多跳规划:跨模态证据整合与关系推理
把RAG扩展到多模态证据与多跳推理:对图像/表格/多模态关系等非纯文本信息进行联合检索与证据整合,并通过代理逻辑(协作/规划/推理范式)生成更完整、更可信的描述或答案。
- A Multimodal Retrieval-Augmented Generation System with ReAct Agent Logic for Multi-Hop Reasoning(Denys Yuvzhenko, Valentyn Chymshyr, V. Shymkovych, Kyrylo Znova, Grzegorz Nowakowski, Sergii Telenyk, 2025, Information, Computing and Intelligent systems)
- Relational Reasoning Image Captioning Via Multi-Agent Retrieval-Augmented Generation(Aiwen Jiang, Duan Wang, Chao Peng, Mingwen Wang, 2025, Knowledge-Based Systems)
- CollEX – A Multimodal Agentic RAG System Enabling Interactive Exploration of Scientific Collections(Florian Schneider, Narges Baba Ahmadi, Niloufar Baba Ahmadi, Iris Vogel, Martin Semmann, Chris Biemann, 2025, Proceedings of the 1st Workshop on Multimodal Augmented Generation via Multimodal Retrieval (MAGMaR 2025))
可解释性评测与RAG交互动力学分析(如RAGTrace类)
面向可解释与可诊断:提供对检索—生成动力学的可追踪评测,分析系统为何失败、检索相关性与生成保真度如何随交互演化,从而支持迭代改进。
- RAGTrace: Understanding and Refining Retrieval-Generation Dynamics in Retrieval-Augmented Generation(Sizhe Cheng, Jiaping Li, Huanchen Wang, Yuxin Ma, 2025, Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology)
动态/通用数据源检索:Agent-based Universal RAG(AU-RAG)
解决动态数据源与通用检索访问适配:通过代理结合描述性元数据实现对不断变化的数据源的学习式访问与动态检索,而非依赖一次性预编码索引。
- AU-RAG: Agent-based Universal Retrieval Augmented Generation(Jisoo Jang, Wen-Syan Li, 2024, Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region)
工具/流程编排型Agent ReAct/RAG:Tool Retrieval与复杂任务工作流
强调工具链/流程编排:将检索作为工具的一部分融入复杂任务执行流程(ReAct/tool retrieval逻辑),突出工程实现层面的流程与推理结合方式,而非仅做检索相关性或多代理通用架构。
- ToolReAGt: Tool Retrieval for LLM-based Complex Task Solution via Retrieval Augmented Generation(N. Braunschweiler, R. Doddipatla, T. Zorila, 2025, Proceedings of the 3rd Workshop on Towards Knowledgeable Foundation Models (KnowFM))
可信可靠:Critic/可解释/系统化推理驱动的评估纠错与透明化
以可靠性与透明度为核心:使用critic驱动的自我纠错/误差挖掘工作流,以及系统化推理与可解释AI方法论来提升Agentic RAG的可控性与可审计性,而不是单纯提升召回。
- RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation(Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen, 2025, Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers))
- Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges(Jintao Liang, Sugang, Huifeng Lin, You Wu, Rui Zhao, Ziyue Li, 2025, Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics)
- Towards Explainable AI in Agentic Retrieval-Augmented Generation: A Systematic Review(Afnan Habib, Osamah F. Abdulmahmod, Mukhlis Raza, Y. Gu, Murat Aydoğan, M. A. Al-antari, 2025, 2025 9th International Artificial Intelligence and Data Processing Symposium (IDAP))
可信/受控Agentic RAG工作流:人类/专家参与、访问控制与端到端工程架构
共性在于“可运行的Agentic RAG工作流与可信/受控访问”:通过角色约束、工具/数据访问控制、人类/专家参与与反馈、以及可追溯的工程化架构,把RAG升级为受控、可信的端到端工作流(含行业落地)。其中框架化流水线作为通用实现参照。
- Agentic AI Meets RAG: A Framework for Domain-Specific Knowledge Evaluation(V. K, Abirami A M, Uma K V, Viswanath V S, 2025, 2025 IEEE 32nd International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW))
- Agentic RAG for Maritime AIoT: Natural Language Access to Structured Data(Oxana Sachenkova, Melker Andreasson, Dongzhu Tan, Alisa Lincke, 2026, Sensors)
- Agentic RAG for Personalized Learning: Design of an AI-Powered Learning Agent Using Open-Source Small Language Models(Shilpi Taneja, S. Biswas, Bhavya Alankar, Harleen Kaur, 2025, Electronic Journal of e-Learning)
- CyberRAG: An agentic RAG cyber attack classification and reporting tool(Francesco Blefari, Cristian Cosentino, F. A. Pironti, Angelo Furfaro, Fabrizio Marozzo, 2025, Future Generation Computer Systems)
- An Agent-Based RAG Architecture for Intelligent Tourism Assistance: The Valencia Case Study(Andrea Bonetti, Adrián Salcedo-Puche, Joan Vila-Francés, Xaro Benavent-Garcia, Emilio Fernández-Vargas, R. Magdalena-Benedito, E. Soria-Olivas, 2025, Tourism and Hospitality)
- Agentic RAG with Human-in-the-Retrieval(Xiwei Xu, Dawen Zhang, Qing Liu, Qinghua Lu, Liming Zhu, 2025, 2025 IEEE 22nd International Conference on Software Architecture Companion (ICSA-C))
多智能体编排与协作架构(框架/服务化视角):分工—迭代—系统化实现
以多智能体编排与协作框架/服务化为主线:围绕多代理专业分工、迭代周期与框架化组织RAG流程;既包含面向多代理的迭代检索思想,也包含从RAG到多代理系统的架构/服务视角总结与落地。
- KAIR: Knowledge-Aware Iterative Retrieval for Multiagent Systems(Seyoung Song, Siddansh Chawla, Vijay K. Madisetti, 2026, IEEE Transactions on Computational Social Systems)
- … multi-agent paradigm for smart urban mobility: Opportunities and challenges for integrating large language models and retrieval-augmented generation with intelligent …(H Xu, J Yuan, A Zhou, G Xu, W Li, XJ Ban, 2025, Urban Human …)
- Sustainable Digitalization of Business with Multi-Agent RAG and LLM(Muhammad Arslan, Saba Munawar, Christophe Cruz, 2025, Procedia Computer Science)
- An Agent-Based Service Architecture for Smart Greenhouses: Telemetry Analytics and Decision Support with RAG-grounded LLM Agents(S. Pardo-Pina, J. Germer, R. Suay-Cortes, Manuel Platero-Horcajadas, Francisco-Javier Ferrández-Pastor, 2026, Smart Agricultural Technology)
- RAG-Enhanced Collaborative LLM Agents for Drug Discovery(Namkyeong Lee, Edward De Brouwer, Ehsan Hajiramezanali, Tommaso Biancalani, Chanyoung Park, Gabriele Scalia, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
- Multi-Agent Retrieval Augmented Generation for Clinical Decision Support: A Systematic Review and Integrative Conceptual Framework(Tarisai Mugambiwa, B. Ndlovu, 2026, Journal of Applied Informatics and Computing)
- From RAG to Multi-Agent Systems: A Survey of Modern Approaches in LLM Development(Gustavo Aquino, Nádila da Silva de Azevedo, Leandro Youiti Silva Okimoto, Leonardo Yuto Suzuki Camelo, Hendrio Bragança, Rubens de Andrade Fernandes, André Luiz Printes, Fábio Florença Cardoso, Raimundo Cláudio Souza Gomes, Israel Gondres Torné, 2025, Preprints.org)
面向异构知识形态的专门化RAG:多模态与表格/异构文档支持
针对“异构知识形态”的结构化改造:分别面向多模态(如病理/图像等跨模态证据)与表格/异构文档(保持表结构与SQL执行、多跳推理),强调数据形态约束下检索与证据利用机制的专门化。
- Patho-AgenticRAG: Towards Multimodal Agentic Retrieval-Augmented Generation for Pathology VLMs via Reinforcement Learning(Wenchuan Zhang, Jingru Guo, Hengzhe Zhang, Penghao Zhang, Jie Chen, Shuwan Zhang, Zhang Zhang, Yuhao Yi, Hong Bu, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
- TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning(Xiaohan Yu, Pu Jian, Chong Chen, 2025, Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing)
推理范式驱动的RAG可靠性(System化推理组织)
(用于覆盖“系统化推理范式”这一子点)该文强调以System 1/2式推理组织RAG推理过程,体现Agentic RAG中推理范式与可靠生成的关系。
- Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges(Jintao Liang, Sugang, Huifeng Lin, You Wu, Rui Zhao, Ziyue Li, 2025, Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics)
危害/风险场景的多代理RAG系统(hazard同主题并列条目)
hazard场景的并列命名重复条目,用于补齐同一主题的同义/重复引用,避免丢失提供的bibkey。
- MARSHA: multi-agent RAG system for hazard adaptation(Yangxinyu Xie, Bowen Jiang, Tanwi Mallick, J. Bergerson, John K. Hutchison, Duane R. Verner, Jordan Branham, M. R. Alexander, Robert B. Ross, Yan Feng, L. Levy, Weijie J. Su, C. J. Taylor, 2025, npj Climate Action)
- From RAG to Multi-Agent Systems: A Survey of Modern Approaches in LLM Development(Gustavo Aquino, Nádila da Silva de Azevedo, Leandro Youiti Silva Okimoto, Leonardo Yuto Suzuki Camelo, Hendrio Bragança, Rubens de Andrade Fernandes, André Luiz Printes, Fábio Florença Cardoso, Raimundo Cláudio Souza Gomes, Israel Gondres Torné, 2025, Preprints.org)
合并后,agentic RAG的研究可以并列归纳为:①范式总体定位;②层级/协同多代理架构;③迭代检索与检索轨迹过程控制;④面向事实核查与安全合规的闭环(检索—验证—停止);⑤领域应用型多代理落地(临床/决策/危害等);⑥移动端长时任务的双级检索增强;⑦多模态证据整合与多跳推理;⑧可解释与评测工具以理解检索-生成动力学;⑨动态通用数据源检索(AU-RAG);⑩工具/流程编排型ReAct/RAG;以及在可靠性与可信工程层面的工作(critic/XAI、受控访问与人类参与、框架/服务化协作),并进一步覆盖针对异构知识形态的专门化RAG(多模态、表格)与个别推理范式细分。
总计44篇相关文献
Retrieval-Augmented Generation (RAG) has emerged as a promising solution to address key challenges faced by GenAI, such as hallucination, outdated or non-removable parametric knowledge, and non-traceable reasoning processes. Existing RAG frameworks introduce dynamism into RAG process through adaptive, recursive and interactive usage of retriever and generator. More recently, agentic RAG adds another layer of intelligence to RAG by leveraging GenAI agents to further enhance dynamism by autonomously planning the retrieval process as a complex orchestration workflow with various external tools. However, current RAG architectures often overlook the significant role that domain experts can play in the retrieval process, alongside passive knowledge bases. This paper introduces a new paradigm for agentic RAG systems, capable of integrating external passive knowledge bases as well as active domain experts. This integration further enhances the versatility and factual accuracy of RAG systems. The paper discusses the key components of this new paradigm and examines the associated design challenges.
Jintao Liang, Sugang, Huifeng Lin, You Wu, Rui Zhao, Ziyue Li. Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics. 2025.
Florian Schneider, Narges Baba Ahmadi, Niloufar Baba Ahmadi, Iris Vogel, Martin Semmann, Chris Biemann. Proceedings of the 1st Workshop on Multimodal Augmented Generation via Multimodal Retrieval (MAGMaR 2025). 2025.
… Agentic RAG extends this framework with autonomous … parison of Traditional and Agentic RAG in terms of architecture, … domain-specialized Agentic RAG frameworks with standardized …
… RAG frameworks remains largely unexplored [2][3]. The integration of agentic capabilities into RAG … This research aims to address these gaps by developing and testing Agentic RAG …
Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.
This paper presents the design of a personalized learning agent powered by the Agentic RAG technique. The agent can interpret learners’ queries and autonomously decide which tools should be used to generate the most suitable response. When the learner shares an Open Educational Resource (OER) they wish to learn from, the agent first breaks the content into smaller, manageable chunks. These chunks are then indexed sequentially to preserve the natural flow of the text. At the same time, chunks are also converted into vector embeddings that allow semantic retrieval. Depending on the learner’s request, different tools are selected by the agent. For example, when the learner requests learning aids like summaries, quizzes, or flashcards, the agent invokes the corresponding tool. This tool passes the sequentially indexed chunks to a small language model to generate the output. For context-specific queries, another specialized tool that relies on vector indexing and retrieval-augmented generation (RAG), is invoked. Visual question answering is handled by a separate tool that leverages multimodal RAG using a multimodal small language model. This agentic setup improves the accuracy and relevance of responses generated by the agent. To test its agentic behaviour, we probed our agent with a diverse set of questions drawn from four different OERs. We thoroughly examined each response and tracked the tools that got invoked autonomously. We also compared the similarity of summaries produced by our agent against those generated by ChatGPT (GPT-4o) using BERT Score as the evaluation metric. Our findings indicate that the agent consistently selected the appropriate tools and the summaries generated by our agent showed close semantic similarity to those produced by GPT-4o, suggesting that the proposed approach can provide performance reasonably close to a state-of-the-art model. The agent being lightweight resides on learner’s local machine and avoid dependence on cloud-based AI ensuring the privacy of learner’s data. It is affordable as it entirely relies on open source frameworks and small models. As the agent provides personalized support to learners by answering their context-based queries and providing on-demand learning aids, it improves their engagement with the educational content. This research shows that designing agentic AI tools using open-source software to address diverse learning needs is technically and economically feasible as well as educationally valuable.
Retrieval-augmented generation (RAG) has emerged as a pivotal technology in natural language processing, owing to its efficacy in generating factual content. However, its informative inputs and complex paradigms often lead to a greater variety of errors. Consequently, achieving automated on-policy assessment and error-oriented correction remains an unresolved issue. In this paper, we propose RAG-Critic, a novel framework that leverages a critic-guided agentic workflow to improve RAG capabilities autonomously. Specifically, we initially design a data-driven error mining pipeline to establish a hierarchical RAG error system. Based on this system, we progressively align an error-critic model using a coarse-to-fine training objective, which automatically provides fine-grained error feedback. Finally, we design a critic-guided agentic RAG workflow that cus-tomizes executor-based solution flows based on the error-critic model’s feedback, facilitating an error-driven self-correction process. Experimental results across seven RAG-related datasets confirm the effectiveness of RAG-Critic, while qualitative analysis offers practical insights for achieving reliable RAG systems. Our dataset and code are available at https: //github.com/RUC-NLPIR/RAG-Critic .
Maritime operations are increasingly reliant on sensor data to drive efficiency and enhance decision-making. However, despite rapid advances in large language models, including expanded context windows and stronger generative capabilities, critical industrial settings still require secure, role-constrained access to enterprise data and explicit limitation of model context. Retrieval-Augmented Generation (RAG) remains essential to enforce data minimization, preserve privacy, support verifiability, and meet regulatory obligations by retrieving only permissioned, provenance-tracked slices of information at query time. However, current RAG solutions lack robust validation protocols for numerical accuracy for high-stakes industrial applications. This paper introduces Lighthouse Bot, a novel Agentic RAG system specifically designed to provide natural-language access to complex maritime sensor data, including time-series and relational sensor data. The system addresses a critical need for verifiable autonomous data analysis within the Artificial Intelligence of Things (AIoT) domain, which we explore through a case study on optimizing ferry operations. We present a detailed architecture that integrates a Large Language Model with a specialized database and coding agents to transform natural language into executable tasks, enabling core AIoT capabilities such as generating Python code for time-series analysis, executing complex SQL queries on relational sensor databases, and automating workflows, while keeping sensitive data outside the prompt and ensuring auditable, policy-aligned tool use. To evaluate performance, we designed a test suite of 24 questions with ground-truth answers, categorized by query complexity (simple, moderate, complex) and data interaction type (retrieval, aggregation, analysis). Our results show robust, controlled data access with high factual fidelity: the proprietary Claude 3.7 achieved close to 90% overall factual correctness, while the open-source Qwen 72B achieved 66% overall and 99% on simple retrieval and aggregation queries. These findings underscore the need for a secure limited-context RAG in maritime AIoT and the potential for cost-effective automation of routine exploratory analyses.
Personalization has become an essential capability in modern AI systems, enabling customized interactions that align with individual user preferences, contexts, and goals. Recent research has increasingly concentrated on Retrieval-Augmented Generation (RAG) frameworks and their evolution into more advanced agent-based architectures within personalized settings to enhance user satisfaction. Building on this foundation, this survey systematically examines personalization across the three core stages of RAG: pre-retrieval, retrieval, and generation. Beyond RAG, we further extend its capabilities into the realm of Personalized LLM-based Agents, which enhance traditional RAG systems with agentic functionalities, including user understanding, personalized planning and execution, and dynamic generation. For both personalization in RAG and agent-based personalization, we provide formal definitions, conduct a comprehensive review of recent literature, and summarize key datasets and evaluation metrics. Additionally, we discuss fundamental challenges, limitations, and promising research directions in this evolving field. Relevant papers and resources are continuously updated at the Github Repo1.
Retrieval Augmented Generation (RAG) has been effectively used to improve the accuracy of question-answering (Q&A) systems powered by Large Language Models (LLMs) by integrating local knowledge and more up-to-date content. However, traditional RAG methods, including those with re-ranking mechanisms, face challenges when dealing with large, frequently updated data sources or when accessing sources exclusively via APIs, as they require pre-encoding all content into embedding vectors. To address these limitations, we introduce Agent-based Universal RAG (AU-RAG), a novel approach that augments data sources with descriptive metadata, allowing an agent to dynamically search through diverse data pools. This agent-driven system can learn from examples to retrieve and consolidate data from various sources on the fly, functioning as a more flexible and adaptive RAG. We demonstrate AU-RAG's functionality with a financial analysis example and evaluate its performance using a multi-source QA dataset. The results show that AU-RAG performs comparably to RAG with re-ranking in data retrieval tasks while also demonstrating an enhanced ability to intelligently learn and access new data sources from examples, making it a robust solution for dynamic and complex information environments.
Chia-Yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin-Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, Na Zou. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025.
The rapid evolution of intelligent chatbots has been largely driven by the advent of Large Language Models (LLMs), which have greatly enhanced natural language understanding and generation. However, the fast-paced advancements in generative Artificial Intelligence (AI) and LLM technologies present challenges for developers to stay up-to-date and to select optimal architectures or approaches from a wide range of available options. This survey article addresses these challenges by providing a overview of cutting-edge techniques and architectural application choices in modern generative chatbot development. We explore various approaches involving retrieval strategies, chunking methods, context management, embeddings, and the utilization of LLMs. Furthermore, we analyze paradigms such as naive Retrieval-Augmented Generation (RAG) compared to Graph-Based RAG, as well as single-agent versus multi-agent systems. We examine agent-based methodologies, comparing single-agent systems with multi-agent architectures, and analyze how multi-agent systems can proficiently handle intricate tasks, enhance scalability, and mitigate faults such as hallucinations through collaborative efforts. Additionally, we review tools and frameworks such as LangGraph that facilitate the implementation of stateful, multi-agent LLM applications. By categorizing and analyzing these modern techniques, this survey aims to present the current landscape and future directions in chatbot development.
While Retrieval-Augmented Generation (RAG) augments Large Language Models (LLMs) with external knowledge, conventional single-agent RAG remains fundamentally limited in resolving complex queries demanding coordinated reasoning across heterogeneous data ecosystems. We present HM-RAG, a novel Hierarchical Multi-agent Multimodal RAG framework that pioneers collaborative intelligence for dynamic knowledge synthesis across structured, unstructured, and graph-based data. The framework is composed of a three-tiered architecture with specialized agents: a Decomposition Agent that dissects complex queries into contextually coherent sub-tasks via semantic-aware query rewriting and schema-guided context augmentation; Multi-source Retrieval Agents that carry out parallel, modality-specific retrieval using plug-and-play modules designed for vector, graph, and web-based databases; and a Decision Agent that uses consistency voting to integrate multi-source answers and resolve discrepancies in retrieval results through Expert Model Refinement. This architecture attains comprehensive query understanding by combining textual, graph-relational, and web-derived evidence, resulting in a remarkable 12.95% improvement in answer accuracy and a 3.56% boost in question classification accuracy over baseline RAG systems on the ScienceQA and CrisisMMD benchmarks. Notably, HM-RAG establishes state-of-the-art results in zero-shot settings on both datasets. Its modular architecture ensures seamless integration of new data modalities while maintaining strict data governance, marking a significant advancement in addressing the critical challenges of multimodal reasoning and knowledge synthesis in RAG systems.
Recent advances in large language models (LLMs) have shown great potential to accelerate drug discovery. However, the specialized nature of biochemical data often necessitates costly domain-specific fine-tuning, posing critical challenges. First, it hinders the application of more flexible general-purpose LLMs in cutting-edge drug discovery tasks. More importantly, it limits the rapid integration of the vast amounts of scientific data continuously generated through experiments and research. Compounding these challenges is the fact that real-world scientific questions are typically complex and open-ended, requiring reasoning beyond pattern matching or static knowledge retrieval. To address these challenges, we propose CLADD, a retrieval-augmented generation (RAG)-empowered agentic system tailored to drug discovery tasks. Through the collaboration of multiple LLM agents, CLADD dynamically retrieves information from biomedical knowledge bases, contextualizes query molecules, and integrates relevant evidence to generate responses - all without the need for domain-specific fine-tuning. Crucially, we tackle key obstacles in applying RAG workflows to biochemical data, including data heterogeneity, ambiguity, and multi-source integration. We demonstrate the flexibility and effectiveness of this framework across a variety of drug discovery tasks, showing that it outperforms general-purpose and domain-specific LLMs as well as traditional deep learning approaches.
Mobile agents show immense potential, yet current state-of-the-art (SoTA) agents exhibit inadequate success rates on real-world, long-horizon, cross-application tasks. We attribute this bottleneck to the agents' excessive reliance on static, internal knowledge within MLLMs, which leads to two critical failure points: 1) strategic hallucinations in high-level planning and 2) operational errors during low-level execution on user interfaces (UI). The core insight of this paper is that high-level planning and low-level UI operations require fundamentally distinct types of knowledge. Planning demands high-level, strategy-oriented experiences, whereas operations necessitate low-level, precise instructions closely tied to specific app UIs. Motivated by these insights, we propose Mobile-Agent-RAG, a novel hierarchical multi-agent framework that innovatively integrates dual-level retrieval augmentation. At the planning stage, we introduce Manager-RAG to reduce strategic hallucinations by retrieving human-validated comprehensive task plans that provide high-level guidance. At the execution stage, we develop Operator-RAG to improve execution accuracy by retrieving the most precise low-level guidance for accurate atomic actions, aligned with the current app and subtask. To accurately deliver these knowledge types, we construct two specialized retrieval-oriented knowledge bases. Furthermore, we introduce Mobile-Eval-RAG, a challenging benchmark for evaluating such agents on realistic multi-app, long-horizon tasks. Extensive experiments demonstrate that Mobile-Agent-RAG significantly outperforms SoTA baselines, improving task completion rate by 11.0% and step efficiency by 10.2%, establishing a robust paradigm for context-aware, reliable multi-agent mobile automation.
We study the automated derivation of safety requirements in a self-driving vehicle use case, leveraging LLMs in combination with agent-based retrieval-augmented generation. Conventional approaches that utilise pre-trained LLMs to assist in safety analyses typically lack domain-specific knowledge. Existing RAG approaches address this issue, yet their performance deteriorates when handling complex queries and it becomes increasingly harder to retrieve the most relevant information. This is particularly relevant for safety-relevant applications. In this paper, we propose the use of agent-based RAG to derive safety requirements and show that the retrieved information is more relevant to the queries. We implement an agent-based approach on a document pool of automotive standards and the Apollo case study, as a representative example of an automated driving perception system. Our solution is tested on a data set of safety requirement questions and answers, extracted from the Apollo data. Evaluating a set of selected RAG metrics, we present and discuss advantages of a agent-based approach compared to default RAG methods.
The contemporary digital landscape overwhelms visitors with fragmented and dynamic information, complicating travel planning and often leading to decision paralysis. This paper presents a real-world case study on the design and deployment of an intelligent tourism assistant for Valencia, Spain, built upon a Retrieval-Augmented Generation (RAG) architecture. To address the complexity of integrating static attraction data, live events, and geospatial context, we implemented a multi-agent system orchestrated via the ReAct (Reason + Act) paradigm, comprising specialized Retrieval, Events, and Geospatial Agents. Powered by a large language model, the system unifies heterogeneous data sources—including official tourism repositories and OpenStreetMap—within a single conversational interface. Our contribution centers on practical insights and engineering lessons from developing RAG in an operational urban tourism environment. We outline data preprocessing strategies, such as coreference resolution, to improve contextual consistency and reduce hallucinations. System performance is evaluated using Retrieval Augmented Generation Assessment (RAGAS) metrics, yielding quantitative results that assess both retrieval efficiency and generation quality, with the Mistral Small 3.1 model achieving an Answer Relevancy score of 0.897. Overall, this work highlights both the challenges and advantages of using agent-based RAG to manage urban-scale information complexity, providing guidance for developers aiming to build trustworthy, context-aware AI systems for smart destination management.
… -based IoT architecture integrated with an Agent-Based Service Architecture (ASA) that orchestrates specialized agents to … -Agent grounded via Retrieval-Augmented Generation (RAG) …
Agentic AI framework for QA system development using Retrieval-Augmented Generation (RAG). The methodology integrates document extraction, contextual chunking, embedding-based retrieval, question generation with semantic deduplication, and multi-agent orchestration. The system leverages the LLaMA 3 70B model deployed on 8 × NVIDIA A100 GPUs, ensuring high fidelity and efficiency. Experimental results demonstrate strong retrieval performance (Precision@5 = 0.91, Recall@5 = 0.88, MRR = 0.87), semantic coverage (0.93), factual grounding (0.88), and a 95% end-to-end success rate in autonomous agentic orchestration. Question uniqueness (82%) and balanced multiple-choice distribution highlight its pedagogical value. The study contributes both applied outcomes—a scalable AI assistant and question bank generator—and theoretical insights into the evolution from RAG to Agentic AI.
The development of large language models (LLMs) is rapidly advancing, with the recent Retrieval Augmented Generation (RAG) systems and Agentic RAG systems, which combine external knowledge to improve factual accuracy and extend the autonomous reasoning processes, multi-step planning, and tool interaction to address complex tasks. The complex structure of retrieval, planning, and generation processes that the composition of the Agentic RAG systems poses considerable difficulty in matters of transparency, accountability, and trust to the user. The field of Explainable AI (XAI) in Agentic RAG is relatively unexplored, with different approaches and an absence of agreed-upon methods of evaluation. This survey provides a depth, systematic overview of the XAI methods suitable for the advanced Agentic RAG techniques, along with component-specific approaches for retrievers, planners/agents, and generators, as well as end-to-end pipeline-level explanations. Followed PRIMSA and PICO (Population, Intervention, Comparison, Outcome) guidelines, and for this review, collected data from different databases like IEEE Xplore (61 articles), PubMed (31 articles), and other databases (25 articles) from 2017 to 2025. The Rayyan AI tool is used in the process to remove duplicates and screen the articles. Initially reviewed articles based on title and abstract, followed by full text. To assess the quality of selected articles Meta Quality Appraisal Tool (MetaQAT) was used, and after screening, 39 records were excluded. In the full-text review of 78 articles, 34 articles were lacking AI relevance, thus excluded. Ultimately, 44 key articles were identified for their contribution to LLM, Agentic RAG, and Explainable AI. This review provides a comprehensive analysis of Explainable AI (XAI) techniques in Agentic Retrieval-Augmented Generation (Agentic RAG), guiding future research and advancing the fields of explainable AI (XAI), large language models (LLMs), and Agents.
Although Vision Language Models (VLMs) have shown generalization in medical imaging, pathology presents unique challenges due to ultra-high resolution, complex tissue structures, and nuanced semantics. These factors make pathology VLMs prone to hallucinations, i.e., generating outputs inconsistent with visual evidence, which undermines clinical trust. Existing RAG approaches in this domain largely depend on text-based knowledge bases, limiting their ability to leverage diagnostic visual cues. To address this, we propose Patho-AgenticRAG, a multimodal RAG framework with a database built on page-level embeddings from authoritative pathology textbooks. Unlike traditional text-only retrieval systems, it supports joint text–image search, enabling retrieval of textbook pages that contain both the queried text and relevant visual cues, thus avoiding the loss of critical image-based information. Patho-AgenticRAG also supports reasoning, task decomposition, and multi-turn search interactions, improving accuracy in complex diagnostic scenarios. Experiments show that Patho-AgenticRAG significantly outperforms existing multimodal models in complex pathology tasks like multiple-choice diagnosis and visual question answering.
… contextual prompts for caption generation. However, existing … relationship memory for Retrieval-Augmented Generation. It … We have further proposed a multi-agent-based approach to …
… , Multi-Agent Retrieval-Augmented Generation (MA-RAG), … agents responsible for query reformulation, iterative retrieval … -making across the retrieval and generation pipeline. We …
… language understanding, content generation, and reasoning. … retrieval-augmented generation (RAG) technologies to transform ITS and CVs into intelligent, human-centric multiagent …
PURPOSE The purpose of the study is to demonstrate the value of custom methods, namely Retrieval Augmented Generation(RAG)-based Large Language Models(LLMs) and Agentic Augmentation, over standard LLMs in delivering accurate information using an anterior cruciate ligament(ACL) injury case. METHODS A set of 100 questions and answers based on the 2022 AAOS ACL guidelines were curated. Closed-source(Open AI GPT4/GPT 3.5 and Anthropic's Claude3) and open-source models(LLama3 8b/70b and Mistral8x7b) were asked questions in base form and again with AAOS guidelines embedded into a RAG system. The top-performing models were further augmented with Artificial Intelligence(AI) Agents and re-evaluated. Two fellowship-trained surgeons blindly evaluated the accuracy of the responses of each cohort. ROUGE and METEOR scores were calculated to assess semantic similarity in the response. RESULTS All non-custom LLM models started below 60% accuracy. Applying RAG improved the accuracy of every model by an average 39.7%. The highest performing model with just RAG was Meta's Open-Source Llama3 70b(94%). The highest performing model with RAG and AI Agents was Open AI's GPT4(95%). CONCLUSION RAG improved accuracy by an average of 39.7%, with the highest accuracy rate of 94% in the Meta Llama3 70b. Incorporating AI agents into a previously RAG-augmented LLM improved ChatGPT4 accuracy rate to 95%. Thus, Agentic and RAG augmented LLMs can be accurate liaisons of information, supporting our hypothesis. CLINICAL RELEVANCE Despite literature surrounding the use of LLM in medicine, there has been considerable and appropriate skepticism given the variably accurate response rates. This study establishes the groundwork to identify whether custom modifications to LLMs using RAG and Agentic augmentation can better deliver accurate information in orthopaedic care. With this knowledge, online medical information commonly sought in popular LLMs, such as ChatGPT, can be standardized and provide relevant online medical information to better support shared decision making between surgeon and patient.
Multi agent retrieval augmented generation (RAG) systems are increasingly explored as advanced architectures for clinical decision support combining information retrieval, reasoning and verification through coordinated agent interactions. This study systematically reviews applications of agentic and multi agent RAG in clinical decision support systems (CDSS) and synthesizes an integrative conceptual framework linking technical design to technology adoption considerations. Following PRISMA guidelines, searches were conducted from PubMed, IEEE Xplore and ScienceDirect using structured Boolean strings combining terms for multi agent architectures, RAG and CDSS.The search yielded 12 studies published between 2020 and 2025 that met the inclusion criteria. The review synthesises evidence on multi agent role configurations retrieval and reasoning strategies, verification mechanisms and reported clinical contexts. Across studies, dominant challenges include data and corpus limitations retrieval quality dependency, limited clinical validation and computational overhead, alongside governance concerns such as privacy, bias and accountability. Building on the synthesis, we propose a four-agent CDSS framework retriever, reasoner, verifier, safety and map its deployment determinants to Technology Acceptance Model constructs perceived usefulness, perceived ease of use, trust and diffusion of Innovations attributes. The review concludes with design-oriented recommendations for safer, explainable, and adoption-ready multi-agent RAG CDSS, particularly for low-resource contexts.
… iterative retrieval, a novel framework that empowers retrievers to make iterative decisions … We propose iterative retrieval to address this problem. Unlike traditional retrievers that perform …
… However, prior iterative-retrieval approaches typically optimize only the retriever’s … Iterative Retrieval for Retrieval-Augmented Generation (AIR-RAG), an adaptive, iterative retrieval …
Retrieval-Augmented Generation (RAG) offers a promising strategy to harness large language models (LLMs) for delivering up-to-date, accurate clinical guidance while reducing physicians’ cognitive burden, yet its effectiveness hinges on query clarity and structure. We propose an adaptive Self-Query Retrieval (SQR) framework that integrates three refinement modules—PICOT (Population, Intervention, Comparison, Outcome, Time), SPICE (Setting, Population, Intervention, Comparison, Evaluation), and Iterative Query Refinement (IQR)—to automatically restructure and iteratively enhance clinical questions until they meet predefined retrieval-quality thresholds. Implemented on Gemini-1.0 Pro, we benchmarked SQR using thirty postoperative rhinoplasty queries, evaluating responses for accuracy and relevance on a three-point Likert scale and for retrieval quality via precision, recall, and F1 score; statistical significance was assessed by one-way ANOVA with Tukey post-hoc testing. The full SQR pipeline achieved 87% accuracy (Likert 2.4 ± 0.7) and 100% relevance (Likert 3.0 ± 0.0), significantly outperforming a non-refined RAG baseline (50% accuracy, 80% relevance; p < 0.01 and p = 0.03). Precision, recall, and F1 rose from 0.17, 0.39 and 0.24 to 0.53, 1.00, and 0.70, respectively, while PICOT-only and SPICE-only variants yielded intermediate improvements. These findings demonstrate that automated structuring and iterative enhancement of queries via SQR substantially elevate LLM-based clinical decision support, and its model-agnostic architecture enables rapid adaptation across specialties, data sources, and LLM platforms.
… The proposed system adheres to the core principles of agent autonomy and goal-directed behavior through two interconnected mechanisms: an iterative decision cycle and dynamic …
Multi-hop question answering is a challenging task with distinct industrial relevance, and Retrieval-Augmented Generation (RAG) methods based on large language models (LLMs) have become a popular approach to tackle this task. Owing to the potential inability to retrieve all necessary information in a single iteration, a series of iterative RAG methods has been recently developed, showing significant performance improvements. However, existing methods still face two critical challenges: context overload resulting from multiple rounds of retrieval, and over-planning and repetitive planning due to the lack of a recorded retrieval trajectory. In this paper, we propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer. This summarizer compresses information from retrieved documents, targeting both the overarching question and the current sub-question concurrently. Experimental results on the multi-hop question-answering datasets HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art, and exhibits excellent robustness concerning context length.
Fact-checking long-form text is challenging, and it is therefore common practice to break it down into multiple atomic claims.The typical approach to fact-checking these atomic claims involves retrieving a fixed number of pieces of evidence, followed by a verification step.However, this method is usually not cost-effective, as it underutilizes the verification model's internal knowledge of the claim and fails to replicate the iterative reasoning process in human search strategies.To address these limitations, we propose FIRE, a novel agent-based framework that integrates evidence retrieval and claim verification in an iterative manner.Specifically, FIRE employs a unified mechanism to decide whether to provide a final answer or generate a subsequent search query, based on its confidence in the current judgment.We compare FIRE with other strong fact-checking frameworks and find that it achieves slightly better performance while reducing large language model (LLM) costs by an average of 7.6 times and search costs by 16.5 times.These results indicate that FIRE holds promise for application in large-scale fact-checking operations.
… -augmented generation (RAG)-based multi-agent LLM system to … The architecture employs a user-centered, multi-agent … , and scientific literature through an RAG framework, the system …
Modern energy platforms are increasingly leveraging Artificial Intelligence (AI) for effective decision-making and efficient operations. This has led to the development of expansive data spaces that comprise both structured and unstructured energy data in various modalities. Conversational agents with the most recent advancements in Large Language Models (LLM) are primed to facilitate the efficient retrieval of this diverse information for decision support. In this paper, we propose a multi-agent chatbot architecture for decision support in net-zero emissions energy systems, leveraging LLMs and Retrieval-Augmented Generation (RAG). This architecture consists of a Chatbot User Interface (UI), an advanced Natural Language Understanding (NLU) module for precise entity and intent recognition, a robust Chatbot Core with four specialized agents: Observer, Knowledge Retriever, Behavior Analyzer, and Visualizer and Response Construction Module. These components work together to address diverse decision support needs in energy environments, specifically for net zero carbon emissions initiatives that need to consider diverse parameters and large volumes of data. We showcase the chatbot's successful integration and evaluation for decision support in the net-zero emissions energy system of a large tertiary education institution.
Businesses heavily rely on data sourced from various channels like news articles, financial reports, and consumer reviews to drive their operations, enabling informed decision-making and identifying opportunities. However, traditional manual methods for data extraction are often time-consuming and resource-intensive, prompting the adoption of digital transformation initiatives to enhance efficiency. Yet, concerns persist regarding the sustainability of such initiatives and their alignment with the United Nations (UN)'s Sustainable Development Goals (SDGs). This research aims to explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) as a sustainable solution for Information Extraction (IE) and processing. The research methodology involves reviewing existing solutions for business decision-making, noting that many systems require training new machine learning models, which are resource-intensive and have significant environmental impacts. Instead, we propose a sustainable business solution using pre-existing LLMs that can work with diverse datasets. We link domain-specific datasets to tailor LLMs to company needs and employ a Multi-Agent architecture to divide tasks such as information retrieval, enrichment, and classification among specialized agents. This approach optimizes the extraction process and improves overall efficiency. Through the utilization of these technologies, businesses can optimize resource utilization, improve decision-making processes, and contribute to sustainable development goals, thereby fostering environmental responsibility within the corporate sector.
Entity resolution in real-world datasets remains a persistent challenge, particularly in identifying households and detecting co-residence patterns within inconsistent and incomplete data. Recent advances using Large Language Models (LLMs) show promise but continue to struggle with scalability, interpretability, and task complexity when applied as single, monolithic systems. This study introduces a multi-agent Retrieval-Augmented Generation (RAG) framework that decomposes household entity resolution into coordinated and specialized agents. The system, implemented using LangGraph, includes four agents: a Direct Agent for name-based matching, an Indirect Agent for transitive linkage, a Household Agent for address-based clustering, and a Household Moves Agent for tracking residential relocations. Each agent employs a task-specific RAG retrieval strategy and a hybrid data cleaning pipeline that integrates rule-based and LLM-powered parsing. Evaluated on synthetic S12PX dataset segments containing 200–300 records with extensive duplicates and data quality issues, the framework achieved 94.3\% accuracy on name variations, complete decision transparency, and a 61\% reduction in API calls compared to single-LLM approaches. These results demonstrate that coordinated agent specialization enhances accuracy, efficiency, and interpretability, establishing a scalable paradigm for entity resolution applicable to census operations, healthcare, and other structured data domains.
Retrieval-augmented generation has raise extensive attention as it is promising to address the limitations of large language models including outdated knowledge and hallucinations. However, retrievers struggle to capture relevance, especially for queries with complex information needs. Recent work has proposed to improve relevance modeling by having large language models actively involved in retrieval, i.e., to guide retrieval with generation. In this paper, we show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner: a model's response to a task input shows what might be needed to finish the task, and thus can serve as an informative context for retrieving more relevant knowledge which in turn helps generate a better response in another iteration. Compared with recent work which interleaves retrieval with generation when completing a single output, Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints. We evaluate Iter-RetGen on multi-hop question answering, fact verification, and commonsense reasoning, and show that it can flexibly leverage parametric knowledge and non-parametric knowledge, and is superior to or competitive with state-of-the-art retrieval-augmented baselines while causing fewer overheads of retrieval and generation. We can further improve performance via generation-augmented retrieval adaptation.
The rapid advancement of generative artificial intelligence models significantly influences modern methods of information processing and user interactions with information systems. One of the promising areas in this domain is Retrieval-Augmented Generation (RAG), which combines generative models with information retrieval methods to enhance the accuracy and relevance of responses. However, most existing RAG systems primarily focus on textual data, which does not meet contemporary needs for multimodal information processing (text, images, tables). The research object of this work is a multimodal RAG system based on ReAct agent logic, capable of multi-hop reasoning. The main emphasis is placed on integrating textual, graphical, and tabular information to generate accurate, complete, and relevant responses. The system's implementation utilized the ChromaDB vector storage, the OpenAI embedding generation model (text-embedding-ada-002), and the GPT-4 language model. The purpose of the study is the development, deployment, and empirical evaluation of the proposed multimodal RAG system based on the ReAct agent approach, capable of effectively integrating diverse knowledge sources into a unified informational context. The experimental evaluation utilized the Global Tuberculosis Report 2024 by the World Health Organization, containing various textual, graphical, and tabular data. A specialized test set of 50 queries (30 textual, 10 tabular, 10 graphical) was created for empirical analysis, allowing comprehensive testing of all aspects of multimodal integration. The research employed methods such as semantic vector search, multi-hop agent-based planning with ReAct logic, and evaluations of answer accuracy, answer recall, and response latency. Additionally, an analysis of response speed dependence on query volume was conducted. The obtained results confirmed the high efficiency of the proposed approach. The system demonstrated an answer accuracy of 92%, answer recall of 89%, and ensured complete (100%) coverage of all data types. The average response time was approximately 5 seconds, meeting interactive system requirements. Optimal parameters were experimentally determined (for example, parameter k = 6, classification threshold 0.35, and up to three reasoning iterations), ensuring the best balance among completeness, speed, and operational efficiency. The study's findings highlighted significant advantages of the multimodal agent-based approach compared to traditional textual RAG solutions, confirming the promising direction for further research.
Retrieval-Augmented Generation (RAG) has demonstrated considerable effectiveness in open-domain question answering. However, when applied to heterogeneous documents, comprising both textual and tabular components, existing RAG approaches exhibit critical limitations. The prevailing practice of flattening tables and chunking strategies disrupts the intrinsic tabular structure, leads to information loss, and undermines the reasoning capabilities of LLMs in multi-hop, global queries. To address these challenges, we propose TableRAG, an SQL-based framework that unifies textual understanding and complex manipulations over tabular data. TableRAG iteratively operates in four steps: context-sensitive query decomposition, text retrieval, SQL programming and execution, and compositional intermediate answer generation. We also develop HeteQA, a novel benchmark designed to evaluate the multi-hop heterogeneous reasoning capabilities. Experimental results demonstrate that TableRAG consistently outperforms existing baselines on both public datasets and our HeteQA, establishing a new state-of-the-art for heterogeneous document question answering. We release TableRAG at https://github.com/yxh-y/TableRAG/tree/main.
We introduce ChemReactSeek, an advanced artificial intelligence platform that integrates retrieval-augmented generation using large language models (LLMs) to automate the design of chemical reaction protocols. The system employs DeepSeek-v3 to extract and structure data from scientific literature, enabling the construction of a specialized knowledge base focused on hydrogenation reactions. By combining FAISS-based semantic search with LLM-driven reasoning, ChemReactSeek generates executable reaction conditions, which we further validate through experiments on heterogeneous hydrogenation.
,
Large language models (LLMs) have the remarkable ability to solve new tasks with just a few examples, but they need access to the right tools.Retrieval Augmented Generation (RAG) addresses this problem by retrieving a list of relevant tools for a given task.However, RAG's tool retrieval step requires all the required information to be explicitly present in the query.This is a limitation, as semantic search, the widely adopted tool retrieval method, can fail when the query is incomplete or lacks context.To address this limitation, we propose Context Tuning for RAG, which employs a smart context retrieval system to fetch relevant information that improves both tool retrieval and plan generation.Our lightweight context retrieval model uses numerical, categorical, and habitual usage signals to retrieve and rank context items.Our empirical results demonstrate that context tuning significantly enhances semantic search, achieving a 3.5-fold and 1.5-fold improvement in Recall@K for context retrieval and tool retrieval tasks respectively, and resulting in an 11.6% increase in LLM-based planner accuracy.Additionally, we show that our proposed lightweight model using Reciprocal Rank Fusion (RRF) with LambdaMART outperforms GPT-4 based retrieval.Moreover, we observe context augmentation at plan generation, even after tool retrieval, reduces hallucination.
RAGTrace: Understanding and Refining Retrieval-Generation Dynamics in Retrieval-Augmented Generation
Retrieval-Augmented Generation (RAG) systems have emerged as a promising solution to enhance large language models (LLMs) by integrating external knowledge retrieval with generative capabilities. While significant advancements have been made in improving retrieval accuracy and response quality, a critical challenge remains that the internal knowledge integration and retrieval-generation interactions in RAG workflows are largely opaque. This paper introduces RAGTrace, an interactive evaluation system designed to analyze retrieval and generation dynamics in RAG-based workflows. Informed by a comprehensive literature review and expert interviews, the system supports a multi-level analysis approach, ranging from high-level performance evaluation to fine-grained examination of retrieval relevance, generation fidelity, and cross-component interactions. Unlike conventional evaluation practices that focus on isolated retrieval or generation quality assessments, RAGTrace enables an integrated exploration of retrieval-generation relationships, allowing users to trace knowledge sources and identify potential failure cases. The system’s workflow allows users to build, evaluate, and iterate on retrieval processes tailored to their specific domains of interest. The effectiveness of the system is demonstrated through case studies and expert evaluations on real-world RAG applications.
合并后,agentic RAG的研究可以并列归纳为:①范式总体定位;②层级/协同多代理架构;③迭代检索与检索轨迹过程控制;④面向事实核查与安全合规的闭环(检索—验证—停止);⑤领域应用型多代理落地(临床/决策/危害等);⑥移动端长时任务的双级检索增强;⑦多模态证据整合与多跳推理;⑧可解释与评测工具以理解检索-生成动力学;⑨动态通用数据源检索(AU-RAG);⑩工具/流程编排型ReAct/RAG;以及在可靠性与可信工程层面的工作(critic/XAI、受控访问与人类参与、框架/服务化协作),并进一步覆盖针对异构知识形态的专门化RAG(多模态、表格)与个别推理范式细分。