基于大模型与知识图谱的智能出题系统设计与实现
教育知识图谱的自动化构建与本体建模
该组文献关注智能出题系统的底层数据底座,探讨如何利用LLM从异构教育资源中提取实体关系、构建动态知识图谱、进行课程本体设计及图谱补全,解决了领域知识结构化的问题。
- An Architectural Framework for Educational Knowledge Graphs (IEEE P2807.6): Ontology Design, Llm Integration, and Adaptive Learning Applications(Bin Xu, Richard Tong, Yanyan Li, Penghe Chen, Hanming Li, Joleen Liang, Xing Fan, Jessie Tong, 2025, 2025 IEEE Conference on Artificial Intelligence (CAI))
- Algorithm for Constructing Educational Dynamic Knowledge Graph and Predicting Intervention Nodes Based on LLM-GNN Fusion(Yixuan Song, Lingyue Fu, 2025, 2025 IEEE 8th International Conference on Information Systems and Computer Aided Education (ICISCAE))
- LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations(Hasan Abu-Rasheed, Constance Jumbo, Rashed Al Amin, Christian Weber, Veit Wiese, Roman Obermaisser, M. Fathi, 2025, 2025 IEEE Global Engineering Education Conference (EDUCON))
- AI赋能下的基于知识图谱的《大数据运维》课程教学改革与探索(Unknown Authors, 2026, Unknown Journal)
- 基于知识图谱的信号与系统课程在线学习系统(Unknown Authors, Unknown Journal)
- AI双师赋能线上线下混合式教学的风险挑战及实施路径(Unknown Authors, Unknown Journal)
- LLM-based Multi-Level Knowledge Generation for Few-shot Knowledge Graph Completion(Qian Li, Zhuo Chen, Cheng Ji, Shiqi Jiang, Jianxin Li, 2024, Proceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence)
- Integrating machine learning and a large language model to construct a domain knowledge graph for reducing the risk of fall-from-height accidents.(Zhipeng Zhou, Xinhui Yu, Joseph Jonathan Magoua, Jianqiang Cui, Haiying Luan, Dong Lin, 2025, Accident Analysis & Prevention)
- Large language model assisted fine-grained knowledge graph construction for robotic fault diagnosis(Xingming Liao, Chong Chen, Zhuowei Wang, Ying Liu, Tao Wang, Lianglun Cheng, 2025, Advanced Engineering Informatics)
- 长安大学交通运输本科专业知识图谱构建(Unknown Authors, Unknown Journal)
- Aircraft Fault Knowledge Graph Construction Based On Large Language Model Incorporating Chinese Airworthiness Knowledge(Yi Fan, Yu Sun, Baigang Mi, Xi Fu, 2025, International Journal of Software Engineering and Knowledge Engineering)
- 知识图谱与人工智能驱动的流行病学教学创新研究(Unknown Authors, 2026, Unknown Journal)
- Constructing a Knowledge System for Traditional Chinese Patent Medicine Using Large Language Model and Knowledge Graphs: A Case Study of TCPM-KG*(Peng Yang, Yan Wang, Wenhao Yang, 2025, 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM))
- Knowledge Graph Large Language Model (KG-LLM) for Link Prediction(Dong Shu, Tianle Chen, Mingyu Jin, Yiting Zhang, Chong Zhang, Mengnan Du, Yongfeng Zhang, 2024, Asian Conference on Machine Learning)
- Building AI Competency Knowledge Graphs with LLMs: From Job Market Analysis to Educational Guidance(Zhuoyuan Tang, Wei Wei, Yi Yang, Shile Zhang, Chi Kin Lam, 2025, 2025 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE))
- From Chaos to Clarity: A Knowledge Graph-Driven Audit Dataset Generation Framework for LLM Unlearning(Weipeng Jiang, Juan Zhai, Shiqing Ma, Ziyan Lei, Xiaofei Xie, Yige Wang, Chao Shen, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
- Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A Case Study at HCMUT(Tuan Bui, Oanh Tran, Phuong Nguyen, B. Ho, Long Song Thien Nguyen, Thang Bui, Tho Quan, 2024, Proceedings of the 1st ACM Workshop on AI-Powered Q&A Systems for Multimedia)
- A Large Language Model-Driven Framework for Automated Knowledge Graph Construction for Aircraft Fault Diagnosis(Lianyu Sun, Yujie Jin, Xilang Tang, Bin Hu, 2025, 2025 International Conference on Data Science and Edge Computing (ICDSEC))
- Exploring Multi-aspect Information for Knowledge Graph Completion with Large Language Model(Linghui Wang, Jiawei Sheng, Wenyuan Zhang, Tingwen Liu, Chuang Zhang, 2025, 2025 International Joint Conference on Neural Networks (IJCNN))
知识图谱增强的RAG与可解释推理机制
该组文献侧重于技术架构中层,研究如何通过知识图谱引导LLM进行逻辑推理。通过Graph-RAG、路径搜索和图上下文增强,缓解模型幻觉,确保出题内容的严谨性与事实一致性。
- KA-RAG: Integrating Knowledge Graphs and Agentic Retrieval-Augmented Generation for an Intelligent Educational Question-Answering Model(Fangqun Gao, Shun-Yi Xu, Weiyang Hao, Tao Lu, 2025, Applied Sciences)
- Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs(Liyi Chen, Panrong Tong, Zhongming Jin, Ying Sun, Jieping Ye, Huixia Xiong, 2024, Neural Information Processing Systems)
- LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph(Tu Ao, Yanhua Yu, Yuling Wang, Yang Deng, Zirui Guo, Liang Pang, Pinghui Wang, Tat-Seng Chua, Xiao Zhang, Zheng Cai, 2025, Proceedings of the AAAI Conference on Artificial Intelligence)
- Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning(Linhao Luo, Yuan-Fang Li, Gholamreza Haffari, Shirui Pan, 2023, International Conference on Learning Representations)
- FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering(Yuan Sui, Yufei He, Nian Liu, Xiaoxin He, Kun Wang, Bryan Hooi, 2024, Annual Meeting of the Association for Computational Linguistics)
- KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search(Haoran Luo, E. Haihong, Yikai Guo, Qika Lin, Xiaobao Wu, Xinyu Mu, Wenhao Liu, Meina Song, Yifan Zhu, Anh Tuan Luu, 2025, International Conference on Machine Learning)
- Enhancing graph multi-hop reasoning for question answering with LLMs: An approach based on adaptive path generation(Lianhong Ding, Na Ding, Qi Tao, Peng Shi, 2025, Journal of Intelligent Information Systems)
- GeAR: Graph-enhanced Agent for Retrieval-augmented Generation(Zhi-Hua Shen, Chenxin Diao, P. Vougiouklis, Pascual Merita, Shriram Piramanayagam, Damien Graux, Dandan Tu, Zeren Jiang, Ruofei Lai, Yang Ren, Jeff Z. Pan, 2024, Annual Meeting of the Association for Computational Linguistics)
- Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations(Hasan Abu-Rasheed, Christian Weber, M. Fathi, 2024, 2024 IEEE Global Engineering Education Conference (EDUCON))
- Reasoning Retrieval-Augmented Generation Method Integrated with Dynamic Semantic Expansion(Wenxian Zeng, Xiangqi Liu, 2025, 2025 5th International Conference on Machine Learning and Intelligent Systems Engineering (MLISE))
- Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning(Xingyu Tan, Xiaoyang Wang, Qing Liu, Xiwei Xu, Xin Yuan, Wenjie Zhang, 2024, Proceedings of the ACM on Web Conference 2025)
- AU-RAG: Agent-based Universal Retrieval Augmented Generation(Jisoo Jang, Wen-Syan Li, 2024, Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region)
- Intelligent question answering for water conservancy project inspection driven by knowledge graph and large language model collaboration(Yangrui Yang, Sisi Chen, Yaping Zhu, Xuemei Liu, Shifeng Pan, Xin Wang, 2024, LHB)
- MDKAG: Retrieval-Augmented Educational QA Powered by a Multimodal Disciplinary Knowledge Graph(Xu Zhao, Guozhong Wang, Yufei Lu, 2025, Applied Sciences)
- Knowledge Graph-Enhanced Large Language Model Reasoning with Prompt Engineering(Zishun Rui, Shucun Fu, Haolong Xiang, Siyu Wu, Shengjie Chen, Xiaolong Xu, 2025, 2025 Thirteenth International Conference on Advanced Cloud and Big Data (CBD))
- Logic-Aware Knowledge Graph Reasoning for Structural Sparsity under Large Language Model Supervision(Yudai Pan, Jiajie Hong, Tianzhe Zhao, Lingyun Song, Jun Liu, Xuequn Shang, 2025, Proceedings of the ACM on Web Conference 2025)
- KGLM-QA: A Novel Approach for Knowledge Graph-Enhanced Large Language Models for Question Answering(Alireza Akhavan Safaei, Pegah Saboori, Reza Ramezani, Mohammadali Nematbakhsh, 2024, 2024 15th International Conference on Information and Knowledge Technology (IKT))
- 问题图谱对大语言模型支持下的自主式学习影响机制分析(Unknown Authors, Unknown Journal)
- Beyond Flat Retrieval: Towards Dynamic Graph-Augmented Generation for Complex Question Answering(Yaxin Shang, Huayun Tang, Lanlan Gao, Chen Jia, Zhaoxun Lin, 2025, 2025 IEEE 6th International Conference on Computer, Big Data, Artificial Intelligence (ICCBD+AI))
- Using Retrieval-Augmented Generation to improve Performance of Large Language Models on the Brazilian University Admission Exam(L. Taschetto, Renato Fileto, 2024, Anais do XXXIX Simpósio Brasileiro de Banco de Dados (SBBD 2024))
自动化命题技术、难度控制与认知分类应用
这组文献直接服务于出题任务,涵盖了选择题自动生成、跨学科题目设计、基于布鲁姆教育目标的分类命题、以及利用DPO等算法实现题目的难度可控生成。
- ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam(Yavuz Selim Kıyak, Ö. Coşkun, I. Budakoğlu, Canan Uluoğlu, 2024, European Journal of Clinical Pharmacology)
- Zero-shot Knowledge Graph Question Generation via Multi-agent LLMs and Small Models Synthesis(Runhao Zhao, Jiuyang Tang, Weixin Zeng, Ziyang Chen, Xiang Zhao, 2024, Proceedings of the 33rd ACM International Conference on Information and Knowledge Management)
- LEMON: A Knowledge-Enhanced, Type-Constrained, and Grammar-Guided Model for Question Generation Over Knowledge Graphs(Sheng Bi, Zeyi Miao, Qizhi Min, 2025, IEEE Transactions on Learning Technologies)
- Automatic question-answer pairs generation using pre-trained large language models in higher education(Jintao Ling, Muhammad Afzaal, 2024, Computers and Education: Artificial Intelligence)
- Interdisciplinary-QG: An LLM-Based Framework for Generating High-Quality Interdisciplinary Test Questions with Knowledge Graphs and Chain-of-Thought Reasoning(Chaocheng Zhong, Feihong Ye, Zihan Wang, Aerman Jigeer, Zehui Zhan, 2025, 2025 14th International Conference on Educational and Information Technology (ICEIT))
- Constrained LLM-Based Query Generation for Question Answering on Official Statistics(J. Kouwenhoven, Lucas Lageweg, Benno Kruit, 2024, Frontiers in Artificial Intelligence and Applications)
- Context Selection and Rewriting for Video-based Educational Question Generation(Mengxia Yu, Bang Nguyen, Olivia Zino, Meng Jiang, 2025, AAAI Conference on Artificial Intelligence)
- Advancing AI in Higher Education: A Comparative Study of Large Language Model-Based Agents for Exam Question Generation, Improvement, and Evaluation(V. Nikolovski, D. Trajanov, Ivan Chorbev, 2025, Algorithms)
- Synthetic Data Generation with Large Language Models for Personalized Community Question Answering(Marco Braga, Pranav Kasela, Alessandro Raganato, Gabriella Pasi, 2024, 2024 IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT))
- KAQG: A Knowledge-Graph-Enhanced RAG for Difficulty-Controlled Question Generation(Ching Han Chen, Ming Fang Shiu, 2025, IEEE Access)
- From Superficial to Deep: Integrating External Knowledge for Follow-up Question Generation Using Knowledge Graph and LLM(Jianyu Liu, Yi Huang, Sheng Bi, Junlan Feng, Guilin Qi, 2025, International Conference on Computational Linguistics)
- Asking Questions Like Educational Experts: Automatically Generating Question-Answer Pairs on Real-World Examination Data(Fanyi Qu, Xin Jia, Yunfang Wu, 2021, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing)
- Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization(Yuto Tomikawa, Masaki Uto, 2025, IEEE Access)
- KG-Enhanced LLM for Controlled Educational Question Generation(Jian Xu, Limin Zhang, Lihua Zhang, Shoujian Duan, 2025, 2025 7th International Academic Exchange Conference on Science and Technology Innovation (IAECST))
- Keyword-driven Conversational Question Generation in Human-like Dialog Agent(Zheng Fang, Bo Wang, 2023, 2023 8th International Conference on Big Data and Computing)
多智能体协作架构与系统质量保障
该组文献探讨了利用多专业Agent(如出题者、审题者、评估者)协同工作的系统框架,通过工作流闭环解决自动化生成的幻觉问题,并提高复杂教学任务的自动化水平。
- EduPlanner: LLM-Based Multiagent Systems for Customized and Intelligent Instructional Design(Xueqiao Zhang, Chao Zhang, Jianwen Sun, Jun Xiao, Yi Yang, Yawei Luo, 2025, IEEE Transactions on Learning Technologies)
- Multi-Examiner: A Knowledge Graph-Driven System for Generating Comprehensive IT Questions with Higher-Order Thinking(Yonggu Wang, Zeyu Yu, Zihan Wang, Zengyi Yu, Jue Wang, 2025, Applied Sciences)
- Hallucination-Free Causal Graph-Guided AI Framework for Intuitive Question and Answer Generation(Nicholas X. Wang, A. Katsaggelos, 2026, International Journal of Multimedia Data Engineering and Management)
- Knowledge Graph Enhanced AI Agent Framework for Programming Competition Training(Yifei Xiao, Sigeng Li, Yuan Li, 2025, 2025 6th International Conference on Computer Vision and Data Mining (ICCVDM))
- Intelligent Course Assistant System Based On Large Language Model(Wenbo Jiang, Dongju Yang, Kang Wang, 2025, Proceedings of the 2025 2nd International Conference on Artificial Intelligence and Future Education)
- The Design and Development of a Computer Science Teaching Assistant Agent Based on Large Language Models(Lili Quan, Yong Pan, 2025, Proceedings of the 2025 2nd International Conference on Artificial Intelligence and Future Education)
- Dialogagent: An Auto-Engagement Agent for Code Question Answering Data Production(Xiaoyun Liang, Jingyi Ren, Jiayi Qi, Chao Peng, Bo Jiang, 2024, 2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP))
- A Multi-Agent Communication Framework for Question-Worthy Phrase Extraction and Question Generation(Siyuan Wang, Zhongyu Wei, Zhihao Fan, Yang Liu, Xuanjing Huang, 2019, Proceedings of the AAAI Conference on Artificial Intelligence)
- Multi-Agent Based Casual Triple Extraction For Factuality Evaluation Using Large Language Models(Jian Zhang, Zeming Xu, Lifang Liu, Zhanfeng Shen, Yue Cui, Yongdong Zhang, 2024, 2024 IEEE International Conference on Data Mining Workshops (ICDMW))
- Coordinated LLM multi-agent systems for collaborative question-answer generation(S. Saadaoui, E. Alonso, 2025, Knowledge-Based Systems)
- Hallucination-Free Automatic Question & Answer Generation for Intuitive Learning(Nicholas X. Wang, A. Katsaggelos, 2025, 2025 IEEE International Conference on Image Processing Workshops (ICIPW))
- Automatic Question Generation for Intuitive Learning Utilizing Causal Graph Guided Chain of Thought Reasoning(Nicholas X. Wang, Neel V. Parpia, Aaryan D. Parikh, A. Katsaggelos, 2025, 2025 IEEE 8th International Conference on Multimedia Information Processing and Retrieval (MIPR))
- MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation(Chia-yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin-Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, Na Zou, 2024, Annual Meeting of the Association for Computational Linguistics)
- E-GPT: A Multi-Agent LLM Framework for Intelligent Educational Assistance(Tabassum Ara, Shreyansu Panda, Jason Samuel Das, Amit Das, Sidharth Vivek Prabhugoankar, 2025, 2025 1st International Conference on Advancement in Futuristic Technologies (ICAFT))
个性化测评、组卷优化与学科落地实践
该组文献侧重于系统的后端应用,包括基于强化学习的组卷算法、学生认知图谱建模、个性化学习路径推荐,以及在医学、数学、计算机等垂直领域的具体实践案例。
- Exam paper generation based on performance prediction of student group(Zhengyang Wu, Tao He, Chenjie Mao, Changqin Huang, 2020, Information Sciences)
- Reinforcement Learning Guided Multi-Objective Exam Paper Generation(Yuhu Shang, Xuexiong Luo, Lihong Wang, Hao Peng, Xiang Zhang, Yimeng Ren, Kun Liang, 2023, SDM)
- LLM-based Intelligent Evaluation Agent with Knowledge Graph Construction for Human-Machine Interactive Learning*(Chang-Shing Lee, Mei-Hui Wang, Guan-Ying Tseng, Chao-Cyuan Yue, Chun-Han Lin, Yi-Jun Lin, Naoyuki Kubota, 2025, 2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC))
- Design of assessment algorithm and model for Chinese spoken language teaching based on natural language processing and knowledge graph(Xinyue Ma, Yandong Hu, Min Li, 2024, Theoretical and Natural Science)
- Collaborative and AI-aided Exam Question Generation using Wikidata in Education(Philipp Scharpf, M. Schubotz, Andreas Spitz, André Greiner-Petter, Bela Gipp, 2022, Unpublished)
- Optimising AI writing assessment using feedback and knowledge graph integration(Ci Zhang, 2025, PeerJ Computer Science)
- Hybrid Knowledge Graph–Neural Network Framework for Automated Student Assessment and Personalized Feedback in Higher Education(Sandeep Dongre, Vandna, Soujanya Saraswathi, M. S, Mamta Bansal, Angela Jean Mary E, 2025, 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG))
- An AI Education Framework Based on LLM-Knowledge Graph for Personalized Learning(Zihao Liang, Yio Wang, Mingjie Zhao, Sen Feng, Yunfan Zhang, Yiqun Zhang, 2025, 2025 21st International Conference on Computational Intelligence and Security (CIS))
- 生成式人工智赋能基础教育教学的应用与反思(Unknown Authors, Unknown Journal)
- Enhancing Personalised Learning with a Context-Aware Intelligent Question-Answering System and Automated Frequently Asked Question Generation(Eleonora Bernasconi, Domenico Redavid, Stefano Ferilli, 2025, Electronics)
- AI-Powered Math Tutoring: Platform for Personalized and Adaptive Education(Jaroslaw A. Chudziak, Adam Kostka, 2025, Lecture Notes in Computer Science)
- A Large Language Model-Based System for Socratic Inquiry: Fostering Deep Learning and Memory Consolidation(Xin Xie, Xiaoli Yang, Rongyu Cui, 2025, 2025 14th International Conference on Educational and Information Technology (ICEIT))
- LLM and Knowledge Graph Based Framework for Dynamic Prerequisite Knowledge Identification in Foundational Mathematics(K.M.G.M.B. Alahakoon, D.D.M. Ranasinghe, 2025, 2025 9th SLAAI International Conference on Artificial Intelligence (SLAAI-ICAI))
- A Knowledge Graph and LLM-Enhanced Teaching System for AI Programming(Haoyu Wang, Ruifang Liu, 2025, 2025 21st International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD))
- TCM-KLLaMA: Intelligent generation model for Traditional Chinese Medicine Prescriptions based on knowledge graph and large language model(Zhuang Yi, Lingkai Yu, Nan Jiang, Yujia Ge, 2025, Computers in Biology and Medicine)
- MedKGGPT2: A Study on Knowledge Graph-Based Data Generation Methods for Fine-Tuning Large Pharmaceutical Models(Yang Gao, Lin Li, Tao Li, Penghui Gu, Yun Li, Liu He, 2025, 2025 10th International Conference on Cyber Security and Information Engineering (ICCSIE))
- Knowledge Graph Enhanced AI Generated Content for International Chinese Language Teaching and Learning(Liqing Yang, Yue Fu, 2025, 2025 IEEE International Conference on Computation, Big-Data and Engineering (ICCBE))
- Design and Implementation of a Knowledge Graph-Based Teaching Assistance System for English Courses in Colleges and Universities(Xinxin Gao, 2025, 2025 2nd International Conference on Digital Media, Communication and Information Systems (DMCIS))
- Free Ebooks for Computer Science Courses: Now With Support for Peer Instruction, Choice Questions, and Exam Generation(Barbara Ericson, Bradley N. Miller, 2022, Proceedings of the 53rd ACM Technical Symposium on Computer Science Education V. 2)
- 基于大语言模型的医学人工智能类课程AI教学助手开发与实践(Unknown Authors, Unknown Journal)
- 大语言模型驱动下的《信号与系统》智慧教学研究(Unknown Authors, 2025, Unknown Journal)
- Research on the Mechanism of Personalized Learning Path Generation Based on the Collaboration of KG and LLM(Yahong Leng, Haoran Zhang, Feng Tan, 2025, 2025 IEEE 3rd International Conference on Electrical, Automation and Computer Engineering (ICEACE))
- Research on Personalized Cognitive Graph Based on Large Language Models (LLM) for Education(Ying Li, Yiming Gai, Leilei Sun, Xingyu Wang, Chao Wang, Xuefei Huang, 2025, 2025 IEEE Frontiers in Education Conference (FIE))
- Question Generation for Adaptive Education(Megha Srivastava, Noah D. Goodman, 2021, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers))
- Intelligent Question Sequencing via Concept-Graph-Aware Reinforcement Learning for Personalized Assessment(E. Byrne, F. Murphy, Patrick Keane, 2025, Frontiers in Humanities and Social Sciences)
- Cultivating College Students' Adaptive Learning Ability Based on LLM-Enhanced Knowledge Graph(Yuhong Xing, Yahong Ma, Jing Li, Zhe Liu, Ya-Huei Chen, Baochu Li, Guiru Xu, Yajing Lu, 2025, Journal of Modern Educational Theory and Practice)
- 新工科背景下AI赋能的“机电一体化系统设计”案例和项目驱动课程 ...(Unknown Authors, Unknown Journal)
- 基于大语言模型的教育智能体个性化学习应用理论研究(Unknown Authors, Unknown Journal)
- Scaling Retrieval Practice with LLM: Improving Multiple Choice Question (MCQ) Quality through Knowledge Graphs(Yuan An, Ruhma Hashmi, 2026, Proceedings of the 57th ACM Technical Symposium on Computer Science Education V.2)
- GISedu-GPT: a large language model framework with prior knowledge for GIS education question bank generation(Zhiyun Wang, Yifan Zhang, Wen Min, Qingfeng Guan, Wenhao Yu, 2025, Journal of Geography in Higher Education)
- A Framework for Automatic Exam Generation based on Intended Learning Outcomes(Ashraf Amria, Ahmed Ewais, R. Hodrob, 2018, Proceedings of the 10th International Conference on Computer Supported Education)
- 人工智能在高中英语听力命题中的应用研究(Unknown Authors, 2026, Unknown Journal)
- AI赋能的《数据分析》智慧课程构建与实践(Unknown Authors, Unknown Journal)
- 基于AIGC的高等数学教辅系统开发(Unknown Authors, 2025, Unknown Journal)
- Linking Knowledge to Care: Knowledge Graph-Augmented Medical Follow-Up Question Generation(Liwen Sun, Xiang Yu, Ming Tan, Zhuo Chen, Anqi Cheng, Ashutosh Joshi, Chenyan Xiong, 2026, Findings of the Association for Computational Linguistics: EACL 2026)
- 数学分析课程与人工智能技术深度融合的教学(Unknown Authors, Unknown Journal)
- A Large Language Model-Powered Bot for Calligraphy Knowledge Q&A and Appreciation(Yunbo Zheng, Zhigang Li, Ying Li, Xinyuan Feng, Liyi Chen, Jingjing Deng, 2025, 2025 IEEE 3rd International Conference on Image Processing and Computer Applications (ICIPCA))
- 高等数学知识图谱的构建与应用探究(Unknown Authors, Unknown Journal)
- 基于Coze平台与DeepSeek大模型的智能辅助教学智能体(Unknown Authors, 2025, Unknown Journal)
- 基于AI的基础医学图谱智能构建与应用(Unknown Authors, Unknown Journal)
- 人工智能技术用于小学数学错题整理的应用研究(Unknown Authors, Unknown Journal)
- 数据结构与算法自动组题系统的研究与实现(Unknown Authors, Unknown Journal)
- 基于超星学习通平台的知识图谱构建——以《人工智能基础 ...(Unknown Authors, Unknown Journal)
- 人工智能赋能实验课程的教学改革与实践概述(Unknown Authors, Unknown Journal)
- 基于知识图谱的数学分析课程教学与实践(Unknown Authors, Unknown Journal)
- 基于知识图谱的《信号与系统》课程创新教学改革(Unknown Authors, Unknown Journal)
- “AI + 知识图谱”赋能课程智慧教学新生态——以《实验心理学》 ...(Unknown Authors, Unknown Journal)
- 人工智能助力下工科数学分析N课程的个性化学习研究(Unknown Authors, Unknown Journal)
本次合并将研究成果整合为五个核心维度:1) 基础层:教育知识图谱的自动化构建与本体设计;2) 机制层:KG与RAG融合驱动的增强推理技术,旨在消除LLM幻觉;3) 任务层:涵盖认知导向的题目生成与多维难度控制;4) 架构层:以多智能体协作实现系统的高可靠性与复杂任务流;5) 应用层:面向特定学科的闭环测评、组卷优化及个性化自适应学习路径规划。整体趋势显示,系统正从“通用生成”转向“垂直领域专家协同”,强调教育科学性与工程可落地性的深度融合。
总计120篇相关文献
研究发现,问题图谱通过结构化知识、优化学习路径和提供认知支持三大核心机制,能够有效克服大语言模型在自主学习中的“幻觉”问题,达到提高学习效率的效果。本文揭示了问题 ...
本系统是基于国产的DeepSeek大模型,结合Coze平台,采用了“AI + 教育专家”双轮驱动的手段构建了一套面向教育的教学辅助智能体.智能体设计了四大核心功能模块:智能备课, ...
本文对自动组题的方法和形式进行了研究,可以实现二叉树、哈夫曼树、无向图、有向图、无向网、有向网的图形自动生成,并将生成的图形和相同或不同的题干组合形成不同问题,供 ...
针对《大数据运维》课程教学中存在的诸多问题,文章提出依托人工智能技术赋能,并结合超星学习通平台构建课程知识图谱开展教学改革研究.通过系统构建“思–学–知–行–用”五维 ...
依托超星网络教学平台开发“大纲–思维导图–图谱–地图”的4种查看模式,教师将每个知识节点关联题库习题,使用平台“掌握度标签”功能,按学生日常作业与测验得分率自动对知识点 ...
依托头歌平台(Educoder),将大模型技术深度融入课程设计,开发了知识图谱导航、24 h智能助教、智能备课、智能出卷等功能模块。结果表明,该模式有效解决了传统课程内容滞后、 ...
为提高错题整理的效率、错题的最大化利用率,本文介绍了图像识别、知识图谱、AI个性化推荐三种技术,可分别代替传统错题整理的手抄、分类、相似题型寻找的步骤,强调人工智能 ...
当前各大模型的迅猛发展,使得教育智能体可以构建动态的知识图谱和进行多模态感知,并能通过各种工具解析学习者的认知轨迹和情感波动,如同一位私人导师。
依托平台互动式试题库系统建设,包括试题类型、知识点和试题难度“三核心参数”建立题库的数据结构。AI助教可以通过超星大模型文档实现自动作业查重,确保学生的作业原创性。
通过平台中的设置里的“知识库”功能,教师可以自行通过上传本课程的相关教材,讲义,教案及习题等教学资源,从而对大模型进行轻量级定制与知识增强,使其在课程问答中表现得更加 ...
随着智能时代来临,高等教育数字化面临新契机与挑战。本文聚焦数学分析课程,构建课程知识图谱体系,涵盖知识点关联构建、可视化呈现等多方面实践,通过其应用于教学各环节 ...
为了提高教学效果,本文基于知识图谱的优势设计了《信号与系统》课程的线上学习系统,并采取线上线下混合式教学方式,改进教学设计及课程考核。该措施将课程知识图谱展示在线 ...
基于知识图谱构建“知识点–能力–难度”的三维命题模型,运用图神经网络实现试题的智能关联与组卷优化。 4) 知识迁移自动化。通过构建多维度知识关联网络,系统自动识别 ...
知识图谱(Knowledge Graph),是一种利用可视化技术呈现的知识资源及其载体的网络化表示,旨在通过挖掘、分析、构建、绘制和展示知识及其相互间复杂关系,以直观地揭示知识发展 ...
本研究聚焦超星学习通平台,目的是构建与《人工智能基础》课程有关的知识图谱,来探索知识图谱在教育领域的应用与实践。首先深入剖析了知识图谱的概念及其构成, ...
鉴于此,本项目拟基于知识图谱设计信号与系统的在线学习系统,利用知识图谱的知识关联性建立信号与系统课程知识库,不但能有效合理地管理各章的学习资料,快捷地梳理各章节相关 ...
知识图谱(Knowledge Graph)构建与智能导航需系统梳理数学分析核心概念、定理、方法及其逻辑关系,应用于智能导航学习路径、可视化概念网络、诊断知识漏洞(如图2)。
通过构建流行病学知识图谱,整合课程知识,实现知识的可视化和结构化,为学生提供更加系统和直观的学习资源.同时,AI技术在知识图谱构建和教学实践中的应用,能够提高知识图谱的 ...
开展基于知识图谱和问题图谱的教学内容重构,构建机电系统创新设计智慧教学智能体赋能教与学,采用案例驱动和实践项目驱动的教学方法,AI赋能个性化学习路径的生成和多元 ...
AI内容生成与智能教学助手是基于大型语言模型(LLM) [15]、知识图谱与自适应学习技术构建的新一代教育科技平台。其核心能力包括多模态教学内容自动生成(如课程设计、习题、 ...
针对当前课程教与学的现状,本文通过建设课程知识图谱,提出了应用知识图谱和AI助教实现学生个性化学习的途径。应用表明知识图谱的关联知识点、学习资料可促进学生的自主个性 ...
知识问答模块通过语义匹配与生成技术,为学生提供实时答疑;习题解答模块根据用户上传的图片识别其中的数学问题,通过推理分析给出解答;出题练习模块生成个性化题目及解析, ...
研究者已将人工智能技术有效应用于语言测试中复杂题型(如选择题)的自动化命题中,极大地提升了命题效率(陈大建,胡杰辉,2025) [9]。
摘要: 在AI大语言模型赋能教育的背景下,针对通用大模型对课程知识不熟悉的问题,本研究基于DeepSeek-R1模型,结合RAG技术,以Dify低代码平台为基础,通过优化提示词(设定 ...
自适应内容生成:GenAI可以重组教学单元内容,并根据学生学习情况实施学习起点诊断模式或者学习进度调试模式,允许学生按需选择学习顺序并根据学生水平自动调整习题难度和知识 ...
... AI备课助手和一键出题丰富教学案例设计[17]。在新工科智慧云平台建设与应用过程中,利用AI可以实现课程知识图谱智能构建、学生学习状态智能追踪及可视化、智能考试 ...
In a conversational system, dynamically generating follow-up questions based on context can help users explore information and provide a better user experience. Humans are usually able to ask questions that involve some general life knowledge and demonstrate higher order cognitive skills. However, the questions generated by existing methods are often limited to shallow contextual questions that are uninspiring and have a large gap to the human level. In this paper, we propose a three-stage external knowledge-enhanced follow-up question generation method, which generates questions by identifying contextual topics, constructing a knowledge graph (KG) online, and finally combining these with a large language model to generate the final question. The model generates information-rich and exploratory follow-up questions by introducing external common sense knowledge and performing a knowledge fusion operation. Experiments show that compared to baseline models, our method generates questions that are more informative and closer to human questioning levels while maintaining contextual relevance.
Clinical diagnosis is time-consuming, requiring intensive interactions between patients and medical professionals. While large language models (LLMs) could ease the pre-diagnostic workload, their limited domain knowledge hinders effective medical question generation. We introduce a Knowledge Graph-augmented LLM with active in-context learning to generate relevant and important follow-up questions, KG-Followup, serving as a critical module for the pre-diagnostic assessment. The structured medical domain knowledge graph serves as a seamless patch-up to provide professional domain expertise upon which the LLM can reason. Experiments demonstrate that KG-Followup outperforms state-of-the-art methods by 5% - 8% on relevant benchmarks in recall.
In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations. To overcome these limitations, researchers have proposed Retrieval-Augmented Generation (RAG) techniques, some others have proposed the integration of LLMs with Knowledge Graphs (KGs) to provide factual context, thereby improving performance and delivering more accurate feedback to user queries. Education plays a crucial role in human development and progress. With the technology transformation, traditional education is being replaced by digital or blended education. Therefore, educational data in the digital environment is increasing day by day. Data in higher education institutions are diverse, comprising various sources such as unstructured/structured text, relational databases, web/app-based API access, etc. Constructing a Knowledge Graph from these cross-data sources is not a simple task. This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources and discusses some initial applications (experimental trials) of KG in conjunction with LLMs for question-answering tasks.
Recently LLMs have faced increasing demands to selectively remove specific information through Machine Unlearning. While evaluating unlearning effectiveness is crucial, existing benchmarks suffer from fundamental limitations in audit dataset generation from unstructured corpora. We identify two critical challenges: ensuring audit adequacy and handling knowledge redundancy between forget and retain datasets. Current approaches rely on ad-hoc question generation from unstructured text, leading to unpredictable coverage gaps and evaluation blind spots. Knowledge redundancy between forget and retain corpora further obscures evaluation, making it difficult to distinguish genuine unlearning failures from legitimately retained knowledge. To bring clarity to this challenge, we propose LUCID, an automated framework that leverages knowledge graphs to achieve comprehensive audit dataset generation with fine-grained coverage and systematic redundancy elimination. By converting unstructured corpora into structured knowledge representations, it transforms the ad-hoc audit dataset generation process into a transparent and automated generation pipeline that ensures both adequacy and non-redundancy. Applying LUCID to the MUSE benchmark, we generated over 69,000 and 111,000 audit cases for News and Books datasets respectively, identifying thousands of previously undetected knowledge memorization instances. Our analysis reveals that knowledge redundancy significantly skews metrics, artificially inflating ROUGE from 19.7% to 26.1% and Entailment Scores from 32.4% to 35.2%, highlighting the necessity of deduplication for accurate assessment.
Intuitive learning plays a vital role in building deep conceptual understanding, particularly in STEM education, where students often grapple with abstract and interdependent ideas. Automatic question generation has emerged as an effective strategy to support personalized and adaptive learning. However, its effectiveness is limited by hallucinations in large language models (LLMs), which can produce factually incorrect, ambiguous, or pedagogically inconsistent questions. To address this challenge, we propose a novel framework that combines causal-graph-guided Chain-of-Thought (CoT) reasoning with a multi-agent LLM architecture to ensure the generation of accurate, meaningful, and curriculum-aligned questions. In this approach, causal graphs offer an explicit representation of domain knowledge, while CoT reasoning enables structured, step-by-step traversal through related concepts. Dedicated LLM agents handle specific tasks such as graph pathfinding, reasoning, validation, and output, all operating under domain constraints. A dual validation mechanism-at both the conceptual and output stages-substantially reduces hallucinations. Experimental results show up to a 70% improvement in quality over reference methods and yielded highly favorable outcomes in subjective evaluations.
Automated question generation plays a crucial role in modern educational technology, reducing teachers' workload and enabling personalized learning. However, current mainstream question generation methods (including large language models and neural network methods) often suffer from drawbacks such as factual errors, weak alignment with the curriculum, and poor interpretability. This paper proposes a hybrid framework called KG-Enhanced LLM for Controlled Educational Question Generation (KG-LLM-CEQG), which combines knowledge graphs and large language models to generate fact-based questions that meet curriculum requirements and are controlled by cognitive theory. This framework integrates knowledge extraction based on knowledge graphs, multi-level instructional prompts, LLM-guided question generation, and a knowledge graph-based validation mechanism. The framework can be deployed in intelligent tutor systems and assessment-driven educational platforms.
In the pharmaceutical industry, large language models (LLMs) in specialized domains, such as medicine, rely heavily on high-quality fine-tuning datasets to generate high-performance models. Therefore, the generation of high-quality fine-tuning datasets becomes a key research focus. Building upon the MedKGGPT model, a framework for generating fine-tuning data for pharmaceutical large models based on knowledge graphs (KG) is proposed. This framework consists of two main parts. The first part involves combining KG triples with a data format and utilizing three methods for generating question-answer pairs. This approach enhances the reliability and diversity of the question-answer data, thereby enriching the LLM's understanding of specialized pharmaceutical knowledge and improving the accuracy of its responses in the medical domain. The second part proposes a method for evaluating the generated question-answer data using a combination of human evaluation and BLEU and ROUGE metrics. By validating the quality of data through the LLM's output, the effectiveness of fine-tuning the LLM with this generated data to improve its accuracy in medical responses is assessed. After comparing several popular methods, it was found that there was a significant improvement in both ROUGE and BLEU evaluations. Experimental results demonstrate that the data generated through a knowledge graph-based approach significantly improves the accuracy of the MedKGGPT model in responding to medical knowledge queries. The final performance score reaches 89.65, indicating a notable enhancement in effectiveness. This enhancement not only strengthens the safety of the output but also increases the MedKGGPT model's utility in the medical field.
Large language models excel in various natural language processing tasks but often struggle with knowledge-intensive queries, particularly those involve rare entities or require precise factual information. This paper presents a novel framework that enhances capabilities of an LLM-based question answering system by incorporating structured knowledge from knowledge graphs. Our approach employs entity extraction, semantic similarity scoring, and adaptive graph exploration to efficiently navigate and extract relevant information from knowledge graphs. The core of the presented solution is a knowledge graph-enhanced language model process that iteratively refines subgraph exploration and answer generation, complemented by a fallback mechanism for robustness across diverse question types. Experiments on location-based questions from the Entity Questions dataset demonstrate significant improvements in the quality of responses. Using the Gemini 1.5 Flash model, our system achieved an accuracy increase from 36% to 71% for partially correct answers and from 22% to 69% for exactly correct answers, as evaluated by human assessors. This approach offers a promising direction for developing more reliable and accurate question answering systems, particularly for queries involving long-tail entities or specific factual knowledge.
No abstract available
No abstract available
. This research explores the development of a knowledge graph statistical question answering system for Statistics Netherlands. Aimed at efficiently retrieving single statistical values from their extensive database, which encompasses over a billion values across more than 4,000 tables, we propose a comprehensive three-component framework consisting of: (1) a data augmentation method to generate synthetic data, (2) an entity retrieval system that leverages various encoder networks along with different hard negative mining techniques for the effective retrieval of tables, measures, and dimensions, and (3) an innovative large language model-based query generator. A central innovation of our research is the introduction of a dynamic prompting technique for query generation, which creates prompts specifically for a certain phase of the token generation. This approach ensures that the model is supplied with information relevant for generating specific tokens in a symbolic query. With this approach, we propose a novel system that can help find relevant information in official statistics and similar systems, which is vital for governmental decision making and all fields of research utilising and relying on these statistics.
Retrieval-Augmented Generation (RAG) has emerged as a mainstream paradigm for grounding large language models (LLMs) with external knowledge. However, conventional RAG systems suffer from two fundamental limitations: (i) they treat documents as isolated bags of sentences, disregarding the rich relational structures between documents; and (ii) they struggle to perform effective multi-hop reasoning over temporally evolving facts. To address these issues, this paper presents improved GraphRAG, a plug-and-play framework that augments LLM reasoning capabilities through a dynamic, hierarchical, and path-aware graph-structured memory. The framework comprises two core components: (1) a time-aware relation graph, in which edge weights decay exponentially over time to capture the temporal dynamics of knowledge; and (2) a hierarchical community graph that clusters entities into a multi-level taxonomy, forming semantically coherent hierarchical representations. Building on this, we design a dual-retrieval mechanism: a semantics-enhanced retriever that integrates dense vector representations with graph structural priors, and a path-guided retriever that performs beam search over relational sequences to identify multi-hop reasoning chains. Extensive experiments on five open-domain QA benchmarks—Natural Questions, TriviaQA, HotpotQA, WikiHop, and Musique—demonstrate that improved GraphRAG achieves an exact match accuracy of 82.7% on multi-hop questions, surpassing vanilla RAG by 24.4% and outperforming HybridRAG by 6.1%. .
With the accelerated digital transformation in education, the efficient integration of massive multimodal instructional resources and the support for interactive question answering (QA) remains a prominent challenge. This study introduces Multimodal Disciplinary Knowledge-Augmented Generation (MDKAG), a framework integrating retrieval-augmented generation (RAG) with a multimodal disciplinary knowledge graph (MDKG). MDKAG first extracts high-precision entities from digital textbooks, lecture slides, and classroom videos by using the Enhanced Representation through Knowledge Integration 3.0 (ERNIE 3.0) model and then links them into a graph that supports fine-grained retrieval. At inference time, the framework retrieves graph-adjacent passages, integrates multimodal data, and feeds them into a large language model (LLM) to generate context-aligned answers. An answer-verification module checks semantic overlap and entity coverage to filter hallucinations and triggers incremental graph updates when new concepts appear. Experiments on three university courses show that MDKAG reduces hallucination rates by up to 23% and increases answer accuracy by 11% over text-only RAG and knowledge-augmented generation (KAG) baselines, demonstrating strong adaptability across subject domains. The results indicate that MDKAG offers an effective route for scalable knowledge organization and reliable interactive QA in education.
We constructed a model for generating international Chinese teaching resources based on large language modeling (LLM) and knowledge graph (KG). The model adopts a dual-engine model of LLM and KG and integrates the natural language processing (NLP) of LLM. The advantages of KG, such as structured knowledge reasoning, enable the creation of high-quality and specialized Chinese teaching resources. The research process includes the design of modules for information filtering, professional question and answer, and extraction and transformation to ensure the accuracy and personalization of resource generation. The developed model addresses the deficiencies of traditional systems and meets dynamic and personalized demands. The model excels in knowledge localization, the quality of teaching resource generation, and system efficiency, particularly in cultural adaptation and grammatical accuracy. It also outperforms other models and promotes intelligent and personalized Chinese teaching resources through the integration of LLM and KG, which substantially enhances user experience and teaching quality.
Teaching introductory computer science has become increasingly difficult with the rise of AI code-completion tools. Frequent retrieval practice, especially through multiple-choice questions (MCQs), offers a promising way to maintain active learning, yet producing high quality MCQs at scale remains a challenge for instructors. Large language models (LLMs) can automate MCQ generation, enabling scalable in-lecture retrieval practice. This poster presents two preliminary studies examining this potential. First, in higher-education programming courses, students who received LLM-generated MCQs scored significantly higher on follow-up quizzes than during periods without retrieval practice. However, raw LLM generated MCQs often suffered from hallucinations, weak distractors, trivial content, and formatting issues. Second, we evaluated a knowledge graph (KG)–guided generation pipeline. By structuring key concepts and relations before prompting, the KG-based approach produced more relevant, integrative, and appropriately challenging MCQs. In a dataset of 400+ items, KG-generated MCQs outperformed text-only generation across 15 quality criteria. These preliminary studies open promising directions for future research, including adaptive generation of personalized MCQs tailored to student mastery levels.
As interdisciplinary education gains prominence in global educational reforms, the design of high-quality interdisciplinary test items remains a challenge due to the complexity of knowledge integration, question difficulty control, and the inefficiency of manual generation. To address these issues, this study introduces Interdisciplinary-QG, an automated interdisciplinary question generation framework based on GPT-4. The framework integrates knowledge graph-enhanced retrieval-based generation with chain-of-thought reasoning and employs a structured BRTE (Background-Role-Task-Example) prompt template, enhancing both accuracy and interdisciplinary coherence. A case study in chemistry demonstrates that Interdisciplinary-QG effectively constructs interdisciplinary knowledge structures and generates high-quality test items with both depth and breadth. Experimental results show that it outperforms the general-purpose LLM ChatGLM in validity, efficiency, and interdisciplinary integration. This study provides new insights into leveraging AI for interdisciplinary education.
The IEEE P2807.6 Education Knowledge Graph (EduKG) standard defines a semantic infrastructure to represent educational knowledge, resources, and pedagogy in a unified graph format. This paper expands on the core EduKG architecture, detailing its ontology design and key entities-Learning Points, Resource Items, and Pedagogical Rules-that collectively model the domain, content, and instructional strategies of learning systems. We further explore how EduKG can be integrated with advanced AI technologies, including large language models (LLMs) and retrieval-augmented generation (Graph-RAG) via embedding databases, to enable intelligent behavior such as semantic search, question answering, and dynamic content generation. These integrations position EduKG as a central component in next-generation smart education systems, wherein knowledge graphs work in concert with intelligent agents and adaptive instructional systems to deliver fully automated, personalized, and interactive learning experiences. By leveraging the standardized graph-structured representation and semantic reasoning capabilities of EduKG, such systems can achieve interoperability across platforms and support complex AI-driven tutoring and training scenarios. This work provides a comprehensive overview of the EduKG framework and highlights its role in empowering adaptive, cognitive, and collaborative learning solutions for the future of digital education.
The task of multi-hop link prediction within knowledge graphs (KGs) stands as a challenge in the field of knowledge graph analysis, as it requires the model to reason through and understand all intermediate connections before making a prediction. In this paper, we introduce the Knowledge Graph Large Language Model (KG-LLM), a novel framework that leverages large language models (LLMs) for knowledge graph tasks. We first convert structured knowledge graph data into natural language and then use these natural language prompts to fine-tune LLMs to enhance multi-hop link prediction in KGs. By converting the KG to natural language prompts, our framework is designed to learn the latent representations of entities and their interrelations. To show the efficacy of the KG-LLM Framework, we fine-tune three leading LLMs within this framework, including Flan-T5, LLaMa2 and Gemma. Further, we explore the framework's potential to provide LLMs with zero-shot capabilities for handling previously unseen prompts. Experimental results show that KG-LLM significantly improves the models' generalization capabilities, leading to more accurate predictions in unfamiliar scenarios.
This study aims to construct a knowledge syst em for Traditional Chinese Patent Medicine (TCPM) based on Large Language Model (LLM) and Knowledge Graph (K G) to address the complexity of TCPM knowledge and its $c$ linical application challenges. It seeks to solve issues inhere nt in traditional TCPM knowledge utilization, such as infor mation silos, retrieval difficulties, and the lack of effective decision support. By integrating multi-source heterogeneous data, including the Pharmacopoeia of the People's Republic of China and various medical information platforms, this re search utilizes an LLM for efficient knowledge extraction a nd fusion to build the Traditional Chinese Patent Medicine Knowledge Graph (TCPM-KG). Experimental results show $t$ hat the model achieved an F1-score of 0.9553 in the Name d Entity Recognition(NER) task, indicating that LLM can ef fectively process complex texts in the TCPM domain and si gnificantly enhance the automation of knowledge extraction. To date, TCPM-KG contains 1,607 TCPM instances and 77, 100 knowledge triplets. The advantages of this study lie in $i$ ts integration of multi-source heterogeneous data to create a KG with broad coverage and fine granularity, and its prop osal of a “KG+LLM” methodology that spans the entire co nstruction process. This work successfully demonstrates the feasibility and efficiency of using an LLM as a core techno logy for building a TCPM-KG, providing a new technical $p$ ath and foundation for the modernization and intelligent de velopment of Traditional Chinese Medicine.
With the rapid deployment of industrial robots in manufacturing, the demand for advanced maintenance techniques to sustain operational efficiency has become crucial. Fault diagnosis Knowledge Graph (KG) is essential as it interlinks multi-source data related to industrial robot faults, capturing multi-level semantic associations among different fault events. However, the construction and application of fine-grained fault diagnosis KG face significant challenges due to the inherent complexity of nested entities in maintenance texts and the severe scarcity of annotated industrial data. In this study, we propose a Large Language Model (LLM) assisted data augmentation approach, which handles the complex nested entities in maintenance corpora and constructs a more fine-grained fault diagnosis KG. Firstly, the fine-grained ontology is constructed via LLM Assistance in Industrial Nested Named Entity Recognition (assInNNER). Then, an Industrial Nested Label Classification Template (INCT) is designed, enabling the use of nested entities in Attention-map aware keyword selection for the Industrial Nested Language Model (ANLM) data augmentation methods. ANLM can effectively improve the model's performance in nested entity extraction when corpora are scarce. Subsequently, a Confidence Filtering Mechanism (CFM) is introduced to evaluate and select the generated data for enhancement, and assInNNER is further deployed to recall the negative samples corpus again to further improve performance. Experimental studies based on multi-source corpora demonstrate that compared to existing algorithms, our method achieves an average F1 increase of 8.25%, 3.31%, and 1.96% in 5%, 10%
Traditional Chinese medicine (TCM) prescriptions are a basic component of TCM treatment, developed by assessing patient symptoms and prescribing a mix of herbs. Accurate prescription generation is critical for enhancing treatment outcomes and maintaining patient safety. However, conventional methods based on Large Language Models (LLMs) focus mainly on symptom information, neglecting other TCM diagnostic expertise, such as tongue and pulse diagnosis, and are prone to hallucination, which is unacceptable in medical applications. To address these challenges, the paper proposes an effective prescription generation model enriched by a TCM knowledge graph (KG) called the TCM-KLLaMA model. In this model, the Chinese-LLaMA2-7 B model is provided with a new output layer and loss function to suppress hallucinations and increase recommendation accuracy. A TCM KG including symptoms, tongue diagnosis, and pulse diagnosis was developed, and the model was fine-tuned utilizing the suggested synonym and matching knowledge injection (SMKI) mechanism. Extensive experiments demonstrate that the TCM- KLLaMA outperforms baseline models in both Precision and F1 Score, proving its superior performance in prescription generation tasks.
Large Language Models (LLMs) have impressive capabilities in text understanding and zero-shot reasoning. However, delays in knowledge updates may cause them to reason incorrectly or produce harmful results. Knowledge Graphs (KGs) provide rich and reliable contextual information for the reasoning process of LLMs by structurally organizing and connecting a wide range of entities and relations. Existing KG-based LLM reasoning methods only inject KGs' knowledge into prompts in a textual form, ignoring its structural information. Moreover, they mostly rely on close-source models or open-source models with large parameters, which poses challenges to high resource consumption. To address this, we propose a novel Lightweight and efficient Prompt learning-ReasOning Framework for KGQA (LightPROF), which leverages the full potential of LLMs to tackle complex reasoning tasks in a parameter-efficient manner. Specifically, LightPROF follows a “Retrieve-Embed-Reason” process, first accurately, and stably retrieving the corresponding reasoning graph from the KG through retrieval module. Next, through a Transformer-based Knowledge Adapter, it finely extracts and integrates factual and structural information from the KG, then maps this information to the LLM’s token embedding space, creating an LLM-friendly prompt to be used by the LLM for the final reasoning. Additionally, LightPROF only requires training Knowledge Adapter and can be compatible with any open-source LLM. Extensive experiments on two public KGQA benchmarks demonstrate that LightPROF achieves superior performance with small-scale LLMs. Furthermore, LightPROF shows significant advantages in terms of input token count and reasoning time.
This paper introduces a novel Large Language Model (LLM)-based system designed to enhance learning effect through Socratic inquiry, thereby fostering deep understanding and longterm knowledge consolidation. Recognizing the challenges of implementing Socratic pedagogy and memory reinforcement principles (as highlighted by the Ebbinghaus forgetting curve) in traditional settings, this research explores a new framework that integrates the power of LLMs with established pedagogical approaches grounded in constructivist learning theory and cognitive load theory. This system, which includes carefully designed AIdriven questioning techniques informed by the Socratic method, dynamic note management, and memory reinforcement strategies guided by the testing effect, was evaluated through a quasiexperimental classroom intervention. The study revealed that students using the LLM-based framework demonstrated significantly better understanding of complex topics, and were also more engaged with the learning process, when compared to students using traditional methods. These results highlight the potential for LLMs to transform educational practices by creating new, more effective and personalized learning pathways that reduce cognitive load and facilitate long-term retention. This research offers practical insights for educators and researchers seeking to leverage AI to promote meaningful learning experiences and to help students internalize knowledge more effectively.
In the paper, we propose an approach of multimodal RAG (Retrieval-Augmented Generation) on the LLM(Large Language Model) of Qwen-VL to implement a bot for knowledge Q&A(Question and Answer) and appreciation of Chinese brush calligraphy. Firstly, electronic files on calligraphy, which contain a large number of images of brush strokes and texts, are divided into multiple chucks and vectorized, and then stored in a vector library, as knowledge bas. When a calligraphy enthusiasts input an image or some text in the bot dialog for knowledge questioning or art appreciation, the image features are extracted using the CLIP model, and the text is encoded using the word embedding model. Then the proposed method is used to retrieve the appropriate text and images from the knowledge base and pass them to the LLM, which will answer the questions of the enthusiasts in a multimodal way. In terms of system implementation, the paper uses LangChain framework to orchestrate the task flow, and uses Gradio to build a lightweight interactive interface, which supports mixed graphical input and visualized output. The bot developed in the paper is an assistant for calligraphy enthusiasts to learn and appreciate Chinese calligraphy.
Large Language Models (LLMs) have shown remarkable reasoning capabilities on complex tasks, but they still suffer from out-of-date knowledge, hallucinations, and opaque decision-making. In contrast, Knowledge Graphs (KGs) can provide explicit and editable knowledge for LLMs to alleviate these issues. Existing paradigm of KG-augmented LLM manually predefines the breadth of exploration space and requires flawless navigation in KGs. However, this paradigm cannot adaptively explore reasoning paths in KGs based on the question semantics and self-correct erroneous reasoning paths, resulting in a bottleneck in efficiency and effect. To address these limitations, we propose a novel self-correcting adaptive planning paradigm for KG-augmented LLM named Plan-on-Graph (PoG), which first decomposes the question into several sub-objectives and then repeats the process of adaptively exploring reasoning paths, updating memory, and reflecting on the need to self-correct erroneous reasoning paths until arriving at the answer. Specifically, three important mechanisms of Guidance, Memory, and Reflection are designed to work together, to guarantee the adaptive breadth of self-correcting planning for graph reasoning. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness and efficiency of PoG.
Logic-Aware Knowledge Graph Reasoning for Structural Sparsity under Large Language Model Supervision
Knowledge Graph (KG) reasoning aims to predict missing entities in incomplete triples, which requires adequate structural information to derive accurate embeddings. However, KGs in the real world are not as dense as the idealized benchmarks, where sparse graph structures restrict the comprehensive structural information for superior performance. Although the logical semantics in KGs shows its potential in alleviating the impact of structural sparsity, there still exist some challenges. The deficient supervision and the semantic gap of logic make it difficult to introduce logical semantics in sparse KG reasoning. To this end, we propose a novel KG reasoning approach LoLLM injecting logic with the supervised information supplied by the Large Language Model (LLM), which is proved to be effective in evaluating and scoring. Firstly, LoLLM derives structural embeddings employing a graph convolutional network (GCN) with relation-aware and triple-aware attention. LoLLM secondly constructs reasoning paths instantiated from the first-order logic rules extracted from sparse KGs, and injects the logical semantics by a designed LLM-enhanced tuning strategy. We propose a textual loss (TL) and a logical loss (LL) in the optimization and obtain logical tuning embeddings of KG in this process. Finally, LoLLM fuses structural embeddings from the GCN and logical tuning embeddings from the LLM-enhanced tuning for scoring and incomplete triple prediction. Extensive experiments on two sparse KGs and a benchmark show that LoLLM outperforms state-of-the-art structure-based and Language Model (LM)-augmented baselines. Moreover, the logic rules with corresponding confidences provide explicit explanations as an interpretable paradigm.
Large language models (LLMs) have demonstrated remarkable success across diverse natural language processing (NLP) tasks, yet their limited knowledge reserves and persistent hallucinations undermine performance in complex reasoning. Knowledge graphs (KGs), with structured and verified information, provide a reliable foundation for reasoning, but existing KG-based LLM methods often treat KGs as static bases, neglecting their structural information. This oversight introduces spurious knowledge and compromises answer accuracy. To address these limitations, we propose LRwP, i.e., knowledge graph-enhanced Large language model Reasoning with Prompt engineering framework, which synergistically integrates LLMs and KGs while enhancing prompt design to improve reasoning fidelity and interpretability. LRwP comprises two key stages: (1) Subgraph retrieval, where a refined Personalized PageRank algorithm aligns queries with KG structures to yield compact subgraphs with maximized answer coverage; and (2) Reasoning, where task-specific prompts guide LLMs to generate chains of thought and KG paths, which are semantically and structurally aligned to identify the most relevant answers. Extensive experiments demonstrate that LRwP significantly outperforms state-of-the-art baselines on both simple and multi-hop reasoning tasks, delivering more faithful and interpretable results.
Airworthiness directives contain rich, standardized information critical for diagnosing aircraft faults. However, the complexity, domain specificity, and heterogeneous data characteristics of these texts make it difficult to extract and structure this knowledge into an organized fault knowledge graph (KG), thereby limiting progress toward intelligent civil-aviation maintenance and management. To address this challenge, we propose a large language model (LLM) fine-tuning approach that integrates domain knowledge to mine fault knowledge from Chinese Airworthiness Directive (CAD) texts. After comprehensive text preprocessing and expert-guided manual annotation, we constructed a specialized dataset for aircraft fault knowledge discovery, encompassing named entity recognition (NER) and relation extraction (RE) tasks. The LLM was fine-tuned through parameter-efficient adaptation methods (Freeze, P-tuning, and LoRA), with domain knowledge incorporated via tailored prompt templates to enable intelligent knowledge extraction from CAD texts. Experimental results demonstrate that the domain-enhanced LLM achieves F1 scores of 81.64% on NER and 88.30% on RE -- improvements of 12.76% and 3.97%, respectively, over conventional pretrained language models (PLMs). These results confirm the effectiveness of the proposed knowledge-embedded LLM framework in constructing aircraft fault KGs and advancing expert systems for civil-aviation safety and airworthiness management.
Large Language Models (LLMs) are often challenged by generating erroneous or hallucinated responses, especially in complex reasoning tasks. Leveraging Knowledge Graphs (KGs) as external knowledge sources has emerged as a viable solution. However, existing KG-enhanced methods, either retrieval-based or agent-based, encounter difficulties in accurately retrieving knowledge and efficiently traversing KGs at scale. In this paper, we propose a unified framework, FiDeLiS, designed to improve the factuality of LLM responses by anchoring answers to verifiable reasoning steps retrieved from KGs. To achieve this, we leverage step-wise beam search with a deductive scoring function, allowing the LLM to validate reasoning process step by step, and halt the search once the question is deducible. In addition, we propose a Path-RAG module to pre-select a smaller candidate set for each beam search step, reducing computational costs by narrowing the search space. Extensive experiments show that our method, as a training-free framework, not only improve the performance but also enhance the factuality and interpretability across different benchmarks. Code is released at https://github.com/Y-Sui/FiDeLiS.
Knowledge Graph Completion (KGC) is important in addressing the incompleteness of Knowledge Graphs (KGs) and supporting intelligent infrastructures. Recently, numerous methods have been developed for the utilization of Large Language Models (LLMs) to complete KGs in a textual generation manner. However, integrating KGs with LLMs presents multiple challenges. First, the output of LLMs is often unconstrained, and directly performing KGC with LLMs may generate entities beyond the scope of KGs and suffer from hallucination. Second, the inherent token length limitation of LLMs may hinder the integration of multi-aspect information, thereby restricting the effectiveness and efficiency in inference. Third, existing methods that integrate LLMs with KGs typically leverage partial aspects of KG contexts, overlooking the crucial role of multi-aspect information in prompting KGC. In fact, there exists crucial multi-aspect information of KG contexts that support the correctness of a factual triple, such as entity/relation descriptions, reasoning paths and entity neighbors. To this end, we propose a novel framework to explore the impact of multi-aspect information of KG contexts for KGC, termed as MAKGC. Particularly, given a target incomplete triple, MAKGC generates a list of candidate entities using an embedding model, incorporates the most relevant relation paths and entity descriptions as embeddings, and integrates them with structural embeddings into a set of instructions. In this way, the multi-aspect information facilitate LLMs in making accurate predictions. Extensive experiments demonstrate the effectiveness on benchmark datasets, and our model outperform previous competitive methods.
As the aviation industry develops rapidly, the increasing complexity of aircraft systems demands more efficient and accurate fault diagnosis. Knowledge Graphs (KGs) enable structured fault knowledge organization but are costly to construct. Large Language Models (LLMs) offer new possibilities for automated domain KG construction with their natural language processing capabilities. This paper proposes an LLM-driven framework for aircraft fault diagnosis KG automation. It preprocesses unstructured data via hierarchical chunking to fit LLM context limits while retaining semantics. The core is a prompt-based “extract-verify-correct” iterative mechanism, which extracts entities/relations accurately and mitigates LLM hallucination and omission. Finally, knowledge vectorization and similarity calculation fuse local knowledge to eliminate redundancy, building a unified KG. The paper details the framework' s design and key technologies, providing a high-quality knowledge base for intelligent fault diagnosis and advancing aviation maintenance intelligence.
Large Language Models (LLMs) have achieved impressive results in various tasks but struggle with hallucination problems and lack of relevant knowledge, especially in deep complex reasoning and knowledge-intensive tasks. Knowledge Graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs. PoG tackles multi-hop and multi-entity questions through a three-phase dynamic multi-hop path exploration, which combines the inherent knowledge of LLMs with factual knowledge from KGs. In order to improve the efficiency, PoG prunes irrelevant information from the graph exploration first and introduces efficient three-step pruning techniques that incorporate graph structures, LLM prompting, and a pre-trained language model (e.g., SBERT) to effectively narrow down the explored candidate paths. This ensures all reasoning paths contain highly relevant information captured from KGs, making the reasoning faithful and interpretable in problem-solving. PoG innovatively utilizes graph structure to prune the irrelevant noise and represents the first method to implement multi-entity deep path detection on KGs for LLM reasoning tasks. Comprehensive experiments on five benchmark KGQA datasets demonstrate PoG outperforms the state-of-the-art method ToG across GPT-3.5-Turbo and GPT-4, achieving an average accuracy improvement of 18.9%. Notably, PoG with GPT-3.5-Turbo surpasses ToG with GPT-4 by up to 23.9%.
Large language models (LLMs) have demonstrated impressive reasoning abilities in complex tasks. However, they lack up-to-date knowledge and experience hallucinations during reasoning, which can lead to incorrect reasoning processes and diminish their performance and trustworthiness. Knowledge graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. Nevertheless, existing KG-based LLM reasoning methods only treat KGs as factual knowledge bases and overlook the importance of their structural information for reasoning. In this paper, we propose a novel method called reasoning on graphs (RoG) that synergizes LLMs with KGs to enable faithful and interpretable reasoning. Specifically, we present a planning-retrieval-reasoning framework, where RoG first generates relation paths grounded by KGs as faithful plans. These plans are then used to retrieve valid reasoning paths from the KGs for LLMs to conduct faithful reasoning. Furthermore, RoG not only distills knowledge from KGs to improve the reasoning ability of LLMs through training but also allows seamless integration with any arbitrary LLMs during inference. Extensive experiments on two benchmark KGQA datasets demonstrate that RoG achieves state-of-the-art performance on KG reasoning tasks and generates faithful and interpretable reasoning results.
ABSTRACT Engineering inspection is of great significance to ensure the safe operation of the project. However, the unclear query statements of the detectors pose a challenge to the intelligent question answering task. Existing knowledge graph-based question-answering systems face issues of vocabulary limitations and reliance on fixed templates. Solely relying on Large Language Models (LLMs) for questioning introduces noise and randomness due to their extensive knowledge base. Therefore, this paper proposes a novel approach that synergistically employs both knowledge graphs and LLMs for intelligent question-answering in hydroengineering inspection. The method divides the overall task into five units, progressively clarifying query statements for accurate answers. Leveraging LLM’s vast prior knowledge, robust semantic understanding, and contextual learning mitigates issues related to vocabulary limitations and template dependence. Simultaneously, the knowledge contained in the graph is integrated into an optimal clarification path and transferred to LLM to address noise and randomness, thereby enhancing the efficiency of the clarification process. Benchmark experiments demonstrate that the proposed method achieves Mean Reciprocal Rank (MRR), Mean Average Precision (MAP), Precision, and Recall metrics all above 0.73. The results affirm the method’s effectiveness in improving the accuracy of intelligent question-answering in hydroengineering inspection, with potential implications for similar applications in other domains of hydraulic engineering.
Fall-from-height (FFH) accidents remain a major source of workplace injuries and fatalities. Fall protection systems (FPS) are critical for preventing falls in the work-at-height (WAH) environment. However, challenges in designing and selecting effective FPS persist across various industries, and existing tools often lack practical references. This study aims to develop an FFH-specific knowledge graph (FFH-KG) to support FPS design. By structuring accident data, the FFH-KG provides empirical insights to help designers improve FPS frameworks, aiding safety planning and decision-making. It serves as a decision support system for FPS designers and safety professionals, guiding the selection and design of appropriate protection solutions for diverse WAH scenarios. The FFH-KG was constructed using a hybrid natural language processing approach, combining manual extraction, entity recognition, text segmentation, and rule-based relation extraction. It was grounded in a schema layer (i.e., ontology) established by experts. A text-mining approach, integrating machine learning with a large language model, facilitated the categorization of fall types, refinement of WAH scenarios, and identification of fall causes, enhancing the content and applicability of knowledge graph. A total of 2,200 entities and 4,820 relationships were created based on fall protection equipment standard documents and fall-from-height accident investigation reports, forming a foundation for developing countermeasures. The retrieval performance of FFH-KG was validated through three case studies. This research has also made significant progress in intelligent safety engineering and management across industries.
With the rapid advancement of artificial intelligence, the demand for AI talent is constantly evolving. Traditional expert-driven competency modeling approaches suffer from slow update cycles, hindering their ability to provide timely educational guidance. This study proposes an AI competency knowledge graph constructed using large language models (LLMs), enabling the transformation of unstructured recruitment texts into structured educational knowledge through an end-to-end automated framework. A total of 1,142 industry job postings were collected, and competency entities were extracted using few-shot prompt engineering. A two-stage strategy combining semantic embedding and LLM-assisted validation was employed for entity alignment and standardization. The method achieved a micro F1 score of 72.5% on a validation set of 120 samples, resulting in a knowledge graph containing 5,793 standardized competency entities. Application cases such as core skill identification and personalized career planning demonstrate the graph’s applicability in curriculum design, career guidance, and learning support. This research establishes a data-driven approach for translating dynamic labor market demands into structured educational knowledge, providing a digital foundation for AI education.
While learning personalization offers great potential for learners, modern practices in higher education require a deeper consideration of domain models and learning contexts, to develop effective personalization algorithms. This paper introduces an innovative approach to higher education curriculum modelling that utilizes large language models (LLMs) for knowledge graph (KG) completion, with the goal of creating personalized learning-path recommendations. Our research focuses on modelling university subjects and linking their topics to corresponding domain models, enabling the integration of learning modules from different faculties and institutions in the student's learning path. Central to our approach is a collaborative process, where LLMs assist human experts in extracting high-quality, fine-grained topics from lecture materials. We develop a domain, curriculum, and user models for university modules and stakeholders. We implement this model to create the KG from two study modules: Embedded Systems and Development of Embedded Systems Using FPGA. The resulting KG structures the curriculum and links it to the domain models. We evaluate our approach through qualitative expert feedback and quantitative graph quality metrics. Domain experts validated the relevance and accuracy of the model, while the graph quality metrics measured the structural properties of our KG. Our results show that the LLM-assisted graph completion approach enhances the ability to connect related courses across disciplines to personalize the learning experience. Expert feedback also showed high acceptance of the proposed collaborative approach for concept extraction and classification.
Educational applications of Large Language Models (LLMs) face two key challenges: balancing their generative adaptability with factual reliability, and ensuring content accuracy within personalized learning paths. This paper presents LKPS (Large Language Model-Knowledge Graph for Personalized Learning System), an adaptive learning system that addresses these issues via two core innovations: (1) the SafeRAG framework, which uses structured prompting, multimodal cross-verification, and post-process verification to minimize hallucinations while maintaining content reliability; and (2) a hybrid path planner that integrates collaborative filtering with real-time knowledge tracing for personalized pathway optimization. In simulated multimodal machine learning scenarios, LKPS significantly outperforms traditional methods in both personalization and learning success, offering a viable pathway toward secure and deeply personalized educational AI.
The goal of the suggested Hybrid Knowledge Graph-Neural Network Framework is to transform automated student evaluation and customized feedback in higher education. The extensive nature of individual learning patterns and the contextual connections between course contents, student interactions, and performance indicators are frequently missed by traditional feedback and evaluation systems. While the Neural Network (NN) component improves flexibility and learning accuracy through deep datadriven analysis, the integration of Knowledge Graphs (KG) permits organized representation of educational material. In order to provide students with timely, relevant, and specific feedback, this hybrid method successfully models the links between learning activities, assessments, and results. In terms of accuracy, flexibility, personalization, and ethical dependability, the framework performs better than current models like Computerized Formative Testing, Student Relationship Engagement Systems, and AI-driven Learning Analytics. The findings of the experiment show that the hybrid model offers better performance and scalability, encouraging context-aware and intelligent feedback systems for better learning outcomes and educational involvement.
This paper proposes an Intelligent Evaluation Agent (IEA) with knowledge graph construction based on the Large Language Model (LLM) and Trustworthy AI Dialogue Engine (TAIDE) for personalized Human-Machine Interactive Learning (HMIL). The intelligent agent will deal with multitasks such as learning data preparation and the learner’s data generation, preprocessing, analysis, and evaluation. Multi-modal data is collected from human-machine interactive activities and processed by an IEA to generate structured data stored in human learning repositories. The intelligent agent focuses on various temporal learning periods, such as macro, meso, and micro-level assessments by integrating Human Intelligence (HI) and Machine Intelligence (MI) results, with the MI-based Genetic Algorithm and Neural Network (GANN) learning mechanism employed to optimize the intelligent evaluation model. The learning data evaluation phase aims to identify a model that best fits the group’s learning behavior through HI-based evaluation and to train it further using MI, ensuring that the trained GANN-IEA model closely approximates the HI-based model. An LLM-based knowledge graph agent also supports the evaluation process by helping teachers analyze and visualize students’ learning progress. Experimental results demonstrate that students who study diligently gain knowledge and exhibit increased interest in learning through HMIL. However, the evidence also suggests that some students who excessively rely on Generative AI (GAI) to reproduce learning content without modification become less inclined to engage in diligent study. Additionally, the proposed IEA effectively reduces teachers’ workload in assessing students’ learning status at the end of the semester and supports personalized learning through the designed HMIL model.
With the rapid development of internet technology, the field of education is undergoing profound changes. As the main force of future society, college students' adaptive learning ability is particularly important. This article aims to explore how to enhance college students' adaptive learning ability by combining large-scale models and knowledge graph technologies. By constructing a knowledge graph system enhanced by large-scale models, it can be achieved in-depth mining of learning content and personalized recommendations, providing college students with more accurate and efficient learning paths, thereby cultivating their ability for autonomous and lifelong learning.
Undergraduate students from non-mathematical backgrounds often struggle with foundational subjects like calculus, especially in environments lacking personalized guidance to navigate complex prerequisite dependencies. To address this, a systematic literature review was first conducted, which revealed a significant gap: a lack of systems designed for the hierarchical nature of university-level mathematics. This paper presents the design and implementation of a novel software system that addresses this gap. The system integrates a curriculum-specific Knowledge Graph (KG) with a Large Language Model (LLM) through a hybrid Retrieval-Augmented Generation (RAG) architecture. This architecture uses the KG for structural validation and a Vector Database for semantic context, grounding the LLM to prevent hallucinations. The system dynamically identifies prerequisite knowledge concepts from student queries and generates accurate, visual learning road maps. Technical validation of the functional prototype confirms the RAG pipeline’s ability to produce pedagogically sound paths and handle unknown concepts via a human-in-the-loop workflow. This work offers a validated blueprint for building reliable and effective LLMpowered educational platforms for personalized learning.
This paper proposes a novel educational system that integrates Large Language Models (LLMs) with Knowledge Graphs (KGs) to generate personalized learning paths. By leveraging the structured knowledge representation of KGs and the generative capabilities of LLMs, the system aims to enhance the learning experience in the field of Artificial Intelligence Programming. Our contributions include the construction of a comprehensive KG for AI programming, the development of a prompt engineering framework for LLMs, and the implementation of a web-based learning system. The system demonstrates significant potential in providing context-aware and user-friendly learning experiences, paving the way for more intelligent educational tools.
Traditional educational systems struggle to model dynamic cognitive processes, limiting personalized interventions. This paper presents a Learner Cognitive Graph (LCG) framework using educational large language models with bias mitigation to address this challenge. We introduce a Dynamic Cognition Graph (DCG) to represent spatiotemporal interactions among students, knowledge, and exercises, capturing cognitive evolution and state transitions. A reverse Turing test-driven agent collects multi-modal behavioral data via structured prompts with hallucination control, while dynamic graph neural networks and reinforcement learning enable behavior prediction and personalized intervention optimization. The framework forms a closed loop from perception to adaptive support, enhancing cognitive modeling precision and providing scalable learning support. Key innovations include heterogeneous DCG construction, interactive data extraction with bias detection, and data-driven intervention design. This work advances intelligent educational systems while addressing inherent biases in large language models.
Knowledge Graphs (KGs) are pivotal in various NLP applications but often grapple with incompleteness, especially due to the long-tail problem where infrequent, unpopular relationships drastically reduce the KG completion performance. In this paper, we focus on Few-shot Knowledge Graph Completion (FKGC), a task addressing these gaps in long-tail scenarios. Amidst the rapid evolution of Large Language Models, we propose a generation-based FKGC paradigm facilitated by LLM distillation. Our MuKDC framework employs multi-level knowledge distillation for few-shot KG completion, generating supplementary knowledge to mitigate data scarcity in few-shot environments. MuKDC comprises two primary components: Multi-level Knowledge Generation, which enriches the KG at various levels, and Consistency Assessment, to ensure the coherence and reliability of the generated knowledge. Most notably, our method achieves SOTA results in both FKGC and multi-modal FKGC benchmarks, significantly advancing KG completion and enhancing the understanding and application of LLMs in structured knowledge generation and assessment.
Personalized assessment systems require sophisticated question sequencing strategies that can adapt to individual student knowledge states while efficiently evaluating learning outcomes across complex domain structures. Traditional assessment approaches rely on static question ordering or simple adaptive algorithms that fail to leverage the intricate relationships between learning concepts and cannot optimize question sequences for both assessment efficiency and learning reinforcement. The challenge lies in developing intelligent systems that can dynamically select and sequence questions based on real-time student performance while considering conceptual dependencies and individual learning characteristics. This study proposes a novel framework that integrates concept-graph-aware structures with Reinforcement Learning (RL) techniques to enable intelligent question sequencing for personalized assessment systems. The framework employs graph neural networks to model domain knowledge relationships while utilizing deep RL agents to learn optimal question selection policies that maximize assessment accuracy and educational value. The concept-graph representation captures prerequisite dependencies, difficulty progressions, and semantic relationships between assessment items, enabling more informed sequencing decisions that align with pedagogical principles and individual learning pathways. Experimental evaluation using comprehensive educational datasets demonstrates that the proposed framework achieves 39% improvement in assessment efficiency compared to traditional adaptive testing methods. The concept-graph-aware approach results in 45% better knowledge state estimation accuracy and 33% reduction in assessment duration while maintaining equivalent measurement precision. The framework successfully balances assessment objectives with learning reinforcement, resulting in 28% improvement in student engagement and 31% better learning outcome prediction compared to conventional question sequencing approaches.
In this work, the authors provide a novel framework for the effectiveness of AI writing assessment systems by embedding state-of-the-art deep learning networks, user feedback mechanisms, and knowledge graph frameworks. Most writing assessment tools cannot give personalized, detailed feedback. To tackle this problem, we employ writing assessment transformer models BERT and GPT-3, which allow exploring and scoring the writing on various features, including phrase structure, semantics, vocabulary usage, etc. In our system, we propose a dynamic relational knowledge graph that incorporates writing concepts and their relations, making it easier for the system to devise contextualized thesaurus-wise suggestions. The addition of graph neural networks (GNNs) empowers the model by boosting the GNN’s learning ability regarding the knowledge graph and improving comprehension of complex semantics. Additionally, we have included an iterative design whereby user feedback is collected, and the system adjusts the feedback given in light of historical feedback and changes in a user’s writing behavior over time. The system reconceptualizes the problem of user AI interaction by incorporating its dynamic nature and movement towards the known user and not vice-versa, achieving higher efficiency. To assess user satisfaction and improvements in the quality of the prepared texts, the authors conduct a series of user studies evaluating the efficiency of this integrated system. However, the preliminary data obtained from the task performance analysis show that the results of the proposed framework are far better than those of traditional methods, achieving a better level of engagement and feedback while performing the assessment. This study underscores the potential of deep learning, feedback, and knowledge graph integration in leveraging writing education. It can potentially reform learners’ capabilities, enabling them to write better and more effectively.
The background of the research field on the design of assessment algorithms and models for Chinese spoken language teaching based on natural language processing and knowledge graph mainly involves two aspects: one is the growing global demand for learning Chinese, and the other is the potential application of advanced computing technology in language learning. The significance of this research lies in providing a more scientific and systematic assessment method for Chinese teaching through this system, and, on a macro level, paving the way for the future development of language learning technologies. Through this study, not only can teachers better guide students learning, but students can also receive more effective learning feedback and guidance. Experimental data shows that the accuracy rate of the Chinese spoken language teaching assessment algorithm system based on natural language processing and knowledge graph is 98.05%, and the satisfaction rate for personalized teaching evaluation reaches 98.12%. In summary, this research provides new methods and approaches for the personalization and technologization of Chinese spoken language teaching, thus having a profound impact on the field of language education.
This paper proposes a programming competition training framework based on the synergy of large language model (LLM) intelligences and knowledge graph, which realizes the personalized generation of training paths and accurate knowledge blind zone identification by constructing a multidimensional ability diagnosis model and knowledge graph. The experimental results show that the framework significantly improves the users' problem solving efficiency and knowledge mastery speed, and its core technology provides an intelligent training paradigm for competition programming education, and has the potential to be extended to other structured knowledge learning domains.
This paper proposes a computer-aided teaching model using knowledge graph construction and learning path recommendation. It first creates a multimodal knowledge graph to illustrate complex relationships among knowledge. Learning elements and sequences are then used to form time sequences stored as directed graphs, supporting flexible path recommendations. Learners select elements based on interests and learning bases, updating behavior data for precise path recommendations. The platform, employing distributed architecture, integrates data processing and teaching applications for comprehensive cycle management and assessment. Controlled experiments validate its efficacy in enhancing learning outcomes compared to traditional methods, catering to personalized learning needs and advancing intelligent teaching.
In the era of personalized education, the provision of comprehensible explanations for learning recommendations is of great value to enhance the learner's understanding and engagement with the recommended learning content. Large language models (LLMs) and generative AI have recently opened new doors for generating human-like explanations, for and along learning recommendations. However, their precision is still far away from acceptable in a sensitive field like education. To harness the abilities of LLMs, while still ensuring a high level of precision towards the intent of the learners, this paper proposes an approach to utilize knowledge graphs (KG) as a source of factual context for LLM prompts, reducing the risk of model hallucinations, and safeguarding against wrong or imprecise information, while maintaining an application-intended learning context. We utilize the semantic relations in the knowledge graph to offer curated knowledge about learning recommendations. With domain-experts in the loop, we design the explanation as a textual template, which is filled and completed by the LLM. Domain experts were integrated in the prompt engineering phase as part of a study, to ensure that explanations include information that is relevant to the learner. We evaluate our approach quantitatively using Rouge-N and Rouge-L measures, as well as qualitatively with experts and learners. Our results show an enhanced recall and precision of the generated explanations compared to those generated solely by the GPT model, with a greatly reduced risk of generating imprecise information in the final learning explanation.
Aiming at the problems of cold start, rigid paths, and lack of interpretability in current learning path recommendation systems, this paper proposes a personalized learning path generation mechanism driven by the collaboration of Knowledge Graph (KG) and Large Language Model (LLM). A K-L collaborative architecture is constructed, which models knowledge entities and relationships through the knowledge layer, realizes collaborative reasoning between the two systems through the reasoning layer, and perceives learners' states through the interaction layer. A path generation and adaptive adjustment mechanism is designed: a static skeleton is generated based on topological sorting, which is fused with dynamic strategies generated by LLM to optimize the path in real time. Experiments show that the proposed model outperforms traditional methods in recommendation quality, cognitive alignment, and learning effectiveness. It can effectively generate learning paths that are more in line with cognitive laws and adapt to individual differences, providing a new solution for personalized education.
Students encounter multi-dimensional dynamic challenges in the learning process, while traditional student performance analysis, which relies on static data, fails to capture the complex relationships in learning behaviors, resulting in delayed interventions. To address this issue, this paper proposes an algorithm for constructing educational dynamic knowledge graphs and predicting intervention nodes by integrating Large Language Models (LLMs) and Graph Neural Networks (GNNs). The semantic understanding capability of LLMs (such as parsing homework errors and classroom needs) assists GNNs in building a dynamic heterogeneous knowledge graph of “students-knowledge points-teaching resources-interactive behaviors”, breaking the limitations of static data. To make up for the deficiency of GNNs in deep semantic mining, a knowledge-based semantic structure mining module (combined with Qwen2) is designed to improve the accuracy of node representation. In addition, an Integrated One Graph (IOG) module is adopted to unify individual and group classification into the prediction of “key intervention nodes”, enhancing the generalization ability across educational scenarios. Experimental results show that the IOG-CIQAN model achieves an accuracy of over 87 % in tasks such as performance early warning and personalized path recommendation, outperforming traditional machine learning baselines. This study provides an effective technical framework for precise educational intervention.
In order to enhance the personalization and intelligence of English teaching in colleges and universities, a teaching assistance system based on knowledge graph is designed and implemented. The system integrates multi-source data through a four-layer distributed architecture and adopts a knowledge processing method combining deep learning and rule extraction to construct a perfect ontology model and knowledge graph. The system provides personalized learning paths and ability assessment through intelligent recommendation, visual navigation and automatic assessment functions. Experimental results show that the system has significant advantages in enhancing students' learning effectiveness and ability development, especially in course resource recommendation and learning behavior analysis.
Large language models have advanced automatic question generation, yet hallucinations continue to undermine correctness and instructional reliability. This paper introduces a unified framework that integrates causal-graph-guided chain-of-thought reasoning with a multi-agent hallucination-mitigation architecture to generate accurate and pedagogically sound question-answer pairs. Causal graphs provide structured domain knowledge, while specialized agents collaboratively detect and correct logical, factual, solvability, and computational errors through iterative refinement. A formal hallucination-scoring model guides optimization, enabling lightweight models to achieve high fidelity. Experiments on a large learning platform show up to a 90% reduction in hallucination and a 70% improvement in question quality over baseline systems, demonstrating a scalable foundation for trustworthy artificial intelligence-powered education.
The rapid development of artificial intelligence and large language models has brought new opportunities and challenges to information technology education. Aiming at the problems such as inaccurate feedback when information technology teachers apply general artificial intelligence, this study constructed the technical architecture of the teaching assistant agent based on the ReAct mechanism and developed an agent for teaching assistance with the help of Retrieval Enhancement Generation (RAG) technology. The accuracy of the feedback is evaluated by comparing the performance of the teaching assistant agent and the general large model in terms of precision and recall rate. The research results show that the information technology subject teaching assistant agent significantly outperforms the general large model in terms of feedback accuracy. Its precision and recall rate both demonstrate a higher level, especially in tasks such as teaching design and test question generation, where it has obvious advantages, effectively compensating for the limitations of the general large model in the correspondence and output accuracy of textbook content.
Aiming at the problems of insufficient accuracy of retrieved content, poor portability of hybrid retrieval, and semantic drift in professional domain question-answering systems using Retrieval-Augmented Generation (RAG) technology, this paper proposes a Reasoning Retrieval-Augmented Generation Method Integrated with Dynamic Semantic Expansion (DSE-RAG). This method enhances question-answering performance through a four-stage process: First, a dynamic semantic expansion model is utilized to expand the semantic diversity of user queries; second, a hybrid retrieval agent with reasoning capabilities is designed; then, a large language model-based text filter is introduced to accurately extract key text fragments and reduce input noise; finally, optimized contextual information and questions are combined to construct prompts, driving the generation model to output highly reliable answers. Experiments on the self-built Computer Knowledge Education dataset (CKE) and the public Medical dataset show that DSE-RAG improves the accuracy index of answering questions by 13% and 9% respectively compared with the traditional RAG method, and significantly outperforms mainstream retrieval-augmented methods in the retrieval recall rate index.
Recent developments in Artificial Intelligence (AI) and Natural Language Processing (NLP) have significantly influenced the education sector, mainly through Large Language Models (LLMs) and Multi-Agent Systems (MAS). Educational chatbots such as GPTutor use Retrieval-Augmented Generation (RAG) and transformer-based models to offer more intelligent tutoring and automated assessment support. Still, most existing systems depend on centralized APIs and monolithic designs, which restrict scalability, adaptability to different contexts, and data privacy.In this paper, we introduce E-GPT, a locally deployable multi-agent educational chatbot framework designed to divide learning tasks among independent, specialized agents. These include a PDF-to-Quiz Generator, PDF-to-Technical Article Writer, RAG-based Chat Assistant, Question Paper Generator, and an OCR-enabled PDF Analyzer. The system integrates fine-tuned versions of LLaMA3 and Mistral models with a MongoDB-FAISS vector database, allowing efficient context retrieval and smooth collaboration between agents.Experimental results show that E-GPT improves contextual accuracy, modular scalability, and response time while maintaining user data privacy. By distributing different cognitive functions across coordinated agents, the system takes a step toward building a more adaptive, transparent, and scalable AI-driven learning environment.
Ensuring the factual accuracy of content generated by large language models (LLMs) is of paramount importance for applications in domains such as finance, healthcare, and education, where reliability and trustworthiness are critical. In this paper, we propose a multi-agent pipeline framework (MAPF) that integrates content generation, fact extraction, and factuality verification into a structured and robust process. The pipeline leverages five specialized agents: Question Parse Agent, Search Agent, Answer Generation Agent, Fact Description Extraction Agent, and Factuality Judge Agent, each responsible for a specific task to ensure both coherence and factual consistency. By extracting and evaluating factual content through a novel structured representation based on causal triples, we introduce an approach that significantly improves the precision of factual judgments. Experimental results on a custom dataset and the public FiQA dataset demonstrate the effectiveness of the proposed pipeline in improving factual consistency by leveraging structured causal fact representations, especially in complex, knowledge-intensive domains. Our findings suggest that structured fact extraction and agent-based task specialization offer a promising pathway for enhancing LLM factuality across various applications.
The growing ubiquity of artificial intelligence (AI), in particular large language models (LLMs), has profoundly altered the way in which learners gain knowledge and interact with learning material, with many claiming that AI positively influences their learning achievements. Despite this advancement, current AI tutoring systems face limitations associated with their reactive nature, often providing direct answers without encouraging deep reflection or incorporating structured pedagogical tools and strategies. This limitation is most apparent in the field of mathematics, in which AI tutoring systems remain underdeveloped. This research addresses the question: How can AI tutoring systems move beyond providing reactive assistance to enable structured, individualized, and tool-assisted learning experiences? We introduce a novel multi-agent AI tutoring platform that combines adaptive and personalized feedback, structured course generation, and textbook knowledge retrieval to enable modular, tool-assisted learning processes. This system allows students to learn new topics while identifying and targeting their weaknesses, revise for exams effectively, and practice on an unlimited number of personalized exercises. This article contributes to the field of artificial intelligence in education by introducing a novel platform that brings together pedagogical agents and AI-driven components, augmenting the field with modular and effective systems for teaching mathematics.
Generating high quality question-answer pairs is a hard but meaningful task. Although previous works have achieved great results on answer-aware question generation, it is difficult to apply them into practical application in the education field. This paper for the first time addresses the question-answer pair generation task on the real-world examination data, and proposes a new unified framework on RACE. To capture the important information of the input passage we first automatically generate (rather than extracting) keyphrases, thus this task is reduced to keyphrase-question-answer triplet joint generation. Accordingly, we propose a multi-agent communication model to generate and optimize the question and keyphrases iteratively, and then apply the generated question and keyphrases to guide the generation of answers. To establish a solid benchmark, we build our model on the strong generative pre-training model. Experimental results show that our model makes great breakthroughs in the question-answer pair generation task. Moreover, we make a comprehensive analysis on our model, suggesting new directions for this challenging task.
Large language models (LLMs) have significantly advanced smart education in the artificial general intelligence era. A promising application lies in the automatic generalization of instructional design for curriculum and learning activities, focusing on two key aspects: 1) customized generation: generating niche-targeted teaching content based on students' varying learning abilities and states and 2) intelligent optimization: iteratively optimizing content based on feedback from learning effectiveness or test scores. Currently, a single large LLM cannot effectively manage the entire process, posing a challenge for designing intelligent teaching plans. To address these issues, we developed EduPlanner, an LLM-based multiagent system comprising an evaluator agent, an optimizer agent, and a question analyst, working in adversarial collaboration to generate customized and intelligent instructional design for curriculum and learning activities. Taking mathematics lessons as our example, EduPlanner employs a novel Skill-Tree structure to accurately model the background mathematics knowledge of student groups, personalizing instructional design for curriculum and learning activities according to students' knowledge levels and learning abilities. In addition, we introduce the CIDDP, an LLM-based 5-D evaluation module encompassing Clarity, Integrity, Depth, Practicality, and Pertinence, to comprehensively assess mathematics lesson plan quality and bootstrap intelligent optimization. Experiments conducted on the GSM8K and Algebra datasets demonstrate that EduPlanner excels in evaluating and optimizing instructional design for curriculum and learning activities. Ablation studies further validate the significance and effectiveness of each component within the framework.
The question generation system (QGS) for information technology (IT) education, designed to create, evaluate, and improve Multiple-Choice Questions (MCQs) using knowledge graphs (KGs) and large language models (LLMs), encounters three major needs: ensuring the generation of contextually relevant and accurate distractors, enhancing the diversity of generated questions, and balancing the higher-order thinking of questions to match various learning levels. To address these needs, we proposed a multi-agent system named Multi-Examiner, which integrates KGs, domain-specific search tools, and local knowledge bases, categorized according to Bloom’s taxonomy, to enhance the contextual relevance, diversity, and higher-order thinking of automatically generated information technology MCQs. Our methodology employed a mixed-methods approach combining system development with experimental evaluation. We first constructed a specialized architecture combining knowledge graphs with LLMs, then implemented a comparative study generating questions across six knowledge points from K-12 Computer Science Standard. We designed a multidimensional evaluation rubric to assess the semantic coherence, answer correctness, question validity, distractor relevance, question diversity, and higher-order thinking, and conducted a statistical analysis of ratings provided by 30 high school IT teachers. Results showed statistically significant improvements (p < 0.01) with Multi-Examiner outperforming GPT-4 by an average of 0.87 points (on a 5-point scale) for evaluation-level questions and 1.12 points for creation-level questions. The results demonstrated that: (i) overall, questions generated by the Multi-Examiner system outperformed those generated by GPT-4 across all dimensions and closely matched the quality of human-crafted questions in several dimensions; (ii) domain-specific search tools significantly enhanced the diversity of questions generated by Multi-Examiner; and (iii) GPT-4 generated better questions for knowledge points at the “remembering” and “understanding” levels, while Multi-Examiner significantly improved the higher-order thinking of questions for the “evaluating” and “creating” levels. This study contributes to the growing body of research on AI-supported educational assessment by demonstrating how specialized knowledge structures can enhance automated generation of higher-order thinking questions beyond what general-purpose language models can achieve.
The transformative capabilities of large language models (LLMs) are reshaping educational assessment and question design in higher education. This study proposes a systematic framework for leveraging LLMs to enhance question-centric tasks: aligning exam questions with course objectives, improving clarity and difficulty, and generating new items guided by learning goals. The research spans four university courses—two theory-focused and two application-focused—covering diverse cognitive levels according to Bloom’s taxonomy. A balanced dataset ensures representation of question categories and structures. Three LLM-based agents—VectorRAG, VectorGraphRAG, and a fine-tuned LLM—are developed and evaluated against a meta-evaluator, supervised by human experts, to assess alignment accuracy and explanation quality. Robust analytical methods, including mixed-effects modeling, yield actionable insights for integrating generative AI into university assessment processes. Beyond exam-specific applications, this methodology provides a foundational approach for the broader adoption of AI in post-secondary education, emphasizing fairness, contextual relevance, and collaboration. The findings offer a comprehensive framework for aligning AI-generated content with learning objectives, detailing effective integration strategies, and addressing challenges such as bias and contextual limitations. Overall, this work underscores the potential of generative AI to enhance educational assessment while identifying pathways for responsible implementation.
Knowledge Graph Question Generation (KGQG) is the task of generating natural language questions based on the given knowledge graph (KG). Although extensively explored in recent years, prevailing models predominantly depend on labelled data for training deep learning models or employ large parametric frameworks, e.g., Large Language Models (LLMs), which can incur significant deployment costs and pose practical implementation challenges. To address these issues, in this work, we put forward a zero-shot, multi-agent KGQG framework. This framework integrates the capabilities of LLMs with small models to facilitate cost-effective, high-quality question generation. In specific, we develop a professional editorial team architecture accompanied by two workflow optimization tools to reduce unproductive collaboration among LLMs-based agents and enhance the robustness of the system. Extensive experiments demonstrate that our proposed framework derives the new state-of-the-art performance on the zero-shot KGQG tasks, with relative gains of 20.24% and 13.57% on two KGQG datasets, respectively, which rival fully supervised state-of-the-art models.
ABSTRACT Intelligent education relies on the generation of multi-level, comprehensive, and diverse question banks to assess student learning effectiveness and teaching efficacy. However, the development of professional question banks often presents challenges such as reliance on expert knowledge and experience, limited transferability, high workload, and subjective biases. In Geographical Information Systems (GIS), personalized question settings could be impacted by diverse knowledge sources and varying student orientations. To address this issue, we propose a novel large language model (LLM) framework guided by GIS prior knowledge for generating professional GIS question banks. Specifically, we tackle three major challenges in intelligent GIS question bank generation: incomplete knowledge coverage, skewed difficulty distribution, and limited adaptability of question types. This framework is founded upon the autonomous understanding, planning, and reasoning capabilities of LLMs, augmented by an elaborate retrieval strategy. It comprises three key modules: subtask matching and partitioning, subtask importance evaluation and quantity allocation, as well as adaptive scenario question generation. Together, these components enable the generation of personalized GIS question banks for learning and teaching tasks. Extensive experiments demonstrate its effectiveness across various metrics. Furthermore, our method with specialized knowledge organization can serve as a valuable resource for advancing research and applications in GIS education.
Recent advances in large language models (LLMs) have made automated multiple-choice question (MCQ) generation increasingly feasible; however, reliably producing items that satisfy controlled cognitive demands remains a challenge. To address this gap, we introduce ReQUESTA, a hybrid, multi-agent framework for generating cognitively diverse MCQs that systematically target text-based, inferential, and main idea comprehension. ReQUESTA decomposes MCQ authoring into specialized subtasks and coordinates LLM-powered agents with rule-based components to support planning, controlled generation, iterative evaluation, and post-processing. We evaluated the framework in a large-scale reading comprehension study using academic expository texts, comparing ReQUESTA-generated MCQs with those produced by a single-pass GPT-5 zero-shot baseline. Psychometric analyses of learner responses assessed item difficulty and discrimination, while expert raters evaluated question quality across multiple dimensions, including topic relevance and distractor quality. Results showed that ReQUESTA-generated items were consistently more challenging, more discriminative, and more strongly aligned with overall reading comprehension performance. Expert evaluations further indicated stronger alignment with central concepts and superior distractor linguistic consistency and semantic plausibility, particularly for inferential questions. These findings demonstrate that hybrid, agentic orchestration can systematically improve the reliability and controllability of LLM-based generation, highlighting workflow design as a key lever for structured artifact generation beyond single-pass prompting.
No abstract available
This study introduces Knowledge Augmented Question Generation (KAQG), an educational assessment framework that integrates Item Response Theory (IRT), Bloom’s Taxonomy, and knowledge graphs into a multi-agent Retrieval-Augmented Generation (RAG) system. The proposed approach overcomes limitations of existing methods by enabling fine-grained control over item difficulty, psychometric calibration, and cognitive alignment. It employs multi-graph isolation to preserve domain-specific semantics and leverages a distributed agent architecture coordinated through Data Distribution Service (DDS) for scalable and fault-tolerant operations. Each agent specializes in tasks such as retrieval, generation, or evaluation, forming a modular and traceable pipeline. Distinctively, the framework encodes semantic hierarchies, PageRank-based concept weighting, and assessment-theory parameters directly into the generation process, ensuring that questions are both contextually grounded and cognitively calibrated. Deployed at Taiwan’s National Institute of Environmental Research, the system has demonstrated practical value by reducing manual workload, improving reliability and validity, and supporting both adaptive and standardized assessments. By integrating psychometric theory with AI-driven retrieval and generation, this work establishes a scalable and cognitively aligned solution for education and professional certification.
Intelligent and adaptive online education systems aim to make high-quality education available for a diverse range of students. However, existing systems usually depend on a pool of hand-made questions, limiting how fine-grained and open-ended they can be in adapting to individual students. We explore targeted question generation as a controllable sequence generation task. We first show how to fine-tune pre-trained language models for deep knowledge tracing (LM-KT). This model accurately predicts the probability of a student answering a question correctly, and generalizes to questions not seen in training. We then use LM-KT to specify the objective and data for training a model to generate questions conditioned on the student and target difficulty. Our results show we succeed at generating novel, well-calibrated language translation questions for second language learners from a real online education platform.
Since the COVID-19 outbreak, the use of digital learning or education platforms has substantially increased. Teachers now digitally distribute homework and provide exercise questions. In both cases, teachers need to develop novel and individual questions continuously. This process can be very time-consuming and should be facilitated and accelerated both through exchange with other teachers and by using Artificial Intelligence (AI) capabilities. To address this need, we propose a multilingual Wikimedia framework that allows for collaborative worldwide teacher knowledge engineering and subsequent AI-aided question generation, test, and correction. As a proof of concept, we present»PhysWikiQuiz«, a physics question generation and test engine. Our system (hosted by Wikimedia at https://physwikiquiz.wmflabs.org) retrieves physics knowledge from the open community-curated database Wikidata. It can generate questions in different variations and verify answer values and units using a Computer Algebra System (CAS). We evaluate the performance on a public benchmark dataset at each stage of the system workflow. For an average formula with three variables, the system can generate and correct up to 300 questions for individual students, based on a single formula concept name as input by the teacher. © 2022 Copyright for this paper by its authors.
The increasing integration of Artificial Intelligence (AI) in education has led to the development of innovative tools like Intelligent Question-Answering Systems (IQASs), aiming to revolutionize traditional learning paradigms. However, many existing IQAS struggle with the nuances of natural language and the complexities of student questions. This research focuses on developing a context-aware IQAS that leverages advanced Natural Language Processing (NLP) techniques and contextual information, including student learning history and educational content, to provide personalised support. This study also introduces a software tool that utilizes NLP techniques to automatically generate FAQs from educational materials. Employing a hybrid approach combining rule-based and machine learning techniques, the IQAS demonstrated high accuracy in interpreting and responding to a wide range of student queries. The software tool effectively automated the generation of FAQs, creating a valuable resource for personalised learning. The findings suggest that these tools can significantly improve student engagement, motivation, and learning outcomes, highlighting the potential of AI to transform education and pave the way for more personalised, adaptive, and effective learning environments.
The objective of question generation from knowledge graphs (KGQG) is to create coherent and answerable questions from a given subgraph and a specified answer entity. KGQG has garnered significant attention due to its pivotal role in enhancing online education. Encoder–decoder architectures have advanced traditional KGQG approaches. However, these approaches encounter challenges in achieving question diversity and grammatical accuracy. They often suffer from a disconnect between the phrasing of the question and the type of the answer entity, a phenomenon known as semantic drift. To address these challenges, we introduce LEMON, a knowledge-enhanced, type-constrained, and grammar-guided model for KGQG. LEMON enhances the input by integrating entity-related knowledge using heuristic rules, which fosters diversity in question generation. It employs a hierarchical global relation embedding with translation loss to align questions with entity types. In addition, it utilizes a graph-based module to aggregate type information from neighboring nodes. The LEMON model incorporates a type-constrained decoder to generate diverse expressions and improves grammatical accuracy through a syntactic and semantic reward function via reinforcement learning. Evaluations on benchmark datasets demonstrate LEMON's strong competitiveness. The study also examines the impact of question generation quality on question-answering systems, providing guidance for future research endeavors in this domain.
Difficulty-controllable question generation for reading comprehension has gained significant attention in the field of education as a fundamental tool for adaptive learning support. Although several neural question generation methods have recently succeeded in controlling difficulty, conventional approaches still face two major limitations. First, they cannot directly generate multiple-choice questions, which are the most widely used question type in educational contexts. Second, they are not explicitly trained to optimize the accuracy of difficulty control, leaving room for further improvement in difficulty controllability. To address these limitations, this study proposes a novel difficulty-controllable multiple-choice question generation method for reading comprehension which leverages a large language model trained using a direct preference optimization technique to improve the accuracy of difficulty control. Experiments on an actual multiple-choice reading comprehension question dataset showed that the proposed method effectively improves difficulty controllability while maintaining fluency, content relevance, and answerability. Additionally, comparison of the proposed method with a few-shot learning approach indicates that few-shot learning is insufficient for achieving difficulty control, highlighting the advantages of the proposed method.
Educational question generation (EQG) is a crucial component of intelligent educational systems, significantly aiding self-assessment, active learning, and personalized education. While EQG systems have emerged, existing datasets typically rely on predefined, carefully edited texts, failing to represent real-world classroom content, including lecture speech with a set of complementary slides. To bridge this gap, we collect a dataset of educational questions based on videos from real-world lectures. On this realistic dataset, we find that current methods for EQG struggle to accurately generate questions from educational videos, particularly in aligning with specific timestamps and target answers. Common challenges include selecting informative contexts from extensive transcripts and ensuring generated questions meaningfully incorporate the target answer. To address the challenges, we introduce a novel framework utilizing large language models (LLMs) for dynamically selecting and rewriting contexts based on target timestamps and answers in lecture videos. First, our framework selects contexts from both lecture transcripts and video keyframes based on answer relevance and temporal proximity. Then, we integrate the contexts selected from both modalities and rewrite them into answer-containing knowledge statements, to enhance the logical connection between the contexts and the desired answer. Quantitative evaluation and human evaluation show that our approach improves the quality and relevance of the generated questions.
In daily human discourse, it is often observed that individuals disregard the correct grammatical structure while articulating their ideas. Instead, they tend to focus on brevity and providing only the essential information. Similarly, when posing questions, individuals often refrain from sharing all the relevant details of the honest answer. To address this issue, this paper introduces the "KdCQG" model. This model takes the "keywords" present in the answer as input rather than the entire answer sentence to generate conversational questions. KdCQG follows a sequence-to-sequence framework and relies on a language constraint mechanism and a coreference alignment mechanism to formulate questions based on the "keywords" present in the answer. Our analysis on the CoQA dataset shows that the KdCQG model effectively generates high-quality conversational questions even when only keywords are provided as input.
No abstract available
Question generation aims to produce questions automatically given a piece of text as input. Existing research follows a sequence-to-sequence fashion that constructs a single question based on the input. Considering each question usually focuses on a specific fragment of the input, especially in the scenario of reading comprehension, it is reasonable to identify the corresponding focus before constructing the question. In this paper, we propose to identify question-worthy phrases first and generate questions with the assistance of these phrases. We introduce a multi-agent communication framework, taking phrase extraction and question generation as two agents, and learn these two tasks simultaneously via message passing mechanism. The results of experiments show the effectiveness of our framework: we can extract question-worthy phrases, which are able to improve the performance of question generation. Besides, our system is able to extract more than one question worthy phrases and generate multiple questions accordingly.
Hallucinations in large language models (LLMs), defined as fluent yet incorrect or incoherent outputs, pose a significant challenge to the automatic generation of educational multiple-choice questions (MCQs). We identified four key hallucination types in MCQ generation: reasoning inconsistencies, insolvability, factual errors, and mathematical errors. To address this, we propose a hallucination-free multi-agent generation framework that breaks down MCQ generation into discrete, verifiable stages. Our framework utilizes both rule-based and LLM-based detection agents, as well as hallucination scoring metrics to optimize question quality. We redefined MCQ generation as an optimization task minimizing hallucination risk while maximizing validity, answerability, and cost-efficiency. We also introduce an agent-led refinement process that uses counterfactual reasoning and chain-of-thought (CoT) to iteratively improve hallucination in question generation. We evaluated a sample of AP-aligned STEM questions, where our system reduced hallucination rates by over 90% compared to baseline generation while preserving the educational value and style of questions. Our results demonstrate that structured multi-agent collaboration can mitigate hallucinations in educational content creation at scale, paving the way for more reliable LLM-powered learning tools.
Generative artificial intelligence (AI) and large language models (LLMs) are reshaping the landscape of intelligent educational systems; however, existing solutions often suffer from unstructured resource organization, limited interpretability, and suboptimal retrieval precision. To address these challenges, this study introduces KA-RAG, a course-oriented question answering (QA) framework that integrates a structured Knowledge Graph (KG) with an Agentic Retrieval-Augmented Generation (Agentic-RAG) workflow. The system incorporates a responsive interface, a unified agent controller (ToolPlanner), a course knowledge graph, and a vector-based retrieval subsystem. By combining symbolic graph reasoning with dense semantic retrieval, the proposed dual-retrieval strategy supports interpretable, context-aware responses to course-related queries. Experiments conducted on a graduate-level Pattern Recognition course demonstrate that KA-RAG achieves a retrieval accuracy of 91.4%, semantic consistency of 87.6%, and an average response latency of 2.8 s. User surveys further reveal significant improvements in learning efficiency and satisfaction. The results validate the feasibility of integrating KG and Agentic-RAG techniques for knowledge-grounded educational applications, offering a practical pathway toward intelligent knowledge organization and interactive learning support.
Large Language Models (LLMs) have become increasingly integral to enhancing developer productivity, particularly in code generation, comprehension, and repair tasks. However, fine-tuning these models with high-quality, real-world data is challenging due to privacy concerns and the lack of accessible, labeled datasets. In this paper, we present DialogAgent, an automated tool for generating synthetic training data that closely mimics real developer interactions within Integrated Development Environments (IDEs). DialogAgent enables the production of diverse, high-fidelity query-response pairs by simulating multi-turn dialogues and contextual behaviors observed in real-world programming scenarios. The tool significantly reduces the reliance on manual data generation, increasing efficiency by 4.8 times compared to traditional methods. Our experiments and online deployment demonstrate substantial improvements in model performance for code-related question-answering tasks: the acceptance rate of responses generated by our in-house model is improved by 33%, after training on synthesized data generated by DialogAgent.
Personalization in Information Retrieval (IR) is a topic studied by the research community since a long time. However, there is still a lack of datasets to conduct large-scale evaluations of personalized IR; this is mainly due to the fact that collecting and curating high-quality user-related information requires significant costs and time investment. Furthermore, the creation of datasets for Personalized IR (PIR) tasks is affected by both privacy concerns and the need for accurate user-related data, which are often not publicly available. Recently, researchers have started to explore the use of Large Language Models (LLMs) to generate synthetic datasets, which is a possible solution to generate data for low-resource tasks. In this paper, we investigate the potential of Large Language Models (LLMs) for generating synthetic documents to train an IR system for a Personalized Community Question Answering task. To study the effectiveness of IR models fine-tuned on LLM-generated data, we introduce a new dataset, named Sy-SE-PQA. We build Sy-SE-PQA based on an existing dataset, SE-PQA11https://zenodo.org/records/10679181, which consists of questions and answers posted on the popular StackExchange communities. Starting from questions in SE-PQA, we generate synthetic answers using different prompt techniques and LLMs. Our findings suggest that LLMs have high potential in generating data tailored to users' needs. The synthetic data can replace human-written training data, even if the generated data may contain incorrect information. The code is publicly available22https://github.com/pkase1a!SY_SE-PQA.
Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities, yet traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios. In this paper, we introduce GeAR, a system that advances RAG performance through two key innovations: (i) an efficient graph expansion mechanism that augments any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates the resulting graph-based retrieval into a multi-step retrieval framework. Our evaluation demonstrates GeAR's superior retrieval capabilities across three multi-hop question answering datasets. Notably, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while consuming fewer tokens and requiring fewer iterations than existing multi-step retrieval systems. The project page is available at https://gear-rag.github.io.
Retrieval Augmented Generation (RAG) has been effectively used to improve the accuracy of question-answering (Q&A) systems powered by Large Language Models (LLMs) by integrating local knowledge and more up-to-date content. However, traditional RAG methods, including those with re-ranking mechanisms, face challenges when dealing with large, frequently updated data sources or when accessing sources exclusively via APIs, as they require pre-encoding all content into embedding vectors. To address these limitations, we introduce Agent-based Universal RAG (AU-RAG), a novel approach that augments data sources with descriptive metadata, allowing an agent to dynamically search through diverse data pools. This agent-driven system can learn from examples to retrieve and consolidate data from various sources on the fly, functioning as a more flexible and adaptive RAG. We demonstrate AU-RAG's functionality with a financial analysis example and evaluate its performance using a multi-source QA dataset. The results show that AU-RAG performs comparably to RAG with re-ranking in data retrieval tasks while also demonstrating an enhanced ability to intelligently learn and access new data sources from examples, making it a robust solution for dynamic and complex information environments.
Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements with large language models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, and high reliance on annotated data. To address these challenges, we propose KBQA-o1, a novel agentic KBQA method with Monte Carlo Tree Search (MCTS). It introduces a ReAct-based agent process for stepwise logical form generation with KB environment exploration. Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space. With heuristic exploration, KBQA-o1 generates high-quality annotations for further improvement by incremental fine-tuning. Experimental results show that KBQA-o1 outperforms previous low-resource KBQA methods with limited annotated data, boosting Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% of the previous sota method with GPT-3.5-turbo. Our code is publicly available.
Addressing core pain points in higher education—time-consuming lesson preparation, inefficient grading, delayed Q&A, and weak progress monitoring—this project employs a “front-end - back-end separation + Coze multi-AI agent integration” architecture to develop an intelligent course assistant system powered by large language models (LLM). The system focuses on the needs of three roles—teachers, students, and administrators—by constructing Coze agent workflows. It enables collaborative invocation of LLM and plugins alongside precise subject database retrieval: - Teachers gain access to BOPPPS structured lesson plan generation, intelligent assignment grading, learning progress analysis, and exam creation/management. - Students receive real-time Q&A support and personalized practice problem generation. - Administrators benefit from multidimensional data visualization capabilities. The system supports multidisciplinary adaptation with real-time subject replacement and updates. The project has completed local deployment and undergone an 8-week pilot in real teaching scenarios. Results demonstrate that the system effectively enhances teaching efficiency, optimizes student learning experiences, and provides robust support for digital transformation in higher education.
We propose a new method to measure the task-specific accuracy of Retrieval-Augmented Large Language Models (RAG). Evaluation is performed by scoring the RAG on an automatically-generated synthetic exam composed of multiple choice questions based on the corpus of documents associated with the task. Our method is an automated, cost-efficient, interpretable, and robust strategy to select the optimal components for a RAG system. We leverage Item Response Theory (IRT) to estimate the quality of an exam and its informativeness on task-specific accuracy. IRT also provides a natural way to iteratively improve the exam by eliminating the exam questions that are not sufficiently informative about a model's ability. We demonstrate our approach on four new open-ended Question-Answering tasks based on Arxiv abstracts, StackExchange questions, AWS DevOps troubleshooting guides, and SEC filings. In addition, our experiments reveal more general insights into factors impacting RAG performance like size, retrieval mechanism, prompting and fine-tuning. Most notably, our findings show that choosing the right retrieval algorithms often leads to bigger performance gains than simply using a larger language model.
No abstract available
The Brazilian University Admission Exam (ENEM) presents a unique challenge for artificial intelligence. It requires deep mastering of knowledge from diverse fields. Recently, Language Models (LMs) with growing numbers of parameters have established the state-of-the-art performance on ENEM. However, techniques like Retrieval-Augmented Generation (RAG) can help further improvements, by exploiting trustfull knowledge bases to enhance contexts and reduce non-factual responses. This study investigates how RAG can improve LMs’ performance on ENEM. The experiments reported in this article use up-to-date versions of four popular LMs, with and without RAG, on text-only and multi-modal data. The results reveal consistent gains using RAG with both kinds of data, across diverse fields, demonstrating the potential of RAG to improve LMs’ performance on tasks requiring multidisciplinary knowledge
No abstract available
Large Language Models (LLMs) are becoming essential tools for various natural language processing tasks but often suffer from generating outdated or incorrect information. Retrieval-Augmented Generation (RAG) addresses this issue by incorporating external, real-time information retrieval to ground LLM responses. However, the existing RAG systems frequently struggle with the quality of retrieval documents, as irrelevant or noisy documents degrade performance, increase computational overhead, and undermine response reliability. To tackle this problem, we propose Multi-Agent Filtering Retrieval-Augmented Generation (MAIN-RAG), a training-free RAG framework that leverages multiple LLM agents to collaboratively filter and score retrieved documents. Specifically, MAIN-RAG introduces an adaptive filtering mechanism that dynamically adjusts the relevance filtering threshold based on score distributions, effectively minimizing noise while maintaining high recall of relevant documents. The proposed approach leverages inter-agent consensus to ensure robust document selection without requiring additional training data or fine-tuning. Experimental results across four QA benchmarks demonstrate that MAIN-RAG consistently outperforms traditional RAG approaches, achieving a 2-11% improvement in answer accuracy while reducing the number of irrelevant retrieved documents. Quantitative analysis further reveals that our approach achieves superior response consistency and answer accuracy over baseline methods, offering a competitive and practical alternative to training-based solutions.
To reduce the repetitive and complex work of instructors, exam paper generation (EPG) technique has become a salient topic in the intelligent education field, which targets at generating high-quality exam paper automatically according to instructor-specified assessment criteria. The current advances utilize the ability of heuristic algorithms to optimize several well-known objective constraints, such as difficulty degree, number of questions, etc., for producing optimal solutions. However, in real scenarios, considering other equally relevant objectives (e.g., distribution of exam scores, skill coverage) is extremely important. Besides, how to develop an automatic multi-objective solution that finds an optimal subset of questions from a huge search space of large-sized question datasets and thus composes a high-quality exam paper is urgent but non-trivial. To this end, we skillfully design a reinforcement learning guided Multi-Objective Exam Paper Generation framework, termed MOEPG, to simultaneously optimize three exam domain-specific objectives including difficulty degree, distribution of exam scores, and skill coverage. Specifically, to accurately measure the skill proficiency of the examinee group, we first employ deep knowledge tracing to model the interaction information between examinees and response logs. We then design the flexible Exam Q-Network, a function approximator, which automatically selects the appropriate question to update the exam paper composition process. Later, MOEPG divides the decision space into multiple subspaces to better guide the updated direction of the exam paper. Through extensive experiments on two real-world datasets, we demonstrate that MOEPG is feasible in addressing the multiple dilemmas of exam paper generation scenario.
Assessment plays important role in learning process in higher education institutions. However, poorly designed exams can fail to achieve the intended learning outcomes of a specific course, which can also have a bad impact on the programs and educational institutes. One of the possible solutions is to standardize the exams based on educational taxonomies. However, this is not an easy process for educators. With the recent technologies, the assessment approaches have been improved by automatically generating exams based on educational taxonomies. This paper presents a framework that allow educators to map questions to intended learning outcomes based on Bloom’s taxonomy. Furthermore, it elaborates on the principles and requirements for generating exams automatically. It also report on a prototype implementation of an authoring tool for generating exams to evaluate the achievements of intended learning outcomes.
Abstract Exam paper generation is an indispensable part of teaching. Existing methods focus on the use of question extraction algorithms with labels for each question provided. Obviously, manual labeling is inefficient and cannot avoid label bias. Furthermore, the quality of the exam papers generated by the existing methods is not guaranteed. To address these problems, we propose a novel approach to generating exam papers based on prediction of exam performance. As such, we update the quality of the initially generated questions one by using dynamic programming, as well as in batches by using genetic algorithms. We performed the prediction task by using Deep Knowledge Tracing. Our approach considered the skill weight, difficulty, and distribution of exam scores. By comparisons, experimental results indicate that our approach performed better than the two baselines. Furthermore, it can generate exam papers with adaptive difficulties closely to the expected levels, and the related student exam scores will be guaranteed to be relatively reasonable distribution. In addition, our approach was evaluated in a real learning scenarios and shows advantages.
本次合并将研究成果整合为五个核心维度:1) 基础层:教育知识图谱的自动化构建与本体设计;2) 机制层:KG与RAG融合驱动的增强推理技术,旨在消除LLM幻觉;3) 任务层:涵盖认知导向的题目生成与多维难度控制;4) 架构层:以多智能体协作实现系统的高可靠性与复杂任务流;5) 应用层:面向特定学科的闭环测评、组卷优化及个性化自适应学习路径规划。整体趋势显示,系统正从“通用生成”转向“垂直领域专家协同”,强调教育科学性与工程可落地性的深度融合。