基于知识图谱的网络安全威胁预警技术研究
网络安全知识图谱自动化构建、校验与质量评估
该组文献聚焦于如何从非结构化威胁情报(CTI)、社交媒体、系统日志及异构数据源中,利用NLP、命名实体识别(NER)和关系抽取技术自动化构建知识图谱,并探讨了构建过程中的模式校验、数据修复及质量评估体系。
- Formal Model for Constructing Sensitive Data Graphs from Cyber Reports using Large Language Models(Viktor Turskyi, 2025, Theoretical and Applied Cybersecurity)
- RDLSH: Adaptive Entity Recognition and Relation Extraction for IoT Knowledge Graph(Daojing He, Chen Tu, Sammy Kwok‐Keung Chan, 2025, IEEE Internet of Things Journal)
- Enhancing cybersecurity through autonomous knowledge graph construction by integrating heterogeneous data sources(Hatoon Alharbi, Ali Hur, Hasan Alkahtani, Hafiz Farooq Ahmad, 2025, PeerJ Computer Science)
- Building a Cybersecurity Knowledge Graph with CyberGraph(Paolo Falcarin, Fabio Dainese, 2024, 2024 IEEE/ACM 4th International Workshop on Engineering and Cybersecurity of Critical Systems and 2024 IEEE/ACM Second International Workshop on Software Vulnerability (EnCyCriS/SVM))
- A Syntax-Aware Graph Network with Contrastive Learning for Threat Intelligence Triple Extraction(Zhenxiang He, Ziqi Zhao, Zhihao Liu, 2025, Symmetry)
- Gathering Cyber Threat Intelligence from Twitter Using Novelty Classification(Ba Dung Le, Guanhua Wang, Mehwish Nasim, Ali Babar, 2019, ArXiv Preprint)
- From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions(Jiaren Peng, Hongda Sun, Xuan Tian, Chen Huang, Zeqing Li, Rui Yan, 2025, ArXiv)
- CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models(Yutong Cheng, Osama Bajaber, Saimon Amanuel Tsegai, D. Song, Peng Gao, 2024, 2025 IEEE 10th European Symposium on Security and Privacy (EuroS&P))
- Dynamic Vulnerability Knowledge Graph Construction via Multi-Source Data Fusion and Large Language Model Reasoning(Ruitong Liu, Yaxuan Xie, Zexu Dang, Jinyi Hao, Xiaowen Quan, Yongcai Xiao, Chunlei Peng, 2025, Electronics)
- Data-Driven Cybersecurity Knowledge Graph Construction for Industrial Control System Security(Guowei Shen, Wanling Wang, Qilin Mu, Yanhong Pu, Ya Qin, Miao Yu, 2020, Wirel. Commun. Mob. Comput.)
- MCKG: Advancing Attack Knowledge Graph Construction via Multi-Source Cross-Modal Threat Intelligence(Xiaoming Wu, Zedong Shen, Chao Mu, Ming Yang, Xin Wang, 2025, 2025 21st International Conference on Mobility, Sensing and Networking (MSN))
- AttacKG: Constructing Technique Knowledge Graph from Cyber Threat Intelligence Reports(Zhenyuan Li, Jun Zeng, Yan Chen, Zhenkai Liang, 2021, No journal)
- SC-LKM: A Semantic Chunking and Large Language Model-Based Cybersecurity Knowledge Graph Construction Method(Pu Wang, Yangsen Zhang, Zicheng Zhou, Yuqi Wang, 2025, Electronics)
- Automated Validation and Repair of Knowledge Graph Triples for Cyber Threat Intelligence(Vitaly Andrejeus, William Mitchell, Sawyer Cawthon, Jesse Sullins, Erdogan Dogdu, Roya Choupani, 2026, 2026 IEEE 5th International Conference on AI in Cybersecurity (ICAIC))
- Quality assessment of cyber threat intelligence knowledge graph based on adaptive joining of embedding model(Bin Chen, Hongyi Li, Di Zhao, Y. Yang, Chengwei Pan, 2024, Complex & Intelligent Systems)
- CyberKG: Constructing a Cybersecurity Knowledge Graph Based on SecureBERT_Plus for CTI Reports(Binyong Li, Qiaoxi Yang, Chuang Deng, Hua Pan, 2025, Informatics)
- A dataset for cyber threat intelligence modeling of connected autonomous vehicles(Yinghui Wang, Yilong Ren, Hongmao Qin, Zhiyong Cui, Yanan Zhao, Haiyang Yu, 2024, ArXiv Preprint)
- Network Security Threat Intelligence Modeling Based on Knowledge Graph(Yibo Yang, Yuankang Zhao, 2024, 2024 5th International Conference on Computer, Big Data and Artificial Intelligence (ICCBD+AI))
- Social engineering in cybersecurity: a domain ontology and knowledge graph application examples(Zuoguang Wang, Hongsong Zhu, Peipei Liu, Limin Sun, 2021, Cybersecurity)
- Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence(Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song, 2020, ArXiv Preprint)
- CSKG4APT: A Cybersecurity Knowledge Graph for Advanced Persistent Threat Organization Attribution(Yitong Ren, Yanjun Xiao, Yinghai Zhou, Zhiyong Zhang, Zhihong Tian, 2023, IEEE Transactions on Knowledge and Data Engineering)
- Cybersecurity threat perception technology based on knowledge graph(A. Sali, A. Al-Jumaily, Víctor P. Gil Jiménez, D. Al-Jumeily, 2023, Journal of Autonomous Intelligence)
- Actionable Cyber Threat Intelligence using Knowledge Graphs and Large Language Models(Romy Fieblinger, Md Tanvirul Alam, Nidhi Rastogi, 2024, ArXiv Preprint)
- Cyber Threat Intelligence Model: An Evaluation of Taxonomies, Sharing Standards, and Ontologies within Cyber Threat Intelligence(Vasileios Mavroeidis, Siri Bromander, 2021, ArXiv Preprint)
- CyLens: Towards Reinventing Cyber Threat Intelligence in the Paradigm of Agentic Large Language Models(Xiaoqun Liu, Jiacheng Liang, Qiben Yan, Jiyong Jang, Sicheng Mao, Muchao Ye, Jinyuan Jia, Zhaohan Xi, 2025, ArXiv Preprint)
- Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph(I. Sarhan, M. Spruit, 2021, Knowl. Based Syst.)
- A System for Efficiently Hunting for Cyber Threats in Computer Systems Using Threat Intelligence(Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Haoyuan Liu, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song, 2021, ArXiv Preprint)
- AEKG4APT: An AI-Enhanced Knowledge Graph for Advanced Persistent Threats with Large Language Model Analysis(Yinghai Zhou, Ziyu Wang, Yunxin Jiang, Bingqi Ma, Rui Wang, Yuan Liu, Yue Zhao, Z. Tian, 2025, ACM Transactions on Intelligent Systems and Technology)
- A Threat Intelligence Event Extraction Conceptual Model for Cyber Threat Intelligence Feeds(Jamal H. Al-Yasiri, Mohamad Fadli Bin Zolkipli, Nik Fatinah N Mohd Farid, Mohammed Alsamman, Zainab Ali Mohammed, 2025, ArXiv Preprint)
- A Method to Construct Vulnerability Knowledge Graph based on Heterogeneous Data(Yizhen Sun, Dandan Lin, Hong Song, Minjia Yan, Linjing Cao, 2020, 2020 16th International Conference on Mobility, Sensing and Networking (MSN))
- RelExt: Relation Extraction using Deep Learning approaches for Cybersecurity Knowledge Graph Improvement(Aditya Pingle, Aritran Piplai, Sudip Mittal, A. Joshi, James Holt, Richard Zak, 2019, 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM))
- K-CTIAA: Automatic Analysis of Cyber Threat Intelligence Based on a Knowledge Graph(Zong-Xun Li, Yujie Li, Yi-Wei Liu, Cheng Liu, Nan Zhou, 2023, Symmetry)
- Research on the Construction and Optimization Algorithm of Cybersecurity Knowledge Graphs Combining Open Information Extraction with Graph Convolutional Networks(Yihong Zou, 2025, 2025 2nd International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS))
- OntoLogX: Ontology-Guided Knowledge Graph Extraction from Cybersecurity Logs with Large Language Models(Luca Cotti, Idilio Drago, Anisa Rula, D. Bianchini, Federico Cerutti, 2025, ArXiv)
- Constructing Knowledge Graph from Cyber Threat Intelligence Using Large Language Model(Jiehui Liu, Jieyu Zhan, 2023, 2023 IEEE International Conference on Big Data (BigData))
- Graph neural networks embedded with domain knowledge for cyber threat intelligence entity and relationship mining(Gan Liu, Kai-Ping Lu, Saiqi Pi, 2025, PeerJ Computer Science)
- Ontologies for Network Security and Future Challenges(Danny Velasco, Glen Rodriguez, 2017, ArXiv Preprint)
知识图谱补全、多跳推理与语义表示增强
研究如何利用知识图谱的结构特性,通过链路预测、图嵌入(Embedding)、时间感知推理及因果发现技术,解决数据缺失问题并增强对复杂攻击链的语义理解与未知威胁预测能力。
- A Novel Multimodal Data Fusion Framework: Enhancing Prediction and Understanding of Inter-State Cyberattacks(Jiping Dong, Mengmeng Hao, Fangyu Ding, Shuai Chen, Jiajie Wu, Jun Zhuo, Dong Jiang, 2025, Big Data Cogn. Comput.)
- Attack prediction in Internet of Things using knowledge graph(Shuqin Zhang, Chunxia Zhao, Shijie Wang, Shuhan Li, Peng Chen, Yunfei Han, 2023, No journal)
- Uncovering CWE-CVE-CPE Relations with Threat Knowledge Graphs(Zhenpeng Shi, Nikolay Matyunin, Kalman Graffi, D. Starobinski, 2023, ACM Transactions on Privacy and Security)
- A Multi-hop Reasoning Framework for Cyber Threat Intelligence Knowledge Graph(Kai Zhou, Yong Xie, Xin Liu, 2024, 2024 IEEE 23rd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom))
- Design of 0-day Vulnerability Monitoring and Defense Architecture based on Artificial Intelligence Technology(Jian Hu, Zhenhong Zhang, Feilu Hang, Linjiang Xie, 2024, Scalable Comput. Pract. Exp.)
- Association Analysis and Prediction of Network Security Vulnerabilities Based on Knowledge Graph(Shi Wu, Li Feng, 2025, Journal of Cyber Security and Mobility)
- Learning Attention-based Representations from Multiple Patterns for Relation Prediction in Knowledge Graphs(Vítor Lourenço, Aline Paes, 2022, ArXiv Preprint)
- CGraph: Graph Based Extensible Predictive Domain Threat Intelligence Platform(Wathsara Daluwatta, Ravindu De Silva, Sanduni Kariyawasam, Mohamed Nabeel, Charith Elvitigala, Kasun De Zoysa, Chamath Keppitiyagama, 2022, ArXiv Preprint)
- Research on Parameter-Efficient Knowledge Graph Completion Methods and Their Performance in the Cybersecurity Field(Bin Chen, Hongyi Li, Ze Shi, 2025, IEEE Access)
- Edge propagation for link prediction in requirement-cyber threat intelligence knowledge graph(Yang Zhang, Jiarui Chen, Zhe Cheng, Xiong Shen, Jiancheng Qin, Ying-Rong Han, Yiqin Lu, 2023, Inf. Sci.)
- Beyond Single-Hop: Link Prediction Through Multi-Hop Reasoning in Malware Knowledge Graphs(Kibrom Bahlibi, Hoang Long Nguyen, Erdogan Dogdu, Roya Choupani, William Mitchell, 2026, 2026 IEEE 5th International Conference on AI in Cybersecurity (ICAIC))
- Exploiting Global Semantic Similarities in Knowledge Graphs by Relational Prototype Entities(Xueliang Wang, Jiajun Chen, Feng Wu, Jie Wang, 2022, ArXiv Preprint)
- Missing Data Imputation Based on Causal Inference to Enhance Advanced Persistent Threat Attack Prediction(Xiang Cheng, Miaomiao Kuang, Hongyu Yang, 2024, Symmetry)
- Cyber Threat Analysis Using CTI Knowledge Graph and RAG Model(Jun-Ho Choi, 2025, Korean Institute of Smart Media)
- Knowledge Enrichment by Fusing Representations for Malware Threat Intelligence and Behavior(Aritran Piplai, Sudip Mittal, Mahmoud Abdelsalam, Maanak Gupta, A. Joshi, Tim Finin, 2020, 2020 IEEE International Conference on Intelligence and Security Informatics (ISI))
- Subsampling for Knowledge Graph Embedding Explained(Hidetaka Kamigaito, Katsuhiko Hayashi, 2022, ArXiv Preprint)
- Time-Aware Cybersecurity Knowledge Graph Reasoning Method for Vulnerability Analysis(Mengjie Wang, Kunlin Li, Yunlong Lu, Fan Zhang, Jiangtao Ma, Yaqiong Qiao, 2026, IEEE Transactions on Automation Science and Engineering)
- TITAN: Graph-Executable Reasoning for Cyber Threat Intelligence(Marco Simoni, Aleksandar Fontana, Andrea Saracino, Paolo Mori, 2025, ArXiv)
- Discriminative Predicate Path Mining for Fact Checking in Knowledge Graphs(Baoxu Shi, Tim Weninger, 2015, ArXiv Preprint)
攻击路径分析、风险评估与防御决策优化
利用攻击图(Attack Graph)、强化学习、贝叶斯网络及图神经网络,对系统脆弱性进行定量评估,模拟攻击者渗透路径,并结合自动化剧本优化防御决策过程。
- An Attack Graph and Reinforcement Learning-Based Analysis of Attack Paths for Vulnerability Assessment(Jinhyuck Kim, Myung-Mook Han, 2025, Journal of Korean Institute of Intelligent Systems)
- Fast Algorithm for Cyber-Attack Estimation and Attack Path Extraction Using Attack Graphs with AND/OR Nodes(Eugene Levner, Dmitry Tsadikovich, 2024, Algorithms)
- Dual-Reinforcement-Learning-Based Attack Path Prediction for 5G Industrial Cyber–Physical Systems(Xinge Li, Xiaoya Hu, Tao Jiang, 2024, IEEE Internet of Things Journal)
- Assessing Cyber-Physical Security in Industrial Control Systems(Martín Barrère, Chris Hankin, Demetrios G. Eliades, Nicolas Nicolau, Thomas Parisini, 2019, ArXiv Preprint)
- GAT-APG: Graph Attention Network-Based Attack Path Generation for Security Simulation(Min Geun Song, Jaewoong Choi, Huy Kang Kim, 2025, IEEE Access)
- A Hybrid Approach to Vulnerability Assessment Combining Attack Graph and Hidden Markov(Yikang Wang, Yuqing Zhai, 2023, 2023 8th International Conference on Signal and Image Processing (ICSIP))
- Physics-Informed Graph Neural Networks for Attack Path Prediction(François Marin, Pierre-Emmanuel Arduin, Myriam Merad, 2025, J. Cybersecur. Priv.)
- GAPPO: Graph-Attention Enhanced Reinforcement Learning for Efficient Attack Path Planning(Xiuping Li, Yong Li, Yangbai Zhang, Wen Sun, Xiaowei Li, Junchao Fan, X. Chang, 2025, 2025 IEEE 24th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom))
- Optimal Attack Path Planning based on Reinforcement Learning and Cyber Threat Knowledge Graph Combining the ATT&CK for Air Traffic Management System(Chao Liu, Buhong Wang, Fan Li, Jiwei Tian, Yong Yang, Peng Luo, Zhouzhou Liu, 2024, IEEE Transactions on Transportation Electrification)
- Efficient network attack path optimization method based on prior knowledge-based PPO algorithm(Qiuxiang Li, Jianping Wu, 2025, Cybersecurity)
- A Risk Prediction Model for Network Security Using Knowledge Graph(Tingting Luo, 2023, 2023 IEEE 6th International Conference on Knowledge Innovation and Invention (ICKII))
- Attack Hypotheses Generation Based on Threat Intelligence Knowledge Graph(F. Kaiser, Uriel Dardik, Aviad Elitzur, Polina Zilberman, Nir Daniel, M. Wiens, F. Schultmann, Y. Elovici, Rami Puzis, 2023, IEEE Transactions on Dependable and Secure Computing)
- VKG2AG : Generating Automated Knowledge-Enriched Attack Graph (AG) from Vulnerability Knowledge Graph (VKG)(M. Talukder, Rakesh Podder, Indrajit Ray, 2025, No journal)
- Dynamic Attack Path Prediction and Visualization for Industrial Cyber-Physical Systems Under Cyber Attacks(Zijin Wang, Minrui Fei, Yao Xiong, Aimin Wang, 2024, 2024 43rd Chinese Control Conference (CCC))
- Security risk situation quantification method based on threat prediction for multimedia communication network(Hao Hu, Hongqi Zhang, Yingjie Yang, 2018, Multimedia Tools and Applications)
- Network Vulnerability Assessment based on Knowledge Graph(Jinyan Cheng, Xiaobin Tan, Hao Wang, Jian Wang, Xiaofeng Jiang, Zhaoyang Jin, 2023, 2023 9th International Conference on Big Data Computing and Communications (BigCom))
- Graph-Based Attack Path Discovery for Network Security(Qiaoran Meng, Huilin Wang, Nay Oo, Hoon Wei Lim, Benedikt Johannes Schätz, Biplab Sikdar, 2023, 2023 7th Cyber Security in Networking Conference (CSNet))
- Optimal Security Protection Selection Strategy Based on Markov Model Attack Graph(Jinwei Yang, Yu Yang, 2021, Journal of Physics: Conference Series)
- Quantitative Method for Network Security Situation Based on Attack Prediction(Hao Hu, Hongqi Zhang, Yuling Liu, Yongwei Wang, 2017, Secur. Commun. Networks)
- Hyper attack graph: Constructing a hypergraph for cyber threat intelligence analysis(Junbo Jia, Li Yang, Yuchen Wang, Anyuan Sang, 2024, Comput. Secur.)
- Optimization of the Decision-making Process of Digital Twins in Network Security Based on Graph Neural Networks(Fangzhou He, Wei Bai, Zhiqi Wang, 2025, Netw. Secur.)
- Reducing benign positives in threat detection systems: A graph-based approach to contextualizing security alerts(Emmanuel Joshua, 2025, International Journal of Science and Research Archive)
- Incorporating Distributed Invariants in Autonomous Cybersecurity Knowledge Graphs: A Scalable Approach Using GNNs and LLMs(Nader Belhadj, M. Mezghich, Jaouher Fattahi, Ridha Ghayoula, Lassaad Latrach, 2025, 2025 International Joint Conference on Neural Networks (IJCNN))
- Operationalizing Cybersecurity Knowledge: Design, Implementation & Evaluation of a Knowledge Management System for CACAO Playbooks(Orestis Tsirakis, Konstantinos Fysarakis, Vasileios Mavroeidis, Ioannis Papaefstathiou, 2025, ArXiv Preprint)
高级持续性威胁(APT)建模、溯源与异常检测
专门针对APT攻击的隐蔽性,利用溯源图(Provenance Graph)进行实时检测、攻击者画像刻画、威胁主体归因以及变体预测,强调缩短攻击者潜伏时间。
- Advance persistent threat prediction using knowledge graph(Nagendrababu NC, Samyama Gunjal GH, Himabindhu N, 2024, International Journal of Science and Technology Research Archive)
- APT Attribution Using Heterogeneous Graph Neural Networks with Contextual Threat Intelligence(Abdirahman Jibril Mead, A. Arabo, 2025, Electronics)
- Threatify: APT Threat Variant Generation Using Graph-Based Machine Learning(Boubakr Nour, M. Pourzandi, Mourad Debbabi, 2025, IEEE Transactions on Network and Service Management)
- Cyber Threat Hunting: Non-Parametric Mining of Attack Patterns from Cyber Threat Intelligence for Precise Threats Attribution(Rimsha Kanwal, Umara Noor, Zafar Iqbal, Zahid Rashid, 2025, ArXiv Preprint)
- Threat Actor Type Inference and Characterization within Cyber Threat Intelligence(Vasileios Mavroeidis, Ryan Hohimer, Tim Casey, Audun Jøsang, 2021, ArXiv Preprint)
- APTStop: A Real-Time Framework for APT Defense via Strategic Threat Observation and Prediction(Sungho Lee, Kyeongsik Lee, Sungyoung Cho, Changhee Choi, 2025, IEEE Access)
- Real-time Analytics for APT Detection and Threat Hunting Using Cyber-threat Intelligence and Provenance Graphs(V. Venkatakrishnan, 2025, Proceedings of the 2025 ACM International Workshop on Security and Privacy Analytics)
- Uncovering multi-step attacks with threat knowledge graph reasoning(Xiayu Xiang, Changchang Ma, Liyi Zeng, Wenying Feng, Yushun Xie, Zhaoquan Gu, 2024, Secur. Saf.)
- A Modular Approach to Automatic Cyber Threat Attribution using Opinion Pools(Koen T. W. Teuwen, 2024, ArXiv Preprint)
- CyberVeriGNN: A Graph Neural Network‐Based Approach for Detecting Fake Cyber Threat Intelligence(Congyu Huang, Chenxiao Wang, 2025, Security and Privacy)
- Research on Network Security Threat Analysis Method Based on Knowledge Graph(Zhenwan Zou, Bin Wang, Feng Li, Bo Ye, 2024, 2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC))
- Wireless Network Security Prediction Using Machine Learning and Graph Neural Networks(D. R. Tripathi, Prajwal Rushi Kshirsagar, Vibhor Sheshrao Patil, 2024, International Journal for Research in Applied Science and Engineering Technology)
- Development of a Cyber Security System for Active Directory Using AI-Powered Threat Detection and Response for Enhanced Enterprise Security(Manish Kumar, Pell Reddy Rajender Reddy, Ramesh Julakanti, Ram Reddy Jonnalagadda, K. Reddy, Kamalakar Gunupati, 2025, 2025 International Conference on Computing and Communications (COMPUTINGCON))
- Web Scale Graph Mining for Cyber Threat Intelligence(Scott Freitas, Amirhossein Gharib, 2024, Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2)
大语言模型(LLM)与神经符号AI驱动的智能预警
反映最新趋势,结合LLM的语义理解能力与符号AI的逻辑推理能力,通过RAG(检索增强生成)和神经符号框架提升预警的可解释性与漏洞检测精度。
- KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment(Zongzong Wu, Fengxiao Tang, Ming Zhao, Yufeng Li, 2024, ArXiv)
- LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification(Fengxiao Tang, Huan Li, Ming Zhao, Zongzong Wu, Shisong Peng, Tao Yin, 2025, ArXiv)
- CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence(Md Tanvirul Alam, Dipkamal Bhusal, Le Nguyen, Nidhi Rastogi, 2024, ArXiv Preprint)
- CTIArena: Benchmarking LLM Knowledge and Reasoning Across Heterogeneous Cyber Threat Intelligence(Yutong Cheng, Yang Liu, Changze Li, D. Song, Peng Gao, 2025, ArXiv)
- CTI-Thinker: an LLM-driven system for CTI knowledge graph construction and attack reasoning(Xiuzhang Yang, Ruijie Zhong, Yuling Chen, Guojun Peng, Di Yao, Chaofan Chen, Chenyang Wang, Dongni Zhang, Yilin Zhou, Zixuan Yang, 2026, Cybersecurity)
- Coherence-driven inference for cybersecurity(Steve Huntsman, 2025, ArXiv Preprint)
- Causal-Aware Knowledge Graph Enhanced RAG for Predictive Cybersecurity Intelligence: A Framework for Attack Progression Analysis and Consequence Prediction(Mounir Belmahjoub, Lamia Benhiba, 2025, 2025 12th International Conference on Soft Computing & Machine Intelligence (ISCMI))
- Neuro-Symbolic AI for Automated Cyber Threat Intelligence Generation(Prajwalasimha S N, D. K. J. Saini, Shital Akash Yewale, Neeru Malik, Fenita F, A. Patil, 2025, 2025 9th International Conference on Computing, Communication, Control and Automation (ICCCBEA))
- Neurosymbolic AI for IoT Security: A Knowledge-Guided Framework for Real-Time IoT Anomaly Detection and Response(Anusha Nerella, Pratik Badri, Siva Teja Reddy Kandula, Vinodkumar Reddy Surasani, Pradeep Kumar Muthukamatchi, Arpit Jain, 2025, 2025 Seventeenth International Conference on Contemporary Computing (IC3))
- VulKiller: Java Web Vulnerability Detection with Code Property Graph and Large Language Models(Xingchen Chen, Baizhu Wang, Mengjun Zhang, Yaqin Cao, Qixu Liu, 2025, ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
- CRUSH: Cybersecurity Research using Universal LLMs and Semantic Hypernetworks(Mohit Sewak, Vamsi K Emani, Annam Naresh, 2023, No journal)
- NSCTI: A Hybrid Neuro-Symbolic Framework for AI-Driven Predictive Cyber Threat Intelligence(Suryaprakash Nalluri, Murali Mohan Malyala, Hemalatha Kandagiri, Kiran Kumar Kandagiri, 2025, 2025 4th International Conference on Computational Modelling, Simulation and Optimization (ICCMSO))
- Efficient Instruction Vulnerability Prediction With Heterogeneous SDC Propagation Knowledge Graph(Bao Wen, Jingjing Gu, Dazhong Shen, Qiang Zhou, Fuzhen Zhuang, Yang Liu, Haocheng Song, Xinyi Huang, 2026, IEEE Transactions on Dependable and Secure Computing)
特定垂直领域与工业场景的图谱应用
探讨知识图谱在工业控制系统(ICS)、物联网(IoT)、车联网、智能电网及供应链安全等特定复杂环境下的定制化预警与告警追溯应用。
- Proactive security defense: cyber threat intelligence modeling for connected autonomous vehicles(Yinghui Wang, Yilong Ren, Zhiyong Cui, Haiyang Yu, 2024, ArXiv Preprint)
- A heterogeneous graph-based approach for cyber threat attribution using threat intelligence(Junting Duan, Yujie Luo,, Zhicheng Zhang, Jianjian Peng, 2024, Proceedings of the 2024 16th International Conference on Machine Learning and Computing)
- BWG: An IOC Identification Method for Imbalanced Threat Intelligence Datasets(Juncheng Lu, Yiyang Zhao, Yan Wang, Jiyuan Cui, Sanfeng Zhang, 2024, 2024 IEEE 23rd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom))
- Blockchain-Powered Secure and Scalable Threat Intelligence System With Graph Convolutional Autoencoder and Reinforcement Learning Feedback Loop(Mohamad Khayat, E. Barka, Mohamed Adel Serhani, F. Sallabi, Khaled Shuaib, Heba M. Khater, 2025, IEEE Access)
- Agentic AI for Cybersecurity: Explainable Graph Neural Networks with Federated and Reinforcement Learning for Threat Prediction and Mitigation(Muhammad Saeed, Hina Shafique, Haseena Manzoor, Muhammad Ibrar, Muhammad Javed, Arbab Kanwal, 2025, 2025 International Conference on Frontiers of Information Technology (FIT))
- Design and Develop MITRE ATT&CK Based Cybersecurity Threat Detection Model(Gurinder Pal Singh, Rohit Bajaj, Manish Kumar Hooda, 2025, 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG))
- AI-Driven Threat Intelligence using Graph Neural Networks for Advanced Cybersecurity Defense(R. Rajendran, Venkata pavan kumar reddy Chintham, R.Beaulah Jeyavathana, Vamsi Krishna, Chidipothu, Venu Karunanithi, Dr.B.Jegajothi, 2025, 2025 8th International Conference on Computing Methodologies and Communication (ICCMC))
- Cyber Threat Intelligence Framework using Graph Attention Networks for Dark Web Activity Monitoring(David Neels, Ponkumar Devadhas Professor, Mahmad Muskhan, Satheesh Kumar, Sangaraju Joshika, Syed Mahammed Afzal, Arul Kirubaharan, S. Manager, 2025, 2025 International Conference on NexGen Networks and Cybernetics (IC2NC))
- BRIDG-ICS: AI-Grounded Knowledge Graphs for Intelligent Threat Analytics in Industry~5.0 Cyber-Physical Systems(Padmeswari Nandiya, Ahmad Mohsin, Ahmed Ibrahim, Iqbal H. Sarker, Helge Janicke, 2025, ArXiv Preprint)
- Research on Penetration Testing Method of Power Information System Based on Knowledge Graph(Liu Sheng, Xinyue Shi, Song Yilei, Zhang Lei, Yingying Wang, Yuan Ze, Dandan Li, Liu Xiue, 2023, 2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC))
- Cargo Ecosystem Dependency-Vulnerability Knowledge Graph Construction and Vulnerability Propagation Study(Peiyang Jia, Chengwei Liu, Hongyu Sun, Chengyi Sun, Mianxue Gu, Yang Liu, Yuqing Zhang, 2022, ArXiv)
- Insider Threats Risk Warning and Traceability Based on User Behavior Entity Analysis and Knowledge Graph(Qi Ji, Xiaoshuang Xu, Chao Shen, Hongkai Xue, Yuhang Ma, Xiang Qiu, 2023, 2023 5th International Conference on Robotics, Intelligent Control and Artificial Intelligence (RICAI))
- Integrating Behavioral Biometrics and CTI Ontologies for Predictive Analysis of Insider Threats and APT Actor Behavior Patterns(Pamela Gado, Funmi Eko Ezeh, Stephanie Onyekachi Oparah, Adeyeni Adeleke, Stephen Vure Gbaraba, 2024, Global Multidisciplinary Perspectives Journal)
- A Hybrid Graph-Based Risk Assessment and Attack Path Detection Model for IoT Systems(Ferhat Arat, S. Akleylek, Zaliha Yüce Tok, 2025, IEEE Access)
- Towards Cyber Threat Intelligence for the IoT(Alfonso Iacovazzi, Han Wang, Ismail Butun, Shahid Raza, 2024, ArXiv Preprint)
- Cyber Threat Intelligence : Challenges and Opportunities(Mauro Conti, Ali Dehghantanha, Tooska Dargahi, 2018, ArXiv Preprint)
- CAKG: A Framework for Cybersecurity Threat Detection of Automotive via Knowledge Graph(Peng Yang, Li-Juan Wang, Yun Li, Xuedong Song, Yaxin Wang, Biheng Guo, 2023, 2023 8th International Conference on Data Science in Cyberspace (DSC))
- Research on Dynamic Anomaly Detection Method of Edge Devices Based on Attack Chain Knowledge Graph(Xinhao Chen, Houding Zhang, Dongying Gao, Jie Fu, Zhilei Lv, Zheng Li, 2025, 2025 IEEE 8th Information Technology and Mechatronics Engineering Conference (ITOEC))
- Research on Power Cyber-Physical Cross-Domain Attack Paths Based on Graph Knowledge(Shenjian Qiu, Zhipeng Shao, Jian Wang, Shiyou Xu, Jiaxuan Fei, 2024, Applied Sciences)
- Knowledge graph and behavior portrait of intelligent attack against path planning(Li Zhang, Zhao Li, Huali Ren, Xiao Yu, Yuxi Ma, Quanxin Zhang, 2022, International Journal of Intelligent Systems)
- A Power Monitor System Cybersecurity Alarm-Tracing Method Based on Knowledge Graph and GCNN(Tianhao Ma, Juan Yu, Binquan Wang, Maosheng Gao, Zhifang Yang, Yajie Li, Mao Fan, 2025, Applied Sciences)
- Research on Threat Prediction Method for Smart Grid System Information Domain Based on Graph Neural Network(Zhihong Zhang, Linhao Li, 2025, 2025 10th International Conference on Cyber Security and Information Engineering (ICCSIE))
- A Global Analysis of Cyber Threats to the Energy Sector: "Currents of Conflict" from a Geopolitical Perspective(Gustavo Sánchez, Ghada Elbez, Veit Hagenmeyer, 2025, ArXiv Preprint)
- A Collaborative Programmable LFA Defense Using Temporal Graph Learning in AIoT(Yikun Li, Ying Liu, Yu Xia, Weiting Zhang, Wei Quan, Jiawen Kang, Hongke Zhang, 2025, IEEE Internet of Things Journal)
鲁棒性、对抗性防御与综合管理体系
关注知识图谱自身的安全性(如对抗攻击、情报投毒)、不确定性建模以及跨组织的威胁情报共享与综合管理平台建设。
- Research Directions in Cyber Threat Intelligence(Stjepan Groš, 2020, ArXiv Preprint)
- Untargeted Adversarial Attack on Knowledge Graph Embeddings(Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Qika Lin, Yuxia Geng, Jun Liu, 2024, Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval)
- Mathematical Frameworks for Threat Intelligence Analysis: Leveraging Graph Theory and Machine Learning for Cyber Threat Assessment(Kanchan Rahul Jamnik, Nita Swapnil Dhake, Sonam Rani, Rajiv Mishra, V. Ukey, J. Patil, 2024, Communications on Applied Nonlinear Analysis)
- Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph(Bin Chen, Hongyi Li, Ze Shi, 2024, Mathematics)
- Soft Topology for Cyber Threat Intelligence: A Knowledge Graph Perspective(R. Deepa, 2025, International Journal of Applied Mathematics)
- LoSA: A Local Structural Approach to Adversarial Attack on the Knowledge Graph-based Question Answering System(Neha Pokharel, Arnab Sharma, Adel Memariani, Michael Röder, A. Ngomo, 2025, No journal)
- A Graph Neural Network Framework for Real Time Cyber Threat Intelligence and Risk Analysis(Voruganti Naresh Kumar, Mns Suvarna Kumar, R. Suhasini, Manikala Lakshman, Naveenkumar Anbalagan, Krithika. D.R., 2025, 2025 IEEE International Conference on Advanced Computing Technologies (ICACT))
- Beyond Traditional Methods: A NLP and Knowledge Graph Approach to Cyber Threat Detection and Visualization(M. Qaisar, Bilal Ali, Hasan Ali Khattak, 2025, 2025 3rd International Conference on Foundation and Large Language Models (FLLM))
- Network Security Threat Detection System Based on Knowledge Graph(Meilun Zheng, Tianqi He, Nan Wang, Xinyu Wang, J. Zuo, 2025, 2025 40th Youth Academic Annual Conference of Chinese Association of Automation (YAC))
- Generating Fake Cyber Threat Intelligence Using Transformer-Based Models(P. Ranade, Aritran Piplai, Sudip Mittal, A. Joshi, Tim Finin, 2021, 2021 International Joint Conference on Neural Networks (IJCNN))
- A Knowledge Graph-Based Early Warning Model for Network Nuisance Actors(Ying Xiong, Jiaxin Yao, Fucai Luo, Yanhua Liu, 2025, 2025 21st International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD))
- Cybersecurity Threat Hunting and Vulnerability Analysis Using a Neo4j Graph Database of Open Source Intelligence(Elijah Pelofske, Lorie M. Liebrock, Vincent Urias, 2023, ArXiv Preprint)
- A Cyber Threat Intelligence Management Platform for Industrial Environments(Alexandros Papanikolaou, Aggelos Alevizopoulos, Christos Ilioudis, Konstantinos Demertzis, Konstantinos Rantos, 2023, ArXiv Preprint)
- Knowledge Acquisition and Insider Threat Prediction in Relational Database Systems(Qussai M. Yaseen, B. Panda, 2009, 2009 International Conference on Computational Science and Engineering)
- PRTIRG: A Knowledge Graph for People-Readable Threat Intelligence Recommendation(Ming Du, Jun Jiang, Zhengwei Jiang, Zhigang Lu, Xiangyu Du, 2019, No journal)
本研究综述展示了基于知识图谱的网络安全威胁预警技术已形成从底层自动化构建到高层智能决策的完整体系。核心演进趋势包括:1) 自动化程度提升,利用LLM和NLP实现海量异构情报的精准提取;2) 推理能力深化,通过GNN、时间感知及神经符号AI增强对复杂攻击链的预测与解释;3) 场景适配多元化,针对ICS、IoT等垂直领域构建物理-信息耦合的防御模型;4) 安全性与鲁棒性关注,开始研究针对知识图谱的对抗攻击与情报质量保障。KG与LLM的深度融合(KG+LLM)正成为实现主动、精准、可解释威胁预警的关键路径。
总计149篇相关文献
Open-source cyber threat intelligence (OSCTI) is becoming more influential in obtaining current network security information. Most studies on cyber threat intelligence (CTI) focus on automating the extraction of threat entities from public sources that describe attack events. The cybersecurity knowledge graph aims to change the expression of threat knowledge so that security researchers can accurately and efficiently obtain various types of threat information for preliminary intelligent decisions. The attribution technology can not only assist security analysts in detecting advanced persistent threats, but can also identify the same threat from different attack events. Therefore, it is important to trace the attack threat actor. In this study, we used the knowledge graph technology, considered the latest research on cyber threat attack attribution, and thoroughly examined key related technologies and theories in the process of constructing and applying the advanced persistent threat (APT) knowledge graph from OSCTI. We designed a cybersecurity platform named CSKG4APT based on a knowledge graph. Inspired by the theory of ontology, we constructed CSKG4APT as an APT knowledge graph model based on real APT attack scenarios. We then designed an APT threat knowledge extraction algorithm for completing and updating the knowledge graph using deep learning and expert knowledge. Finally, we proposed a practical APT attack attribution method with attribution and countermeasures. CSKG4APT is not a passive defense method in traditional network confrontation but one that integrates a large amount of fragmented intelligence and can actively adjust its defense strategy. It lays the foundation for further dominance in network attack and defense.
In cybersecurity, constructing an accurate knowledge graph is vital for discovering key entities and relationships in security incidents buried in vast unstructured threat reports. Traditional knowledge-graph construction pipelines based on handcrafted rules or conventional machine learning models falter when the data scale and linguistic variety grow. GraphRAG, a retrieval-augmented generation (RAG) framework that splits documents into fixed-length chunks and then retrieves the most relevant ones for generation, offers a scalable alternative yet still suffers from fragmentation and semantic gaps that erode graph integrity. To resolve these issues, this paper proposes SC-LKM, a cybersecurity knowledge-graph construction method that couples the GraphRAG backbone with hierarchical semantic chunking. SC-LKM applies semantic chunking to build a cybersecurity knowledge graph that avoids the fragmentation and inconsistency seen in prior work. The semantic chunking method first respects the native document hierarchy and then refines boundaries with topic similarity and named-entity continuity, maintaining logical coherence while limiting information loss during the fine-grained processing of unstructured text. SC-LKM further integrates the semantic comprehension capacity of Qwen2.5-14B-Instruct, markedly boosting extraction accuracy and reasoning quality. Experimental results show that SC-LKM surpasses baseline systems in entity-recognition coverage, topology density, and semantic consistency.
In the digital age, software security is essential for the stability of information systems and data protection, yet increasing complexity in software systems has made vulnerabilities a significant cybersecurity threat, leading to data breaches, system crashes, and service disruptions. Traditional vulnerability assessments usually analyze vulnerabilities in isolation, ignoring their time relations and the risk of attackers exploiting multiple vulnerabilities simultaneously, known as co-exploitation. This paper proposes an innovative time-aware cybersecurity knowledge graph (TCG) reasoning method called TCGFormer, which is designed to address these challenges. TCGFormer comprises four modules: 1) an entity encoding module that adjusts attention based on positional information and interaction frequency, 2) a novel attention mechanism for encoding relational topology graphs, 3) a joint sequence encoding module for extracting temporal representations and node relations from historical interactions, and 4) a parameter learning module for predicting entities and relations. Extensive experiments on three public temporal datasets demonstrate that TCGFormer significantly outperforms existing baseline methods, and validation on a cybersecurity knowledge graph dataset—including NVD, CVE details, CWE database, and EDB—further confirms its efficacy in identifying co-exploitation behaviors. Note to Practitioners—The rising complexity of software systems introduces new challenges in cybersecurity, notably in how multiple vulnerabilities are exploited simultaneously, a tactic increasingly common in cyber-attacks. This work’s contribution, TCGFormer, offers a novel approach to predict and analyze such threats by recognizing patterns in historical data and predicting future exploits. While our method advances the detection of complex attack strategies, it remains essential to integrate these insights with existing security protocols to enhance their effectiveness. Practitioners are encouraged to consider how this method could be tailored to fit specific security needs, and to explore its integration into broader cybersecurity frameworks to protect against sophisticated cyber threats effectively.
Cyberattacks, especially Advanced Persistent Threats (APTs), have become more complex. These evolving threats challenge traditional defense systems, which struggle to counter long-lasting and covert attacks. Cybersecurity Knowledge Graphs (CKGs), enabled through the integration of multi-source CTI, introduce novel approaches for proactive defense. However, building CKGs faces challenges such as unclear terminology, overlapping entity relationships in attack chains, and differences in CTI across sources. To tackle these challenges, we propose the CyberKG framework, which improves entity recognition and relation extraction using a SecureBERT_Plus-BiLSTM-Attention-CRF joint architecture. Semantic features are captured using a domain-adapted SecureBERT_Plus model, while temporal dependencies are modeled through BiLSTM. Attention mechanisms highlight key cross-sentence relationships, while CRF incorporates ATT&CK rule constraints. Hierarchical clustering (HAC), based on contextual embeddings, facilitates dynamic entity disambiguation and semantic fusion. Experimental evaluations on the DNRTI and MalwareDB datasets demonstrate strong performance in extraction accuracy, entity normalization, and the resolution of overlapping relations. The constructed knowledge graph supports APT tracking, attack-chain provenance, proactive defense prediction.
The issue of complex sources, difficult to understand and share security threat intelligence, this paper realizes deep learning of threat intelligence features based on Restricted Boltzmann Machine, which graphs the original threat intelligence features from high dimensional space to low dimensional space layer by layer, and constructs the cyberspace security threat knowledge graphs. The deep learning used to build a multi-level and structured knowledge graph of cyberspace security threats can reflect the structural characteristics of the knowledge graph, making the graph have a lower dimension and a higher level of abstraction. The experiment verifies the feasibility of constructing the cyberspace security threat knowledge graph, and verifies the security threat perception method based on the knowledge graph is more suitable for the perception of high-intensity security threats by comparing with traditional threat detection methods.
With the increasing complexity and connectivity of vehicles, ensuring their security has become a critical concern. In this study, we propose CAKG: A Framework For Cybersecurity Threat Detection Of Automotive Via Knowledge Graph, achieved with a knowledge graph for vehicle vulnerability and threat intelligence. We integrate existing cyber security knowledge bases to analyze potential attack surfaces and scenarios specific to automobiles. By using keyword extraction and text similarity analysis, we identify threats relevant to the event produced by Intrusion Detection Systems(IDS) in automotive. Moreover, we leverage another knowledge graph to analyze attack logs gained from actual vehicles, enabling us to further correlate security events and product situational analysis. Our framework provides a holistic perspective on vehicle security, facilitating threat modeling and enhancing our understanding of potential attack scenarios.
Software engineers and security professionals rely on a variety of sources of information, including known vulnerabilities, newly identified weaknesses, and threats, as well as attack patterns and current mitigations. Such information, spread across different places, results in an increased effort for developers in following all the cross-referenced data and finding appropriate solutions to their security issues in a timely manner. Software developers cannot have a good knowledge of the breadth of the different issues and vulnerabilities that are constantly increasing in time; the raising number of security issues to tackle cannot be matched by software developers which need more help from intelligent tools. Therefore, in this work, we present CyberGraph, a tool to automatically build and update a single, easily queryable cybersecurity knowledge graph by automatically linking heterogeneous data from different public repositories. The resulting unique integrated dataset, thanks to its magnitude, allows the execution of sophisticated queries that can quickly provide new insights and valuable perspectives.
As the forms of cyber threats become increasingly severe, cybersecurity knowledge graphs (KGs) have become essential tools for understanding and mitigating these threats. However, the quality of the KG is critical to its effectiveness in cybersecurity applications. In this paper, we propose a spurious-negative sample augmentation-based quality evaluation method for cybersecurity KGs (SNAQE) that includes two key modules: the multi-scale spurious-negative triple detection module and the adaptive mixup based on the attention mechanism module. The multi-scale spurious-negative triple detection module classifies the sampled negative triples into spurious-negative and true-negative triples. Subsequently, the attention mechanism-based adaptive mixup module selects appropriate mixup targets for each spurious-negative triple, constructing partially correct triples and achieving more precise sample generation in the entity embedding space to assist in training the KG quality evaluation models. Through extensive experimental validation, the SNAQE model not only performs excellently in general-domain KG quality evaluation but also achieves outstanding outcomes in the cybersecurity KGs, significantly enhancing the accuracy and F1 score of the model, with the best F1 score of 0.969 achieved on the FB15K dataset.
Cybersecurity plays a critical role in today’s modern human society, and leveraging knowledge graphs can enhance cybersecurity and privacy in the cyberspace. By harnessing the heterogeneous and vast amount of information on potential attacks, organizations can improve their ability to proactively detect and mitigate any threat or damage to their online valuable resources. Integrating critical cyberattack information into a knowledge graph offers a significant boost to cybersecurity, safeguarding cyberspace from malicious activities. This information can be obtained from structured and unstructured data, with a particular focus on extracting valuable insights from unstructured text through natural language processing (NLP). By storing a wide range of cyber threat information in a semantic triples form which machines can interpret autonomously, cybersecurity experts gain improved visibility and are better equipped to identify and address cyber threats. However, constructing an efficient knowledge graph poses challenges. In our research, we construct a cybersecurity knowledge graph (CKG) autonomously using heterogeneous data sources. We further enhance the CKG by applying logical rules and employing graph analytic algorithms. To evaluate the effectiveness of our proposed CKG, we formulate a set of queries as questions to validate the logical rules. Ultimately, the CKG empowers experts to efficiently analyze data and gain comprehensive understanding of cyberattacks, thereby help minimize potential attack vectors.
Security Analysts that work in a ‘Security Operations Center’ (SoC) play a major role in ensuring the security of the organization. The amount of background knowledge they have about the evolving and new attacks makes a significant difference in their ability to detect attacks. Open source threat intelligence sources, like text descriptions about cyber-attacks, can be stored in a structured fashion in a cybersecurity knowledge graph. A cybersecurity knowledge graph can be paramount in aiding a security analyst to detect cyber threats because it stores a vast range of cyber threat information in the form of semantic triples which can be queried. A semantic triple contains two cybersecurity entities with a relationship between them. In this work, we propose a system to create semantic triples over cybersecurity text, using deep learning approaches to extract possible relationships. We use the set of semantic triples generated through our system to assert in a cybersecurity knowledge graph. Security Analysts can retrieve this data from the knowledge graph, and use this information to form a decision about a cyber-attack.
This paper presents ETRGNN-ZT, a scalable and automated cybersecurity framework that integrates Neo4jbased knowledge graphs, Graph Neural Networks (GNNs) using the Deep Graph Library (DGL), and Zero-Trust (ZT) policy-driven mitigation. The proposed framework addresses key limitations of static security models by introducing a realtime pipeline for threat detection, risk prioritization, and proactive mitigation. Structured data from the MITRE ATT&CK framework and unstructured Open Source Intelligence (OSINT) are fused to construct a dynamic knowledge graph representing attack tactics, vulnerabilities, and indicators of compromise (IOCs). The GNN module detects attack paths and generates threat scores, which drive automated ZT policy updates. Experimental evaluation demonstrates a detection accuracy of 96.7 %, a mitigation success rate of 98.4 %, and sub- 0.1 second inference latency, while maintaining linear scalability across millions of graph nodes. The ETRGNN-ZT framework offers an intelligent, resilient, and adaptive approach to modern cyber defense.
As cyber-attack techniques become increasingly sophisticated, cyber threat intelligence has emerged as a crucial resource for cybersecurity defense. However, the vast and fragmented nature of CTI makes it challenging to utilize efficiently. To address this issue, this research introduces a fresh methodology that integrates open information extraction with graph convolutional networks for building and refining cybersecurity knowledge maps. In this study, we utilized open information extraction technology to extract entities and relationships from cybersecurity threat information to construct an initial knowledge graph. Additionally, we introduced graph convolutional networks to further enhance the quality of the graph. Furthermore, we introduce graph convolutional networks to perform deep representation learning on the graph, optimizing node connectivity and enhancing the quality of the knowledge graph. Based on this, we conduct indepth research on the concurrent extraction technique for entity-relationship pairs using multi-grained dilated convolutions, aiming to reduce error propagation and enhance information interaction. The proposed method significantly improves entity recognition, relationship extraction, and the quality of knowledge graphs, providing robust support for cybersecurity defense.
Cybersecurity threat analysis requires systems that not only retrieve information but also reason about attack progressions, quantify uncertainty, and guide defensive actions. Traditional Retrieval-Augmented Generation (RAG) tools lack causal reasoning, temporal prediction, and confidence estimation. We introduce CA-RAG, a causal-aware extension of knowledge graph-based RAG, which integrates pre-computed causal relationships from graph neural network (GNN) analysis into a Neo4j cybersecurity graph based on the SEPSES Knowledge Graph. CA-RAG combines graph traversal, SEPSES-enriched embeddings, and causal path discovery to deliver three key advances: temporal attack progression prediction with confidence scores, prioritized root-cause roadmaps, and defensive optimization through causal bottleneck analysis. Comparative evaluation against CyKG-RAG shows significant gains in prediction accuracy, uncertainty-aware threat assessment, and evidence-based decision support. By embedding causal reasoning, CA-RAG shifts threat intelligence from descriptive retrieval to proactive, predictive analysis.
As an advanced knowledge management and reasoning tool, knowledge graph technology can significantly enhance the efficiency of managing and analyzing cyberthreat intelligence, providing strong technical support for cybersecurity threat identification and situational awareness. This paper proposes a multi-level graph clustering-based model for completing cyberthreat intelligence knowledge graphs (GCCKG). The model utilizes a multi-level graph clustering approach to divide the knowledge graph into communities and selects a portion of high-degree entities within each community as the reserved entities. Additionally, we propose a path-enhanced cosine similarity measurement method to measure the similarity between entities, and a graph attention network is employed to iteratively update the embeddings of entities and relations, capturing key information in the graph-structured data. Experimental results demonstrate that the proposed GCCKG model significantly improves evaluation values such as MRR and Hits@K in knowledge graph completion tasks on several general domain knowledge graphs, including FB15K, WN18, FB15K-237, as well as the cyberthreat intelligence knowledge graph CS13K, while also significantly reducing the model’s parameter size. This provides a novel solution for knowledge graph completion in the field of cyberthreat intelligence.
No abstract available
Cyber threat detection, analysis, and mitigation are becoming more and more complex, as traditional signature-based cybersecurity methods tend to be limited in their effectiveness against evolving threats. This work presents a novel framework for real-time threat detection and visualisation leveraging Natural Language Processing (NLP) and Knowledge Graphs (KG). By combining these techniques, this approach connects threat patterns and entities, enabling a more detailed understanding of cyber-attacks. Considering the ever-increasing amount of data and communications, this work demonstrates improvements in detection accuracy as well as the ability to visualise complex attack patterns using the proposed framework, leading to a deeper understanding of cyber-security operations. The proposed framework has been validated using real-world datasets, demonstrating its effectiveness in predicting and mitigating threats better than traditional systems. The study provides new insights into the growing role of NLP and KGs in modern cybersecurity environments.
Industrial control systems (ICS) involve many key industries, which once attacked will cause heavy losses. However, traditional passive defense methods of cybersecurity have difficulty effectively dealing with increasingly complex threats; a knowledge graph is a new idea to analyze and process data in cybersecurity analysis. We propose a novel overall framework of data-driven industrial control network security defense, which integrated fragmented multisource threat data with an industrial network layout by a cybersecurity knowledge graph. In order to better correlate data to construct a knowledge graph, we propose a distant supervised relation extraction model ResPCNN-ATT; it is based on a deep residual convolutional neural network and attention mechanism, reduces the influence of noisy data in distant supervision, and better extracts deep semantic features in sentences by using deep residuals. We empirically demonstrate the performance of the proposed method in the field of general cybersecurity by using dataset CSER; the model proposed in this paper achieves higher accuracy than other models. And then, the dataset ICSER was used to construct a cybersecurity knowledge graph (CSKG) on the basis of analyzing specific industrial control scenarios, visualizing the knowledge graph for further security analysis to the industrial control system.
System logs represent a valuable source of Cyber Threat Intelligence (CTI), capturing attacker behaviors, exploited vulnerabilities, and traces of malicious activity. Yet their utility is often limited by lack of structure, semantic inconsistency, and fragmentation across devices and sessions. Extracting actionable CTI from logs therefore requires approaches that can reconcile noisy, heterogeneous data into coherent and interoperable representations. We introduce OntoLogX, an autonomous Artificial Intelligence (AI) agent that leverages Large Language Models (LLMs) to transform raw logs into ontology-grounded Knowledge Graphs (KGs). OntoLogX integrates a lightweight log ontology with Retrieval Augmented Generation (RAG) and iterative correction steps, ensuring that generated KGs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates KGs into sessions and employs a LLM to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. We evaluate OntoLogX on both logs from a public benchmark and a real-world honeypot dataset, demonstrating robust KG generation across multiple KGs backends and accurate mapping of adversarial activity to ATT&CK tactics. Results highlight the benefits of retrieval and correction for precision and recall, the effectiveness of code-oriented models in structured log analysis, and the value of ontology-grounded representations for actionable CTI extraction.
Textual descriptions in cyber threat intelligence (CTI) reports, such as security articles and news, are rich sources of knowledge about cyber threats, crucial for organizations to stay informed about the rapidly evolving threat landscape. However, current CTI knowledge extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction. Syntax parsing relies on fixed rules and dictionaries, while model fine-tuning requires large annotated datasets, making both paradigms challenging to adapt to new threats and ontologies. To bridge the gap, we propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models (LLMs) for data-efficient CTI knowledge extraction and high-quality cybersecurity knowledge graph (CSKG) construction. Unlike existing methods, CTINexus requires neither extensive data nor parameter tuning and can adapt to various ontologies with minimal annotated examples. This is achieved through: (1) a carefully designed automatic prompt construction strategy with optimal demonstration retrieval for extracting a wide range of cybersecurity entities and relations; (2) a hierarchical entity alignment technique that canonicalizes the extracted knowledge and removes redundancy; (3) an long-distance relation prediction technique to further complete the CSKG with missing links. Our extensive evaluations using 150 real-world CTI reports collected from 10 platforms demonstrate that CTINexus significantly outperforms existing methods in constructing accurate and complete CSKG, highlighting its potential to transform CTI analysis with an efficient and adaptable solution for the dynamic threat landscape.
Advanced persistent threats (APTs) are a major threat to cybersecurity, and they are typically attributed to nation-state actors or well-organized groups with sophisticated capabilities. This knowledge graph is intended to help you understand and attribute APT organizations by providing a framework for understanding their characteristics, attributing challenges, attributing clues, attributing methodologies, and attributing limitations. By understanding APT organizations and attributing challenges, clues, methodologies, and attribution limitations, you can gain valuable insights and methods for unraveling the mystery surrounding APT organizations. The graph highlights the difficulties and intricacies associated with attribution, such as false flags, use of proxies, cooperation between APTs and the evolving tactics employed by threat actors. State- sponsored attribution is based on government statements or intelligence agency reports; private sector attribution is based on cybersecurity firms’ reports or threat intelligence sharing; and academia and independent research is based on academic and non-academic sources. The graph serves as a resource for cybersecurity professionals, analysts and researchers looking for a systematic framework to improve their understanding and ability to attribute cyberattacks to attack actors. It offers in-depth analysis and practical advice to navigate the complex landscape of APP attribution in today’s rapidly changing cybersecurity landscape.
The rapid advancement of information technologies has significantly intensified the focus on cyberspace security across various sectors. In this evolving landscape, attackers deploy many of techniques- including exploits, weakness identification, and complex multi-step attacks- to gain unauthorized access to systems. Conversely, defenders harness insights from a variety of sources to pinpoint potential threats. Prominent public cybersecurity databases such as the Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK), Common Attack Pattern Enumeration and Classification (CAPEC), Common Vulnerabilities and Exposures (CVE), Common Weakness Enumeration (CWE), and Common Platform Enumeration (CPE) provide extensive data on security entities and their interrelations, playing a pivotal role in enriching the understanding of cybersecurity challenges and assisting in comprehensive defensive analyses. However, the semantic cross-analysis of these databases, crucial for identifying obscure threat patterns, remains underexploited. In this study, we amalgamate data from these disparate sources into a cohesive threat knowledge graph and introduce a novel knowledge representation learning approach, A4CKGE (ATT&CK-CAPEC-CWE-CVE-CPE Knowledge Graph Embedding). This method utilizes advanced structural and textual analytics to predict interactions among security entities such as products, vulnerabilities, weaknesses, and multi-step attack sequences, employing complex attack templates generated through a Large Language Model (LLM). Our extensive experiments demonstrate that this approach significantly outperforms existing state-of-the-art methods in effectively predicting these relationships. The findings validate the efficacy of our threat knowledge graph in unveiling hidden connections, thereby highlighting its potential to strengthen cybersecurity defenses substantially.
This paper presents a novel methodology to enhance Autonomous Cybersecurity Knowledge Graphs (ACKGs) by incorporating distributed invariants, ensuring robust consistency in data integrity, access control, and threat detection. The proposed framework integrates Graph Neural Networks (GNNs) and Large Language Models (LLMs), facilitating real-time validation, automated threat mitigation, and continuous system updates as evolving threats are detected. By embedding these invariants into the cybersecurity architecture, the approach offers a scalable, dynamic, and self-sustaining solution, significantly improving the resilience, adaptability, and operational efficiency of cybersecurity systems in complex, large-scale environments.
This paper introduces AEKG4APT, an APT Knowledge Graph (KG) enhanced by Large Language Models (LLMs), as a way to deal with the cybersecurity problems caused by Advanced Persistent Threats (APTs). The core of AEKG4APT lies in the combined application of LLMs, Cyber Threat Intelligence (CTI), and KG. The first part of the paper goes into great detail about how the AEKG4APT was constructed, including its ontology schema, data sources, and dataset features. There are also statistics on the AEKG4APT’s nodes, relationships, and key attributes. Secondly, it was shown how to utilize LLMs and public sandboxes for the collection and analysis of CTI Additionally, tests that compare traditional deep learning models to LLM methods show that LLM is both more efficient and more accurate at extracting information. Subsequently, the Decision Making Trial and Evaluation Laboratory - Interpretive Structural Modeling (DEMATEL-ISM) analytical method was introduced to identify and analyse the factors and their interrelationships within the AEKG4APT data, thereby revealing the key dependencies and influence paths within the data structure. Experiments were designed to demonstrate its applications in modeling, computing, and obtaining interpretable computational results on AEKG4APT. In addition, this paper also explores the dynamic expansion capabilities of AEKG4APT, including data expansion, schema expansion, and permanent maintenance strategies, to address the evolving APT threats. Finally, this paper summarizes the competitiveness and application value of AEKG4APT by comparing it with other CTI KGs and platforms in academia and industry, demonstrating its extensive application potential in the field of cybersecurity.
Ensuring cybersecurity in power monitoring systems is of paramount importance to maintain the operational safety and stability of modern power grids. With the rapid expansion of grid infrastructure and increasing sophistication of cyber threats, existing manual alarm-tracing methods face significant challenges in handling the massive volume of security alerts, leading to delayed responses and potential system vulnerabilities. Current approaches often lack the capability to effectively model complex relationships among alerts and are hindered by imbalanced data distributions, which degrade tracing accuracy. To this end, this paper proposes a power monitor system cybersecurity alarm-tracing method based on the knowledge graph (KG) and graph convolutional neural networks (GCNN). Specifically, a cybersecurity KG is constituted based on the historical alert, accurately representing the entities and relationships in massive alerts. Then, a GCNN with attention mechanisms is applied to sufficiently extract the topological features along alarms in KG so that it can precisely and effectively trace the massive alarms. Most importantly, to mitigate the influence of imbalanced alarms for tracing, a specialized data process and model ensemble strategy by adaptively weighted imbalance sample is proposed. Finally, based on 70,000 alarm information from a regional power grid, by applying the method proposed in this paper, an alarm traceability accuracy rate of 96.59% was achieved. Moreover, compared with the traditional manual method, the traceability efficiency was improved by more than 80%.
This paper proposes a method to detect and trace insider threats in tobacco industry databases. The method is based on user entity behavior analysis and knowledge graph. It tackles the challenges of insider threat detection, such as high detection costs and tracing difficulties. The proposed method extracts salient features and augments attack vectors to improve classification performance. Additionally, it encodes user behavior as high-quality color images, overcoming the limitations of conventional grayscale encoding. By analyzing the UEBA and KG detection results, the threat cause can be traced. The proposed method demonstrates its effectiveness and superiority on a benchmark dataset.
With the rapid development of social media and networking platforms, nuisance information disseminated online has a significant negative impact on daily life. This data is characterised by multimodality, fragmentation and a lack of labelling, making it challenging to aggregate and analyse raw data, identify nuisance information and actors, and discover potential nuisance events. To address this issue, this paper presents a knowledge graph-based early warning model for identifying network nuisance subjects. Firstly, we construct a multimodal knowledge graph for network nuisance governance, based on the characteristics of network nuisance data. This allows us to align multi-source data and integrate redundant information through structured data mapping and attribute fusion. Secondly, we establish a nuisance labelling system from three dimensions: user attributes, behavioural habits, and interest areas. We then combine the K-Means algorithm to construct a group profile of network nuisance. Thirdly, to address the issue of potential subjects being strongly concealed, we design a social influence assessment model that integrates improved PageRank and KShell algorithms. Combined with the similarity of portraits, we propose an early warning method based on threat level to accurately identify potential nuisance subjects. Experiments show that the model achieves an accuracy rate of 83.3 % in predicting potential network nuisance subjects, demonstrating its practical application value.
Social engineering has posed a serious threat to cyberspace security. To protect against social engineering attacks, a fundamental work is to know what constitutes social engineering. This paper first develops a domain ontology of social engineering in cybersecurity and conducts ontology evaluation by its knowledge graph application. The domain ontology defines 11 concepts of core entities that significantly constitute or affect social engineering domain, together with 22 kinds of relations describing how these entities related to each other. It provides a formal and explicit knowledge schema to understand, analyze, reuse and share domain knowledge of social engineering. Furthermore, this paper builds a knowledge graph based on 15 social engineering attack incidents and scenarios. 7 knowledge graph application examples (in 6 analysis patterns) demonstrate that the ontology together with knowledge graph is useful to 1) understand and analyze social engineering attack scenario and incident, 2) find the top ranked social engineering threat elements (e.g. the most exploited human vulnerabilities and most used attack mediums), 3) find potential social engineering threats to victims, 4) find potential targets for social engineering attackers, 5) find potential attack paths from specific attacker to specific target, and 6) analyze the same origin attacks.
As Advanced Persistent Threats (APTs) continue to evolve, constructing a dynamic cybersecurity knowledge graph requires precise extraction of entity–relationship triples from unstructured threat intelligence. Existing approaches, however, face significant challenges in modeling low-frequency threat associations, extracting multi-relational entities, and resolving overlapping entity scenarios. To overcome these limitations, we propose the Symmetry-Aware Prototype Contrastive Learning (SAPCL) framework for joint entity and relation extraction. By explicitly modeling syntactic symmetry in attack-chain dependency structures and its interaction with asymmetric adversarial semantics, SAPCL integrates dependency relation types with contextual features using a type-enhanced Graph Attention Network. This symmetry–asymmetry fusion facilitates a more effective extraction of multi-relational triples. Furthermore, we introduce a triple prototype contrastive learning mechanism that enhances the robustness of low-frequency relations through hierarchical semantic alignment and adaptive prototype updates. A non-autoregressive decoding architecture is also employed to globally generate multi-relational triples while mitigating semantic ambiguities. SAPCL was evaluated on three publicly available CTI datasets: HACKER, ACTI, and LADDER. It achieved F1-scores of 56.63%, 60.21%, and 53.65%, respectively. Notably, SAPCL demonstrated a substantial improvement of 14.5 percentage points on the HACKER dataset, validating its effectiveness in real-world cyber threat extraction scenarios. By synergizing syntactic–semantic multi-feature fusion with symmetry-driven dynamic representation learning, SAPCL establishes a symmetry–asymmetry adaptive paradigm for cybersecurity knowledge graph construction, thus enhancing APT attack tracing, threat hunting, and proactive cyber defense.
Abstract Instant analysis of cybersecurity reports is a fundamental challenge for security experts as an immeasurable amount of cyber information is generated on a daily basis, which necessitates automated information extraction tools to facilitate querying and retrieval of data. Hence, we present Open-CyKG: an Open Cyber Threat Intelligence (CTI) Knowledge Graph (KG) framework that is constructed using an attention-based neural Open Information Extraction (OIE) model to extract valuable cyber threat information from unstructured Advanced Persistent Threat (APT) reports. More specifically, we first identify relevant entities by developing a neural cybersecurity Named Entity Recognizer (NER) that aids in labeling relation triples generated by the OIE model. Afterwards, the extracted structured data is canonicalized to build the KG by employing fusion techniques using word embeddings. As a result, security professionals can execute queries to retrieve valuable information from the Open-CyKG framework. Experimental results demonstrate that our proposed components that build up Open-CyKG outperform state-of-the-art models. 1
This study introduces an advanced cybersecurity threat detection framework built upon the MITRE ATT&CK knowledge base to address persistent threats in Internet of Things (IoT) environments. The proposed system integrates feature extraction, temporal modeling, anomaly scoring, attention mechanisms, and graph-based correlations to capture both short-term and long-term dependencies in telemetry data from RAM, CPU, registry, file systems, and network events. Experimental evaluation demonstrates that the model surpasses traditional and modern intrusion detection systems in multiple dimensions of performance, achieving 97% precision, 96% recall, 97% F1 score, 96% accuracy, and 95% balanced accuracy while maintaining a detection delay as low as 3 seconds and minimizing the false negative rate to 4%. Furthermore, the model achieves superior ranking accuracy (96%) and alert efficiency (95%), ensuring effective prioritization of critical threats in real-time. These findings validate the model as a robust, adaptive, and context-aware solution for strengthening IoT cybersecurity against evolving advanced persistent threats.
As the field of cybersecurity has experienced continual changes, up-to-date techniques have become increasingly necessary to analyze and defend against threats. Furthermore, the current methods consistently produce false alarms and sometimes completely miss real threats. This paper proposes an approach that integrates secure blockchain technology with data preprocessing, deep learning, and reinforcement learning to enhance threat detection and response capabilities. To secure the exchange of threat intelligence information, a safe blockchain network is used, which comprises Byzantine Fault Tolerance for high data integrity and Zero-Knowledge Proofs for access control. All relevant information is cleaned and standardized prior to analysis. Subsequently, graph convolutional neural networks with autoencoders are trained on large unlabeled sets of threat data to automatically label various types of threats, with the system employing fuzzy logic to rank and score possible threats. Furthermore, we implemented a feedback loop that incorporates reinforcement learning, thereby improving model performance over time according to guidance provided by cybersecurity specialists. The proposed system achieved high accuracy, precision, negative predictive value, and MCC, as well as notably low FPR and FNR values. The results establish that the proposed system is a reliable and effective measure for detecting cyberthreats.
Threat detection systems form the backbone of modern enterprise cybersecurity programs, analyzing massive volumes of logs, network flows, and user activities to identify potentially malicious events. Despite continuous advances in detection techniques, these systems generate an abundance oding to alert fatigue, wasted analyst resources, and a delayed response to actual threats. This paper surveys the problem of benign positives and proposes a graph-based framework that unifies alerts, user roles, infrastructure metadata, and historical dispositions in a knowledge graph. By representing alerts and contextual entities as interconnected nodes and edges, security teams can quickly detect recurring benign patterns (e.g., routine scanning tasks, staging environment bulk transfers) and implement precise suppression rules. Experimental findings from a simulated enterprise environment indicate that this approach significantly reduces benign positives compared to conventional static filters or standalone machine learning methods. The paper closes with recommendations for integrating multi-cloud data, automated rule generation, privacy safeguards, and user-friendly interfaces that support non-expert security analysts.
Ensuring cybersecurity in an ever-evolving threat landscape requires proactive identification and understanding of potential threats. Conventional detection and prediction solutions often fall short as they predominantly focus on known attack vectors. Advanced Persistent Threats (APTs) are becoming increasingly sophisticated and stealthy, resulting in new threat variants that are undetectable by these detection solutions. This paper introduces Threatify, a novel approach to predicting the most probable threat variants from existing APTs and previously seen attack campaigns. Our approach automates the generation of threat variants using graph-based machine learning based on the attack definition, past attack campaigns, and the security context between different techniques. Threatify leverages a security knowledge base of realistic attack scenarios and cybersecurity expertise to model, generate, and predict new forms of potential future threats by combining inter- (i.e., within the same APT attack) and intra- (i.e., between different APTs) techniques used by threat actors. It is crucial to emphasize that Threatify does not merely mix techniques from different APTs; rather, it constructs a logical and pragmatic kill chain based on their security context. Threatify is able to predict new attack steps, find relevant techniques to be substituted by, and merge APTs’ techniques in the current security context, and thus create previously unexplored threat variants. Our extensive experimental results demonstrate the efficacy of our approach in generating relevant and novel threat variants with a similarity score of 92%, uniqueness of 82%, validity of 95%, and reduction rate of 96%, including those that have never occurred before.
With the widespread application of information technology in smart grids, the security threats faced by its information domain are becoming increasingly complex and diverse, such as malicious attacks and data leaks, which seriously affect the normal operation of smart grids. To improve the security and stability of smart grids, this study focuses on the threat prediction method for the information domain of smart grid systems and proposes a threat prediction model based on graph neural networks. This model, constructed by combining graph neural network technology, thoroughly analyzes the characteristics of the smart grid information domain and the types of threats. It can fully capture the global and local features in graph data and effectively handle complex network structures and dynamically changing threat data. This provides an innovative technical solution for the security protection of the smart grid information domain and promotes the safe and stable development of smart grids.
The emerging complexity of cyber threats, such as sophisticated ransomware, Artificial Intelligence-powered phishing, and adversarial attacks has overtaken the established cyber-security defenses and novel ways to provide sufficient protection and privacy must be sought. In this study, we propose an agentic AI framework that autonomously detects, analyzes, predicts, and responds to cyber threats using four specialized agents: anomaly detection (Isolation Forest), threat classification (XGBoost), predictive modeling (CNN-LSTM with federated learning), and automated response (PPO reinforcement learning). A Graph Neural Network (GNN) models network traffic as a dynamic graph to identify anomalous patterns, while SHAP ensures explainable decisions. Evaluated on the NSL-KDD benchmark, the integrated system achieves high accuracy and sub-second response time, offering a balanced, privacy-preserving, and adaptive defense against evolving cyber threats.
Network security threat analysis is an important component of network security assessment. Traditional threat analysis methods cannot effectively integrate multi-source information and adapt to the rapidly changing network attack and defense situation. To address the above issues, a network security threat analysis method based on knowledge graph is proposed. Firstly, this article constructs a network security ontology model, which models the concepts and relationships in the field of network security, and then associates and integrates multi-source network security information. Next, this article constructs a network security knowledge graph based on ontology. Then, a network security threat path prediction, traceability, and analysis method based on knowledge graph is proposed to accurately perceive cyberspace security threat. Finally, by comparing with traditional threat detection methods, it is verified that the network security threat analysis method based on knowledge graph proposed in this article is more suitable for perceiving high-strength security threat.
Abstract: We study the application of graph neural networks and machine learning in the prediction of risks to wireless networks. Advanced predictive methods are necessary, since mere security measures cannot thwart the rising complexity of cyber-attacks. In the present work, we will use a hybrid model that combines several machine learning methods to reduce false negatives and positives and improve accuracy in the prediction of risks to wireless networks. Models are trained and validated with large amounts of data that include the performance indicators for accuracy, precision, recall, F1-score, and AUC-ROC. From this, it is also seen that the hybrid model performs much better than the standard models in real-time threat detection. The impact of the size of the dataset on the performance of the model was also studied, and it has come out that larger datasets improve predictive powers significantly. The result demonstrated that the latest advancements in machine learning techniques can lead to important improvements in the security of wireless networks.
The increasing sophistication of cyber threats necessitates intelligent, adaptive, and real-time threat detection mechanisms. Long-standing cyber security systems face difficulties in managing attacks because the complex patterns of cyber threats form intricate interconnections among networks along with devices and users. This research presents a Graph Neural Network (GNN) framework to perform real-time cyber threat analysis and intelligence using deep learning graph approaches for modeling complex attack patterns and anomaly detection for enhanced prediction security. Cyber threats get represented through graph structures using GNN that link entities (such as users or devices and IP addresses) to their relationship paths called edges. This method helps the proposed GNN model to uncover complex relationships in cyber security data. GNNs surpass traditional machine learning algorithms by detecting underlying spatial and temporal elements from threat environments thus attaining superior threat identification and activated risk prediction and minimal false alarm rates. Through this framework real-time data streams integrate with anomaly detection and adversarial threat modeling while performing dynamic cyber risk prediction and mitigation. GNNS-based threat analysis demonstrates better accuracy at 96.8 % with a 40% decrease in false detection rates together with enhanced speed over traditional security methods. The model demonstrates its ability to identify unknown cyber attacks at the same time it displays its capability to adapt to emerging online security risks.
Cyber threats are rapidly evolving, and the vast amount of Cyber Threat Intelligence (CTI) data exists in an unstructured form, making effective analysis challenging. This study proposes a method to structure CTI data and analyze relationships between threat elements by constructing a CTI-KG knowledge graph and applying an RAG-based threat analysis model. To achieve this, security data was normalized to build the CTI-KG, and the CompGCN model was utilized to learn entity relationships. Additionally, vector search was employed to explore relationships between threat elements, and the LLaMA 3 model was used to perform a more precise threat assessment. Experiments demonstrated that data normalization enhances the structural consistency of the CTI-KG and that link prediction techniques effectively infer hidden relationships between threat elements. Furthermore, RAG-based analysis was used to evaluate consistency with existing CTI data, confirming that the threat analysis results generated by the LLaMA 3 model exhibit high semantic similarity.
This research proposes a heterogeneous graph neural network (GNN) framework to attribute advanced persistent threat (APT) activity using enriched cyber threat intelligence (CTI). We construct a tripartite graph linking APT groups, contextualised Tactics, Techniques, and Procedures (TTPs), and their Cyber Kill Chain (CKC) stages. TTP nodes are embedded with Sentence-BERT (SBERT) vectors for semantic similarity, while CKC stages provide procedural context. This design captures both behavioural semantics and attack-stage relationships, enabling robust and interpretable attribution. Empirical evaluation on the APTNotes corpus achieves a Macro-F1 score of 0.84 and 85% accuracy, addressing limitations in baselines such as DeepOP (technique prediction without CKC integration) and APT-MMF (no procedural or temporal TTP modelling). The framework is suitable for Security Operations Centres (SOCs), enabling faster and more accurate decision-making during incident response. Overall, the study advances automated and explainable APT attribution for practical SOC deployment.
To reduce the probability of network risk, a knowledge graph-driven risk prediction model for network security was developed by exploring the information on network detection data. Indicators of the model were determined for security protection capability, attack threat risk, network performance anomaly index, and disaster tolerance, which were included in the index system of the developed model. The primary and secondary indicators were categorized to construct the knowledge graph. The graph attention mechanism was also integrated into the model. The entity-level attention network layer learned the attention coefficients among neighboring entities in the relationship path. The relationship of level attention network layers was obtained using new entity feature vectors and the attention coefficients. Entity feature vectors were aggregated to output risk prediction results. The experimental results showed that the model accurately predicted the level of risk in network security, reducing the number of risk events to more than 38%.
As digital transformation accelerates, cyberspace has become increasingly active, resulting in a rise in cyberattacks. In particular, Advanced Persistent Threats (APTs) targeting high-value assets are difficult to defend against with conventional security systems due to their stealthy and persistent characteristics. This paper proposes a proactive defense framework for APT attacks that enables real-time responses by observing attacker behavior in near real-time and predicting subsequent attack steps. Unlike traditional methods that detect isolated attacks at individual security points, the proposed framework holistically observes attacker actions and constructs a provenance graph by linking correlated events. An attack scoring mechanism is applied to the graph, and once the score exceeds a predefined threshold, the activity is classified as an APT attack, prompting immediate response actions. Additionally, the framework learns attack technique patterns from over 1,300 past cyberattack campaigns to predict the next likely attacker behavior with a certain level of accuracy. This prediction allows for estimating the timing of attacks on victim systems and determining the optimal timing for defense measures. Experimental evaluation using six APT scenarios from the MITRE Evaluation demonstrated that the proposed system reduced the attacker’s dwell time on victim systems by 67% and effectively blocked APT progression. Furthermore, the framework outperformed conventional Endpoint Detection and Response (EDR) solutions.
Cyber Threat attribution is the process of associating a cyberattack with the threat groups. This process is essential for enhancing defense strategies and enabling rapid response to threats, making threat attribution a critical component of an effective network security defense system. Current methods often struggle to leverage the intricate relationships among threat behaviors or lack an attacker’s feature extraction mechanism resulting in the need for manual analysis of vast data, thereby presenting challenges in the face of the escalating number and complexity of attacks. To tackle these challenges, we propose HG-CTA, a novel cyber threat attribution method based on heterogeneous graph. We first utilize cyber threat intelligence(CTI) to construct a heterogeneous knowledge base. Then we formalize threat attribution as a link prediction task on heterogeneous graph and propose a metapath context based heterogeneous graph embedding methods to extract feature of attackers. Finally, attribution is achieved by inferring the relationship between the attackers and threat groups. Through experiment on a data set constructed from threat intelligence provided by Alienvault, Miter ATT&CK, we demonstrate the effectiveness of our proposed attribution method compared with baseline models.
No abstract available
There are novel, complicated security concerns because of the widespread use of intricate technology structures and services as well as the constantly changing threat environment. These threats are generally linked to a wide range of weaknesses, including errors, operational errors, and safety defects in hardware or software. A prompt evaluation and prevention of the security threats influencing the technological settings of enterprises are crucial in this situation. The need for enhanced cybersecurity systems that can detect threats proactively and respond automatically derives from the growing complexity of cyber threats for active directory (AD) in companies. This study explores how advanced artificial intelligence (AI) methods might be applied to improve cybersecurity systems, with a particular emphasis on threat detection, prediction analytics, and defense automated processes. The system is built upon the integration of machine learning (ML) with graph-based approaches. More specifically, graphs that portray the target's elements and targets are used to identify possible security issues by analyzing the attack vectors. ML methods categorize these routes and offer the target's security evaluation. About fifty percent of the 220 artificially created AD settings used for the experimental evaluation of the suggested system had threats introduced into them. The classification technique produced generally positive findings. For the purpose of evaluating susceptible networks, the Random Forest model's $\mathbf{F 1}$-value was $\mathbf{0. 9 2}$. According to these findings, this method may be used to automate the security evaluation processes in intricate networked systems.
With the continuous development of network security situations, the types of attacks increase sharply, but can be divided into symmetric attacks and asymmetric attacks. Symmetric attacks such as phishing and DDoS attacks exploit fixed patterns, resulting in system crashes and data breaches that cause losses to businesses. Asymmetric attacks such as Advanced Persistent Threat (APT), a highly sophisticated and organized form of cyber attack, because of its concealment and complexity, realize data theft through long-term latency and pose a greater threat to organization security. In addition, there are challenges in the processing of missing data, especially in the application of symmetric and asymmetric data filling, the former is simple but not flexible, and the latter is complex and more suitable for highly complex attack scenarios. Since asymmetric attack research is particularly important, this paper proposes a method that combines causal discovery with graph autoencoder to solve missing data, classify potentially malicious nodes, and reveal causal relationships. The core is to use graphic autoencoders to learn the underlying causal structure of APT attacks, with a special focus on the complex causal relationships in asymmetric attacks. This causal knowledge is then applied to enhance the robustness of the model by compensating for data gaps. In the final phase, it also reveals causality, predicts and classifies potential APT attack nodes, and provides a comprehensive framework that not only predicts potential threats, but also provides insight into the logical sequence of the attacker’s actions.
Cyber threat intelligence on past attacks may help with attack reconstruction and the prediction of the course of an ongoing attack by providing deeper understanding of the tools and attack patterns used by attackers. Therefore, cyber security analysts employ threat intelligence, alert correlations, machine learning, and advanced visualizations in order to produce sound attack hypotheses. In this article, we present AttackDB, a multi-level threat knowledge base that combines data from multiple threat intelligence sources to associate high-level ATT&CK techniques with low-level telemetry found in behavioral malware reports. We also present the Attack Hypothesis Generator which relies on knowledge graph traversal algorithms and a variety of link prediction methods to automatically infer ATT&CK techniques from a set of observable artifacts. Results of experiments performed with 53K VirusTotal reports indicate that the proposed algorithms employed by the Attack Hypothesis Generator are able to produce accurate adversarial technique hypotheses with a mean average precision greater than 0.5 and area under the receiver operating characteristic curve of over 0.8 when it is implemented on the basis of AttackDB. The presented toolkit will help analysts to improve the accuracy of attack hypotheses and to automate the attack hypothesis generation process.
The network security model based on static rules lacks flexibility and adaptability, and it is difficult to adapt to and respond to dynamically changing network threats in real-time. Therefore, this paper proposes a method for optimizing the decision-making process of digital twins (DTs) in network security based on graph neural networks (GNNs). First, this paper applies the Adam optimizer to adjust the learning rate by optimizing the GNN structure, combines the cross entropy loss function to improve the attack recognition ability, and uses L2 regularization and Dropout to prevent overfitting to enhance the model’s performance in complex network data. Then, real-time network threat detection and attack path prediction are performed based on the optimized GNN model. To further improve the intelligent level of network security protection, this paper applies the online learning (OL) algorithm to continuously update the model to adapt to changes in the network environment and threat patterns. At the same time, combined with the policy gradient (PG) method, an intelligent decision-making module is designed to automatically adjust the defense strategy to achieve dynamic protection against changing network threats. Experimental results show that the optimized GNN model’s accuracy in network threat detection reaches 93.3%, which is 9.7% higher than that of the non-DT model, and the malware’s precision is increased by 6.9%. The system’s response time is reduced to 50ms, which significantly improves the real-time performance and decision accuracy of network security protection and demonstrates the excellent performance and broad application prospects of this method in dynamic network environments.
No abstract available
In the current era of rapid advancements in Artificial Intelligence of Things (AIoT), with the increase in cloud data center operations and the limited security computing capabilities of AIoT terminal devices, link flooding attack (LFA) has emerged as a complex and stealthy new threat. However, the existing defense methods based on programmable networks usually have issues of slow offline inference and delayed defense activation. To address these issues, we propose a collaborative programmable defense framework (CPDTG) to predict, detect, and mitigate LFA. First, an early attack intention prediction model based on temporal graph learning (TGL) is proposed to accurately locate attacks and promptly activate defenses to save resource consumption during idle time. Second, a switch-native clustering algorithm independent of the global perspective is introduced for line-speed detection of LFA. The unsupervised algorithm does not rely on labeled datasets for training, which enhances its robustness against differentiated attack scenarios. Third, we propose a distributed defense mechanism that achieves the pushback deployment of adaptive rate-limiting strategies. Compressing the potential attack vector space effectively increases the difficulty of launching rolling attacks. Extensive experimental validation demonstrates the effectiveness of the proposed CPDTG in predicting and defending against LFA.
With the rapid development of the Internet of Things (IoT), security issues are becoming increasingly severe. Malicious attackers use IoT devices to carry out network attacks, resulting in data leakage. The use of knowledge graphs effectively prevent and resist attacks through deep mining and association analysis in security situation awareness and threat prediction. Entity recognition and relationship extraction are the core steps in the construction of knowledge graphs. They are used to automatically extract meaningful entities and relationships from massive data and perform reasoning, but they still face challenges in accuracy and computational cost in extracting long texts and complex relationships. To address these issues, this article proposes the reformer-based dynamic locality-sensitive hashing (RDLSH) model for processing of local context, low-frequency entity recognition, and global semantic associations. Based on the Reformer architecture, it dynamically adjusts the locality sensitive hashing parameters, and combines the multihead attention mechanism to achieve good performance in capturing cross-paragraph and long-range dependencies, and efficiently handles entity recognition and relationship extraction tasks. In addition, the RDLSH model introduces reversible residual networks and bidirectional transfer mechanisms to optimize the memory usage of large-scale data processing and improve computational efficiency. Experimental results show that the RDLSH model not only improves the accuracy of entity and relationship extraction, but also enhances the cross-sentence dependency processing capability and computational efficiency.
Intrusion intent and path prediction are important for security administrators to gain insight into the possible threat behavior of attackers. Existing research has mainly focused on path prediction in ideal attack scenarios, yet the ideal attack path is not always the real path taken by an intruder. In order to accurately and comprehensively predict the path information of network intrusion, a multi-step attack path prediction method based on absorbing Markov chains is proposed. Firstly, the node state transfer probability normalization algorithm is designed by using the nil posteriority and absorption of state transfer in absorbing Markov chain, and it is proved that the complete attack graph can correspond to absorbing Markov chain, and the economic indexes of protection cost and attack benefit and the index quantification method are constructed, and the optimal security protection policy selection algorithm based on particle swarm algorithm is proposed, and finally the experimental verification of the model in protection Finally, we experimentally verify the feasibility and effectiveness of the model in protection policy decision-making, which can effectively reduce network security risks and provide more security protection guidance for timely response to network attack threats.
Inter-state cyberattacks are increasingly becoming a major hidden threat to national security and global order. However, current prediction models are often constrained by single-source data due to insufficient consideration of complex influencing factors, resulting in limitations in understanding and predicting cyberattacks. To address this issue, we comprehensively consider multiple data sources including cyberattacks, bilateral interactions, armed conflicts, international trade, and national attributes, and propose an interpretable multimodal data fusion framework for predicting cyberattacks among countries. On one hand, we design a dynamic multi-view graph neural network model incorporating temporal interaction attention and multi-view attention, which effectively captures time-varying dynamic features and the importance of node representations from various modalities. Our proposed model exhibits greater performance in comparison to many cutting-edge models, achieving an F1 score of 0.838. On the other hand, our interpretability analysis reveals unique characteristics of national cyberattack behavior. For example, countries with different income levels show varying preferences for data sources, reflecting their different strategic focuses in cyberspace. This unveils the factors and regional differences that affect cyberattack prediction, enhancing the transparency and credibility of the proposed model.
No abstract available
APT attacks are becoming increasingly complex and stealthy. To effectively counter APT attacks, modelling threat intelligence data based on graphs, identifying Indicators of Compromise (IOC) nodes, and providing early warnings have become new research hotspots. However, the problem of node category imbalance in such graph datasets restricts the identification capabilities of these methods. Therefore, this paper proposes a supervised graph data augmentation method. In the training phase, graph disentangled representation learning is utilized to perform feature embedding for minority class nodes, effectively alleviating the sparsity problem faced by traditional methods and effectively integrating neighbourhood information of minority class nodes at a higher semantic level. Additionally, two loss functions designed based on link prediction and prototype constraints enhance node type consistency and semantic consistency, respectively. Experimental results on the APT and PDNS datasets demonstrate that the proposed method outperforms other baseline models in identification performance; even in highly imbalanced scenarios, it surpasses the second-best model.
Given the ever-changing nature of the threat landscape, with new attacks emerging with high regularity, resilient and adaptive detection systems are crucial, for the protection of Internet of Things (IoT) networks. In this paper we present a novel advanced anomaly detection model using AI techniques for anomaly detection in networks based on traffic features, and providing a clear justification for the most influential factors causing the detected anomaly. The A3T2A model leverages domain knowledge in a knowledge graph in order to check if the detected anomaly is a real attack. When validated, the model can determine what core cybersecurity principles. CIA is mapped to very influential feature values. This is then down-selected to align with MITRE ATT&CK for attacker tactics, techniques and finally intelligence-based defences to provide recommendations for defence improvement. Through encoding expert domain knowledge and using explainable AI (XAI), by our means we fill the gap between the AI powerful prediction and human-interpretable decision, which achieves the improvement on not only detection accuracy, but also result interpretability. This visibility enables rapid reaction and real-time decision-making, and it enhances our ability to react to new, previously unseen cyber threats. Our experiments on network traffic datasets show that the proposed method is able to not only detect and explain anomalies but also achieve 0.97 of overall detection accuracy with domain knowledge in attack legality. And it does threat Intel 100% according to MITRE ATT&CK, guaranteeing the suitability of the security measures in place and, finally, enhancing the defences of the IoT environment by providing immediate threat intelligence and response capability that reduces human response time.
Multistep attack prediction and security situation awareness are two big challenges for network administrators because future is generally unknown. In recent years, many investigations have been made. However, they are not sufficient. To improve the comprehensiveness of prediction, in this paper, we quantitatively convert attack threat into security situation. Actually, two algorithms are proposed, namely, attack prediction algorithm using dynamic Bayesian attack graph and security situation quantification algorithm based on attack prediction. The first algorithm aims to provide more abundant information of future attack behaviors by simulating incremental network penetration. Through timely evaluating the attack capacity of intruder and defense strategies of defender, the likely attack goal, path, and probability and time-cost are predicted dynamically along with the ongoing security events. Furthermore, in combination with the common vulnerability scoring system (CVSS) metric and network assets information, the second algorithm quantifies the concealed attack threat into the surfaced security risk from two levels: host and network. Examples show that our method is feasible and flexible for the attack-defense adversarial network environment, which benefits the administrator to infer the security situation in advance and prerepair the critical compromised hosts to maintain normal network communication.
Security assessment relies on public information about products, vulnerabilities, and weaknesses. So far, databases in these categories have rarely been analyzed in combination. Yet, doing so could help predict unreported vulnerabilities and identify common threat patterns. In this article, we propose a methodology for producing and optimizing a knowledge graph that aggregates knowledge from common threat databases (CVE, CWE, and CPE). We apply the threat knowledge graph to predict associations between threat databases, specifically between products, vulnerabilities, and weaknesses. We evaluate the prediction performance both in closed world with associations from the knowledge graph and in open world with associations revealed afterward. Using rank-based metrics (i.e., Mean Rank, Mean Reciprocal Rank, and Hits@N scores), we demonstrate the ability of the threat knowledge graph to uncover many associations that are currently unknown but will be revealed in the future, which remains useful over different time periods. We propose approaches to optimize the knowledge graph and show that they indeed help in further uncovering associations. We have made the artifacts of our work publicly available.
TITAN (Threat Intelligence Through Automated Navigation) is a framework that connects natural-language cyber threat queries with executable reasoning over a structured knowledge graph. It integrates a path planner model, which predicts logical relation chains from text, and a graph executor that traverses the TITAN Ontology to retrieve factual answers and supporting evidence. Unlike traditional retrieval systems, TITAN operates on a typed, bidirectional graph derived from MITRE, allowing reasoning to move clearly and reversibly between threats, behaviors, and defenses. To support training and evaluation, we introduce the TITAN Dataset, a corpus of 88209 examples (Train: 74258; Test: 13951) pairing natural language questions with executable reasoning paths and step by step Chain of Thought explanations. Empirical evaluations show that TITAN enables models to generate syntactically valid and semantically coherent reasoning paths that can be deterministically executed on the underlying graph.
Cyber threat intelligence is important information for analysing cyber threats. Most of the research focuses on extracting threat entities from threat intelligence and constructing knowledge graphs, with less research on reasoning on the threat intelligence knowledge graph. Existing research has limited ability to reason about implicit information in the threat intelligence knowledge graph and is not interpretable. In addition, the amount of information contained in the threat intelligence knowledge graph has an inherent upper limit, and the existing multihop reasoning methods are also constrained by the limitation caused by the knowledge graph, which affects their performance. In this paper, we propose a multi-hop reasoning framework for the cyber threat intelligence knowledge graph, which uses a language model for multi-hop reasoning, views the reasoning process as a sequence-to-sequence task, and generates multihop reasoning paths based on reasoning queries. To alleviate the limitation of the threat intelligence knowledge graph, we inject external knowledge graphs into the language model and add a rule enhancement strategy, and the experimental results show some improvement in multi-hop reasoning performance and interpretability.
With the rapid evolution of cyber threats, traditional Artificial Intelligence (AI)-driven security models often fail to provide real-time, interpretable, and adaptive threat intelligence. This paper proposes Neuro-Symbolic Cyber Threat Intelligence (NSCTI), a novel hybrid framework that integrates Deep NeuroSymbolic Learning (DNSL) with Graph-Based Threat Reasoning (GBTR) to enhance predictive cybersecurity analytics. The NSCTI framework comprises three core components: Hybrid Deep Learning-Based Threat Detection (HDL-TD), which leverages Graph Neural Networks (GNNs), Long Short-Term Memory (LSTM), and Hidden Markov Models (HMMs) for dynamic attack pattern recognition; Neuro-Symbolic Adversarial Defense (NSAD), which employs reinforcement learning-driven adversarial resilience mechanisms to mitigate evolving cyber threats; and Trust-Aware Federated Cyber Intelligence (TFCI), which utilizes federated learning (FL) and blockchain-based threat sharing to ensure secure, decentralized CTI. Experimental evaluations on benchmark datasets (CICIDS2017, UNSW-NB15, and N-BaIoT) demonstrate that NSCTI achieves 99.7% attack detection accuracy, a 0.5 % false positive rate (FPR), and a 35% reduction in computational overhead compared to existing cybersecurity frameworks. Security analysis confirms NSCTI's robustness against Man-in-the-Middle (MITM), replay attacks, and adversarial perturbations, making it a scalable and proactive cyber defense solution. Future research will explore quantumenhanced neuro-symbolic learning and self-adaptive reinforcement learning to further strengthen autonomous cybersecurity in large-scale IoT networks.
The persistent and stealthy nature of Advanced Persistent Threats (APTs) poses a significant challenge to enterprise security. Traditional detection mechanisms often fall short in identifying coordinated multi-step attacks or leveraging the rich context available in Cyber Threat Intelligence (CTI). The three works presented tackle this problem from complementary angles -- real-time detection, correlation-based threat hunting, and automated intelligence extraction. A unifying thread across these three works is their shared reliance on provenance graphs as a powerful abstraction for capturing and reasoning about complex attacker behavior. Together, these approaches form a complementary ecosystem: Extractor extracts threat knowledge, POIROT hunts for manifestations of that knowledge, and HOLMES detects emergent threats in real-time, all grounded in a common graph-based representation of system activity and threat behavior. HOLMES introduces a real-time detection framework aimed at identifying the coordinated activities typical of APT campaigns. It does so by correlating suspicious information flows to generate a robust detection signal and constructing high-level provenance graphs that summarize attacker behavior for analyst response. Its evaluation shows high precision and low false alarm rates, supporting its applicability in live operational environments. POIROT builds on the growing use of CTI standards by actively leveraging the relationships between indicators—often underused in practice—for threat hunting. It treats the problem as a graph pattern matching task, aligning CTI-derived graphs with system-level provenance data obtained from kernel audits. Its novel similarity metric enables efficient search through massive graphs, revealing APT traces within minutes and demonstrating the operational utility of CTI relationship data. Extractor addresses the challenge of unstructured CTI reports by automatically transforming them into structured, machine-usable provenance graphs. Without requiring strict assumptions about the input text, it extracts concise behavioral indicators that can be fed into threat-hunting tools like POIROT, bridging the gap between raw intelligence and analytical application. Together, these systems represent a shift toward graph-based, intelligence-driven detection and response. They emphasize the value of integrating real-time monitoring with structured threat intelligence and automation, setting the stage for more adaptive and effective cybersecurity operations.
The swift development of cyber threats such as Zero-Day Attacks (ZDA) and Advanced Persistent Threat (APTs) necessitates smart, adaptive, and explainable security solutions. Current threat detection platforms are not well-suited to real-time reasoning and adversarial robustness, and pure Deep Learning (DL) methods fail to generalize against new attack patterns and lack interpretability. For alleviating these limitations, this work presents a Neuro-Symbolic Deep Learning (NSDL) architecture for Automated Cyber Threat Intelligence (ACTI) Generation with the combination of the pattern perception strengths of DL and logical reasoning capabilities of symbolic AI. Dynamic threat pattern discovery is employed by our technique through Graph Neural Networks (GNNs) and Transformers (TMs) and symbolic reasoning that supports causal inference as well as explainable decision-making in detecting cyber threats. Moreover, we integrate adversarial training for robust models and Federated Learning (FL) for privacy-preserving intelligence sharing on distributed settings. Benchmark tests on real-world cybersecurity datasets (CIC-IDS2017, TON_IoT, BoT-IoT) prove state-of-the-art threat detection accuracy, improved false positives, and improved ZDA detection. The framework opens avenues to trustworthy, scalable, and interpretable AI-based cybersecurity with a paradigm shift in automated cyber threat intelligence generation towards next-generation security operations.
Cyber threat intelligence (CTI) is a crucial tool to prevent sophisticated, organized, and weaponized cyber attacks. However, few studies have focused on the credibility assessment of CTI, and this work still requires manual analysis by cybersecurity experts. In this paper, we propose Knowledge Graph-based Verifier (KGV), the first framework integrating large language models (LLMs) with simple structured knowledge graphs (KGs) for automated CTI credibility assessment. Unlike entity-centric KGs, KGV constructs paragraph-level semantic graphs where nodes represent text segments connected through similarity analysis, which effectively enhances the semantic understanding ability of the model, reduces KG density and greatly improves response speed. Experimental results demonstrate that our KGV outperforms state-of-the-art fact reasoning methods on the CTI-200 dataset, achieving a 5.7\% improvement in F1. Additionally, it shows strong scalability on factual QA and fake news detection datasets. Compared to entity-based knowledge graphs (KGs) for equivalent-length texts, our structurally simple KG reduces node quantities by nearly two-thirds while boosting precision by 1.7\% and cutting response time by 46.7\%. In addition, we have created and publicly released the first CTI credibility assessment dataset, CTI-200. Distinct from CTI identification datasets, CTI-200 refines CTI summaries and key sentences to focus specifically on credibility assessment.
The continuous growth of cyber threats needs sophisticated and adaptive defensive systems capable of proactively detecting malicious activity across complicated network architectures. This study presents TIGNET, a unique AI-driven threat detection architecture that combines sophisticated Graph Neural Networks (GNNs) with a bespoke feature improvement module called CAFE (Context-Aware Feature Extractor). TIGNET, which is designed to properly capture both structural and temporal relationships in network data, provides a multi-level reasoning framework for exact threat assessment. We test the model using the CSE-CIC-IDS2018 dataset, which comprises a wide range of attack scenarios spanning several days of simulated network activity. TIGNET outperforms baseline models such as SVM, Random Forest, LSTM, GCN, and GAT, with a superior accuracy of 98.12%, precision of 97.02%, recall of 97.89%, and an F1-score of 97.45%, thanks to thorough testing. The suggested model also has a low false positive rate (1.21%) and high detection rates for several attack types, including 99.45% for DoS Hulk and 98.91% for PortScan. The suggested architecture has a high potential for real-world application owing to its scalability, interpretability, and resilience. This study adds a realistic and new approach to the area of AI-powered cybersecurity by addressing both detection accuracy and dependability in dynamic threat environments.
Knowledge graph, as a form of efficient organization of entities and concepts, can organize scattered distributed data and provide support for data analysis and knowledge reasoning for threat modeling of cyber security. In this paper, based on ATT&CK threat intelligence, we use knowledge graph modeling technology to study the construction of a threat intelligence visualization model for cyber security attacks.
Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these systems. Adversaries can use fake CTI examples as training input to subvert cyber defense systems, forcing their models to learn incorrect inputs to serve the attackers' malicious needs. In this paper, we show how to automatically generate fake CTI text descriptions using transformers. Given an initial prompt sentence, a public language model like GPT-2 with fine-tuning can generate plausible CTI text that can mislead cyber-defense systems. We use the generated fake CTI text to perform a data poisoning attack on a Cybersecurity Knowledge Graph (CKG) and a cybersecurity corpus. The attack introduced adverse impacts such as returning incorrect reasoning outputs, representation poisoning, and corruption of other dependent AI-based cyber defense systems. We evaluate with traditional approaches and conduct a human evaluation study with cyber-security professionals and threat hunters. Based on the study, professional threat hunters were equally likely to consider our fake generated CTI and authentic CTI as true.
No abstract available
Security engineers and researchers use their disparate knowledge and discretion to identify malware present in a system. Sometimes, they may also use previously extracted knowledge and available Cyber Threat Intelligence (CTI) about known attacks to establish a pattern. To aid in this process, they need knowledge about malware behavior mapped to the available CTI. Such mappings enrich our representations and also helps verify the information. In this paper, we describe how we retrieve malware samples and execute them in a local system. The tracked malware behavior is represented in our Cybersecurity Knowledge Graph (CKG), so that a security professional can reason with behavioral information present in the graph and draw parallels with that information. We also merge the behavioral information with knowledge extracted from the text in CTI sources like technical reports and blogs about the same malware to improve the reasoning capabilities of our CKG significantly.
Unstructured cyber threat intelligence (CTI) reports present major challenges for systematic analysis, particularly when accuracy and reliability are critical. This paper introduces a formal, four-stage mathematical model for constructing canonical knowledge graphs from sensitive textual data. The model integrates the advanced extraction and reasoning capabilities of GPT-5 with deterministic rule-based inference and network analysis to bridge the “formalization gap” between probabilistic large language model (LLM) outputs and verifiable analytical structures. Using a corpus of 204 official CERT-UA incident reports as a test case, the methodology successfully normalized thousands of raw entities, identified central threat actors and high-value targets, and revealed distinct operational ecosystems within Ukraine’s cyber threat landscape. Theoretically, the study contributes a replicable and mathematically defined framework for integrating next-generation LLMs into formalized knowledge graph pipelines. Practically, it provides a scalable and reliable tool for analysts in cybersecurity, national security, and related fields, enabling the transformation of unstructured reports into actionable intelligence.
Knowledge graphs (KGs) have become essential for representing complex relationships among entities such as malware, vulnerabilities, and attack techniques extracted from cyber threat reports. Traditional KG embedding methods primarily focus on single-hop link prediction, limiting their ability to capture longer relational chains that are critical in real-world attack scenarios. Multi-hop reasoning, on the other hand, reflects the true structure of cyber threats, where interactions often span multiple entities (for example, malware → technique → vulnerability → software → sector). Inferring such indirect connections enables early detection of potential attack surfaces, improved attribution, and richer contextual threat intelligence. In this work, we propose a unified framework for multi-hop link prediction in malware knowledge graphs. The framework learns latent representations of entities and relations by leveraging both graph-based and language-based reasoning paradigms. Specifically, we fine-tune large language models (LLaMA-2-7B and Phi-2) for path-based reasoning using natural language representations, and we benchmark them against graph neural network (GNN) variants such as GraphSAGE, GCN, and GAT. Experimental results show that GNNs generalize effectively on shorter paths but degrade as relational complexity increases, while LLMs maintain robust performance across longer reasoning chains due to their chain-of-thought and contextual learning capabilities. These findings highlight the complementary strengths of symbolic and neural reasoning approaches and demonstrate the potential of LLMs for dynamic malware analysis and reasoning over unseen entities.
The increasing sophistication of insider threats and Advanced Persistent Threats (APTs) necessitates intelligent, proactive cybersecurity systems that go beyond traditional rule-based detection. This paper presents an integrated framework that combines behavioral biometrics with Cyber Threat Intelligence (CTI) ontologies for the predictive analysis of insider threats and APT actor behavior. Behavioral biometrics—such as keystroke dynamics, mouse movement, and touch gestures—are leveraged to establish dynamic, continuous identity verification baselines. These are semantically mapped using CTI ontologies, including MITRE ATT&CK and STIX/TAXII, to associate anomalous behavior with known adversary tactics, techniques, and procedures (TTPs). The proposed model employs multi-layered learning with clustering algorithms, Bayesian networks, and graph convolutional reasoning to detect behavioral deviations and attribute them to malicious intent. Empirical evaluations using a 12-month dataset of over 80,000 user behavior records show that the integrated system achieved an accuracy of over 92% in identifying insider threat activity and predicting APT behavioral patterns. Explainable AI techniques such as SHAP and LIME enhance interpretability, while fairness audits ensure ethical compliance. The findings demonstrate that integrating behavioral biometrics and CTI ontologies significantly improves detection fidelity, contextual awareness, and cyber threat mitigation. This work contributes to the development of cognitive, adaptive defense mechanisms essential for modern enterprise security.
No abstract available
Cyber threat intelligence (CTI) is vital for the proactive identification and mitigation of cybersecurity threats. However, inconsistent quality and the spread of misinformation undermine decision‐making, weaken defensive measures, and increase exposure to attack. To address these challenges, we propose CyberVeriGNN, a graph neural network (GNN) framework that detects forged CTI by jointly modeling semantic consistency and structural coherence. The method parses unstructured threat reports into heterogeneous attack subgraphs, enriches node and edge representations using BERT contextual embeddings and MITRE ATT&CK semantics, and fuses these features via a multi‐head graph attention network (GAT). A graph autoencoder (GAE) subsequently computes reconstruction error, yielding an interpretable anomaly score. Evaluated on real CTI corpora augmented with expert‐validated adversarial samples, CyberVeriGNN achieves a precision of 81.57% and an F1 score of 81.25%, significantly outperforming both traditional and graph‐based baselines. These results demonstrate the value of deep semantic–structural fusion for CTI authenticity verification and lay a strong foundation for scalable, trustworthy validation in security‐critical environments.
The escalating frequency and severity of cyber-attacks have presented formidable challenges to the safeguarding of cyberspace. Named Entity Recognition (NER) technology is utilized for the rapid identification of threat entities and their relationships within cyber threat intelligence, enabling security researchers to be promptly informed of the occurrence of cyber threats, thereby enhancing the efficiency of security defense and analysis. However, current models for identifying network threat entities and extracting relationships suffer from limitations such as the inadequate representation of textual semantic information, insufficient granularity in threat entity recognition, and errors in relationship extraction propagation. To address these issues, this article proposes a novel model for Network Threat Entity Recognition and Relationship Extraction (CtiErRe). Additionally, it redefines seven network threat entities and two types of relationships between threat entities. Specifically, first, domain knowledge is collected to build a domain knowledge graph, which is then embedded using graph convolutional networks (GCN) to enhance the feature representation of threat intelligence text. Next, the features from domain knowledge graph embedding and those generated by the bidirectional encoder representations from transformers (BERT) model are fused using the Layernorm algorithm. Finally, the fused features are processed using the GlobalPointer algorithm to generate both the threat entity type matrix and the threat entity relation type matrix, thereby enabling the identification of threat entities and their relationships. To validate our proposed model, we conducted extensive experiments, and the results demonstrate its superiority over existing models. Our model performs remarkably in threat entity recognition tasks, with accuracy and F1 scores reaching 92.13% and 93.11%, respectively. In the relationship extraction task, our model achieves accuracy and F1 scores of 91.45% and 92.45%, respectively.
The rapid growth of cyber threats that emanate from the Dark Web has become a daunting problem to the modern cybersecurity systems. This research suggests a framework of Cyber Threat Intelligence using Graph Attention Networks (GATs) for effective monitoring and analysis of the Dark Web. The proposed system makes use of GATs that can best model the complex relationships between the entities i.e. the threat actors, the marketplaces, and the communication channels by giving weights to connections between the nodes. Initially, Dark Web data is gathered and preprocessed by NLP and NER to remove irrelevant issues such as age, virus names, and online identities. A graph that is based on this feature set is then designed, wherein the nodes are entities extracted and the edges represent relationships. The GAT model is set to understand the hierarchical and semantic patterns of these graphs. It is possible to find out the newest threats, strange activities, and ways of attack after the graph where the models are given a better version. Moreover, the framework has the ability to discover potential threats and alert the network operator on time. Furthermore, the proposed framework incorporates a module of temporal anomaly detection in order to monitor behavioral alterations throughout the duration. Using this technique, the detection of cyber threats is not only more accurate but also accountable for the system’s changes. The experimental results prove that the suggested GAT-based framework outperforms the traditional graph neural networks and baseline machine learning models in terms of detecting hidden criminal acts on the Dark Web.
Cyber threat intelligence (CTI) is central to modern cybersecurity, providing critical insights for detecting and mitigating evolving threats. With the natural language understanding and reasoning capabilities of large language models (LLMs), there is increasing interest in applying them to CTI, which calls for benchmarks that can rigorously evaluate their performance. Several early efforts have studied LLMs on some CTI tasks but remain limited: (i) they adopt only closed-book settings, relying on parametric knowledge without leveraging CTI knowledge bases; (ii) they cover only a narrow set of tasks, lacking a systematic view of the CTI landscape; and (iii) they restrict evaluation to single-source analysis, unlike realistic scenarios that require reasoning across multiple sources. To fill these gaps, we present CTIArena, the first benchmark for evaluating LLM performance on heterogeneous, multi-source CTI under knowledge-augmented settings. CTIArena spans three categories, structured, unstructured, and hybrid, further divided into nine tasks that capture the breadth of CTI analysis in modern security operations. We evaluate ten widely used LLMs and find that most struggle in closed-book setups but show noticeable gains when augmented with security-specific knowledge through our designed retrieval-augmented techniques. These findings highlight the limitations of general-purpose LLMs and the need for domain-tailored techniques to fully unlock their potential for CTI.
Cyber Threat Intelligence (CTI) reports provide valuable insights into cyber threats. However, manually constructing attack graphs from the unstructured CTI reports requires significant human effort. With the development of Large Language Models (LLMs), researchers have begun to harness LLMs for attack graph construction from CTI reports. Nevertheless, existing works mainly focus on describing abstract and high-level attack behaviors through these graphs, which cannot be used as query graphs for threat detection based on graph matching, as provenance graphs are at the system level. Moreover, these works do not consider modeling cross-host attack behaviors. To address these problems, we propose a novel method for automatically constructing attack graphs from CTI reports. We utilize prompt engineering, and leverage the in-context learning ability of LLMs to generate attack graphs. In this method, we first restructure the CTI report by grouping continuous sentences with the same tactics, and then use multiple LLMs to extract entities and relations. Finally, we use an LLM to integrate all the results. We also design a cross-host threat detection algorithm using the generated attack graphs. The evaluation results show that our method constructs attack graphs with an average conversion rate of 72.4%. It also achieves nearly 94.8% precision and 96.5% recall for IoC and relation extraction compared with manually labeled results.
Verifying the credibility of Cyber Threat Intelligence (CTI) is essential for reliable cybersecurity defense. However, traditional approaches typically treat this task as a static classification problem, relying on handcrafted features or isolated deep learning models. These methods often lack the robustness needed to handle incomplete, heterogeneous, or noisy intelligence, and they provide limited transparency in decision-making-factors that reduce their effectiveness in real-world threat environments. To address these limitations, we propose LRCTI, a Large Language Model (LLM)-based framework designed for multi-step CTI credibility verification. The framework first employs a text summarization module to distill complex intelligence reports into concise and actionable threat claims. It then uses an adaptive multi-step evidence retrieval mechanism that iteratively identifies and refines supporting information from a CTI-specific corpus, guided by LLM feedback. Finally, a prompt-based Natural Language Inference (NLI) module is applied to evaluate the credibility of each claim while generating interpretable justifications for the classification outcome. Experiments conducted on two benchmark datasets, CTI-200 and PolitiFact show that LRCTI improves F1-Macro and F1-Micro scores by over 5%, reaching 90.9% and 93.6%, respectively, compared to state-of-the-art baselines. These results demonstrate that LRCTI effectively addresses the core limitations of prior methods, offering a scalable, accurate, and explainable solution for automated CTI credibility verification
The automation of Cyber Threat Intelligence (CTI) relies heavily on Named Entity Recognition (NER) to extract critical entities from unstructured text. Currently, Large Language Models (LLMs) primarily address this task through retrieval-based In-Context Learning (ICL). This paper analyzes this mainstream paradigm, revealing a fundamental flaw: its success stems not from global semantic similarity but largely from the incidental overlap of entity types within retrieved examples. This exposes the limitations of relying on unreliable implicit induction. To address this, we propose TTPrompt, a framework shifting from implicit induction to explicit instruction. TTPrompt maps the core concepts of CTI's Tactics, Techniques, and Procedures (TTPs) into an instruction hierarchy: formulating task definitions as Tactics, guiding strategies as Techniques, and annotation guidelines as Procedures. Furthermore, to handle the adaptability challenge of static guidelines, we introduce Feedback-driven Instruction Refinement (FIR). FIR enables LLMs to self-refine guidelines by learning from errors on minimal labeled data, adapting to distinct annotation dialects. Experiments on five CTI NER benchmarks demonstrate that TTPrompt consistently surpasses retrieval-based baselines. Notably, with refinement on just 1% of training data, it rivals models fine-tuned on the full dataset. For instance, on LADDER, its Micro F1 of 71.96% approaches the fine-tuned baseline, and on the complex CTINexus, its Macro F1 exceeds the fine-tuned ACLM model by 10.91%.
Cyber threat intelligence (CTI) increasingly relies on knowledge graphs (KGs) to represententities, relationships, and context across vulnerabilities, exploits, actors, and mitigations. Real-worldCTI is, however, uncertain and parameter-dependent: intelligence feeds vary in confidence, temporalvalidity, and applicability. In this paper we introduce a soft-topological framework for cybersecurityknowledge graphs (ST-CKG). Our framework integrates soft set theory and soft topology with KGrepresentations to model parameterized uncertainty and dynamic relationships. We define soft-opensubgraphs, soft-closure and boundary operators, soft-connected components, and soft-continuousmappings to formalize KG evolution and risk propagation. A prototype implementation built frompublic CTI sources demonstrates the framework’s utility in identifying robust threat clusters andemerging vulnerabilities. We discuss applications in threat prioritization, risk assessment, and cybersituational awareness, and outline directions for future research.
Defending against today's increasingly sophisticated and large-scale cyberattacks demands accurate, real-time threat intelligence. Traditional approaches struggle to scale, integrate diverse telemetry, and adapt to a constantly evolving security landscape. We introduce Threat Intelligence Tracking via Adaptive Networks (Titan), an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale. Titan introduces a suite of innovations specifically designed to address the complexities of the modern security landscape, including: (1) a dynamic threat intelligence graph that maps the intricate relationships between millions of entities, incidents, and organizations; (2) real-time update mechanisms that automatically decay and prune outdated intel; (3) integration of security domain knowledge to bootstrap initial reputation scores; and (4) reputation propagation algorithms that uncover hidden threat actor infrastructure. Integrated into Microsoft Unified Security Operations Platform (USOP), which is deployed across hundreds of thousands of organizations worldwide, Titan's threat intelligence powers key detection and disruption capabilities. With an impressive average macro-F1 score of 0.89 and a precision-recall AUC of 0.94, Titan identifies millions of high-risk entities each week, enabling a 6x increase in non-file threat intelligence. Since its deployment, Titan has increased the product's incident disruption rate by a remarkable 21%, while reducing the time to disrupt by a factor of 1.9x, and maintaining 99% precision, as confirmed by customer feedback and thorough manual evaluation by security experts--ultimately saving customers from costly security breaches.
In the research of cyber threat intelligence knowledge graphs, the current challenge is that there are errors, inconsistencies, or missing knowledge graph triples, which makes it difficult to cope with the complexity and diversified application requirements. Currently, the predominant approach in quality assessment research for knowledge graphs involves employing word embeddings. This method evaluates the rationality of triples to assess the quality of knowledge graphs. Recent studies have found that better word representations can be obtained by splicing different types of embeddings, and applied to tasks such as named entity recognition (NER). However, amidst the proliferation of embedding typologies, the conundrum of selecting optimal embeddings for constructing connection representations has emerged as a pressing issue. In this paper, we propose an adaptive joining of embedding (AJE) model to automatically find better word embedding representations for knowledge graph quality assessment. The AJE model operates through a coordinated interplay between a task model and a selector. The former samples word embeddings generated by various models, while the latter generates rewards predicated on feedback obtained from current task outcomes to decide whether or not to splice the embedding. Experiments were conducted on two generic datasets and one cybersecurity dataset for knowledge graph quality assessment. The results show that our model outperforms the baseline model and achieves significant advantages in key metrics such as accuracy and F1 value, obtaining accuracy of 95.8%, 95.6% and 91.3% on the generic datasets WN11, FB13 and cybersecurity dataset CS13K, respectively, representing increases of 1.0%, 0.2% and 0.5% over the AttTucker model.
Because there are so many complex online risks, we need new ways to look at threat data. This study suggests a complete approach that combines graph theory and machine learning methods to make figuring out cyber threats better. The basic idea behind networks is graph theory, which lets us show the complicated connections between different things in a connected world. This approach gives a full picture of the danger scene by representing cyber entities and how they interact as nodes and lines in a graph. This makes it easier to spot trends and outliers. The system includes machine learning techniques that make use of the huge amount of data that is available for analyzing cyber threats. Supervised learning methods are used for classification tasks. These let threats be put into groups based on past data and known patterns of bad behavior. Unsupervised learning methods, on the other hand, make finding anomalies easier by noticing changes in how networks normally behave. These machine learning models learn to adapt to changing threats by being trained and improved over and over again. This makes methods for finding threats and stopping them more effective. Combining graph theory and machine learning makes it possible to get useful information from a huge number of different data sources. Graph-based analytics bring together different kinds of data, like network traffic, system logs, and threat intelligence feeds, into a single view. This helps you see the connections between things that don't seem to be related. Machine learning algorithms improve this analysis by finding small patterns and trends that point to bad behavior. This gives cybersecurity professionals the power to stop new threats before they happen. Scalability and freedom are built into the suggested system so it can adapt to changing cyber dangers and network platforms. It can handle big datasets and real-time streaming data well by using distributed computer structures and flexible machine learning methods. This makes sure that threats are found and dealt with quickly. Putting graph theory and machine learning together is a good way to make threat intelligence research better in defense.
No abstract available
Cyber Threat Intelligence (CTI) knowledge graphs depend on extracting Structured Threat Information Expression (STIX) 2.1-compliant entity–relation triples from unstructured threat reports, but current systems often treat schema violations as terminal errors and discard invalid outputs. In practice, Large Language Model (LLM) extraction frequently produces near-correct triples that fail validation due to minor formatting artifacts, entity type drift across sentences, alias fragmentation, or STIX domain-range mismatches. This paper presents a multi-stage triple validation and repair framework that recovers such rejected triples while enforcing strict STIX 2.1 constraints. Starting from an invalid-triple log, the framework applies deterministic normalization (artifact stripping, type/predicate canonicalization, and domain-range enforcement), followed by probabilistic Markov smoothing to stabilize entity typing across document contexts. A unified, schema-constrained LLM repair module generates a small set of candidate repairs, each revalidated under the same STIX rules, ensuring that no hallucinated entities or unsupported predicates enter the final output. Finally, a graph-based Skew Zero Forcing (SZF) pass then reinforces structurally consistent neighborhoods by propagating trust from high-confidence nodes and filtering incompatible relationships.Using DNRTI as a benchmark, the results show that staged repair significantly improves the usable triple set by converting a portion of validation failures into STIX-valid triples, increasing graph connectivity and reducing information loss without relaxing ontology constraints.
Cyber Threat Intelligence (CTI) reports are valuable resources in various applications but manually extracting information from them is time-consuming. Existing approaches for automating extraction require specialized models trained on a substantial corpus. In this paper, we present an efficient methodology for constructing knowledge graphs from CTI by leveraging the Large Language Model (LLM), using ChatGPT for instance. Our approach automatically extracts attack-related entities and their relationships, organizing them within a CTI knowledge graph. We evaluate our approach on 13 CTIs, demonstrating better performance compared to AttacKG and REBEL while requiring less manual intervention and computational resources. This proves the feasibility and suitability of our method in low-resource scenarios, specifically within the domain of cyber threat intelligence.
Cyber threat intelligence (CTI) sharing has gradually become an important means of dealing with security threats. Considering the growth of cyber threat intelligence, the quick analysis of threats has become a hot topic at present. Researchers have proposed some machine learning and deep learning models to automatically analyze these immense amounts of cyber threat intelligence. However, due to a large amount of network security terminology in CTI, these models based on open-domain corpus perform poorly in the CTI automatic analysis task. To address this problem, we propose an automatic CTI analysis method named K-CTIAA, which can extract threat actions from unstructured CTI by pre-trained models and knowledge graphs. First, the related knowledge in knowledge graphs will be supplemented to the corresponding position in CTI through knowledge query and knowledge insertion, which help the pre-trained model understand the semantics of network security terms and extract threat actions. Second, K-CTIAA reduces the adverse effects of knowledge insertion, usually called the knowledge noise problem, by introducing a visibility matrix and modifying the calculation formula of the self-attention. Third, K-CTIAA maps corresponding countermeasures by using digital artifacts, which can provide some feasible suggestions to prevent attacks. In the test data set, the F1 score of K-CTIAA reaches 0.941. The experimental results show that K-CTIAA can improve the performance of automatic threat intelligence analysis and it has certain significance for dealing with security threats.
Cyber attacks are becoming more sophisticated and diverse, making detection increasingly challenging. To combat these attacks, security practitioners actively summarize and exchange their knowledge about attacks across organizations in the form of cyber threat intelligence (CTI) reports. However, as CTI reports written in natural language texts are not structured for automatic analysis, the report usage requires tedious manual efforts of cyber threat intelligence recovery. Additionally, individual reports typically cover only a limited aspect of attack patterns (techniques) and thus are insufficient to provide a comprehensive view of attacks with multiple variants. To take advantage of threat intelligence delivered by CTI reports, we propose AttacKG to automatically extract structured attack behavior graphs from CTI reports and identify the adopted attack techniques. We then aggregate cyber threat intelligence across reports to collect different aspects of techniques and enhance attack behavior graphs into technique knowledge graphs (TKGs). In our evaluation against 1,515 real-world CTI reports from diverse intelligence sources, AttacKG effectively identifies 28,262 attack techniques with 8,393 unique Indicators of Compromises (IoCs). To further verify the accuracy of AttacKG in extracting threat intelligence, we run AttacKG on 16 manually labeled CTI reports. Empirical results show that AttacKG accurately identifies attack-relevant entities, dependencies, and techniques with F1-scores of 0.887, 0.896, and 0.789, which outperforms the state-of-the-art approaches Extractor and TTPDrill. Moreover, the unique technique-level intelligence will directly benefit downstream security tasks that rely on technique specifications, e.g., APT detection and cyber attack reconstruction.
No abstract available
Internet of Things (IoT)has numerous applications in the industry and society, thanks to its ability to achieve automation and connectivity in a range of activities. Despite its great potentials, IoT is susceptible to physical and cyber-attacks, which causes security threats (e.g., financial risk and leakage of privacy). To address this problem, an approach for attack prediction is proposed for IoT. Aiming at a high degree of flexibility, an intelligent model is designed to construct knowledge graph by integrating equipment information CPE, vulnerability information CVE and attack pattern information CAPEC disclosed by the National Institute of Standards and Technology (NIST) and the security organization MITRE. Based on the knowledge graph, the safety analysis and operation analysis of many IOT information are carried out. To conclude the possible attack, knowledge representation learning method that fuses the triple information and semantic path combination information of the knowledge graph (FTSPC) was employed. We transform the attack prediction task into the link prediction problem. The suggested method is evaluated on a public dataset and our dataset, the results demonstrated that the method can predict the attack of IoT infrastructure, providing rich IoT security knowledge to security researchers and professionals and a useful reference for active defense.
This research presents a novel approach for network security vulnerability association analysis and prediction leveraging knowledge graph technology. We construct a comprehensive vulnerability knowledge graph that captures semantic relationships between vulnerabilities, attack patterns, and affected systems by integrating data from multiple sources including NVD, CVE, and vendor security bulletins. Our methodology encompasses three complementary analysis approaches: semantic association analysis using path-based algorithms, temporal association analysis employing multi-scale time-series techniques, and attack chain association analysis through exploitation chain construction. The prediction framework combines knowledge graph embeddings, graph neural networks, and multi-modal feature fusion to forecast vulnerability exploitation with 89.2% accuracy within a 30-day window, significantly outperforming statistical baselines (71.3%) and non-knowledge graph methods (82.6%). Experimental evaluation on real-world datasets demonstrates that our semantic association analysis achieved 0.87 precision and 0.82 recall (F1: 0.84), outperforming baselines by 18.7%. Our attack chain discovery identified 76.8% of known attack chains while discovering 23 previously undocumented but plausible vectors. The system maintained 83.7% performance with 30% missing attributes, demonstrating robust adaptability to real-world challenges. In enterprise deployment, our approach identified 37 critical vulnerability associations and predicted 14 high-priority vulnerabilities, with 11 being missed by existing tools. The methodology aids in the proactive cybersecurity management in networks that are becoming increasingly complex.
The security issues of power information systems are becoming more and more severe. Actively discovering system vulnerabilities is of great significance to improve system security. To realize the automation of penetration testing, in this paper a penetration testing method based on knowledge graph is proposed for power information systems. The method uses knowledge graph to represent and infer network topology, asset information and vulnerability information to guide the automated execution of penetration testing. Firstly, the knowledge graph information extraction and framework construction are completed to realize knowledge inference; secondly, an attack graph generation framework based on knowledge graph is constructed, penetration testing algorithms and penetration paths are designed to realize path searching and optimization; finally, penetration path automatic planning is realized based on attack condition inference of knowledge graph. The method can realize the automation of customized penetration testing path search and decision-making for power information systems, significantly improving the testing efficiency.
In response to the difficulty in detecting attacks caused by the unknown nature of 0-day vulnerabilities, the author proposes a knowledge graph based 0-day attack path prediction method. By extracting concepts and entities related to attacks from existing research on the ontology of network security and network security databases, a network defense knowledge graph is constructed to extract discrete security data such as threats, vulnerabilities, and assets into interrelated security knowledge. Using a knowledge graph reasoning method based on path sorting algorithm to explore possible 0-day attacks in the target system. Experimental results have shown that the proposed method can rely on the knowledge system provided by the knowledge graph to provide comprehensive knowledge support for attack prediction, reduce the dependence of prediction analysis on expert models, and effectively overcome the adverse effects of unknown 0-day vulnerabilities on prediction analysis. It improves the accuracy of 0-day attack prediction and utilizes the path sorting algorithm to infer based on the explicit feature of graph structure, being able to effectively backtrack the reasons behind the formation of reasoning results, this to some extent improves the interpretability of attack prediction analysis results.
.
No abstract available
With the increasing number of network security threats and the frequent occurrence of software vulnerability attacks, the effective management and large-scale retrieval of vulnerability data have become urgent needs. Existing vulnerability information is scattered across heterogeneous sources and is difficult to integrate, which in turn makes it hard for security analysts to quickly retrieve and analyze relevant security knowledge. To address this problem, this paper proposes a method to construct a vulnerability knowledge graph by integrating multi-source vulnerability data, combining graph embedding technology with large language model reasoning to aggregate, infer, and enrich vulnerability knowledge. Experiments demonstrated that our domain-tuned Bidirectional Long Short-Term Memory–Conditional Random Field (BiLSTM-CRF) named entity recognition (NER), enhanced with a cybersecurity dictionary, achieved a 90.1% F1-score for entity extraction. For link prediction, a hybrid Graph Attention Network fused with GPT-3 reasoning boosted Hits1 by 0.137, Hits3 by 0.116, and Hits10 by 0.101 over the baseline. These results confirm that our approach markedly enhanced entity identification and relationship inference, yielding a more complete and dynamically updatable cybersecurity knowledge graph.
Currently, little is known about the structure of the Cargo ecosystem and the potential for vulnerability propagation. Many empirical studies generalize third-party dependency governance strategies from a single software ecosystem to other ecosystems but ignore the differences in the technical structures of different software ecosystems, making it difficult to directly generalize security governance strategies from other ecosystems to the Cargo ecosystem. To fill the gap in this area, this paper constructs a knowledge graph of dependency vulnerabilities for the Cargo ecosystem using techniques related to knowledge graphs to address this challenge. This paper is the first large-scale empirical study in a related research area to address vulnerability propagation in the Cargo ecosystem. This paper proposes a dependency-vulnerability knowledge graph parsing algorithm to determine the vulnerability propagation path and propagation range and empirically studies the characteristics of vulnerabilities in the Cargo ecosystem, the propagation range, and the factors that cause vulnerability propagation. Our research has found that the Cargo ecosystem's security vulnerabilities are primarily memory-related. 18% of the libraries affected by the vulnerability is still affected by the vulnerability in the latest version of the library. The number of versions affected by the propagation of the vulnerabilities is 19.78% in the entire Cargo ecosystem. This paper looks at the characteristics and propagation factors triggering vulnerabilities in the Cargo ecosystem. It provides some practical resolution strategies for administrators of the Cargo community, developers who use Cargo to manage third-party libraries, and library owners. This paper provides new ideas for improving the overall security of the Cargo ecosystem.
Existing attack path generation methods face limitations in dynamic simulation environments due to their reliance on static network models and computational inefficiencies when network configurations change frequently. This study proposes GAT-APG, a reinforcement learning framework that combines Graph Attention Networks with policy gradient methods to generate adaptive attack paths for security simulation. The approach employs node-centric vulnerability assessment that transforms network traffic data into vulnerability metrics, enabling adaptation to network changes without complete graph recalculation. Experimental validation on controlled graphs and the Kyoto dataset demonstrates competitive performance, achieving 93% accuracy against brute-force methods and showing 69.1% exact matches with Dijkstra’s algorithm on real-world topologies. The framework provides a simulation-ready environment for vulnerability assessment and defensive planning in dynamic networks.
The rapid expansion of Internet of Things (IoT) technology has led to a proliferation of smart devices and interconnected systems. Critical factors such as production limitations, cost constraints, and insufficient technical capabilities have rendered these devices more vulnerable and at higher risk compared to traditional devices. The extensive data processing and communication requirements due to the increasing number of devices and connections have also introduced significant security challenges. Consequently, security risk assessment methodologies have gained relevance for a wide range of IoT systems. However, identifying vulnerable nodes within the system, individually assessing devices, and performing compact and efficient analyses of the entire topology remain underexplored areas. To address these gaps, this paper presents a quantitative assessment approach based on risk and vulnerability metrics. By integrating computational metrics from existing literature, we conduct a host-based attack probability assessment and extend this analysis to devices, communication paths, and the overall graph within a security context. Beyond establishing a mathematical framework, we refine the IOTA approach, typically used for attack path detection and graph generation, into a hybrid risk-based model to enhance search-domain efficiency. Our proposed approach targets high-vulnerability components within the system through risk-weighted backtracking, thereby facilitating more efficient attack path detection and filtering. The developed method is evaluated in a border security case scenario, with comparisons made in terms of algorithmic and asymptotic complexity. Simulation results demonstrate that our hybrid approach for detecting potential attack paths achieves an average runtime improvement of 16.9% compared to existing methods.
No abstract available
MCKG: Advancing Attack Knowledge Graph Construction via Multi-Source Cross-Modal Threat Intelligence
The cyber threat intelligence knowledge graph aims to structure attack knowledge to guide cyber defences, yet existing approaches predominantly rely on natural language CTI, failing to effectively integrate multi-source, cross-modal threat intelligence. This results in incomplete and imprecise knowledge representation. We propose the Multi-Source Cross-Modal Threat Intelligence Knowledge Graph Construction Framework (MCKG). This framework employs a two-tier progressive fusion mechanism: first, a three-level attention mechanism establishes deep associations between natural language and visual features; subsequently, it constructs a three-dimensional associative network linking CTI knowledge graphs, malicious code graphs, and log-timeline graphs. Leveraging a heterogeneous graph attention network to unify cross-source semantic spaces, it achieves multigranularity attack path modelling, forming a unified, fine-grained attack technique knowledge graph that is cross-modal and multisource. Experiments demonstrate that MCKG accurately parses multi-source cross-modal threat intelligence, efficiently aggregates cross-modal attack knowledge, and constructs a comprehensive, precise attack technique knowledge graph. This provides structured decision-making support for complex network attack attribution analysis.
Attack-path planning plays a key role in proactive cybersecurity because of its ability in helping defenders anticipate adversaries and uncover critical vulnerabilities. This paper proposes GAPPO, a novel deep reinforcement learning-based attack path planning scheme that integrates Graph Attention Networks (GAT) and expert knowledge into Proximal Policy Optimization (PPO). There are three mechanisms in GAPPO. The first is using GAT to produce graph-structure-aware embeddings that emphasize critical connections, enabling expressive state representations for decision making. The second is a ruled-based action masking mechanism, which incorporates expert knowledge to prune the action space based on node dependencies and then to prevent illegal actions from negatively impacting training. The third is combining the results of the first two mechanisms into PPO for attack path planning. Our extensive experimental results demonstrate that GAPPO outperforms existing methods in terms of faster convergence and higher-quality attack paths across diverse scenarios.
As power distribution IoT technology advances, traditional security measures are increasingly inadequate. This study proposes a dynamic anomaly detection method for edge devices using attack chain knowledge graph technology. By analyzing IoT attack paths and constructing an attack chain model, the method establishes a hierarchical network security knowledge extraction model. It automatically builds a security knowledge graph based on the attack chain, enabling effective dynamic anomaly detection and improving security performance. The approach combines graph structure storage and algorithms for fast attack detection and threat traceability. Additionally, it incorporates atlas similarity and critical path analysis for dynamic detection. Experimental results show that this method effectively identifies potential security threats, enhances system security, and meets the complex protection needs of power distribution IoT systems.
The broad application of artificial intelligence (AI) shows more and more vulnerabilities. Adversaries have more opportunities to attack AI systems. For example, unmanned vehicles may be interfered with by adversaries in path planning, resulting in unmanned vehicles being unable to move according to the planned route, and even serious safety problems. On the other side, the portrait technology can extract highly refined characteristics of different attack strategies, so that unmanned vehicles can defend themselves based on the characteristics of each attack. Existing research lacks intelligent attack research on path planning in the field of unmanned vehicles, and lacks portraits of attack behaviors in this scenario. This paper combines multiagent reinforcement learning technology, time‐series segmentation clustering technology, and knowledge graph technology to study the portrait technology of adversary intelligent attack behavior in the field of unmanned vehicle path planning. First, the simulation results of unmanned vehicle path planning are obtained, and the steps of adversary attack behavior are extracted by using Toeplitz inverse covariance‐based clustering time‐series segmentation cluster technology. Second, the knowledge graph is used to save the attack strategy, so as to form the attack behavior portrait of unmanned vehicle path planning. The test on the Neo4j platform shows that our method is universal, can effectively describe the attack steps for unmanned vehicle path planning, and provides the basis for attack detection to establish the defense system of unmanned vehicles.
Vulnerability assessment is a critical aspect of cybersecurity, and its importance has grown significantly. However, traditional methods based on attack graph are expensive and lack interpretability. And emerging methods based on knowledge graph, there is currently no widely accepted scheme in the industry. To address this issue, we propose a new scheme named KG-Rank, which leverages the correlations between vulnerabilities and assets through a knowledge graph and improves PageRank to assess and rank vulnerabilities. The KG-Rank scheme involves constructing Vulnerability Knowledge Graph (VKG) and Weakness Knowledge Graph (WKG), and obtaining new relations between weaknesses through relational reasoning on WKG by using a relational reasoning model called Doc2TransR to complete VKG. We then transform the nodes and edges in VKG and assess nodes on the transformed graph using a designed random walk strategy. Our experimental results demonstrate the effectiveness and reliability of KG-Rank, which considers not only the severity of vulnerabilities but also the importance of assets and the exposure scope of vulnerabilities and assets.
With System on Chip integration on the rise, Silent Data Corruption (SDC) poses a significant threat to computer systems, corrupting outputs silently and without clear faults. Traditional error detection methods lack either energy efficiency or accuracy, failing to capture SDC’s complex propagation patterns. To the end, in this paper, we propose a new paradigm called VP-HPKG, which leverages a Heterogeneous Program Knowledge Graph to intricately map structural interdependencies between basic blocks and instructions, enabling the exploration of potential SDC propagation paths. Specifically, first, we build an instruction execution and register fault generation system, based on which we can simulate bit-flip errors to obtain data such as program semantic information and execution status. Second, due to complex inter-instruction relations and random error propagation, we construct a multi-layer heterogeneous program knowledge graph by characterizing entities and relations of instructions, which implicates the potential path of error propagation. Then, we model contextual correlations within and among basic blocks with Graph Neural Network and Transformer, aiming to uncover abnormal inter-block jumps and extract the instruction embedding to predict vulnerable instructions within blocks accurately with low overhead. In particular, we perform 1,144,070 fault injections based on LLFI to obtain sufficient samples of SDCs. Our experimental results with 22 programs show our method outperforms the state-of-the-art baselines in terms of accuracy (average 10.3%<inline-formula><tex-math notation="LaTeX">$\uparrow$</tex-math><alternatives><mml:math><mml:mo>↑</mml:mo></mml:math><inline-graphic xlink:href="gu-ieq1-3614343.gif"/></alternatives></inline-formula>), F1-score (average 18.4%<inline-formula><tex-math notation="LaTeX">$\uparrow$</tex-math><alternatives><mml:math><mml:mo>↑</mml:mo></mml:math><inline-graphic xlink:href="gu-ieq2-3614343.gif"/></alternatives></inline-formula>). In addition, the model complexity of VP-HPKG is reduced by about 2 times compared to the most competitive method, and the fault injection overhead is reduced by 30%.
The automated identification and evaluation of potential attack paths within infrastructures is a critical aspect of cybersecurity risk assessment. However, existing methods become impractical when applied to complex infrastructures. While machine learning (ML) has proven effective in predicting the exploitation of individual vulnerabilities, its potential for full-path prediction remains largely untapped. This challenge stems from two key obstacles: the lack of adequate datasets for training the models and the dimensionality of the learning problem. To address the first issue, we provide a dataset of 1033 detailed environment graphs and associated attack paths, with the objective of supporting the community in advancing ML-based attack path prediction. To tackle the second, we introduce a novel Physics-Informed Graph Neural Network (PIGNN) architecture for attack path prediction. Our experiments demonstrate its effectiveness, achieving an F1 score of 0.9308 for full-path prediction. We also introduce a self-supervised learning architecture for initial access and impact prediction, achieving F1 scores of 0.9780 and 0.8214, respectively. Our results indicate that the PIGNN effectively captures adversarial patterns in high-dimensional spaces, demonstrating promising generalization potential towards fully automated assessments.
Against the background of the construction of new power systems, power generation, transmission, distribution, and dispatching services are open to the outside world for interaction, and the accessibility of attack paths has been significantly enhanced. We are facing cyber-physical cross-domain attacks with the characteristics of strong targeting, high concealment, and cross-space threats. This paper proposes a quantitative analysis method for the influence of power cyber-physical cross-domain attack paths based on graph knowledge. First, a layered attack graph was constructed based on the cross-space and strong coupling characteristics of the power cyber-physical system business and the vertical architecture of network security protection focusing on border protection. The attack graph included cyber-physical cross-domain attacks, control master stations, measurement and control equipment failures, transient stable node disturbances, and other vertices, and achieved a comprehensive depiction of the attack path. Second, the out-degree, in-degree, vertex betweenness, etc., of each vertex in the attack graph were comprehensively considered to calculate the vertex vulnerability, and by defining the cyber-physical coupling degree and edge weights, the risk of each attack path was analyzed in detail. Finally, the IEEE RTS79 and RTS96 node systems were selected, and the impact of risk conduction on the cascading failures of the physical space system under typical attack paths was analyzed using examples, verifying the effectiveness of the proposed method.
Enterprise network systems are confronted with an escalating threat landscape, requiring timely and effective attack detection and mitigation of the risk of potential financial losses and system damages. However, existing algorithms mostly rely on machine learning techniques or attack knowledge bases. They face challenges dealing with the large volumes of noisy network logs in enterprises, as well as the emergence of unknown cyber attacks. Moreover, previous research has predominantly focused on anomaly detection using raw network traffic capture, with limited exploration on attack path prioritization. To address these challenges, this paper introduces a novel algorithm for attack path detection and prioritization in network systems. Our approach gathers comprehensive asset information and network logs from multiple Network Intrusion Detection Systems (NIDSs). Through data processing and collation, the network data undergoes significant noise reduction and transformation into a network communication graph format. Subsequently, a Graph Neural Network (GNN) based anomaly detection algorithm is employed to extract and prioritize potential attack paths on the graph. This methodology leverages the power of unsupervised Machine Learning (ML) techniques and operates independently of prior attack databases. Incorporating path mining techniques, our algorithm provides visibility into identified attack propagation chain and the sequence of assets involved, which offers more valuable information compared to the repetitive atomic network traffic data from NIDSs. The algorithm is evaluated using the UNSW-NB15 dataset and proven to be effective and accurate with comprehensive experiment settings.
In recent years, there are more and more attacks and exploitation aiming at network security vulnerabilities. It is effective for us to prevent criminals from exploiting vulnerabilities for attacks and help security analysts maintain equipment security that knows vulnerabilities and threats on time. With the knowledge graph, we can organize, manage, and utilize the massive information effectively in cyberspace. In this paper we construct the vulnerability ontology after analyzing multi-source heterogeneous databases. And the vulnerability knowledge graph is established. Experimental results show that the accuracy of entity recognition for extracting vendor names reaches 89.76%. The more rules used in entity recognition, the higher the accuracy and the lower the error rate.
No abstract available
5G industrial cyber–physical systems (5G-ICPSs) have attracted substantial research interests due to their capability in the interconnection of everything. However, integrating the 5G network may expose systems to more potential risks. To reveal attack propagation, an attack path prediction approach based on dual reinforcement learning (RL) is proposed. First, a dual-network model is established, incorporating the security constraints for attacks against the 5G network into the attack graph. Second, employing RL, $Q $ -value updating functions and reward mechanisms based on topology and vulnerability are designed. Finally, an optimal attack path prediction algorithm is developed. Unlike traditional methods, the proposed approach does not rely on the monotonicity assumption that a system component has only one vulnerability, enabling it to accurately predict the optimal attack paths. Our simulation results demonstrate that the proposed approach can identify possible attack sources and paths from a 5G-ICPS.
The use of attack graphs in the study of network vulnerability assessment is a classical and effective way, which breaks the deficiency of traditional methods that can only do a static assessment based on the threat level of vulnerabilities. In order to assess the overall vulnerability of the network system, this paper studies different vulnerability assessment methods based on attack graphs and proposes the AGH model, which uses CVSS to statically assess the probability of vulnerability exploitation and combines the Hidden Markov Model and Viterbi decoding algorithm to calculate the maximum probability of attacker attack path. Finally, the feasibility of the method is verified by conducting experiments with a real laboratory network topology.
No abstract available
This paper studies the security issues for cyber–physical systems, aimed at countering potential malicious cyber-attacks. The main focus is on solving the problem of extracting the most vulnerable attack path in a known attack graph, where an attack path is a sequence of steps that an attacker can take to compromise the underlying network. Determining an attacker’s possible attack path is critical to cyber defenders as it helps identify threats, harden the network, and thwart attacker’s intentions. We formulate this problem as a path-finding optimization problem with logical constraints represented by AND and OR nodes. We propose a new Dijkstra-type algorithm that combines elements from Dijkstra’s shortest path algorithm and the critical path method. Although the path extraction problem is generally NP-hard, for the studied special case, the proposed algorithm determines the optimal attack path in polynomial time, O(nm), where n is the number of nodes and m is the number of edges in the attack graph. To our knowledge this is the first exact polynomial algorithm that can solve the path extraction problem for different attack graphs, both cycle-containing and cycle-free. Computational experiments with real and synthetic data have shown that the proposed algorithm consistently and quickly finds optimal solutions to the problem.
The accurate and effective prediction of network attack paths has become a crucial concern in the realm of network security, given the inherent uncertainty and subjectivity associated with network attack methods. To solve this problem, this paper proposes a visualized dynamic attack path prediction scheme for industrial cyber-physical systems (ICPSs). The method combines the Bayesian attack graph with the knowledge graph and considers the topology of the digital twin layer to make it closer to the actual situation. In addition, node dynamic reachability probabilities are considered to provide support for the interpretation of the prediction results. The simulation results demonstrate that the proposed scheme is more flexible and scalable than the static attack graph. These improvements enable more accurate prediction of the network attack path and enhance the network’s security protection ability.
In recent years, web application development has become more efficient, yet vulnerabilities still pose significant risks. Traditional static and dynamic detection techniques are prone to false positives and negatives, making it challenging for small and medium-sized developers with limited security knowledge to accurately assess the results. To address these challenges, we introduced VulKiller, an automated vulnerability detection tool powered by large language models (LLM). VulKiller leverages static analysis to convert application code into Code Property Graphs (CPG) and utilizes Neo4j to identify high-risk method call chains. By designing structured interactions with ChatGPT, these call chains and corresponding code are transformed into Proofs of Concept (PoCs), which are then parsed into attack payloads and evaluated by a vulnerability monitor for effectiveness. In comparison with traditional tools, VulKiller excels in reducing false positives and negatives. Additionally, in zero-day vulnerability detection experiments, VulKiller identified 12 zero-day vulnerabilities. Our results offer significant encouragement for using LLM to enhance vulnerability detection.
Knowledge graph embedding (KGE) methods have achieved great success in handling various knowledge graph (KG) downstream tasks. However, KGE methods may learn biased representations on low-quality KGs that are prevalent in the real world. Some recent studies propose adversarial attacks to investigate the vulnerabilities of KGE methods, but their attackers are target-oriented with the KGE method and the target triples to predict are given in advance, which lacks practicability. In this work, we explore untargeted attacks with the aim of reducing the global performances of KGE methods over a set of unknown test triples and conducting systematic analyses on KGE robustness. Considering logic rules can effectively summarize the global structure of a KG, we develop rule-based attack strategies to enhance the attack efficiency. In particular, we consider adversarial deletion which learns rules, applying the rules to score triple importance and delete important triples, and adversarial addition which corrupts the learned rules and applies them for negative triples as perturbations. Extensive experiments on two datasets over three representative classes of KGE methods demonstrate the effectiveness of our proposed untargeted attacks in diminishing the link prediction results. And we also find that different KGE methods exhibit different robustness to untargeted attacks. For example, the robustness of methods engaged with graph neural networks and logic rules depends on the density of the graph. But rule-based methods like NCRL are easily affected by adversarial addition attacks to capture negative rules.
Ability to effectively investigate indicators of compromise and associated network resources involved in cyber attacks is paramount not only to identify affected network resources but also to detect related malicious resources. Today, most of the cyber threat intelligence platforms are reactive in that they can identify attack resources only after the attack is carried out. Further, these systems have limited functionality to investigate associated network resources. In this work, we propose an extensible predictive cyber threat intelligence platform called cGraph that addresses the above limitations. cGraph is built as a graph-first system where investigators can explore network resources utilizing a graph based API. Further, cGraph provides real-time predictive capabilities based on state-of-the-art inference algorithms to predict malicious domains from network graphs with a few known malicious and benign seeds. To the best of our knowledge, cGraph is the only threat intelligence platform to do so. cGraph is extensible in that additional network resources can be added to the system transparently.
Industry 5.0's increasing integration of IT and OT systems is transforming industrial operations but also expanding the cyber-physical attack surface. Industrial Control Systems (ICS) face escalating security challenges as traditional siloed defences fail to provide coherent, cross-domain threat insights. We present BRIDG-ICS (BRIDge for Industrial Control Systems), an AI-driven Knowledge Graph (KG) framework for context-aware threat analysis and quantitative assessment of cyber resilience in smart manufacturing environments. BRIDG-ICS fuses heterogeneous industrial and cybersecurity data into an integrated Industrial Security Knowledge Graph linking assets, vulnerabilities, and adversarial behaviours with probabilistic risk metrics (e.g. exploit likelihood, attack cost). This unified graph representation enables multi-stage attack path simulation using graph-analytic techniques. To enrich the graph's semantic depth, the framework leverages Large Language Models (LLMs): domain-specific LLMs extract cybersecurity entities, predict relationships, and translate natural-language threat descriptions into structured graph triples, thereby populating the knowledge graph with missing associations and latent risk indicators. This unified AI-enriched KG supports multi-hop, causality-aware threat reasoning, improving visibility into complex attack chains and guiding data-driven mitigation. In simulated industrial scenarios, BRIDG-ICS scales well, reduces potential attack exposure, and can enhance cyber-physical system resilience in Industry 5.0 settings.
In this article, we explain the recent advance of subsampling methods in knowledge graph embedding (KGE) starting from the original one used in word2vec.
Large language models (LLMs) can compile weighted graphs on natural language data to enable automatic coherence-driven inference (CDI) relevant to red and blue team operations in cybersecurity. This represents an early application of automatic CDI that holds near- to medium-term promise for decision-making in cybersecurity and eventually also for autonomous blue team operations.
Knowledge bases, and their representations in the form of knowledge graphs (KGs), are naturally incomplete. Since scientific and industrial applications have extensively adopted them, there is a high demand for solutions that complete their information. Several recent works tackle this challenge by learning embeddings for entities and relations, then employing them to predict new relations among the entities. Despite their aggrandizement, most of those methods focus only on the local neighbors of a relation to learn the embeddings. As a result, they may fail to capture the KGs' context information by neglecting long-term dependencies and the propagation of entities' semantics. In this manuscript, we propose ÆMP (Attention-based Embeddings from Multiple Patterns), a novel model for learning contextualized representations by: (i) acquiring entities' context information through an attention-enhanced message-passing scheme, which captures the entities' local semantics while focusing on different aspects of their neighborhood; and (ii) capturing the semantic context, by leveraging the paths and their relationships between entities. Our empirical findings draw insights into how attention mechanisms can improve entities' context representation and how combining entities and semantic path contexts improves the general representation of entities and the relation predictions. Experimental results on several large and small knowledge graph benchmarks show that ÆMP either outperforms or competes with state-of-the-art relation prediction methods.
Modern cybersecurity threats are growing in complexity, targeting increasingly intricate & interconnected systems. To effectively defend against these evolving threats, security teams utilize automation & orchestration to enhance response efficiency and consistency. In that sense, cybersecurity playbooks are key enablers, providing a structured, reusable, and continuously improving approach to incident response, enabling organizations to codify requirements, domain expertise, and best practices and automate decision-making processes to the extent possible. The emerging Collaborative Automated Course of Action Operations (CACAO) standard defines a common machine-processable schema for cybersecurity playbooks, facilitating interoperability for their exchange and ensuring the ability to orchestrate and automate cybersecurity operations. However, despite its potential and the fact that it is a relatively new standardization work, there is a lack of tools to support its adoption and, in particular, the management & lifecycle development of CACAO playbooks, limiting their practical deployment. Motivated by the above, this work presents the design, development, and evaluation of a Knowledge Management System (KMS) for managing CACAO cybersecurity playbooks throughout their lifecycle, providing essential tools to streamline playbook management. Using open technologies & standards, the proposed approach fosters standards-based interoperability & enhances the usability of state-of-the-art cybersecurity orchestration & automation primitives. To encourage adoption, the resulting implementation is released as open-source, which, to the extent of our knowledge, comprises the first publicly available & documented work in this domain, supporting the broader uptake of CACAO playbooks & promoting the widespread use of interoperable automation and orchestration mechanisms in cybersecurity operations.
Open source intelligence is a powerful tool for cybersecurity analysts to gather information both for analysis of discovered vulnerabilities and for detecting novel cybersecurity threats and exploits. However the scale of information that is relevant for information security on the internet is always increasing, and is intractable for analysts to parse comprehensively. Therefore methods of condensing the available open source intelligence, and automatically developing connections between disparate sources of information, is incredibly valuable. In this research, we present a system which constructs a Neo4j graph database formed by shared connections between open source intelligence text including blogs, cybersecurity bulletins, news sites, antivirus scans, social media posts (e.g., Reddit and Twitter), and threat reports. These connections are comprised of possible indicators of compromise (e.g., IP addresses, domains, hashes, email addresses, phone numbers), information on known exploits and techniques (e.g., CVEs and MITRE ATT&CK Technique ID's), and potential sources of information on cybersecurity exploits such as twitter usernames. The construction of the database of potential IoCs is detailed, including the addition of machine learning and metadata which can be used for filtering of the data for a specific domain (for example a specific natural language) when needed. Examples of utilizing the graph database for querying connections between known malicious IoCs and open source intelligence documents, including threat reports, are shown. We show three specific examples of interesting connections found in the graph database; the connections to a known exploited CVE, a known malicious IP address, and a malware hash signature.
Efforts have been recently made to construct ontologies for network security. The proposed ontologies are related to specific aspects of network security. Therefore, it is necessary to identify the specific aspects covered by existing ontologies for network security. A review and analysis of the principal issues, challenges, and the extent of progress related to distinct ontologies was performed. Each example was classified according to the typology of the ontologies for network security. Some aspects include identifying threats, intrusion detection systems (IDS), alerts, attacks, countermeasures, security policies, and network management tools. The research performed here proposes the use of three stages: 1. Inputs; 2. Processing; and 3. Outputs. The analysis resulted in the introduction of new challenges and aspects that may be used as the basis for future research. One major issue that was discovered identifies the need to develop new ontologies that relate to distinct aspects of network security, thereby facilitating management tasks.
Over the last years, Industrial Control Systems (ICS) have become increasingly exposed to a wide range of cyber-physical threats. Efficient models and techniques able to capture their complex structure and identify critical cyber-physical components are therefore essential. AND/OR graphs have proven very useful in this context as they are able to semantically grasp intricate logical interdependencies among ICS components. However, identifying critical nodes in AND/OR graphs is an NP-complete problem. In addition, ICS settings normally involve various cyber and physical security measures that simultaneously protect multiple ICS components in overlapping manners, which makes this problem even harder. In this paper, we present an extended security metric based on AND/OR hypergraphs which efficiently identifies the set of critical ICS components and security measures that should be compromised, with minimum cost (effort) for an attacker, in order to disrupt the operation of vital ICS assets. Our approach relies on MAX-SAT techniques, which we have incorporated in META4ICS, a Java-based security metric analyser for ICS. We also provide a thorough performance evaluation that shows the feasibility of our method. Finally, we illustrate our methodology through a case study in which we analyse the security posture of a realistic Water Transport Network (WTN).
Cyber threats are constantly evolving. Extracting actionable insights from unstructured Cyber Threat Intelligence (CTI) data is essential to guide cybersecurity decisions. Increasingly, organizations like Microsoft, Trend Micro, and CrowdStrike are using generative AI to facilitate CTI extraction. This paper addresses the challenge of automating the extraction of actionable CTI using advancements in Large Language Models (LLMs) and Knowledge Graphs (KGs). We explore the application of state-of-the-art open-source LLMs, including the Llama 2 series, Mistral 7B Instruct, and Zephyr for extracting meaningful triples from CTI texts. Our methodology evaluates techniques such as prompt engineering, the guidance framework, and fine-tuning to optimize information extraction and structuring. The extracted data is then utilized to construct a KG, offering a structured and queryable representation of threat intelligence. Experimental results demonstrate the effectiveness of our approach in extracting relevant information, with guidance and fine-tuning showing superior performance over prompt engineering. However, while our methods prove effective in small-scale tests, applying LLMs to large-scale data for KG construction and Link Prediction presents ongoing challenges.
As the cyber threat landscape is constantly becoming increasingly complex and polymorphic, the more critical it becomes to understand the enemy and its modus operandi for anticipatory threat reduction. Even though the cyber security community has developed a certain maturity in describing and sharing technical indicators for informing defense components, we still struggle with non-uniform, unstructured, and ambiguous higher-level information, such as the threat actor context, thereby limiting our ability to correlate with different sources to derive more contextual, accurate, and relevant intelligence. We see the need to overcome this limitation in order to increase our ability to produce and better operationalize cyber threat intelligence. Our research demonstrates how commonly agreed upon controlled vocabularies for characterizing threat actors and their operations can be used to enrich cyber threat intelligence and infer new information at a higher contextual level that is explicable and queryable. In particular, we present an ontological approach to automatically inferring the types of threat actors based on their personas, understanding their nature, and capturing polymorphism and changes in their behavior and characteristics over time. Such an approach not only enables interoperability by providing a structured way and means for sharing highly contextual cyber threat intelligence but also derives new information at machine speed and minimizes cognitive biases that manual classification approaches entail.
The escalating frequency and sophistication of cyber threats increased the need for their comprehensive understanding. This paper explores the intersection of geopolitical dynamics, cyber threat intelligence analysis, and advanced detection technologies, with a focus on the energy domain. We leverage generative artificial intelligence to extract and structure information from raw cyber threat descriptions, enabling enhanced analysis. By conducting a geopolitical comparison of threat actor origins and target regions across multiple databases, we provide insights into trends within the general threat landscape. Additionally, we evaluate the effectiveness of cybersecurity tools -- with particular emphasis on learning-based techniques -- in detecting indicators of compromise for energy-targeted attacks. This analysis yields new insights, providing actionable information to researchers, policy makers, and cybersecurity professionals.
The ever increasing number of cyber attacks requires the cyber security and forensic specialists to detect, analyze and defend against the cyber threats in almost realtime. In practice, timely dealing with such a large number of attacks is not possible without deeply perusing the attack features and taking corresponding intelligent defensive actions, this in essence defines cyber threat intelligence notion. However, such an intelligence would not be possible without the aid of artificial intelligence, machine learning and advanced data mining techniques to collect, analyse, and interpret cyber attack evidences. In this introductory chapter we first discuss the notion of cyber threat intelligence and its main challenges and opportunities, and then briefly introduce the chapters of the book which either address the identified challenges or present opportunistic solutions to provide threat intelligence.
Cyber threat intelligence is the provision of evidence-based knowledge about existing or emerging threats. Benefits from threat intelligence include increased situational awareness, efficiency in security operations, and improved prevention, detection, and response capabilities. To process, correlate, and analyze vast amounts of threat information and data and derive intelligence that can be shared and consumed in meaningful times, it is required to utilize structured, machine-readable formats that incorporate the industry-required expressivity while at the same time being unambiguous. To a large extent, this is achieved with technologies like ontologies, schemas, and taxonomies. This research evaluates the coverage and high-level conceptual expressivity of cyber-threat-intelligence-relevant ontologies, sharing standards, and taxonomies pertaining to the who, what, why, where, when, and how elements of threats and attacks in addition to courses of action and technical indicators. The results confirm that little emphasis has been given to developing a comprehensive cyber threat intelligence ontology, with existing efforts being not thoroughly designed, non-interoperable, ambiguous, and lacking proper semantics and axioms for reasoning.
Cyber threat intelligence is a relatively new field that has grown from two distinct fields, cyber security and intelligence. As such, it draws knowledge from and mixes the two fields. Yet, looking into current scientific research on cyber threat intelligence research, it is relatively scarce, which opens up a lot of opportunities. In this paper we define what cyber threat intelligence is, briefly review some aspects for cyber threat intelligence. Then, we analyze existing research fields that are much older that cyber threat intelligence but related to it. This opens up an opportunity to draw knowledge and methods from those older field, and in that way advance cyber threat intelligence much faster than it would by following its own path. With such an approach we effectively give a research directions for CTI.
Cyber threat attribution can play an important role in increasing resilience against digital threats. Recent research focuses on automating the threat attribution process and on integrating it with other efforts, such as threat hunting. To support increasing automation of the cyber threat attribution process, this paper proposes a modular architecture as an alternative to current monolithic automated approaches. The modular architecture can utilize opinion pools to combine the output of concrete attributors. The proposed solution increases the tractability of the threat attribution problem and offers increased usability and interpretability, as opposed to monolithic alternatives. In addition, a Pairing Aggregator is proposed as an aggregation method that forms pairs of attributors based on distinct features to produce intermediary results before finally producing a single Probability Mass Function (PMF) as output. The Pairing Aggregator sequentially applies both the logarithmic opinion pool and the linear opinion pool. An experimental validation suggests that the modular approach does not result in decreased performance and can even enhance precision and recall compared to monolithic alternatives. The results also suggest that the Pairing Aggregator can improve precision over the linear and logarithmic opinion pools. Furthermore, the improved k-accuracy in the experiment suggests that forensic experts can leverage the resulting PMF during their manual attribution processes to enhance their efficiency.
Cyber attacks have become a vital threat to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence, as the collection of cyber threat information, provides an ideal approach for responding to emerging vehicle cyber threats and enabling proactive security defense. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve cyber threat intelligence modeling is an effective means to ensure automotive cybersecurity. Unfortunately, there is no existing cybersecurity dataset available for cyber threat intelligence modeling research in the automotive field. This paper reports the creation of a cyber threat intelligence corpus focusing on vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, containing 3678 sentences, 8195 security entities and 4852 semantic relations. We further conduct a comprehensive analysis of cyber threat intelligence mining algorithms based on this corpus. The proposed dataset will serve as a valuable resource for evaluating the performance of existing algorithms and advancing research in cyber threat intelligence modeling within the automotive field.
With the ever-changing landscape of cyber threats, identifying their origin has become paramount, surpassing the simple task of attack classification. Cyber threat attribution gives security analysts the insights they need to device effective threat mitigation strategies. Such strategies empower enterprises to proactively detect and defend against future cyber-attacks. However, existing approaches exhibit limitations in accurately identifying threat actors, leading to low precision and a significant occurrence of false positives. Machine learning offers the potential to automate certain aspects of cyber threat attribution. The distributed nature of information regarding cyber threat actors and their intricate attack methodologies has hindered substantial progress in this domain. Cybersecurity analysts deal with an ever-expanding collection of cyber threat intelligence documents. While these documents hold valuable insights, their sheer volume challenges efficient organization and retrieval of pertinent information. To assist the cybersecurity analyst activities, we propose a machine learning based approach featuring visually interactive analytics tool named the Cyber-Attack Pattern Explorer (CAPE), designed to facilitate efficient information discovery by employing interactive visualization and mining techniques. In the proposed system, a non-parametric mining technique is proposed to create a dataset for identifying the attack patterns within cyber threat intelligence documents. These attack patterns align semantically with commonly employed themes ensuring ease of interpretation. The extracted dataset is used for training of proposed machine learning algorithms that enables the attribution of cyber threats with respective to the actors.
Preventing organizations from Cyber exploits needs timely intelligence about Cyber vulnerabilities and attacks, referred as threats. Cyber threat intelligence can be extracted from various sources including social media platforms where users publish the threat information in real time. Gathering Cyber threat intelligence from social media sites is a time consuming task for security analysts that can delay timely response to emerging Cyber threats. We propose a framework for automatically gathering Cyber threat intelligence from Twitter by using a novelty detection model. Our model learns the features of Cyber threat intelligence from the threat descriptions published in public repositories such as Common Vulnerabilities and Exposures (CVE) and classifies a new unseen tweet as either normal or anomalous to Cyber threat intelligence. We evaluate our framework using a purpose-built data set of tweets from 50 influential Cyber security related accounts over twelve months (in 2018). Our classifier achieves the F1-score of 0.643 for classifying Cyber threat tweets and outperforms several baselines including binary classification models. Our analysis of the classification results suggests that Cyber threat relevant tweets on Twitter do not often include the CVE identifier of the related threats. Hence, it would be valuable to collect these tweets and associate them with the related CVE identifier for cyber security applications.
Developing intelligent, interoperable Cyber Threat Information (CTI) sharing technologies can help build strong defences against modern cyber threats. CTIs allow the community to share information about cybercriminals' threats and vulnerabilities and countermeasures to defend themselves or detect malicious activity. A crucial need for success is that the data connected to cyber risks be understandable, organized, and of good quality. The receiving parties may grasp its content and utilize it effectively. This article describes an innovative cyber threat intelligence management platform (CTIMP) for industrial environments, one of the Cyber-pi project's significant elements. The suggested architecture, in particular, uses cyber knowledge from trusted public sources and integrates it with relevant information from the organization's supervised infrastructure in an entirely interoperable and intelligent way. When combined with an advanced visualization mechanism and user interface, the services mentioned above provide administrators with the situational awareness they require while also allowing for extended cooperation, intelligent selection of advanced coping strategies, and a set of automated self-healing rules for dealing with threats.
In response to the escalating cyber threats, the efficiency of Cyber Threat Intelligence (CTI) data collection has become paramount in ensuring robust cybersecurity. However, existing works encounter significant challenges in preprocessing large volumes of multilingual threat data, leading to inefficiencies in real-time threat analysis. This paper presents a systematic review of current techniques aimed at enhancing CTI data collection efficiency. Additionally, it proposes a conceptual model to further advance the effectiveness of threat intelligence feeds. Following the PRISMA guidelines, the review examines relevant studies from the Scopus database, highlighting the critical role of artificial intelligence (AI) and machine learning models in optimizing CTI data preprocessing. The findings underscore the importance of AI-driven methods, particularly supervised and unsupervised learning, in significantly improving the accuracy of threat detection and event extraction, thereby strengthening cybersecurity. Furthermore, the study identifies a gap in the existing research and introduces XBC conceptual model integrating XLM-RoBERTa, BiGRU, and CRF, specifically developed to address this gap. This paper contributes conceptually to the field by providing a detailed analysis of current CTI data collection techniques and introducing an innovative conceptual model to enhance future threat intelligence capabilities.
Understanding the modus operandi of adversaries aids organizations in employing efficient defensive strategies and sharing intelligence in the community. This knowledge is often present in unstructured natural language text within threat analysis reports. A translation tool is needed to interpret the modus operandi explained in the sentences of the threat report and translate it into a structured format. This research introduces a methodology named TTPXHunter for the automated extraction of threat intelligence in terms of Tactics, Techniques, and Procedures (TTPs) from finished cyber threat reports. It leverages cyber domain-specific state-of-the-art natural language processing (NLP) to augment sentences for minority class TTPs and refine pinpointing the TTPs in threat analysis reports significantly. The knowledge of threat intelligence in terms of TTPs is essential for comprehensively understanding cyber threats and enhancing detection and mitigation strategies. We create two datasets: an augmented sentence-TTP dataset of 39,296 samples and a 149 real-world cyber threat intelligence report-to-TTP dataset. Further, we evaluate TTPXHunter on the augmented sentence dataset and the cyber threat reports. The TTPXHunter achieves the highest performance of 92.42% f1-score on the augmented dataset, and it also outperforms existing state-of-the-art solutions in TTP extraction by achieving an f1-score of 97.09% when evaluated over the report dataset. TTPXHunter significantly improves cybersecurity threat intelligence by offering quick, actionable insights into attacker behaviors. This advancement automates threat intelligence analysis, providing a crucial tool for cybersecurity professionals fighting cyber threats.
Log-based cyber threat hunting has emerged as an important solution to counter sophisticated attacks. However, existing approaches require non-trivial efforts of manual query construction and have overlooked the rich external threat knowledge provided by open-source Cyber Threat Intelligence (OSCTI). To bridge the gap, we propose ThreatRaptor, a system that facilitates threat hunting in computer systems using OSCTI. Built upon system auditing frameworks, ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, (3) a query synthesis mechanism that automatically synthesizes a TBQL query for hunting, and (4) an efficient query execution engine to search the big audit logging data. Evaluations on a broad set of attack cases demonstrate the accuracy and efficiency of ThreatRaptor in practical threat hunting.
Cybersecurity has become a crucial concern in the field of connected autonomous vehicles. Cyber threat intelligence (CTI), as the collection of cyber threat information, offers an ideal way for responding to emerging cyber threats and realizing proactive security defense. However, instant analysis and modeling of vehicle cybersecurity data is a fundamental challenge since its complex and professional context. In this paper, we suggest an automotive CTI modeling framework, Actim, to extract and analyse the interrelated relationships among cyber threat elements. Specifically, we first design a vehicle security-safety conceptual ontology model to depict various threat entity classes and their relations. Then, we manually annotate the first automobile CTI corpus by using real cybersecurity data, which comprises 908 threat intelligence texts, including 8195 entities and 4852 relationships. To effectively extract cyber threat entities and their relations, we propose an automotive CTI mining model based on cross-sentence context. Experiment results show that the proposed BERT-DocHiatt-BiLSTM-LSTM model exceeds the performance of existing methods. Finally, we define entity-relation matching rules and create a CTI knowledge graph that structurally fuses various elements of cyber threats. The Actim framework enables mining the intrinsic connections among threat entities, providing valuable insight on the evolving cyber threat landscape.
Cyber threat intelligence (CTI) is crucial in today's cybersecurity landscape, providing essential insights to understand and mitigate the ever-evolving cyber threats. The recent rise of Large Language Models (LLMs) have shown potential in this domain, but concerns about their reliability, accuracy, and hallucinations persist. While existing benchmarks provide general evaluations of LLMs, there are no benchmarks that address the practical and applied aspects of CTI-specific tasks. To bridge this gap, we introduce CTIBench, a benchmark designed to assess LLMs' performance in CTI applications. CTIBench includes multiple datasets focused on evaluating knowledge acquired by LLMs in the cyber-threat landscape. Our evaluation of several state-of-the-art models on these tasks provides insights into their strengths and weaknesses in CTI contexts, contributing to a better understanding of LLM capabilities in CTI.
Log-based cyber threat hunting has emerged as an important solution to counter sophisticated cyber attacks. However, existing approaches require non-trivial efforts of manual query construction and have overlooked the rich external knowledge about threat behaviors provided by open-source Cyber Threat Intelligence (OSCTI). To bridge the gap, we build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI. Built upon mature system auditing frameworks, ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors, and (4) an efficient query execution engine to search the big system audit logging data.
With the proliferation of digitization and its usage in critical sectors, it is necessary to include information about the occurrence and assessment of cyber threats in an organization's threat mitigation strategy. This Cyber Threat Intelligence (CTI) is becoming increasingly important, or rather necessary, for critical national and industrial infrastructures. Current CTI solutions are rather federated and unsuitable for sharing threat information from low-power IoT devices. This paper presents a taxonomy and analysis of the CTI frameworks and CTI exchange platforms available today. It proposes a new CTI architecture relying on the MISP Threat Intelligence Sharing Platform customized and focusing on IoT environment. The paper also introduces a tailored version of STIX (which we call tinySTIX), one of the most prominent standards adopted for CTI data modeling, optimized for low-power IoT devices using the new lightweight encoding and cryptography solutions. The proposed CTI architecture will be very beneficial for securing IoT networks, especially the ones working in harsh and adversarial environments.
The exponential growth of cyber threat knowledge, exemplified by the expansion of databases such as MITRE-CVE and NVD, poses significant challenges for cyber threat analysis. Security professionals are increasingly burdened by the sheer volume and complexity of information, creating an urgent need for effective tools to navigate, synthesize, and act on large-scale data to counter evolving threats proactively. However, conventional threat intelligence tools often fail to scale with the dynamic nature of this data and lack the adaptability to support diverse threat intelligence tasks. In this work, we introduce CYLENS, a cyber threat intelligence copilot powered by large language models (LLMs). CYLENS is designed to assist security professionals throughout the entire threat management lifecycle, supporting threat attribution, contextualization, detection, correlation, prioritization, and remediation. To ensure domain expertise, CYLENS integrates knowledge from 271,570 threat reports into its model parameters and incorporates six specialized NLP modules to enhance reasoning capabilities. Furthermore, CYLENS can be customized to meet the unique needs of different or ganizations, underscoring its adaptability. Through extensive evaluations, we demonstrate that CYLENS consistently outperforms industry-leading LLMs and state-of-the-art cybersecurity agents. By detailing its design, development, and evaluation, this work provides a blueprint for leveraging LLMs to address complex, data-intensive cybersecurity challenges.
Traditional fact checking by experts and analysts cannot keep pace with the volume of newly created information. It is important and necessary, therefore, to enhance our ability to computationally determine whether some statement of fact is true or false. We view this problem as a link-prediction task in a knowledge graph, and present a discriminative path-based method for fact checking in knowledge graphs that incorporates connectivity, type information, and predicate interactions. Given a statement S of the form (subject, predicate, object), for example, (Chicago, capitalOf, Illinois), our approach mines discriminative paths that alternatively define the generalized statement (U.S. city, predicate, U.S. state) and uses the mined rules to evaluate the veracity of statement S. We evaluate our approach by examining thousands of claims related to history, geography, biology, and politics using a public, million node knowledge graph extracted from Wikipedia and PubMedDB. Not only does our approach significantly outperform related models, we also find that the discriminative predicate path model is easily interpretable and provides sensible reasons for the final determination.
Knowledge graph (KG) embedding aims at learning the latent representations for entities and relations of a KG in continuous vector spaces. An empirical observation is that the head (tail) entities connected by the same relation often share similar semantic attributes -- specifically, they often belong to the same category -- no matter how far away they are from each other in the KG; that is, they share global semantic similarities. However, many existing methods derive KG embeddings based on the local information, which fail to effectively capture such global semantic similarities among entities. To address this challenge, we propose a novel approach, which introduces a set of virtual nodes called \textit{\textbf{relational prototype entities}} to represent the prototypes of the head and tail entities connected by the same relations. By enforcing the entities' embeddings close to their associated prototypes' embeddings, our approach can effectively encourage the global semantic similarities of entities -- that can be far away in the KG -- connected by the same relation. Experiments on the entity alignment and KG completion tasks demonstrate that our approach significantly outperforms recent state-of-the-arts.
本研究综述展示了基于知识图谱的网络安全威胁预警技术已形成从底层自动化构建到高层智能决策的完整体系。核心演进趋势包括:1) 自动化程度提升,利用LLM和NLP实现海量异构情报的精准提取;2) 推理能力深化,通过GNN、时间感知及神经符号AI增强对复杂攻击链的预测与解释;3) 场景适配多元化,针对ICS、IoT等垂直领域构建物理-信息耦合的防御模型;4) 安全性与鲁棒性关注,开始研究针对知识图谱的对抗攻击与情报质量保障。KG与LLM的深度融合(KG+LLM)正成为实现主动、精准、可解释威胁预警的关键路径。