多源数据下的中国应急虚假信息检测分类
多源多模态虚假信息检测技术
聚焦于文本、视觉(图像/视频)以及社交上下文等多源异构数据的深度融合,探讨跨模态交互、特征表示学习及多模态对齐策略以提升检测精度。
- A multi-level fusion-based framework for multimodal fake news classification using semantic feature extraction(Fakhar Abbas, Araz Taeihagh, 2025, International Journal of Machine Learning and Cybernetics)
- Evidence-Aware Multimodal Chinese Social Media Rumor Detection(Kaixuan Wu, Donglin Cao, 2024, ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
- A social context-aware graph-based multimodal attentive learning framework for disaster content classification during emergencies(Shahid Shafi Dar, M. Z. Rehman, Karan Bais, Mohammed Abdul Haseeb, Nagendra Kumar, 2024, Expert Systems with Applications)
- Rumor detection model based on multimodal machine learning(Lulu Gao, Yali Gao, Jie Yuan, Xiaoyong Li, 2023, Second International Conference on Algorithms, Microchips, and Network Applications (AMNA 2023))
- Rumor Detection Mechanism for Multi-Modal Information in Social Media(Yong Liu, Xiaoling Li, 2024, 2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT))
- Image-Text Out-Of-Context Detection Using Synthetic Multimodal Misinformation(Fatma Shalabi, H. Nguyen, Hichem Felouat, Ching-Chun Chang, Isao Echizen, 2023, 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC))
- Cross-Platform Multimodal Misinformation: Taxonomy, Characteristics and Detection for Textual Posts and Videos(Nicholas Micallef, Marcelo Sandoval-Castañeda, Adir Cohen, M. Ahamad, Srijan, Kumar, Nasir D. Memon, 2022, Proceedings of the International AAAI Conference on Web and Social Media)
- A Multimodal Framework for Real-Time Disaster Event Classification and Prioritization Using Social Media(Thai Dinh Kim, Khanh Luong, O. T. Tran, T. An, D. T. N. Anh, 2025, 2025 International Conference on Advanced Technologies for Communications (ATC))
- Rumor Classification through a Multimodal Fusion Framework and Ensemble Learning(Abderrazek Azri, Cécile Favre, Nouria Harbi, J. Darmont, Camille Noûs, 2022, Information Systems Frontiers)
- Multimodal Pipeline for Collection of Misinformation Data from Telegram(José Sosa, S. Sharoff, 2022, International Conference on Language Resources and Evaluation)
- A Heuristic Framework for Sources Detection in Social Networks via Graph Convolutional Networks(Le Cheng, Peican Zhu, Chao Gao, Zhen Wang, Xuelong Li, 2024, IEEE Transactions on Systems, Man, and Cybernetics: Systems)
- Cross‐Social Media Platform Emergency Knowledge Collaboration Based on Multimodal Heterogeneous Information Networks(Wei Zhou, Lu An, Ruilian Han, Gang Li, Chuanming Yu, 2025, Proceedings of the Association for Information Science and Technology)
- Multi-modal Knowledge-aware Event Memory Network for Social Media Rumor Detection(Huaiwen Zhang, Quan Fang, Shengsheng Qian, Changsheng Xu, 2019, Proceedings of the 27th ACM International Conference on Multimedia)
- Contrastive Learning for Multimodal Classification of Crisis related Tweets(Bishwas Mandal, Sarthak Khanal, Doina Caragea, 2024, Proceedings of the ACM Web Conference 2024)
- SmoothDectector: A Smoothed Dirichlet Multimodal Approach for Combating Fake News on Social Media(Akinlolu Oluwabusayo Ojo, Fatma Najar, Nuha Zamzami, Hanen T. Himdi, Nizar Bouguila, 2025, IEEE Access)
- Multi-modal affine fusion network for social media rumor detection(Bo Fu, Jie Sui, 2022, PeerJ Computer Science)
- Dual-Key Prompt Learning Network for Enhanced Multi-Modal Rumor Detection(Qiuyue Wei, Jingyi Zhang, Mingjie Zhang, 2025, 2025 44th Chinese Control Conference (CCC))
- Harmfully Manipulated Images Matter in Multimodal Misinformation Detection(Bing Wang, Shengsheng Wang, C. Li, Renchu Guan, Ximing Li, 2024, Proceedings of the 32nd ACM International Conference on Multimedia)
- Augmenting Multimodal Content Representation with Transformers for Misinformation Detection(Jenq-Haur Wang, M. Norouzi, Shu Ming Tsai, 2024, Big Data and Cognitive Computing)
- Fine-Grained Discrepancy Contrastive Learning for Robust Fake News Detection(Junwei Yin, Min Gao, Kai Shu, Jia Wang, Yinqiu Huang, Wei Zhou, 2024, ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
- Multimodal Misinformation Detection Using Early Fusion of Linguistic, Visual, and Social Features(Gautam Kishore Shahi, 2025, Companion Publication of the 17th ACM Web Science Conference 2025)
- ISMAF: Intrinsic-Social Modality Alignment and Fusion for Multimodal Rumor Detection(Zihao Yu, Xiang Li, Jing Zhang, 2025, Neurocomputing)
- EEGCN: Event Evolutionary Graph Comparison Network for Multi-Modal Fake News Detection(Xiang Dai, Yue Wang, Xin Liu, Dan Song, Zhaobo Juan, Shengze Wang, Liangyu Lu, Haibo Liu, Jing Shen, 2023, Proceedings of the 2023 6th International Conference on Machine Learning and Natural Language Processing)
- Hybrid Machine Learning Framework for Secure Information Verification in Social Media(Neelam Sanjeev Kumar, S. S, Aaron Mammen Enoch, K. T, William Carey R, 2025, 2025 IEEE 17th International Conference on Computational Intelligence and Communication Networks (CICN))
- Deep visual-linguistic fusion network considering cross-modal inconsistency for rumor detection(Yang Yang, Ran Bao, Weili Guo, De-Chuan Zhan, Yilong Yin, Jian Yang, 2023, Science China Information Sciences)
- Rumor Detection Based on Cross-modal Information-enhanced Fusion Network(Zhiwei Guo, Zhenguo Yang, Dahuang Liu, 2024, 2024 16th International Conference on Advanced Computational Intelligence (ICACI))
- Enhancing video rumor detection through multimodal deep feature fusion with time-sync comments(Ming Yin, Wei Chen, Dan Zhu, Jijiao Jiang, 2025, Information Processing & Management)
- MRML: Multimodal Rumor Detection by Deep Metric Learning(Liwen Peng, Songlei Jian, Dongsheng Li, Siqi Shen, 2023, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
- Public Crisis Events Tweet Classification Based on Multimodal Cycle-GAN(Jinyan Zhou, Xingang Wang, Jiandong Lv, Ning Liu, Hong Zhang, Rui Cao, Xiaoyu Liu, Xiaomin Li, 2023, 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC))
- Discourse-Driven Detection of Multimodal Misinformation(Adrián Girón, Javier Huertas-Tato, David Camacho, 2026, IEEE Transactions on Emerging Topics in Computational Intelligence)
- Machine Learning-Based Classification of Multi-modal Fact-Checked Misinformation on Social Networks(Javeriya Naaz I. Syed, R. Keole, 2025, EPJ Web of Conferences)
- Impact of Misinformation on Reddit User Behavior: A Multimodal Transformer-Based Detection Model(S. K. Rao, K. S. Vineet, D. Adithya, S. H. Kishore Kumar, Akshay Paramesha, 2024, 2024 Asian Conference on Intelligent Technologies (ACOIT))
- MMCoVaR: multimodal COVID-19 vaccine focused data repository for fake news detection and a baseline architecture for classification(Mingxuan Chen, Xinqiao Chu, K. P. Subbalakshmi, 2021, Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining)
- 'Humor, Art, or Misinformation?': A Multimodal Dataset for Intent-Aware Synthetic Image Detection(Anastasios Skoularikis, Stefanos-Iordanis Papadopoulos, Symeon Papadopoulos, P. Petrantonakis, 2025, Proceedings of the 2nd International Workshop on Diffusion of Harmful Content on Online Web)
- Hybrid-CNN Classifier for Fake News Detection: a Multimodal Defense Against Digital Misinformation(S. K., Jayashrinidhi Vijayaraghavan, Shruthi G, Sri Balaji S, Ranjith Kumar S, S. P, 2025, 2025 Second International Conference on Networks and Soft Computing (ICNSoC))
社交网络传播结构与上下文交互建模
基于图神经网络、传播树及用户互动行为,研究虚假信息在网络中的拓扑扩散规律、立场演化及多视角下的时空传播特征。
- Dual Graph Networks with Synthetic Oversampling for Imbalanced Rumor Detection on Social Media(Yen-Wen Lu, Chih-Yao Chen, Cheng-Te Li, 2024, Companion Proceedings of the ACM Web Conference 2024)
- TRGCN: A Hybrid Framework for Social Network Rumor Detection(Yanqin Yan, Suiyu Zhang, Dingguo Yu, Yijie Zhou, Cheng-Jun Wang, K. Shang, 2026, Humanities and Social Sciences Communications)
- RP-DNN: A Tweet Level Propagation Context Based Deep Neural Networks for Early Rumor Detection in Social Media(Jie Gao, Sooji Han, Xingyi Song, F. Ciravegna, 2020, International Conference on Language Resources and Evaluation)
- Graph-Aware Multi-View Fusion for Rumor Detection on Social Media(Yang Wu, Jing Yang, Liming Wang, Zhen Xu, 2024, ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
- A Weakly Supervised Propagation Model for Rumor Verification and Stance Detection with Multiple Instance Learning(R. Yang, Jing Ma, Hongzhan Lin, Wei Gao, 2022, Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval)
- Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning Approach for Rumor Detection(Chaoqun Cui, Caiyan Jia, 2024, Proceedings of the AAAI Conference on Artificial Intelligence)
- LLM-Enhanced Multiple Instance Learning for Joint Rumor and Stance Detection with Social Context Information(Ruichao Yang, Jing Ma, Wei Gao, Hongzhan Lin, 2025, ACM Transactions on Intelligent Systems and Technology)
- Twitter Fake News Detection by Using Xlnet Model(Senthil Athithan, Savya Sachi, Ajay Kumar Singh, Arpit Jain, Divya, Yogesh Kumar Sharma, 2023, 2023 3rd International Conference on Technological Advancements in Computational Sciences (ICTACS))
- Rumor detection on social networks based on Temporal Tree Transformer(Sirong Wu, Yuhui Deng, Junjie Liu, Xi Luo, Gengchen Sun, 2025, PLOS ONE)
- Deep Feature Fusion for Rumor Detection on Twitter(Zhirui Luo, Qingqing Li, Jun Zheng, 2021, IEEE Access)
- Leveraging Dual GNNs and BERT for Context-Aware Rumor Detection in Social Networks(Ayesha Seerat, Muhammad Wasim, 2025, 2025 International Conference on Innovation in Artificial Intelligence and Internet of Things (AIIT))
- Attention Based Neural Architecture for Rumor Detection with Author Context Awareness(Sansiri Tarnpradab, K. Hua, 2018, 2018 Thirteenth International Conference on Digital Information Management (ICDIM))
- CLFFRD: Curriculum Learning and Fine-grained Fusion for Multimodal Rumor Detection(Fan Xu, Lei Zeng, Bowei Zou, AiTi Aw, Huan Rong, 2024, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024))
- Experimental Design for Evaluating the Effectiveness of Health Information Dissemination Based on the PHEME Dataset(Wenjie Li, Xuan Tan, Linze Li, 2025, 2025 IEEE 7th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC))
- Social Network Rumor Detection Method Combining Dual-Attention Mechanism With Graph Convolutional Network(Xiao-Yang Liu, Zhengyang Zhao, Yihao Zhang, Chao Liu, Fan Yang, 2023, IEEE Transactions on Computational Social Systems)
- A Rumor Propagation Model Considering Media Effect and Suspicion Mechanism under Public Emergencies(Shan Yang, Shihan Liu, Kai Su*, Jianhong Chen, 2024, Mathematics)
- eventAI at SemEval-2019 Task 7: Rumor Detection on Social Media by Exploiting Content, User Credibility and Propagation Information(Quanzhi Li, Qiong Zhang, Luo Si, 2019, Proceedings of the 13th International Workshop on Semantic Evaluation)
- Adaptive Weighted Ensemble Deep Learning for Robust Rumor Detection on Social Media(Yanran Ren, Yehan Liu, Jie Sui, 2023, 2023 4th International Conference on Computers and Artificial Intelligence Technology (CAIT))
- Rumor Detection Model with Deep Cross Fusion of Text and Propagation Graph Structures(Kai Xu, Junfei He, 2023, 2023 3rd International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI))
- Filter-based Stance Network for Rumor Verification(Jun Li, Yi Bin, Yunshan Ma, Yang Yang, Z. Huang, Tat-Seng Chua, 2024, ACM Transactions on Information Systems)
- A Novel and High-Accuracy Rumor Detection Approach using Kernel Subtree and Deep Learning Networks(Ziyu Wei, Xi Xiao, Guangwu Hu, Bin Zhang, Qing Li, Shutao Xia, 2021, 2021 International Joint Conference on Neural Networks (IJCNN))
- Propagation Structure Fusion for Rumor Detection Based on Node-Level Contrastive Learning(Jiachen Ma, Yong Liu, Meng Han, Chunqiang Hu, Zhaojie Ju, 2023, IEEE Transactions on Neural Networks and Learning Systems)
- Bi-directional temporal graph attention networks for rumor detection in online social networks(Qiao Zhou, Xingpeng Lin, Li Xu, Yan Sun, 2025, Computing)
- Graph Enhanced BERT for Stance-Aware Rumor Verification on Social Media(Kai Ye, Yangheran Piao, Kun Zhao, Xiaohui Cui, 2021, Lecture Notes in Computer Science)
- STANKER: Stacking Network based on Level-grained Attention-masked BERT for Rumor Detection on Social Media(Dongning Rao, Xin Miao, Zhihua Jiang, Ran Li, 2021, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing)
大语言模型与预训练驱动的语义检测方法
利用预训练语言模型(BERT/RoBERTa)及大语言模型(LLM)的提示学习、指令微调与零样本学习能力,解决复杂语义理解及低资源场景下的应急分类问题。
- Rumor Detection on Social Media with Crowd Intelligence and ChatGPT-Assisted Networks(Chang Yang, Peng Zhang, Wenbo Qiao, Hui Gao, Jiaming Zhao, 2023, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing)
- Collaborate large and small language models for multi-modal emergency rumor detection(Youcheng Yan, Jinshuo Liu, Juan Deng, Junyang Li, Lina Wang, Jeff Z. Pan, 2025, Neural Networks)
- Few-shot rumor detection based on prompt learning and meta learning(Zhaorui Fan, Xin Chen, Yaling Xun, 2024, Fifth International Conference on Computer Communication and Network Security (CCNS 2024))
- Overcoming Language Disparity in Online Content Classification with Multimodal Learning(Gaurav Verma, Rohit Mujumdar, Zijie J. Wang, M. Choudhury, Srijan Kumar, 2022, International Conference on Web and Social Media)
- Evaluating the Zero-Shot Chinese Rumor Detection Capability of Large Language Models Based on Chain-of-Thought Prompting(Rui Guo, Yu Yang, Zhuoyun Yang, Shang Wu, J. Zhou, 2025, Proceedings of the 2025 International Conference on Computer Technology, Digital Media and Communication)
- Detecting Medical Misinformation for Clinical Decision Support: A Robust Neural Approach with Consistency Regularization(Bingqiang Huo, Fang Li, Sudan Xin, Hongjun Wang, 2026, European Heart Journal Supplements)
- An Improved FakeBERT for Fake News Detection(Arshad Ali, M. Gulzar, 2023, Applied Computer Systems)
- Early Detection of Emergencies in Social Networks Based on BERT-BiLSTM Model(Yong Luo, Haoyu Li, 2025, 2025 IEEE 12th Joint International Information Technology and Artificial Intelligence Conference (ITAIC))
- ABiLSTM with BERT Embedding for Classification of Imbalanced COVID-19 Rumors(Rakesh Dutta, Mukta Majumder, 2024, CURRENT APPLIED SCIENCE AND TECHNOLOGY)
- Social Network Public Opinion Analysis Using BERT-BMA in Big Data Environment(Hanqing Sun, Zhengyang Liu, Weimin Lian, Guizhi Wang, 2024, International Journal of Information Technologies and Systems Approach)
- An Explainable Artificial Intelligence Text Classifier for Suicidality Prediction in Youth Crisis Text Line Users: Development and Validation Study(Julia Thomas, Antonia Lucht, Jacob Segler, Richard Wundrack, M. Miché, R. Lieb, Lars Kuchinke, Gunther Meinlschmidt, 2024, JMIR Public Health and Surveillance)
- Continually Detection, Rapidly React: Unseen Rumors Detection Based on Continual Prompt-Tuning(Yuhui Zuo, Wei Zhu, Guoyong Cai, 2022, International Conference on Computational Linguistics)
- CrisisSense: A Real-Time Framework for Event Detection and Misinformation Filtering in Social Media Streams(S. Ganeshmoorthy, Mouniga. K, Abinaya. C. M, Ilakkiya. G, Selvimuthu. J, Vikash Kumar. V. S, 2025, 2025 International Conference on Sustainable Communication Networks and Application (ICSCN))
应急场景下的自动化检测与实时响应架构
面向突发公共卫生或灾害应急场景,侧重于系统时效性、应急地理感知、众包管理及针对性的虚假信息辟谣与治理实践。
- Fake News Detection in Model Integral: A Hybrid CNN-BiLSTM Model(Renuka Nyayadhish, C. Jadhav, Ch Bhupati, R. A. Mabel Rose, M. Prabhu, 2025, International Journal of Engineering, Science and Information Technology)
- Design Principles for Information Categorization Quality in Crowdsourced Crisis Mapping Platforms(R. Valecha, Onook Oh, H. Rao, 2025, MIS Quarterly)
- Unveiling temporal patterns in information for improved rumor detection(Omel Mairaj, S. R. Khan, 2025, Social Network Analysis and Mining)
- Time-Critical Geolocation for Social Good(Reem Suwaileh, 2020, Lecture Notes in Computer Science)
- Real-Time Event Detection from Social Media Using Big Data Analytics(Napat Sukthong, 2025, 2025 6th International Conference on Big Data Analytics and Practices (IBDAP))
- ESARS: A Situation-Aware Multi-Agent System for Real-Time Emergency Response Management(Dumnamene Jolley Sunday Sako, C. G. Igiri, E. O. Bennett, F. B. Deedam, 2024, European Journal of Information Technologies and Computer Science)
- Zero-Shot Social Media Crisis Classification: A Training-Free Multimodal Approach(Franziska Schwarz, Klaus Dieter Schwarz, Daniel Arias Aranda, Kendrick Bollens, Navaneeth Shivananjappa, Reiner Creutzburg, Vesna Dimitrova, 2026, Applied Sciences)
- Mix-Persona Comment Generation and Geographically Enhanced Context Retrieval for LLM Fine-Tuning in Multimodal Crisis Post Classification(Tong Bie, Yongli Hu, Yu Fu, Linjia Hao, Tengfei Liu, Kan Guo, Huajie Jiang, Junbin Gao, Yanfeng Sun, Baocai Yin, 2026, ISPRS International Journal of Geo-Information)
- Rumor Detection on Social Media Using Deep Learning Algorithms with Fuzzy Inference System for Healthcare Analytics System Using COVID-19 Dataset(Akila Rathakrishnan, Revathi Sathiyanarayanan, 2023, International Journal of Computational Intelligence and Applications)
- Beyond just saying it’s false: explainable AI for multimodal misinformation detection(Saswata Roy, M. Bhanu, S. Priya, Joydeep Chandra, Sourav Kumar Dandapat, 2025, Applied Intelligence)
- Mitigating information overload in social media during conflicts and crises: design and evaluation of a cross-platform alerting system(M. Kaufhold, N. Rupp, Christian Reuter, Matthias Habdank, 2019, Behaviour & Information Technology)
- A Semi-Automatic Method for Efficient Detection of Stories on Social Media(S. Vosoughi, D. Roy, 2016, Proceedings of the International AAAI Conference on Web and Social Media)
- Know it to Defeat it: Exploring Health Rumor Characteristics and Debunking Efforts on Chinese Social Media during COVID-19 Crisis(Wenjie Yang, Sitong Wang, Zhenhui Peng, Chuhan Shi, Xiaojuan Ma, Diyi Yang, 2021, Proceedings of the International AAAI Conference on Web and Social Media)
- Crowdsourced Disaster Management(Farhat A. Patel, Adnan Memon, Soham Nikam, A. Marathe, Shrushty Meshram, 2025, International Research Journal on Advanced Engineering Hub (IRJAEH))
- Rumors, Fakes and Fact-checking in Chinese Social Media: The Case of WeChat(Yihua Yang, Aleksandr Anatol'evich Grabel'nikov, 2025, Филология: научные исследования)
- What You Perceive Is What You Get: Enhancing Rumor-Combating Effectiveness on Social Media Based on Elaboration Likelihood Model(Cheng Zhou, Qian Chang, 2024, Social Media + Society)
- Real-Time Classification Model of Public Emergencies Using Fusion Expansion Network(Haiou Xiong, Gang Wang, 2024, Journal of Organizational and End User Computing)
- A Hybrid Deep Learning Framework for Real-Time and Explainable Social Network Analytics(Dr. B. Adithya, Dr.V.Seethalakshmi, D. Amu, Dr. Addanki Mounika, J. Ranjith, 2025, 2025 IEEE International Conference on Advanced Computing Technologies (ICACT))
- Multimodal Classification of Social Media Disaster Posts With Graph Neural Networks and Few-Shot Learning(José Nascimento, P. Bestagini, Anderson Rocha, 2025, IEEE Access)
- AuthEv-LKolb at CheckThat! 2024: A Two-Stage Approach To Evidence-Based Social Media Claim Verification(Luis E Kolb, Allan Hanbury, 2024, Conference and Labs of the Evaluation Forum)
- ES-VRAI at CheckThat!-2023: Leveraging Bio and Lists Information for Enhanced Rumor Verification in Twitter(H. Sadouk, Faouzi Sebbak, Hussem Eddine Zekiri, 2023, Conference and Labs of the Evaluation Forum)
- Rumor conversations detection in twitter through extraction of structural features(S. Lotfi, Mitra Mirzarezaee, M. Hosseinzadeh, V. Seydi, 2021, Information Technology and Management)
- Multi-view learning with distinguishable feature fusion for rumor detection(Xueqin Chen, Fan Zhou, Goce Trajcevski, Marcello Bonsangue, 2022, Knowledge-Based Systems)
- Rumor detection using BERT-based social circle and interaction network model(Thirumoorthy K, J. J., H. R, Shreenee N, 2024, Social Network Analysis and Mining)
- A mutual attention based multimodal fusion for fake news detection on social network(Ying-Chun Guo, 2022, Applied Intelligence)
- Rumor detection for emergency events via few-shot ensembled prompt learning(Chen Su, Jun Zhou, Zhentao Jiang, Shuwei Zhu, Chao Li, Wei Fang, Heng-yang Lu, 2025, Journal of Intelligent Information Systems)
- Detection of Rumor in Social Media(Manan Vohra, Misha Kakkar, 2018, 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence))
- Rumor Identification and Verification for Text in Social Media Content(P. Suthanthiradevi, S. Karthika, 2021, The Computer Journal)
- The “Parallel Pandemic” in the Context of China: The Spread of Rumors and Rumor-Corrections During COVID-19 in Chinese Social Media(Yunya Song, K. H. Kwon, Yin Lu, Yining Fan, Baiqi Li, 2021, American Behavioral Scientist)
- Efficient Source Inference in Large-Scale Networks based on Community Detection(Yangyu Long, Caiyu Zhang, Congduan Li, 2023, Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence)
- Social media rumor refutation effectiveness: Evaluation, modelling and enhancement(Zongmin Li, Qi Zhang, Xinyu Du, Yanfang Ma, Shihang Wang, 2021, Information Processing & Management)
- Identification of Vortex Information. Detection of fake news eruption time(Wlodimierz Gogolek, 2024, Studia Medioznawcze)
- Real-World Witness Detection in Social Media via Hybrid Crowdsensing(S. Cresci, Andrea Cimino, M. Avvenuti, Maurizio Tesconi, F. Dell’orletta, 2018, Proceedings of the International AAAI Conference on Web and Social Media)
- Rumors detection on Social Media during Crisis Management(C. Laudy, 2017, International Conference on Information Systems for Crisis Response and Management)
- Community Notes vs. Related Articles: Assessing Real-World Integrated Counter-Rumor Features in Response to Different Rumor Types on Social Media(Sarawut Kankham, Jian-Ren Hou, 2024, International Journal of Human–Computer Interaction)
- Rumors detection in Chinese via crowd responses(Guoyong Cai, Hao Wu, Rui Lv, 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014))
- An agent-based model of cross-platform information diffusion and moderation(Isabel Murdock, K. Carley, Osman Yağan, 2024, Social Network Analysis and Mining)
- Blowing Seeds Across Gardens: Visualizing Implicit Propagation of Cross-Platform Social Media Posts(Jianing Yin, Hanze Jia, Buwei Zhou, Tan Tang, Lu Ying, Shuainan Ye, Tai-Quan Peng, Yingcai Wu, 2024, IEEE Transactions on Visualization and Computer Graphics)
鲁棒性增强与通用机器学习检测策略
探讨基础模型的优化路径,包括对抗性鲁棒性、领域自适应、多任务学习、对比学习以及轻量化模型的设计,以提升模型在复杂环境下的泛化能力。
- Adversarial Vulnerability of Transformer-Based Rumor Detection(Aditi Agrawal, Shivani Tufchi, Chanchal Ahlawat, Ravi Kumar, 2025, 2025 3rd World Conference on Communication & Computing (WCONF))
- Enhancing Social Media Integrity: A Machine Learning-Based Rumor Identification System Utilizing CNN for Accurate Real-Time Tweet Analysis(Joe Leon, Angelin Jeba, 2025, 2025 International Conference on Visual Analytics and Data Visualization (ICVADV))
- A hybrid deep learning model based on CNN-BiLSTM for rumor detection(N. Rani, Praseniit Das, A. Bhardwaj, 2021, 2021 6th International Conference on Communication and Electronics Systems (ICCES))
- Countering Social Media Misinformation: A Sentence-Level Feature Approach to Fake News Classification(Rakhmat Arianto, Rudy Ariyanto, A. Ananta, I. F. Rozi, Erfan Rohadi, Natasha Dwi Pramudita, Bagus Winarko, 2025, 2025 1st International Conference on Artificial Intelligence Technology (ICoAIT))
- Improving fake news detection with domain-adversarial and graph-attention neural network(Hua Yuan, Jie Zheng, Qiongwei Ye, Yu Qian, Yan Zhang, 2021, Decision Support Systems)
- Enhancing Rumor Detection from Social Media Texts with an Ensemble Model for Residual Graph Convolutional Networks(S. Vanitha, R. Prabahari, 2025, 2025 5th International Conference on Pervasive Computing and Social Networking (ICPCSN))
- DETECTING ONLINE FAKE NEWS USING MACHINE LEARNING AND TOPOLOGICAL DATA ANALYSIS(Agbasonu Valerian Chinedum, Amanze Bethran Chibuike, A. Onyekachi, 2023, International Journal of Engineering Applied Sciences and Technology)
- Social Media Rumor Identification Based on Random Forest Classification and Feature Engineering: Case Study on Weibo Platform: Social Media Rumor Identification Based on Random Forest Classification(Yuxin Guo, Chen Jia, Chenglong Wu, Yan Tu, 2022, 2022 7th International Conference on Big Data and Computing)
- Identifying Hoaxes in Fake Spotter using XG Boost Machine Learning based Classification Method(S. Chitti, Dhruva R. Rinku, Tarun Kumar Juluru, P. R. Rao, S. Gouthami, 2024, 2024 4th International Conference on Ubiquitous Computing and Intelligent Information Systems (ICUIS))
- Integrating UNet and LSTM Architectures for Enhanced Fake News(C. M J, E. Kannan, Almas Begum, S. K, A. S., Mahesh C, 2023, 2023 International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE))
- Automatic Identification of Fake News Using Deep Learning(Ethar Qawasmeh, Mais Tawalbeh, Malak Abdullah, 2019, 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS))
- Enhancing Rumor Detection of Twitter Tweets Using a Deep Learning Approach(Rasmy Akter, Md. Motiur Rahman, Anjan Debnath, Nadim Samrat, Marjana Roman, Abu Naim Khan, Nasif Alvi, M. Raihan, 2024, 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT))
- MARV: Multi-task learning and Attention based Rumor Verification scheme for Social Media(Yufeng Wang, Bo Zhang, Jianhua Ma, Qun Jin, 2022, 2022 IEEE/CIC International Conference on Communications in China (ICCC))
- DistilBERT and RoBERTa Models for Identification of Fake News(Aleksandar Kitanovski, M. Toshevska, Georgina Mirceva, 2023, 2023 46th MIPRO ICT and Electronics Convention (MIPRO))
- A deep neural networks-based fusion model for COVID-19 rumor detection from online social media(Heng-yang Lu, Jun Yang, Wei Fang, Xiaoning Song, Chongjun Wang, 2022, Data Technologies and Applications)
- Fake News Detection Using SVM Algorithm in Machine Learning(S. Vinothkumar, S. Varadhaganapathy, M. Ramalingam, D. Ramkishore, S. Rithik, K.P. Tharanies, 2022, 2022 International Conference on Computer Communication and Informatics (ICCCI))
- A Machine Learning Framework for Real-Time Rumor Detection on Social Media Platform(B. Sri, G. Chowdary, Earladinne Sai Dheeraj Dr, V. Nirmalrani, 2025, 2025 Third International Conference on Augmented Intelligence and Sustainable Systems (ICAISS))
- RTBERT: A Transformer Based Approach for Improved Rumor Classification from Tweet(Shakib Mahmud Dipto, Mostofa Kamal Sagor, Anatte Rozario, Sahal Bin Saad, 2023, 2023 26th International Conference on Computer and Information Technology (ICCIT))
- A Topic-Agnostic Approach for Identifying Fake News Pages(Sonia Castelo, Thais G. Almeida, Anas Elghafari, Aécio Santos, Kien Pham, E. Nakamura, J. Freire, 2019, Companion Proceedings of The 2019 World Wide Web Conference)
- A Rumor Detection Method from Social Network Based on Deep Learning in Big Data Environment(J. Cen, Yongbo Li, 2022, Computational Intelligence and Neuroscience)
- Machine Learning and MADIT methodology for the fake news identification: the persuasion index(G. Turchi, L. Orrù, Christian Moro, Marco Cuccarini, Monia Paita, Marta Silvia Dalla Riva, Davide Bassi, Giovanni Da San Martino, Nicoló Navarin, 2022, 4th International Conference on Advanced Research Methods and Analytics (CARMA 2022))
- Boosting Fake News Detection Accuracy: A Deep Dive into LSTM Classifiers(E. G, K. D, S. M, 2024, 2024 10th International Conference on Advanced Computing and Communication Systems (ICACCS))
- Real-Time Misinformation Detection with Cyclic Evidence-Based Framework(Zhiwen Hu, Lv Han, Haihua Jiang, Xi-Ao Ma, Saihua Lei, Yang Qiu, Haojia Niu, Zehui Zhou, Xun Wang, 2025, 2025 International Joint Conference on Neural Networks (IJCNN))
- An Improved Deep Learning Network, Addressing Graph Node Imbalance in Social Media Rumor Source Detection(G. N. Gopal, Binsu C. Kovoor, S. Shailesh, 2024, New Generation Computing)
- Cycle mapping with adversarial event classification network for fake news detection(Fei Wu, Hong Zhou, Yujian Feng, Guangwei Gao, Yimu Ji, Xiao-Yuan Jing, 2024, Multimedia Tools and Applications)
- SmartFactCheckBot: A Dual-Interface AI Platform for Real-Time Misinformation Detection with Explainable Predictions(Baskaran Jeyarajan, Vigneshwaran Jagadeesan Pugazhenthi, 2025, 2025 International Conference on Computer and Applications (ICCA))
- Adaptive Trans-Bidirectional Long Short Term Memory for Detecting COVID19 Epidemic Fake News on Social Media(V.Rathinapriya, Dr. J.Kalaivani, 2023, 2023 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES))
- A network-based positive and unlabeled learning approach for fake news detection(Mariana Caravanti de Souza, Bruno M. Nogueira, R. G. Rossi, R. Marcacini, Brucce Neves dos Santos, S. O. Rezende, 2021, Machine Learning)
- Rapid identification of rumors based on BERT and CNN(Huaizhong Zhu, Lina Ran, Wanlu Li, Yanni Zhao, Lisha Wang, Xufeng Ling, 2025, International Conference on Computer Application and Information Security (ICCAIS 2024))
- Multimodal Meta-Learning for Early Rumor Detection Based on Few-Shot Learning(Yanyan Ye, Hongzhe Chen, Qianqian Cai, Housheng Su, 2026, IEEE Transactions on Computational Social Systems)
- Rumor Verification on Social Media with Stance-Aware Recursive Tree(Xiaoyun Han, Zhen Huang, Menglong Lu, Dongsheng Li, Jinyan Qiu, 2021, Lecture Notes in Computer Science)
- Adaptive cost-sensitive stance classification model for rumor detection in social networks(Z. Zojaji, Behrouz Tork Ladani, 2022, Social Network Analysis and Mining)
- Automatic Fake News Identification: A Hybrid CNN-LSTM-Logistic Regression Approach(Pummy Dhiman, 2025, Journal of Information Systems Engineering and Management)
- Reinforcement Learning for Fake News Detection on Social Media with Blockchain Security(Arpit Jain, A. Musunuri, Saketh Reddy Cheruku, Vijay Bhasker Reddy Bhimanapati, Shreyas Mahimkar, Mohammed H. Al-Farouni, 2024, 2024 4th International Conference on Blockchain Technology and Information Security (ICBCTIS))
本报告通过多维视角整合了中国应急虚假信息检测的研究现状,将相关文献划分为多模态融合技术、传播结构建模、大模型驱动语义分析、应急实战响应框架以及鲁棒性学习策略五大核心模块。总体而言,该领域研究正经历从单一数据源检测向多源跨模态融合、从传统机器学习向大语言模型微调、从离线算法分析向应急场景实时化系统建设的范式转型,致力于解决复杂突发事件下的信息鉴别难题。
总计161篇相关文献
No abstract available
Multi-modal emergency rumors are spreading in the current digital era, causing significant disruptions and negative impacts. Most existing methods focus on exploring rumor detection using individual small language models (SLMs) or large language models (LLMs), achieving a certain degree of success but with underlying issues. Approaches based on SLMs have reached a bottleneck due to their limited knowledge and capacity. In contrast, LLMs have unique strengths in deep analysis that compensate for the weaknesses of SLMs; however, they struggle to select and integrate analyses to draw appropriate conclusions. Furthermore, recent works on multi-modal feature fusion remain superficial, limiting the ability of these models to fully comprehend and identify rumors. In this work, we propose Collaborate Large and Small Language Models for Multi-Modal Emergency Rumor Detection (M2ERD). Specifically, it consists of two main components. First, LLMs generate multi-dimensional rationales based on multi-perspective prompts, from which SLMs selectively derive insights for rumor detection. Second, a multi-source cross-modal penetration fusion network not only accomplishes unidirectional fusion of auxiliary information such as multi-dimensional rationales but also achieves complete mutual complementation between text and the image. Comprehensive experiments demonstrate the effectiveness of M2ERD for rumor detection on Weibo, RumorEval, and Pheme datasets, achieving a 2.6% improvement in accuracy and a 1.9% improvement in F1-score compared to all baselines. We release the code and data at https://github.com/youchengyan/M2ERD.
The rapid development of Internet technology makes online rumors increasingly rampant. The current researches on rumor detection rely on large-scale labeled instance data. However, in the face of emergent events, obtaining large-scale labeled data becomes extremely difficult. To address this issue, this paper proposed a few-shot rumor detection model based on prompt learning and meta learning, named MPRD (Mate Prompt Rumor Detection, MPRD). MPRD combines hard prompt and soft prompt to explore universal rumor prompt templates that are not limited by specific topics, thus offering greater flexibility. Moreover, it employs meta-learning methods to sample meta-tasks based on past events, trains prompt models on these meta-tasks, and ultimately applies them to perform rumor detection tasks in emergency events. Experiments on two publicly datasets show that MPRD outperforms state-of-the-art baseline models on detection accuracy, and exhibits significant performance improvement on few-shot rumor detection under different topic.
Rumor detection is a critical task for addressing the spread of misinformation and maintaining the credibility of information sources. Natural Language Processing (NLP) techniques have been employed to propose efficient and effective methods for rumor detection. In the wake of the widespread COVID-19 pandemic, the world has faced extensive strain on health, economics, and social structures. The dissemination of false or inaccurate information on social media, whether intentionally malicious or unintentional, has had detrimental consequences for individuals and society, particularly during critical situations like real world emergencies. In this study, we aim to explore the textual and temporal features present in social media posts (specifically tweets) related to COVID-19 to detect rumors as time is unique feature of text and any event can be mapped on timeline. Previous studies utilized the textual features and the temporal features are neglected at large for rumors detection. We utilize both temporal and textual features independently, as well as in combination, to train machine learning and neural network models. The evaluation of multiple algorithms (RNN, LSTM, CNN, DNN, BERT) across various feature sets reveals diverse performance. RNN and LSTM improve with combined textual and temporal features, highlighting temporal information’s importance. CNN performs well with textual features but declines with temporal features. DNN maintains consistent performance, while BERT demonstrates moderate effectiveness in classification tasks.
No abstract available
No abstract available
In this paper, we collect the basic information data of online rumors and highly topical public opinions. In the research of the propagation model of online public opinion rumors, we use the improved SCIR model to analyze the characteristics of online rumor propagation under the suspicion mechanism at different propagation stages, based on considering the flow of rumor propagation. We analyze the stability of the evolution of rumor propagation by using the time-delay differential equation under the punishment mechanism. In this paper, the evolution of heterogeneous views with different acceptance and exchange thresholds is studied, using the standard Deffuant model and the improved model under the influence of the media, to analyze the evolution process and characteristics of rumor opinions. Based on the above results, it is found that improving the recovery rate is better than reducing the deception rate, and increasing the eviction rate is better than improving the detection rate. When the time lag τ < 110, it indicates that the spread of rumors tends to be asymptotic and stable, and the punishment mechanism can reduce the propagation time and the maximum proportion of deceived people. The proportion of deceived people increases with the decrease in the exchange threshold, and the range of opinion clusters increases with the decline in acceptance.
The rapid development of social networks has given opportunities for rumors to disturb the order of society. However, due to the diversity and complexity of users and information dissemination dynamics, localizing the rumor sources in social networks is still a critical and crucial problem yet to be well solved. Recent years, although several methods have been proposed to attempt to solve this problem, they suffer from the contradiction between accuracy and model complexity. To detect information sources efficiently, this article propose a heuristic framework for sources detection (HFSD) in social networks via graph convolutional networks which handles three major challenges, including: 1) the diversity and complexity of users and information dissemination dynamics; 2) difficulty to detect multiple sources, especially without knowing the number of sources; and 3) the class imbalance problem caused by the large differences in sample size of sources and nonsources. Specifically, first, to counteract the diversity and complexity of users and information, different kinds of features of users and information are encoded in the raw feature vectors; second, to address multisource detection, we adopt a binary classification in the last layer of model, which is different from the n-classification methods that are always applied to single-source scenario; finally, to solve the class imbalance problem, we design a balance mechanism which offsets the differences in sample size between the sets of sources and nonsources. Extensive experiments conducted on 12 real-world datasets demonstrate that HFSD can handle problems mentioned above and outperforms than state-of-the-art algorithms significantly.
Propagation of rumors is a big challenge for public safety, thus limiting the spread of rumors can help maintain social harmony and stability. Rumor source detection is one of the solutions to this problem, which controls the spread of rumors by locating the source node. However such algorithms usually have the problem of high time complexity. This paper reduces the time complexity by introducing a community detection algorithm so that it can be used for larger networks, and proposes a heuristic method to improve its accuracy. To verify the effectiveness of the algorithm, experiments were conducted on different network scales according to the susceptible-infected (SI) model. Results show that the proposed algorithm converges faster than the benchmark method without reducing accuracy, while the heuristic method further reduces errors in the proposed algorithm. The proposed algorithm is expected to be useful in various fields such as public safety, intelligence analysis and emergency response, providing an effective method to identify and prevent the spread of false information.
Since open social platforms allow for a large and continuous flow of unverified information, rumors can emerge unexpectedly and spread quickly. However, existing rumor detection (RD) models often assume the same training and testing distributions and can not cope with the continuously changing social network environment. This paper proposed a Continual Prompt-Tuning RD (CPT-RD) framework, which avoids catastrophic forgetting (CF) of upstream tasks during sequential task learning and enables bidirectional knowledge transfer between domain tasks. Specifically, we propose the following strategies: (a) Our design explicitly decouples shared and domain-specific knowledge, thus reducing the interference among different domains during optimization; (b) Several technologies aim to transfer knowledge of upstream tasks to deal with emergencies; (c) A task-conditioned prompt-wise hypernetwork (TPHNet) is used to consolidate past domains. In addition, CPT-RD avoids CF without the necessity of a rehearsal buffer. Finally, CPT-RD is evaluated on English and Chinese RD datasets and is effective and efficient compared to prior state-of-the-art methods.
Social networks are the main channel for people to obtain news at this stage, especially the spread of emergencies. However, while obtaining news in a timely manner, people are also facing the harm of false news and rumors. Therefore, there is a great enthusiasm for the early detection of emergencies and the management of the network environment. For the early detection of emergencies, a model based on BERT-BiLSTM is proposed. This model can realize the early detection of social network emergencies in a timely manner with fewer samples. After experimental verification, among similar algorithms, the accuracy of the test set is 0.9398 when the loss value is the lowest. The test results show that the proposed model can better realize the early detection of social network emergencies.
. Twitter being a famous social media site not only helps people to share their thoughts in microblogs but also plays a pivotal role in situations of emergency for communication, announcement and so on. However, it results in anaversive effect when inappropriate tweet is reposted or shared to people thereby spreading rumors. This work describesthe methodologies in identifying the rumors using specific attributes like precision, fi-score, recall and support thereby solving the ranging rumor issues across the social media platform. A system detects candidate’s rumor from twitter and then evaluates it applicably. The result of experiment shows the proposed algorithm in order to detect the rumors with acceptable accuracy.
Health misinformation on social media threatens public health responses and trust. This study analyzes Ebola-related social media data to investigate rumor propagation mechanisms. We propose a novel Hierarchical Information Graph Network (HIGN) that combines Graph Neural Networks with multi-modal features to capture structural, temporal, and content-based diffusion patterns. The model employs multi-scale temporal attention and hierarchical graph convolution to analyze propagation networks. Evaluations on the PHEME dataset show HIGN outperforms baseline methods (SVM, LSTM, GCN, GraphSAGE), achieving 84.7% accuracy and 84.5% F1-score. Analysis reveals distinct patterns: rumors spread through “deep and narrow” structures with higher clustering coefficients (0.23 vs. 0.16) and multi-peak resurgences, while non-rumors propagate in “shallow and broad” patterns. HIGN also demonstrates strong early detection capability, reaching 68.7% F1-score within the first hour of propagation. This work provides computational insights into health misinformation dynamics and offers a robust framework for early rumor detection and intervention design in public health emergencies.
The rapid and widespread dissemination of medical misinformation on social media platforms poses a significant threat to public health, leading to treatment delays, vaccine hesitancy, and the adoption of harmful practices. Addressing the challenges of overfitting and prediction instability is crucial when fine-tuning pre-trained language models for Chinese microblog health rumor detection, especially given the short, emotionally charged, and noisy nature of public health surveillance data. This paper aims to propose a simple yet effective training strategy to enhance model generalization and robustness in detecting Chinese microblog health rumors. By integrating R-Drop consistency regularization with robust loss functions, this study seeks to establish a stronger and more reliable performance baseline for identifying and mitigating the impact of harmful health-related rumors, thereby contributing to a healthier online information ecosystem. We fine-tuned two powerful pre-trained Chinese language models, RoBERTa-wwm-ext and MacBERT, using a large corpus from the public Chinese Emergency Corpus (CED) dataset. Our approach introduced R-Drop consistency regularization, which constrains prediction consistency through dual forward passes with dropout and minimizes symmetric Kullback-Leibler (KL) divergence. To handle noisy labels and class imbalance, we compared standard Cross-Entropy loss against Focal Loss and Symmetric Cross-Entropy (SCE), employing a stable learning rate schedule and early stopping to prevent overfitting. The MacBERT model consistently outperformed RoBERTa-wwm-ext in health rumor detection. R-Drop consistency regularization yielded stable and significant improvements across all configurations, enhancing both accuracy and F1-score. The optimal configuration, pairing MacBERT with standard Cross-Entropy loss and R-Drop, achieved a peak test accuracy of 0.9198. Notably, a well-regularized standard Cross-Entropy loss proved more effective for this dataset and task than the specialized robust loss functions. This research successfully establishes that integrating R-Drop consistency regularization offers a computationally efficient, easily implementable, and highly effective training paradigm for advancing Chinese microblog health rumor detection. The proposed methodology provides a strong, reproducible, and high-performing baseline, offering valuable guidance for public health authorities and social media platforms in combating medical misinformation and enhancing automated systems for safeguarding public health.
The task of witness detection in social media is crucial for many practical applications, including rumor debunking, emergency management, and public opinion mining. Yet to date, it has been approached in an approximated way. We propose a method for addressing witness detection in a strict and realistic fashion. By employing hybrid crowdsensing over Twitter, we contact real-life witnesses and use their reactions to build a strong ground-truth, thus avoiding a manual, subjective annotation of the dataset. Using this dataset, we develop a witness detection system based on a machine learning classifier using a wide set of linguistic features and metadata associated with the tweets.
No abstract available
No abstract available
The existing social network public opinion analysis methods have problems such as poor semantic expression quality and weak detection ability in short texts. Therefore, a social network public opinion analysis method based on BERT-BMA is proposed. To normalize the comment text, the rumor text is initially transferred to a word vector matrix using the BERT (Bidirectional Encoder Representations from Transformer) model. The BiLSTM-based network architecture is subsequently employed to acquire the trace features of data transmission. Ultimately, this study employs the multi-head attention mechanism to extract feature information that is more significant in the analysis of online public opinion by mining the dependency relationships between users, resulting in increasing ability to detect public opinion emergencies. The experimental outcomes indicate that the results on the Twitter data set and Weibo dataset are superior to other comparative models.
Twitter has become one of the main sources of news for many people. As real-world events and emergencies unfold, Twitter is abuzz with hundreds of thousands of stories about the events. Some of these stories are harmless, while others could potentially be life-saving or sources of malicious rumors. Thus, it is critically important to be able to efficiently track stories that spread on Twitter during these events. In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter. We ran a user study with 25 participants, demonstrating that compared to more conventional methods, our tool can increase the speed and the accuracy with which users can track stories about real-world events.
Twitter has become an instrumental source of news in emergencies where efficient access, dissemination of information, and immediate reactions are critical. Nevertheless, due to several challenges, the current fully-automated processing methods are not yet mature enough for deployment in real scenarios. In this dissertation, I focus on tackling the lack of context problem by studying automatic geo-location techniques. I specifically aim to study the Location Mention Prediction problem in which the system has to extract location mentions in tweets and pin them on the map. To address this problem, I aim to exploit different techniques such as training neural models, enriching the tweet representation, and studying methods to mitigate the lack of labeled data. I anticipate many downstream applications for the Location Mention Prediction problem such as incident detection, real-time action management during emergencies, and fake news and rumor detection among others.
The internet is rife with different types of disinformation openly available to the public. The spreading of accidental or malicious misinformation on social media, specifically in critical situations, such as real-world emergencies, can have negative consequences on our health, democracy and economy. This facilitates the spread of rumors on social media where users share and exchange the latest information with many readers, including a large volume of new information every second. However, updated news sharing on social media is not always true. Consequently, disinformation is rapidly being recognized as a global risk alongside terrorism, cancer and global warming. The increasing demand for fact checking at scale has stimulated a rapid development of automated solutions using technologies such as Natural Language Processing (NLP) and Machine Learning (ML) in order to reduce the required human effort. This paper will explore novel methods for automated fake news detection through the integration of two powerful approaches to data science, ML and Topological Data Analysis (TDA). The main strength of ML lies in its predictive power. Deep learning in particular has yielded some impressive practical successes in various text processing tasks. However, ML can fail in more exploratory talks aimed at understanding the nature of the data and uncovering its insights. TDA is a fairly new field that applies topology and geometry to analyze high-dimensional data and construct its compressed representation offering a more exploratory approach. This paper will explore how the strengths of those two fields can be integrated in order to create novel method for the analysis of text data, which will potentially lead to the creation of a state-of-the-art fake news detection model.
The proliferation of online misinformation videos poses serious societal risks. Current datasets and detection methods primarily target binary classification or single-modality localization based on post-processed data, lacking the interpretability needed to counter persuasive misinformation. In this paper, we introduce the task of Grounding Multimodal Misinformation (GroundMM), which verifies multimodal content and localizes misleading segments across modalities. We present the first real-world dataset for this task, GroundLie360, featuring a taxonomy of misinformation types, fine-grained annotations across text, speech, and visuals, and validation with Snopes evidence and annotator reasoning. We also propose a VLM-based, QA-driven baseline, FakeMark, using single and cross-modal cues for effective detection and grounding. Our experiments highlight the challenges of this task and lay a foundation for explainable multimodal misinformation detection. Dataset will be released at https://github.com/yangbingjian/GroundLie360.
No abstract available
Amid a tidal wave of misinformation flooding social media during elections and crises, extensive research has been conducted on misinformation detection, primarily focusing on text-based or image-based approaches. However, only a few studies have explored multimodal feature combinations, such as integrating text and images for building a classification model to detect misinformation. This study investigates the effectiveness of different multimodal feature combinations, incorporating text, images, and social features using an early fusion approach for the classification model. This study analyzed 1,529 tweets containing both text and images during the COVID-19 pandemic and election periods collected from Twitter (now X). A data enrichment process was applied to extract additional social features, as well as visual features, through techniques such as object detection and optical character recognition (OCR). The results show that combining unsupervised and supervised machine learning models improves classification performance by 15% compared to unimodal models and by 5% compared to bimodal models. Additionally, the study analyzes the propagation patterns of misinformation based on the characteristics of misinformation tweets and the users who disseminate them.
Multimodal tasks require learning a joint representation of the constituent modalities of data. Contrastive learning learns a joint representation by using a contrastive loss. For example, CLIP takes as input image-caption pairs and is trained to maximize the similarity between an image and its corresponding caption in actual image-caption pairs, while minimizing the similarity for arbitrary image-caption pairs. This approach operates on the premise that the caption depicts the image's content. However, this assumption does not always hold true for tweets that contain both text and images. Previous studies have indicated that the connection between the image and the text in a tweet is more intricate and complex. We study the effectiveness of pre-trained multimodal contrastive learning models, specifically, CLIP, and ALIGN, on the task of classifying multimodal crisis related tweets. Our experiments using two publicly available datasets, CrisisMMD and DMD, show that despite the intricate relationships in tweets, pre-trained contrastive learning models fine-tuned with task-specific data produce better results than prior approaches used for the multimodal classification of crisis related tweets. Additionally, the experiments show that the contrastive learning models are effective in low-data few-shot and cross-domain settings.
No abstract available
The spread of misinformation on social platforms is amplified by multimodal posts and the discourse they trigger. We present <monospace>LANTERN</monospace>, a lightweight framework that detects fake posts by pairing a compact LVLM (Large Visio-Language Model) with a graph-to-text rendering of reply threads. Comment trees are pruned breadth-first and serialized with depth-based indentation and in-line metadata (author, engagement), then concatenated with image tokens; the LVLM is fine-tuned with low-rank adapters and a linear head, using the end-of-sequence state for classification. On <italic>Fakeddit</italic>, <monospace>LANTERN</monospace> attains state-of-the-art accuracy with only 150 k training samples, achieving <bold>97.16%</bold> (2-way), <bold>97.05%</bold> (3-way), <bold>93.72%</bold> (6-way). To assess versatility beyond Reddit, we additionally evaluate <monospace>LANTERN-TEXT</monospace> on <italic>PHEME</italic> (text-only rumor cascades) and adapt the pipeline to <italic>Twitter15/16</italic> (structure-only replies), observing robust gains from shallow discourse and and measurable gains even when reply text is absent. Large-scale multimodal pretraining plus lightweight adaptation and structured conversational context yields a practical and reproducible approach to misinformation detection.
Misinformation has become a major challenge in the era of increasing digital information, requiring the development of effective detection methods. We have investigated a novel approach to Out-Of-Context detection (OOCD) that uses synthetic data generation. We created a dataset specifically designed for OOCD and developed an efficient detector for accurate classification. Our experimental findings validate the use of synthetic data generation and demonstrate its efficacy in addressing the data limitations associated with OOCD. The dataset and detector should serve as valuable resources for future research and the development of robust misinformation detection systems.
Social media has become a vital source for humanitarian organizations to gather information during crises. However, existing multimodal classification methods operate primarily as isolated systems, while neglecting external references crucial for accurate judgment. Furthermore, while user comments can provide valuable context, they are often scarce during the early stages of a crisis. To address these limitations, we propose a framework named Mix-Persona Comment Generation with Geographically Enhanced Context Retrieval for LLM Instruction Fine-tuning (MPCG-GECR). To mitigate comment scarcity, we employ a Synthetic Persona Generator (SPG) that prompts LLMs to adopt diverse mix-personas, generating synthetic comments that simulate multi-perspective public discourse. To incorporate external references, we introduce a Geographically Enhanced Context Retrieval (GECR) module. Unlike standard retrieval approaches, GECR utilizes a hybrid re-ranking strategy to identify samples that are both multimodally similar and geographically consistent, serving as reliable reference anchors for the LLM. By integrating these social perspectives and geographic references into a unified instruction-tuning format, we transform the classification task into a context-aware text generation problem and fine-tune the LLM using Low-Rank Adaptation (LoRA). Extensive experiments on the CrisisMMD and DMD datasets demonstrate that MPCG-GECR effectively overcomes data scarcity and context isolation, significantly outperforming existing methods.
Rapid classification of social media content during humanitarian crises is essential for effective disaster relief; however, traditional approaches require extensive annotated training data, which are often unavailable during new disasters. This paper presents a training-free, multimodal classification framework that leverages zero-shot vision-language models to analyze disaster-related social media content without task-specific training. The framework employs a two-stage prompt-engineered pipeline using the locally deployable Mistral-Small-3.1-24B-Instruct model, performing binary informativeness detection followed by multiclass categorization into eight humanitarian categories through structured JSON output generation. Evaluation on the CrisisMMD dataset of 18,082 multimodal samples from seven natural disasters demonstrated binary F1 scores above 0.84 for both text and image informativeness detection and weighted F1 scores of 0.61 (text) and 0.72 (image) for humanitarian categorization. The framework generalizes consistently across all disaster types with minimal performance variance (standard deviation below 0.031) and operates entirely on local infrastructure without cloud dependencies, requiring only moderate GPU resources. By eliminating training data requirements, this approach enables immediate deployment during new disasters, demonstrating that zero-shot multimodal classification achieves practically relevant performance for real-time crisis response.
Public crisis events are unexpected and disastrous events that endanger the overall life of the entire public. Most public crisis events are unpredictable and sudden, including accidents, natural disasters, social unrest, public health emergencies, and so on. When public crisis events occur, users usually send many tweets on social media platforms such as Facebook and Twitter to share real-time conditions and seek help. These tweets, if effectively selected and utilized, will help humanitarian organizations assess the situation and plan relief operations. Previous studies mainly used text data for tweet classification but ignored the complementarity among multimodal data. Although some works combined multimodal data from tweets for tweet classification, but the fusion method was not comprehensive enough, ignoring the heterogeneous differences between the multimodal data. Therefore, the MMC-GAN (Multimodal Cycle-GAN With Mixed Fusion Strategy) model is introduced, which classifies public crisis. The MMC-GAN model consists of three modules: image feature extractor, text feature extractor, and multimodal fusion module. The image feature extractor and the text feature extractor use the newly proposed ResNet(V+H) and BERT to learn visual and textual feature from tweets, respectively. The multimodal fusion module uses the mixed fusion method to get the final classification outcomes after fusing text and image data from several data domains into a single data domain. The MMC-GAN model is validated on the Crisis MMD v2.0 dataset, and the model performance is significantly better than the baseline algorithm and other related research works. SO the MMC-GAN model can effectively fuse the data of different modalities, improving the accuracy of the results about classifying tweets.
Nowadays, misinformation is widely spreading over various social media platforms and causes extremely negative impacts on society. To combat this issue, automatically identifying misinformation, especially those containing multimodal content, has attracted growing attention from the academic and industrial communities, and induced an active research topic named Multimodal Misinformation Detection (MMD). Typically, existing MMD methods capture the semantic correlation and inconsistency between multiple modalities, but neglect some potential clues in multimodal content. Recent studies suggest that manipulated traces of the images in articles are non-trivial clues for detecting misinformation. Meanwhile, we find that the underlying intentions behind the manipulation, e.g., harmful and harmless, also matter in MMD. Accordingly, in this work, we propose to detect misinformation by learning manipulation features that indicate whether the image has been manipulated, as well as intention features regarding the harmful and harmless intentions of the manipulation. Unfortunately, the manipulation and intention labels that make these features discriminative are unknown. To overcome the problem, we propose two weakly supervised signals as alternatives by introducing additional datasets on image manipulation detection and formulating two classification tasks as positive and unlabeled learning problems. Based on these ideas, we propose a novel MMD method, namely Harmfully Manipulated Images Matter in MMD (Hami-m3d). Extensive experiments across three benchmark datasets can demonstrate that Hami-m3d can consistently improve the performance of any MMD baselines.
Recent advances in multimodal AI have enabled progress in detecting synthetic and out-of-context content. However, existing efforts largely overlook the intent behind AI-generated images. To fill this gap, we introduce S-HArM, a multimodal dataset for intent-aware classification, comprising 9,576 ''in the wild'' image–text pairs from Twitter/X and Reddit, labeled as Humor/Satire, Art, or Misinformation. Additionally, we explore three prompting strategies (image-guided, description-guided, and multimodally-guided) to construct a large-scale synthetic training dataset with Stable Diffusion. We conduct an extensive comparative study including modality fusion, contrastive learning, reconstruction networks, attention mechanisms, and large vision-language models. Our results show that models trained on image- and multimodally-guided data generalize better to ''in the wild'' content, due to preserved visual context. However, overall performance remains limited, highlighting the complexity of inferring intent and the need for specialized architectures.
The rise of misinformation on social networks creates serious problems for public awareness, policy-making, and trust in society. Social media content is getting more complex, often including text, metadata, and multimedia. This makes it essential to have smart systems that can classify misinformation using various signals. This paper introduces a machine learning approach to check the misinformation that uses the MuMiN (Multilingual Multimodal Fact-Checked Misinformation) dataset. This dataset contains annotated claims, supporting evidence, user tweets, and fact-check labels. Structured preprocessing pipeline applied to get the dataset ready for analysis. The textual and structural features were extracted as features. Three machine learning models, Random Forest (RF), Gradient Boosting (GB), and a Stacking Classifier were developed and assessed. These models were evaluated using key performance metrics. The experimental findings indicate that the stacking ensemble regularly surpasses the individual base classifiers, attaining an accuracy rating of 89.12%. This highlights the advantages of combining models to manage complex, noisy, and multimodal social media data. This study emphasizes the value of merging multimodal feature representations with ensemble learning methods for effective and scalable misinformation detection on online platforms.
Social media posts that direct users to YouTube videos are one of the most effective techniques for spreading misinformation. However, it has been observed that such posts rarely get deleted or flagged. Since multi-modal misinformation that leads to compelling videos has more impact than using just textual content, it is important to characterize and detect such textual post and video pairs to prevent users from becoming victims of misinformation. To address this gap, we build a taxonomy of how links to YouTube videos are used on social media platforms. We then use pairs of posts and videos annotated with this taxonomy to test several classification models built using cross-platform features. Our work reveals several characteristics of post-video pairs, in terms of how posts and videos are related to each other, the type of content they share, and their collective outcome. In addition, we find that traditional approaches to misinformation detection that rely only on text from posts miss a significant number of post-video pairs that contain misinformation. More importantly, we find that to reduce the spread of misinformation via post-video pairs, classifiers would be more effective if they are designed to use data and features from multiple diverse platforms.
In the era of rapid digital news development, fake news poses a severe threat to society’s determination and authenticity, especially with the advent of online media platforms that facilitate the creation and dissemination of fabricated information. Although various techniques have been developed to discriminate between authentic and fake news, a practical fake news classification framework is still needed to automatically deliver high classification performance and impede the spread of misinformation. To fill this gap, this study proposes a multi-level fusion-based CNN with dual-conv layers-RNN (CDLR) framework that fuses Convolutional Neural Networks (CNN) with dual Conv layers, Recurrent Neural Networks (RNN), and classification module for multi-model fake news classification. The proposed framework fuses CNN (with dual-Conv layers) and RNN to enhance classification abilities and extract high-quality semantic textual and visual features to identify misinformation effectively. After pre-processing, the extracted weight matrix was fed to CNN (with dual Conv layers) to learn and extract deep visual features and an RNN for high-quality feature extraction from textual data or news articles for classification. Likewise, we designed a fusion mechanism to cross-validate the execution of our framework by considering different variants such as mean fusion, weighted-mean fusion, maximum fusion, and sum fusion. Finally, a classification module with a polynomial kernel was employed to categorize the extracted data as fake or real, for final classification. A comprehensive experiment analysis was carried out to evaluate the proposed framework's effectiveness by combining early and late fusion mechanisms with baseline methods on five extensive, fair, and diverse datasets. The accuracy of our framework was found to be 0.9725 on ISOT, 0.9107 on Fake vs. Real News, 0.9816 on WELFake, 0.5403 on FA-KES, and 0.9163 on Twitter datasets, indicating its robustness in classifying fake news compared to benchmark methods. Lastly, the study proposes some recommendations to mitigate the adverse effects of fake news based on the predictions made using our fusion-based framework.
No abstract available
The widespread circulation of false information through digital media presents a significant challenge to public confidence, societal harmony, and informed decision-making. To tackle this issue, this work proposes a dual-input fake news detection framework that leverages both textual and imagebased analysis for improved performance and trustworthiness. Central to the system is a Hybrid-CNN model that extracts features from text using convolutional layers and subsequently classifies them using Support Vector Machines (SVM) and Naive Bayes algorithms, enhancing classification precision. Alongside, a Transformer-based image analysis module is incorporated to detect manipulated or deceptive visuals frequently used in misleading content. These components operate independently, enabling accurate assessment of both text and images, thereby ensuring adaptability across content types. Offered as an online application, the platform allows users to verify the credibility of news through text or image submissions in real time. By analyzing both linguistic patterns and visual elements, the system achieves higher accuracy in identifying misinformation. Experimental results highlight the effectiveness of the proposed solution, which provides a resilient and scalable approach to addressing the proliferation of fake news in the current digital ecosystem.
Social media has emerged during the last decade as a potential information source in crisis scenarios, providing data in real-time or just after the occurrence of the event. Nevertheless, current social media data acquisition procedures result in datasets, presenting myriad non-informative content, which hinders posterior analyses. While previous studies have used machine learning to address this issue, they typically require many labeled examples, hardening their use in a real-world scenario. Moreover, social media posts tend to be multimodal, which adds complexity to how these data should be represented. This paper extends upon our previous work and presents a new method for identifying the most informative content related to an event in textual and visual data through few-shot learning. The results show that this method outperforms existing approaches in both performance and efficiency, offering a valuable solution for a timely analysis of crisis-related social media data and advancing research in this area.
Natural disasters generate vast amounts of multi-modal social media content on platforms like Twitter, containing both textual and visual information crucial for emergency response. However, existing disaster monitoring systems often analyze text and images in isolation, leading to incomplete situational awareness and delayed crisis response. This study proposes a comprehensive multimodal framework that integrates natural language processing and computer vision techniques through a context-aware late fusion strategy. Our system employs a dual-pipeline architecture: the text pipeline utilizes XLNet for binary disaster detection, BERTweet for 16-category disaster type classification, and Gemini for zero-shot semantic intent analysis (Disaster Information, Emergency Help, and Emotion Sharing), while the image pipeline applies EfficientNet-B4 for informativeness assessment, humanitarian categorization, and damage severity evaluation. The multimodal outputs are integrated using weighted voting, where weights are assigned proportionally to each model’s validation accuracy, and validated against ReliefWeb for real-time verification. Experimental results on the Disaster Tweets Normalized and CrisisMMD datasets demonstrate exceptional performance: XLNet achieved 98% F1-score for disaster detection, BERTweet attained 96% F1-score for type classification, and Gemini reached 94% F1-score for intent analysis. EfficientNet-B4 showed consistent performance across visual tasks with F1-scores of 82% (informativeness), 76% (humanitarian), and 65% (damage severity). The multimodal fusion approach significantly outperformed unimodal baselines, successfully prioritizing critical disaster events through severity-based ranking, thus providing emergency responders with actionable intelligence for effective crisis management.
In times of crisis, the prompt and precise classification of disaster-related information shared on social media platforms is crucial for effective disaster response and public safety. During such critical events, individuals use social media to communicate, sharing multimodal textual and visual content. However, due to the significant influx of unfiltered and diverse data, humanitarian organizations face challenges in leveraging this information efficiently. Existing methods for classifying disaster-related content often fail to model users'credibility, emotional context, and social interaction information, which are essential for accurate classification. To address this gap, we propose CrisisSpot, a method that utilizes a Graph-based Neural Network to capture complex relationships between textual and visual modalities, as well as Social Context Features to incorporate user-centric and content-centric information. We also introduce Inverted Dual Embedded Attention (IDEA), which captures both harmonious and contrasting patterns within the data to enhance multimodal interactions and provide richer insights. Additionally, we present TSEqD (Turkey-Syria Earthquake Dataset), a large annotated dataset for a single disaster event, containing 10,352 samples. Through extensive experiments, CrisisSpot demonstrated significant improvements, achieving an average F1-score gain of 9.45% and 5.01% compared to state-of-the-art methods on the publicly available CrisisMMD dataset and the TSEqD dataset, respectively.
The proliferation of misinformation poses significant challenges to information integrity in the digital age. This research work presents a detailed analysis of how misinformation affects user behaviour on social networking sites like Reddit, and proposes a unified multimodal approach for detecting misinformation using both fake news texts and tampered images, by combining state-of-the-art deep learning models such as DistilBERT for text classification and Vision Transformer (ViT) for image classification. The analysis suggested that posts containing fake news encourage users to engage superficially, without promoting critical thinking. The proposed method integrates the transformer-based models to create a robust system capable of addressing the dual challenges of content verification. The models were trained and evaluated on a comprehensive dataset containing paired news articles and images, achieving an impressive accuracy of $\mathbf{9 2. 5 \%}$ in detecting fake news texts and 88.1% in identifying tampered images. The proposed unified model serves as a scalable solution for mitigating the spread of misinformation and manipulated visuals, contributing to the broader effort of maintaining the integrity of online information.
Information sharing on social media has become a common practice for people around the world. Since it is difficult to check user-generated content on social media, huge amounts of rumors and misinformation are being spread with authentic information. On the one hand, most of the social platforms identify rumors through manual fact-checking, which is very inefficient. On the other hand, with an emerging form of misinformation that contains inconsistent image–text pairs, it would be beneficial if we could compare the meaning of multimodal content within the same post for detecting image–text inconsistency. In this paper, we propose a novel approach to misinformation detection by multimodal feature fusion with transformers and credibility assessment with self-attention-based Bi-RNN networks. Firstly, captions are derived from images using an image captioning module to obtain their semantic descriptions. These are compared with surrounding text by fine-tuning transformers for consistency check in semantics. Then, to further aggregate sentiment features into text representation, we fine-tune a separate transformer for text sentiment classification, where the output is concatenated to augment text embeddings. Finally, Multi-Cell Bi-GRUs with self-attention are used to train the credibility assessment model for misinformation detection. From the experimental results on tweets, the best performance with an accuracy of 0.904 and an F1-score of 0.921 can be obtained when applying feature fusion of augmented embeddings with sentiment classification results. This shows the potential of the innovative way of applying transformers in our proposed approach to misinformation detection. Further investigation is needed to validate the performance on various types of multimodal discrepancies.
No abstract available
The paper presents the outcomes of AI-COVID19, our project aimed at better understanding of misinformation flow about COVID-19 across social media platforms. The specific focus of the study reported in this paper is on collecting data from Telegram groups which are active in promotion of COVID-related misinformation. Our corpus collected so far contains around 28 million words, from almost one million messages. Given that a substantial portion of misinformation flow in social media is spread via multimodal means, such as images and video, we have also developed a mechanism for utilising such channels via producing automatic transcripts for videos and automatic classification for images into such categories as memes, screenshots of posts and other kinds of images. The accuracy of the image classification pipeline is around 87%.
Advances in Natural Language Processing (NLP) have revolutionized the way researchers and practitioners address crucial societal problems. Large language models are now the standard to develop state-of-the-art solutions for text detection and classification tasks. However, the development of advanced computational techniques and resources is disproportionately focused on the English language, sidelining a majority of the languages spoken globally. While existing research has developed better multilingual and monolingual language models to bridge this language disparity between English and non-English languages, we explore the promise of incorporating the information contained in images via multimodal machine learning. Our comparative analyses on three detection tasks focusing on crisis information, fake news, and emotion recognition, as well as five high-resource non-English languages, demonstrate that: (a) detection frameworks based on pre-trained large language models like BERT and multilingual-BERT systematically perform better on the English language compared against non-English languages, and (b) including images via multimodal learning bridges this performance gap. We situate our findings with respect to existing work on the pitfalls of large language models, and discuss their theoretical and practical implications.
The outbreak of COVID-19 has resulted in an "infodemic" that has encouraged the propagation of misinformation about COVID-19 and cure methods which, in turn, could negatively affect the adoption of recommended public health measures in the larger population. In this paper, we provide a new multimodal (consisting of images, text and temporal information) labeled dataset containing news articles and tweets on the COVID-19 vaccine. We collected 2,593 news articles from 80 publishers for one year between Feb 16th 2020 to May 8th 2021 and 24184 Twitter posts (collected between April 17th 2021 to May 8th 2021). We combine ratings from two news media ranking sites: Medias Bias Chart and Media Bias/Fact Check (MBFC) to classify the news dataset into two levels of credibility: reliable and unreliable. The combination of two filters allows for higher precision of labeling. We also propose a stance detection mechanism to annotate tweets into three levels of credibility: reliable, unreliable and inconclusive. We provide several statistics as well as other analytics like, publisher distribution, publication date distribution, topic analysis, etc. We also provide a novel architecture that classifies the news data into misinformation or truth to provide a baseline performance for this dataset. We find that the proposed architecture has an F-Score of 0.919 and accuracy of 0.882 for fake news detection. Furthermore, we provide benchmark performance for misinformation detection on tweet dataset. This new multimodal dataset can be used in research on COVID-19 vaccine, including misinformation detection, influence of fake COVID-19 vaccine information, etc.
Abstract Motivated by the practical needs of enhancing social media rumor refutation effectiveness, this paper is dedicated to develop a proper rumor refutation effectiveness index ( R E I ), identify key factors influencing R E I and propose decision making suggestions for rumor refutation platforms. 298,118 pieces of comments and 185,209 pieces of the reposters’ verification status of 248 rumor refutation microblogs on Sina Weibo (the Chinese equivalent of Twitter) are collected during a 1-year period using a web crawler. To extract the text characteristics and analyze the sentiment of the rumor refutation microblogs, Natural Language Processing (NLP) approaches are applied. To explore the relationship between R E I and the content and contextual factors of the rumor refutation microblogs, four regression models based on the collected data are established, namely linear regression model, Support Vector regression model (SVR), Extreme Gradient Boosting regression model (XGBoostRegressor) and Light Gradient Boosting Machine regression model (LGBMRegressor). The LGBMRegressor has the best goodness-of-fit among the compared regression models. Then, SHapley Additive exPlanations (SHAP) is employed to visualize and explain the LGBMRegressor results. Decision making suggestions for rumor refutation platforms on how to organize rumor refutation microblogs under different situations such as rumor category, author’s influence and heat of topics are proposed.
This research examines the typology of rumors and fact-checking mechanisms in Chinese social media, focusing on the WeChat platform. The study analyzes 300 cases of disinformation extracted from the "Rumor Refutation Assistant" application in WeChat between 2023 and 2025 using Python-based tools.The author investigates the structural and content characteristics of rumors, their thematic classification across various categories (healthcare, public safety, and others), and both institutional and user-driven verification strategies. Special attention is given to the relationship between rumor types and fact-checking mechanisms' effectiveness within China's . The methodology includes content analysis for fakes typology, text mining techniques (TF-IDF, LDA), and social network analysis to examine information dissemination patterns. Findings reveal significant patterns in fakes distribution, where algorithmic and institutional factors substantially influence information perception. Healthcare-related messages (39.67%), technology information (23.00%), and public safety content (21.33%) dominate the fakes landscape. The author's contribution lies in analyzing information verification mechanisms within Chinese social media and identifying correlations between fake typologies and refutation strategies' effectiveness. The research novelty stems from examining rumor typology and fact-checking in the Chinese context, emphasizing WeChat's role in information dissemination. The study demonstrates that mitigating disinformation requires AI integration, active user participation in fact-checking, and effective legal regulation of the information space.
The rapid proliferation of the internet has expedited the dissemination of multimedia rumor content on social media. Nevertheless, existing multimodal rumor detection approaches predominantly concentrate on the intrinsic features of the multimodal rumor content, lacking factual evidence support, which constrains their generalizability and rationality. Inspired by the task of Natural Language Inference (NLI), we propose a multimodal rumor-evidence-aware method in this paper. Given a multimodal rumor instance, we predict the extent of evidence support in the given instance for the multimodal rumor content via an evidence-aware model. To prove the effectiveness of this method, we have collected a Chinese Social Media Rumor dataset with Evidence (CSMRE) comprising 12,371 instances. Experimental results on CSMRE datasets show the superior performance of our method.
Considering individuals can freely post messages on social media platforms, there is a large amount of unverified information, so-called rumor spreading on these platforms, which seriously affects users' experience and even disturbs social order. The application of Multi-Task Learning (MTL) in the field of rumor verification has witnessed great development, which improves rumor verification performance through jointly training the main task of rumor verification and the auxiliary task of stance classification. However, traditional MTL based rumor verification schemes can't adaptively weight different positions of data sequence to effectively represent the sequence, and then affect the verification performance. This paper proposes a novel rumor verification scheme for social media, MARV, through effectively exploiting the MTL and multi-head attention mechanism. Specifically, first, the shared LSTM layer in MARV is used to effectively process and represent the tweet sequences, and generate the high-level virtual features. Then, in the branch of rumor verification task, the multi-head attention layer is used to accurately learn the local dependencies in the high-level representations extracted from the shared layer. The experimental results on the PHEME and the RumourEval datasets demonstrate that our proposed MARV scheme is superior to other MTL based rumor verification schemes. Moreover, we also investigated the impact of differently placing attention module on the MTL based rumor verification.
No abstract available
Although studies have investigated cyber-rumoring previous to the pandemic, little research has been undertaken to study rumors and rumor-corrections during the COVID-19 (coronavirus disease 2019) pandemic. Drawing on prior studies about how online stories become viral, this study will fill that gap by investigating the retransmission of COVID-19 rumors and corrective messages on Sina Weibo, the largest and most popular microblogging site in China. This study examines the impact of rumor types, content attributes (including frames, emotion, and rationality), and source characteristics (including follower size and source identity) to show how they affect the likelihood of a COVID-19 rumor and its correction being shared. By exploring the retransmission of rumors and their corrections in Chinese social media, this study will not only advance scholarly understanding but also reveal how corrective messages can be crafted to debunk cyber-rumors in particular cultural contexts.
No abstract available
No abstract available
No abstract available
Health-related rumors being spread online during a public crisis may pose a serious threat to people's well-being. Existing crisis informatics research lacks in-depth insights into the characteristics of health rumors and the efforts to debunk them on social media in a pandemic. To fill this gap, we conduct a comprehensive analysis of four months of rumor-related online discussion during COVID-19 on Weibo, a Chinese microblogging site. Results suggest that the dread (cause fear) type of health rumors provoked significantly more discussions and lasted longer than the wish (raise hope) type. We further explore how four kinds of social media users (i.e., government, media, organization, and individual) combat health rumors, and identify their preferred way of sharing the debunking information and the key rhetoric strategies used in the process. We examine the relationship between debunking and rumor discussions using a Granger causality approach, and show the efficacy of debunking in suppressing rumor discussions, which is time-sensitive and varies according to rumor type and debunker. Our results can provide insights into crisis informatics and risk management on social media in pandemic settings.
The wide dissemination and misleading effects of online rumors on social media have become a critical issue concerning the public and government. Detecting and regulating social media rumors is important for ensuring users receive truthful information and maintaining social harmony. Most of the existing rumor detection methods focus on inferring clues from media content and social context, which largely ignores the rich knowledge information behind the highly condensed text which is useful for rumor verification. Furthermore, existing rumor detection models underperform on unseen events because they tend to capture lots of event-specific features in seen data which cannot be transferred to newly emerged events. In order to address these issues, we propose a novel Multimodal Knowledge-aware Event Memory Network (MKEMN) which utilizes the Multi-modal Knowledge-aware Network (MKN) and Event Memory Network (EMN) as building blocks for social media rumor detection. Specifically, the MKN learns the multi-modal representation of the post on social media and retrieves external knowledge from real-world knowledge graph to complement the semantic representation of short texts of posts and takes conceptual knowledge as additional evidence to improve rumor detection. The EMN extracts event-invariant features of events and stores them into global memory. Given an event representation, the EMN takes it as a query to retrieve the memory network and output the corresponding features shared among events. With the additional information provided by EMN, our model can learn robust representations of events and consistently perform well on the newly emerged events. Extensive experiments on two Twitter benchmark datasets demonstrate that our rumor detection method achieves much better results than state-of-the-art methods.
This study develops a machine learning-based system for the real-time detection of rumors on social media, particularly focusing on Twitter. We evaluated several models including Logistic Regression, Support Vector Machine (SVM), Naive Bayes, Random Forest, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and an Ensemble model combining Random Forest, SVM, and Logistic Regression. The CNN model demonstrated superior performance with the highest accuracy of 88%, closely followed by the Ensemble and SVM models at 87%. Naive Bayes reported the lowest accuracy at 82%, indicating challenges with complex social media text. The effectiveness of the CNN is attributed to its advanced capability to capture complex linguistic and contextual nuances, significantly enhancing classification accuracy. Furthermore, the system includes a user-friendly interface developed with Streamlit, facilitating swift and accurate information verification by users. The robustness of these machine learning models, particularly CNN, marks a significant advancement over traditional rumor detection methods, offering a scalable and effective solution for combating misinformation on social media platforms.
There is no doubt that social media has changed how people access information. However, it has also allowed misinformation to spread widely. Fake news, rumors, phishing links, and spam have become serious threats. These issues have led to societal division, economic damage, and a loss of trust in digital platforms. Furthermore, most existing detection systems focus on either text or images and do not work well across different types of content. This study introduces a Hybrid Machine Learning Framework that combines various signals, text features, visual cues, and user-based metadata with graph neural networks for secure information verification. The approach uses contextual embeddings from BERT for text, ResNet or Vision Transformers for images, handcrafted statistical features for URLs and user accounts, and graphbased propagation analyses for rumor detection. An attentionbased fusion module dynamically adjusts the fusion weights over these modalities based on input to be more robust for making predictions. Additional experimental results on benchmarks including FakeNewsNet, Weibo and UCI Phishing show that the hybrid supplemented framework outperforms the unimodal-based baseline in terms of accuracy, precision, recall and F1-score. This approach provides a scalable foundation for reliable social media monitoring.
No abstract available
With the rapid development of the Internet, people obtain much information from social media such as Twitter and Weibo every day. However, due to the complex structure of social media, many rumors with corresponding images are mixed in factual information to be widely spread, which misleads readers and exerts adverse effects on society. Automatically detecting social media rumors has become a challenge faced by contemporary society. To overcome this challenge, we proposed the multimodal affine fusion network (MAFN) combined with entity recognition, a new end-to-end framework that fuses multimodal features to detect rumors effectively. The MAFN mainly consists of four parts: the entity recognition enhanced textual feature extractor, the visual feature extractor, the multimodal affine fuser, and the rumor detector. The entity recognition enhanced textual feature extractor is responsible for extracting textual features that enhance semantics with entity recognition from posts. The visual feature extractor extracts visual features. The multimodal affine fuser extracts the three types of modal features and fuses them by the affine method. It cooperates with the rumor detector to learn the representations for rumor detection to produce reliable fusion detection. Extensive experiments were conducted on the MAFN based on real Weibo and Twitter multimodal datasets, which verified the effectiveness of the proposed multimodal fusion neural network in rumor detection.
No abstract available
Rumor verification on social media aims to identify the truth value of a rumor, which is important to decrease the detrimental public effects. A rumor might arouse heated discussions and replies, conveying different stances of users that could be helpful in identifying the rumor. Thus, several works have been proposed to verify a rumor by modelling its entire stance sequence in the time domain. However, these works ignore that such a stance sequence could be decomposed into controversies with different intensities, which could be used to cluster the stance sequences with the same consensus. In addition, the existing stance extractors fail to consider both the impact of all previously posted tweets and the reply chain on obtaining the stance of a new reply. To address the above problems, in this article, we propose a novel stance-based network to aggregate the controversies of the stance sequence for rumor verification, termed Filter-based Stance Network (FSNet). As controversies with different intensities are reflected as the different changes of stances, it is convenient to represent different controversies in the frequency domain, but it is hard in the time domain. Our proposed FSNet decomposes the stance sequence into multiple controversies in the frequency domain and obtains the weighted aggregation of them. Specifically, FSNet consists of two modules: the stance extractor and the filter block. To obtain better stance features toward the source, the stance extractor contains two stages. In the first stage, the tweet representation of each reply is obtained by aggregating information from all previously posted tweets in a conversation. Then, the features of stance toward the source, i.e., rumor-aware stance, are extracted with the reply chains in the second stage. In the filter block module, a rumor-aware stance sequence is constructed by sorting all the tweets of a conversation in chronological order. Fourier Transform thereafter is employed to convert the stance sequence into the frequency domain, where different frequency components reflect controversies of different intensities. Finally, a frequency filter is applied to explore the different contributions of controversies. We supervise our FSNet with both stance labels and rumor labels to strengthen the relations between rumor veracity and crowd stances. Extensive experiments on two benchmark datasets demonstrate that our model substantially outperforms all the baselines.
One of the most significant problems in the Internet and social media world we live in today is fake news. The fact that news can travel great distances in a matter of hours is a blessing, but it is also upsetting to witness several individuals and organizations disseminating false information. Many false news reports feature propaganda against an individual, group, organization, or political party. There are too many phoney news stories for one human to handle. Therefore, it is necessary to have machine learning classifiers that can recognize these bogus news reports automatically. In this study data of fake news collected from kaggle and different approaches applied into it like Logistic regression, hard voting classifier, graded boosting, adaboost and random forest approach. This will help any one to make right decision and analyze the truthiness involved into the news. Applying such algorithms techniques will make the environment peaceful and also help to reduce riots among different religions or communities, which is very often in many countries. These techniques will also be helpful in making the readers or viewers smart by simply not relying on any news blindly but also to use smart techniques by which they can able to decide about the percentage of truthiness in any news on social media platform.
No abstract available
With the proliferation of social media, the widespread dissemination of rumors has caused severe societal harm. While traditional rumor detection methods heavily depend on labeled datasets for model training, zero-shot rumor detection has emerged as a key research focus. Concurrently, the rapid advancement of Large Language Models (LLMs) presents new opportunities for establishing a training-free paradigm for rumor detection. This paper aims to systematically evaluate the behavioral patterns of open-source LLMs, represented by ChatGLM3, under different prompting strategies, and to explore how Chain-of-Thought (CoT) prompts can activate their zero-shot rumor detection potential. We conducted experiments on two public Chinese Weibo rumor datasets (Weibo Rumor Dataset and MDFEND), comparing “direct prompting” with various “Chain-of-Thought prompting” strategies. To gain a more comprehensive understanding of LLM performance, we also introduced Kimi and Qwen2 as comparative models and compared the zero-shot performance of all LLMs against a fully supervised BERT baseline. The experimental results reveal a critical finding: under direct prompting, ChatGLM3 exhibits a severe “conservative bias,” rendering it almost incapable of effectively identifying rumors. However, when guided by CoT prompts, its F1-score improved by over 440%, showcasing a dramatic transformation from 'failure' to 'efficacy'. Although its absolute performance still lags behind more powerful closed-source models and the supervised baseline, this transformation clearly demonstrates that CoT acts as a critical 'switch' to activate the task-specific capabilities of open-source LLMs. Our study provides valuable empirical insights and a methodological framework for effectively leveraging existing open-source LLMs through prompt engineering, especially under resource-constrained conditions.
Abstract The pervasive reach of the Internet has revolutionized information access and transmission, which has contributed to the widespread dissemination of rumors on social media. This study explored the impact of real-world integrated counter-rumor features, specifically community notes (which provide context and additional information from the online community) and related articles (which link to verified news sources that address the rumor), on online users’ intentions to believe and spread rumor tweets on social media. Additionally, we investigated how these features mitigate online users’ intentions to believe and spread different types of rumor messages, including wish and dread rumors. After conducting an experimental study with 201 online users on social media, we found that the presence of integrated counter-rumor features in rumor tweets can reduce online users’ intentions to believe and spread rumors, regardless of the specific feature used. While we observed no significant differences between the effects of community notes and related articles on overall online users’ intentions, a nuanced pattern emerged when we considered wish and dread rumors. Specifically, community notes proved more effective at reducing online users’ intentions to believe and spread wish-related rumors due to the diverse perspectives and opinions within the online community. By contrast, related articles were found to have greater efficacy at mitigating online users’ intentions to believe and spread dread rumors, as they can provide more concrete information to alleviate any associated fear or anxiety. Our findings contribute theoretical and practical insights for effectively countering the spread of rumor tweets on social media platforms.
Rumor detection is to identify and mitigate potentially damaging falsehoods, thereby shielding the public from misleading information. However, existing methods fall short of tackling class imbalance, meaning rumor is less common than true messages, as they lack specific adaptation for the context of rumor dissemination. In this work, we propose Dual Graph Networks with Synthetic Oversampling (SynDGN), a novel method that can determine whether a claim made on social media is rumor or not in the presence of class imbalance. SynDGN properly utilizes dual graphs to integrate social media contexts and user characteristics to make accurate predictions. Experiments conducted on two well-known datasets verify that SynDGN consistently outperforms state-of-the-art models, regardless of whether the data is balanced or not.
Rumors spread on social media overshadow the truth and trigger public panic. One effective countermeasure to address this issue is online rumor-combating. However, its effectiveness on social media has not been fully verified. In this study, drawing on construal level theory, we use temporal distance—the time interval between a rumor-combating post being released and receiving responses from social media users—to measure the effectiveness of rumor-combating. We also adopt elaboration likelihood model to explore the factors that could enhance this effectiveness. The empirical results show that perceptible (central route) factors, including the author’s authoritative combating methods, media richness, and positive emotions, are negatively related to temporal distance and are more effective for enhancing rumor-combating effectiveness than imperceptible (peripheral route) factors, such as the author’s influence and activeness. In addition, media richness exerts positive moderating effects on the relationship between perceptible route factors and rumor-combating effectiveness, implying that with the help of images or videos, rumor-combating effectiveness improves. This study addresses the need to enhance the effectiveness of rumor-combating and has practical implications for combating rumors in the social media.
Rumor detection on social media puts pre-trained language models (LMs), such as BERT, and auxiliary features, such as comments, into use. However, on the one hand, rumor detection datasets in Chinese companies with comments are rare; on the other hand, intensive interaction of attention on Transformer-based models like BERT may hinder performance improvement. To alleviate these problems, we build a new Chinese microblog dataset named Weibo20 by collecting posts and associated comments from Sina Weibo and propose a new ensemble named STANKER (Stacking neTwork bAsed-on atteNtion-masKed BERT). STANKER adopts two level-grained attention-masked BERT (LGAM-BERT) models as base encoders. Unlike the original BERT, our new LGAM-BERT model takes comments as important auxiliary features and masks co-attention between posts and comments on lower-layers. Experiments on Weibo20 and three existing social media datasets showed that STANKER outperformed all compared models, especially beating the old state-of-the-art on Weibo dataset.
The diffusion of rumors on social media generally follows a propagation tree structure, which provides valuable clues on how an original message is transmitted and responded by users over time. Recent studies reveal that rumor verification and stance detection are two relevant tasks that can jointly enhance each other despite their differences. For example, rumors can be debunked by cross-checking the stances conveyed by their relevant posts, and stances are also conditioned on the nature of the rumor. However, stance detection typically requires a large training set of labeled stances at post level, which are rare and costly to annotate. Enlightened by Multiple Instance Learning (MIL) scheme, we propose a novel weakly supervised joint learning framework for rumor verification and stance detection which only requires bag-level class labels concerning the rumor's veracity. Specifically, based on the propagation trees of source posts, we convert the two multi-class problems into multiple MIL-based binary classification problems where each binary model is focused on differentiating a target class (of rumor or stance) from the remaining classes. Then, we propose a hierarchical attention mechanism to aggregate the binary predictions, including (1) a bottom-up/top-down tree attention layer to aggregate binary stances into binary veracity; and (2) a discriminative attention layer to aggregate the binary class into finer-grained classes. Extensive experiments conducted on three Twitter-based datasets demonstrate promising performance of our model on both claim-level rumor detection and post-level stance classification compared with state-of-the-art methods.
PurposeThe COVID-19 has become a global pandemic, which has caused large number of deaths and huge economic losses. These losses are not only caused by the virus but also by the related rumors. Nowadays, online social media are quite popular, where billions of people express their opinions and propagate information. Rumors about COVID-19 posted on online social media usually spread rapidly; it is hard to analyze and detect rumors only by artificial processing. The purpose of this paper is to propose a novel model called the Topic-Comment-based Rumor Detection model (TopCom) to detect rumors as soon as possible.Design/methodology/approachThe authors conducted COVID-19 rumor detection from Sina Weibo, one of the most widely used Chinese online social media. The authors constructed a dataset about COVID-19 from January 1 to June 30, 2020 with a web crawler, including both rumor and non-rumors. The rumor detection task is regarded as a binary classification problem. The proposed TopCom model exploits the topical memory networks to fuse latent topic information with original microblogs, which solves the sparsity problems brought by short-text microblogs. In addition, TopCom fuses comments with corresponding microblogs to further improve the performance.FindingsExperimental results on a publicly available dataset and the proposed COVID dataset have shown superiority and efficiency compared with baselines. The authors further randomly selected microblogs posted from July 1–31, 2020 for the case study, which also shows the effectiveness and application prospects for detecting rumors about COVID-19 automatically.Originality/valueThe originality of TopCom lies in the fusion of latent topic information of original microblogs and corresponding comments with DNNs-based models for the COVID-19 rumor detection task, whose value is to help detect rumors automatically in a short time.
This paper describes our system for SemEval 2019 RumorEval: Determining rumor veracity and support for rumors (SemEval 2019 Task 7). This track has two tasks: Task A is to determine a user’s stance towards the source rumor, and Task B is to detect the veracity of the rumor: true, false or unverified. For stance classification, a neural network model with language features is utilized. For rumor verification, our approach exploits information from different dimensions: rumor content, source credibility, user credibility, user stance, event propagation path, etc. We use an ensemble approach in both tasks, which includes neural network models as well as the traditional classification algorithms. Our system is ranked 1st place in the rumor verification task by both the macro F1 measure and the RMSE measure.
The news provides important insights into current events and acts as a vital window into the world. The propagation of fake news, poses a serious problem. News that seems to present genuine but is made up is considered fake news. Such fake news can propagate inadvertently or on purpose, foment strife, and erode trust. Identifying fake news has been the focus of various studies to address this issue. To contribute in this direction, we proposed a stacking approach that combines convolutional neural networks (CNN) and long short-term memory (LSTM). We use logistic regression (LR) as a metaclassifier for final classification. We used accuracy, precision, recall, and F1-score as performance evaluation metrics on a real-world dataset. The dataset included in this study reflects a wide range of information and consists of both content from social media platforms and news items from reliable sources. We use McNemar's test to determine the statistical significance of the model's performance. The proposed hybrid approach yields impressive results: 95.19% accuracy, 95.05% precision, 95.54% recall, and 95.29% F1-score. These findings highlight the hybrid model's efficacy in correctly identifying fake news, supporting social peace and the preservation of real news.
No abstract available
The purpose of this study is to develop and validate a procedure known as the Information Vortex Indicator (IVI) for its effectiveness, designed to detect the timing of information vortex formation in textual data streams. Research has established that the formation of this vortex coincides with the onset of the dissemination of fake news (FN) concerning a particular object (such as a person, organization, company, event, etc.). The primary aim of this detection is to minimize the time required for an appropriate response or defense against the adverse effects of information turbulence caused by the spread of fake news. Methodology: The study used Big Data information resources analysis instruments (Gogołek, 2019, 2022), including selected statistical and artificial intelligence techniques and tools, to automatically detect vortex occurrence in real time. Experimental validation of the efficacy of these tools has been conducted, enabling a reliable assessment of the timing of vortex emergence. This assessment is quantified using the V-function, procedure, or test, which formally describes the IVI procedure. The V-function’s parameters are derived from the distribution patterns of letter pair clusters within the textual information stream. Conclusions: A comparison of manual (reference) and automatic detection of vortex emergence times confirmed an accuracy rate of over 80% in detecting the appearance of fake news. These results underscore the effectiveness of the IVI procedure and the utility of the selected tools for rapidly automating the detection of information vortices, which herald the propagation of fake news. Furthermore, the study demonstrates the applicability of IVI for the continuous monitoring of information with significant media value across multiple multilingual data streams. Originality: This research introduces a novel approach utilizing the distribution of letter pair clusters within information streams to detect the onset of information vortices, coinciding with the emergence of fake news. This methodology represents a unique contribution to the field, as prior research on this subject is limited.
The phenomenon of fake news has grown concurrently with the rise of social networks that allow people to directly access news without the mediation of reliable sources. Recognizing news as fake is a difficult task for humans, and even tougher for a machine. This proposal aims to redesign the problem: from a check of truthfulness of news content, to the analysis of texts’ persuasion level. That is how information is introduced to the reader, assuming that fake news is aimed at persuading towards the reality of sense they intend to convey. M.A.D.I.T. methodology has been chosen. It is useful to describe how texts are built, overcoming the content/structure analysis level and stressing the study of Discursive Repertories: discursive modalities of reality of sense building, classified into real and fake news categories thanks to the Machine learning application. For the dataset building 7,387 news have been analysed. The results highlight different profiles of text building between the two groups: the different and typical discursive repertories allow to validate the methodological approach as a good predictor of the persuasion level of texts, not only of news, but also of information in domains such as the economic financial one (e.g. GameStop event).
In the contemporary landscape characterized by the pervasive use of social media, the proliferation of counterfeit news has become conspicuous. Consequently, the precise identification of such disinformation has assumed paramount significance. However, several existing methods for fake news detection tend to focus solely on entity information within the text or relationships between multimodal information, often overlooking the inherent knowledge of event evolutionary patterns embedded within the news text. In order to address the aforementioned issue, we propose an Event Evolutionary Graph Comparison Network for multimodal fake news detection (EEGCN). The primary objective of this network is to assist in fake news detection by comparing the patterns of event evolution in text data with those in pre-constructed event evolutionary graph. In order to fully harness the capabilities of the Event Evolutionary Graph Comparison Network, various comparison methods were explored. Experimental results on the Weibo dataset demonstrate that EEGCN outperforms previous multimodal fake news detection models, achieving superior performance.
In recent years, fake news on social media has become a significant threat to societal security, elevating fake news detection to a research priority. Among various strategies, fact-checking detection methods stand out for their accuracy, leveraging evidence from dedicated fact databases. However, these methods often retrieve raw truth, including vast amounts of irrelevant data, based on semantic similarity. This approach results in information redundancy and risks missing the nuanced differences between fake news and the truth. As a result, subtle changes in fake news can greatly increase the risk of misclassification, compromising the methods’ robustness. To this end, we propose a robust fake news detection framework with Fine-grained Discrepancy Contrastive Learning (FinDCL). By simulating subtle discrepancies between fake news and event-related truth, our method enhances the capture and identification of nuanced falsehoods. Specifically, we construct an adversarial dataset to pre-train a fine-grained discrepancy calculation module with contrastive learning. Moreover, the truth extraction module is devised to alleviate information redundancy by extracting event-related truth. At last, FinDCL jointly utilizes the aforementioned modules to detect fake news in event truth-known and truth-unknown scenarios. Extensive experiments on six real-world datasets demonstrate the effectiveness of FinDCL.
The proliferation of fake news has become a significant issue in today’s society, affecting the public’s perception of current events and causing harm to individuals and organizations. Therefore, the need for automated systems that can identify and flag fake news is critical. This paper presents a study on the effectiveness of DistilBERT and RoBERTa, two state-of-the-art language models, for detecting fake news. In this study, we trained both models on a dataset of labelled news articles and evaluated them on two different datasets, comparing their performance in terms of accuracy, precision, recall and F1-score. The results of our experiments show that both models perform well in detecting fake news, with RoBERTa model achieving slightly better results in overall. Our study highlights the ability of these models to effectively identify fake news and help combat misinformation.
The rapid dissemination of fake news in the digital era has become a pressing concern. The ease of generating and manipulating fake content, including images, text, audio, and videos, has significantly fueled the spread of misinformation on social media platforms. These platforms often lack rigorous editorial scrutiny, exacerbating this problem. Although recent studies have explored multimodal fake news detection to learn shared representations of textual and visual information, they often learn discrete latent representations, merely concatenations of multimodal features. Simple concatenation or summation operations hinder the dynamic interaction of multimodal features. Furthermore, most models rely on additional subtasks, such as reconstruction and event discrimination. The performance of these models depends heavily on subtasks, which can be mathematically complex and time-consuming. This reliance limits the ability of researchers to explore different modeling assumptions freely. This study introduces a novel approach that integrates a probabilistic algorithm with a deep neural network to effectively capture the uncertainties and diversities in the shared latent representation of multimodal data. Specifically, our model utilizes continuous latent representations by leveraging a smoothed Dirichlet distribution, facilitating the identification of shared hidden patterns across textual and visual modalities. In addition, our model demonstrates the powerful properties of generative models when integrated with neural network models. Our results underscore the potential of integrating a probabilistic algorithm with a deep neural network to address the challenges of fake news detection in a multimodal setting. To support further research and reproducibility, we made the code related to this work publicly accessible.
Abstract In the present era of the internet and social media, the way of information dissemination has changed. However, due to rapid growth in the amount of news generated regularly and the unsupervised nature of social media, fake news turns out to be a big problem. Fake news can easily build a false positive or negative perception about a person, or an event. Fake news was also used as a tool by propagandists during the Coronavirus (COVID-19) pandemic. Thus, there is a need to use technology to tag fake news and prevent its dissemination. Previously, different algorithms were designed to detect fake news but without considering the semantic meaning and long sentence dependence. This research work proposes a new approach to the detection of fake news in the context of COVID-19. The suggested approach uses a combination of Bidirectional Encoder Representations from Transformers (BERT) for extracting context meaning from sentences, SVM for pattern identification to detect fake news in a better way from the COVID-19 dataset, and an evolutionary algorithm called Non-dominated Sorting Genetic Algorithm II (NSGA-II) to distribute text for Support Vector Machine (SVM) classification. The suggested approach improves accuracy by 5.2 % by removing a certain amount of ambiguity from sentences.
Arguing in a disinformation era in the wake of the COVID19 pandemic is a challenging endeavour for many superpower nations. On social media, disinformation and rumours move such as any actual news, and most of the moment individuals are misguided with that data. Believing in rumours can have severe repercussions in both the society and individual. This makes it more severe in the event of a global epidemic at a point that it has caused chaos among nations and people. Fake news can not only produce cognitive uncertainty during periods like the COVID19 pandemic, but it can also endanger the lives of individuals. To address these difficulties, during pandemics, a deep learning-based detection of fake news is created to determine whether the news is fake or real. Initially, for this identification of fake news, the text data from twitter and other social networks is gathered. The collected data is applied to the text preprocessing strategies for improving text information quality. Further, the fake news from the social media is detected through "Adaptive Transformer Bidirectional Long Short Term Memory (ATrans-Bi-LSTM)" to identify the fake or real news. The hyperparameters from the Bi-LSTM network are tuned via the Komodo Mlipir Algorithm (KMA) to improve the effectiveness over the fake news detection on social networks. The classified outcome is compared with the traditional detection of fake news models in terms of various performance measures to ensure the effectiveness.
Information sharing on social media, especially about daily news and events, is a major focus area. Timely identification of urgent needs, sharing relevant posts, and delivering accurate information are crucial tasks. To combat the spread of fake news, a Reinforcement Learning (RL) technique is used alongside blockchain security to verify social media content. Twitter, a key platform with a major influence on public discourse, is particularly susceptible to false information due to its rapid news dissemination. The approach involves collecting news articles and their metadata, which are then pre-processed to clean and tokenize the data. An RL agent is trained on attributes like word frequency and readability, learning to distinguish between genuine and fake news through rewards and penalties. The trained RL agent classifies new news as true or false based on learned patterns. While blockchain's role in enhancing security is highlighted, further details are necessary to clarify its integration. This approach aims to reduce the spread of misinformation in digital news effectively.
No abstract available
The rapid development of computing trends, wireless communications, and the smart devices industry has contributed to the widespread of the internet. People can access internet services and applications from anywhere in the world at any time. There is no doubt that these technological advances have made our lives easier and saved our time and efforts. On the other side, we should admit that there is a misuse of internet and its applications including online platforms. As an example, online platforms have been involved in spreading fake news all over the world to serve certain purposes (political, economic, or social media). Detecting fake news is considered one of the hard challenges in term of the existing content-based analysis of traditional methods. Recently, the performance of neural network models have outperformed traditional machine learning methods due to the outstanding ability of feature extraction. Still, there is a lack of research work on detecting fake news in news and time critical events. Therefore, in this paper, we have investigated the automatic identification of fake news over online communication platforms. Moreover, We propose an automatic identification of fake news using modern machine learning techniques. The proposed model is a bidirectional LSTM concatenated model that is applied on the FNC-1 dataset with 85.3 % accuracy performance.
In today's society, fake news is a growing problem, and precisely identifying it is a crucial challenge. Numerous studies have investigated various machine learning methods to identify bogus news. A fake news detection system employing UNet and LSTM was suggested in one study to identify false news photos. Another study suggested a deep learning-based approach to find false information on Twitter. The third study, in contrast, suggested a model to identify bogus news on the Chinese social media site Weibo by combining CNN with LSTM. Subsequent research within the realm of news identification could centre on the advancement of sophisticated and precise machine learning models. This could involve extending the application of fake news detection to platforms other than social media, as well as amalgamating diverse modalities to enhance the accuracy of fake news detection, addressing the issue of emerging events, and developing user-friendly tools for spotting fake news. These directions for research hold the promise of considerably enhancing the efficiency of false news detection and reducing the detrimental effects of fake news on our society.
In recent years, social media has grown in popularity as a means for individuals to take in news. Given that the propagation of false information on social media sites like Twitter has negative impacts on both individuals and society, many research communities have investigated automated fake news identification as a means of countering fake news on the in Twitter platform. One phenomenon brought on by the development of multimedia technology is the increasing prevalence of social media news that includes information in several media, including text, photos, and videos. The many informational channels provide greater evidence that news events really happened and more opportunities to spot components of fake news on social media platforms. One of the most difficulties in natural language processing is differentiating between true (or accurate) and fraudulent (or fake) news material. To identify fake news on the Twitter network, we suggest an XLNet model in this analysis. Numerous pre-trained models based on transformers were tested, but XLNet produces results with the highest accuracy. We improved the model's performance, got it to work rather well, and measured various performance measures.
The act of recognizing news that intentionally spreads false information via social media or traditional news sources is known as fake news detection. The characteristics of fake news make it difficult to identify. The spread of fake news and misleading information has increased dramatically due to social media's role as a communication tool and the quick advancement of technology. There is an urgent need for automated and intelligent systems that can differentiate between authentic and fraudulent information due to the fast dissemination of unverified content. The proposed hybrid model efficiently captures regional and worldwide relationships in textual details to address this by combining multiscale residual CNN and BiLSTM layers. The BiLSTM layers manage contextual representations and sequential dependencies, while the CNN layers concentrate on extracting deep local features. The model's capacity to recognize patterns of deception in textual content and comprehend semantic flow is enhanced by this dual architecture. The Edge-IIoT set data and the IoT-23 information from Aposemat were utilized in this study to assess the suggested framework empirically. A concept based on information transfer and sophisticated adaptive systems, we provide an understanding of outliers management paradigm of "generation–spread–identification–refutation" for identifying false information during emergencies. Findings from experiments clearly illustrate the superiority of the BiLSTM approach, demonstrating not only its state-of-the-art efficacy in identifying fake news but also its significant edge over traditional machine learning algorithms. This highlights the BiLSTM approach's critical role in protecting our information ecosystems from the ubiquitous threat of misinformation.
Fake news can rapidly spread through internet users and can deceive a large audience. Due to those characteristics, they can have a direct impact on political and economic events. Machine Learning approaches have been used to assist fake news identification. However, since the spectrum of real news is broad, hard to characterize, and expensive to label data due to the high update frequency, One-Class Learning (OCL) and Positive and Unlabeled Learning (PUL) emerge as an interesting approach for content-based fake news detection using a smaller set of labeled data than traditional machine learning techniques. In particular, network-based approaches are adequate for fake news detection since they allow incorporating information from different aspects of a publication to the problem modeling. In this paper, we propose a network-based approach based on Positive and Unlabeled Learning by Label Propagation (PU-LP), a one-class and transductive semi-supervised learning algorithm that performs classification by first identifying potential interest and non-interest documents into unlabeled data and then propagating labels to classify the remaining unlabeled documents. A label propagation approach is then employed to classify the remaining unlabeled documents. We assessed the performance of our proposal considering homogeneous (only documents) and heterogeneous (documents and terms) networks. Our comparative analysis considered four OCL algorithms extensively employed in One-Class text classification (k-Means, k-Nearest Neighbors Density-based, One-Class Support Vector Machine, and Dense Autoencoder), and another traditional PUL algorithm (Rocchio Support Vector Machine). The algorithms were evaluated in three news collections, considering balanced and extremely unbalanced scenarios. We used Bag-of-Words and Doc2Vec models to transform news into structured data. Results indicated that PU-LP approaches are more stable and achieve better results than other PUL and OCL approaches in most scenarios, performing similarly to semi-supervised binary algorithms. Also, the inclusion of terms in the news network activate better results, especially when news are distributed in the feature space considering veracity and subject. News representation using the Doc2Vec achieved better results than the Bag-of-Words model for both algorithms based on vector-space model and document similarity network.
Many people utilize the internet, which is one of the most important inventions. These people use this for various objectives. These individuals can access a variety of social media networks. These online platforms allow any user to create content and share news. The users and their posts are not verified by these platforms. Thus, some individuals attempt to utilize these platforms to disseminate false information. These false reports may be propaganda intended to harm a certain person, group, institution, or political party. All of these phony news are too many for a human to recognize. Thus, a system that can automatically identify this false information is required. In addition to social media sites like Facebook, Twitter, and others becoming more and more popular, the word spread rapidly, reaching a large number of people quickly-millions of individuals. Therefore, addressing the problem of decreasing crime, political unrest, suffering, and attempts to disseminate misleading information requires acknowledging this issue. The goal of this project is to automatically identify fake news using a collection of current events. Such facts must be contrasted and compared. Knowing the difference between a real and a fake is essential. Most importantly, we use the right data set to discriminate between real and non-real, allowing us to identify what is incorrect and what is not with the same confusing identification. In this research, we examine fake news and its profound effects on a range of societal issues, including reputational harm, stoking controversy, and even influencing the outcome of elections. All news publishers, news aggregators, and social media platforms attempted to stop the spread of incorrect or misleading information that was misrepresented as news by implementing automatic text classification systems. In this proposed research work, after data collection, the data preprocessing activity involves removing irrelevant words, splitting text into smaller units, reducing words to their stem form, and ensuring consistency in format. By using the Long Short-Term Memory (LSTM) classification algorithm, train the model once the data has been cleaned, which yields the highest accuracy rate. The information is extracted and converted into H5 format using Python, and the best results are achieved by using streamlit to show whether the given information is real or fake news. In conclusion, we examine the outcomes of our model in relation to important subjects like filtering and information overload, taking advantage of the exceptionally high precision and recall that were obtained.
Online news has taken over as the primary source of information in recent years. People don't have enough time to read the newspaper, so they utilise social media to keep up with the latest news. However, sometimes information on the internet is unclear, and it may be intended to deceive. Automated false news identification technologies, such as machine learning models, have become a must in the current system. With hold out cross validation, the performance of machine learning models was evaluated on two fake and real news datasets of varying sizes. On the ISOT dataset and the KD nugget dataset, the suggested novel stacking model obtained testing accuracy of 99.94 percent and 96.05 percent, respectively. While using the dataset, we were unable to obtain an accurate result for identifying fake news from current events, and we were only able to detect fake news. Concerning the specific group. As a result, we're going to use for detecting fake news in real-time tweets from Twitter. The global model is able to capture general sentiment information and is shared across multiple tweets. Greedy & Dynamic Blocking Algorithms unique to Trends, such as the Support Vector Machine model. In addition, we collect sentiment knowledge from both labelled and unlabelled samples in each Trend and use it to improve the learning of Trends-specific sentiment categorization. We use restoration over Trends-specific sentiment classifiers in our method for encouraging the exchange of sentiment information between relevant every key word.
Fake news and misinformation have been increasingly used to manipulate popular opinion and influence political processes. To better understand fake news, how they are propagated, and how to counter their effect, it is necessary to first identify them. Recently, approaches have been proposed to automatically classify articles as fake based on their content. An important challenge for these approaches comes from the dynamic nature of news: as new political events are covered, topics and discourse constantly change and thus, a classifier trained using content from articles published at a given time is likely to become ineffective in the future. To address this challenge, we propose a topic-agnostic (TAG) classification strategy that uses linguistic and web-markup features to identify fake news pages. We report experimental results using multiple data sets which show that our approach attains high accuracy in the identification of fake news, even as topics evolve over time.
Abstract With the widespread use of online social media, we have witnessed that fake news causes enormous distress and inconvenience to people's social life. Although previous studies have proposed rich machine learning methods for identifying fake news in social media, the task of detecting fake news in emerging news events/domains remains a challenging problem due to the wide range of news topics on social media as well as the evolution and variation of fake news contents in the web. In this study, we propose an approach which we term “domain-adversarial and graph-attention neural network” (DAGA-NN) model to address the challenge. Its main advantage is that, in a text environment with multiple events/domains, only partial domain sample data are needed to train the model to achieve accurate cross-domain fake news detection in those domains with few (or even no) samples, which makes up for the limitations of traditional machine learning in fake news detection tasks due to news content evolution or cross-domain identification (where there is no sample data). Extensive experiments were conducted on two multimedia datasets of Twitter and Weibo, and the results showed that the proposed model was very effective in detecting fake news across events/domains.
Social media has greatly streamlined communication in recent years which uses network, shares information, and keep up with current events. Many posts on social media are questionable and meant to deceive. There is a lot of fake news everywhere and online fake news could cause serious societal issues. It is crucial for us to identify false and misleading internet content. Technically, it is hard for us to differentiate between real and fake information. The development and sharing of content have been made easier by social media platforms, leading to a significant amount that must be examined. Information on the internet is diverse making this process difficult. In order to assess the reliability and purpose of statements, humans and machines must work together. Manually identifying fake news is challenging, and it can only be done when the individual doing the identification has extensive understanding of news. Therefore, there is a need for machine learning classifiers that can identify the false news automatically. The main goal is to measure the accuracy of deceptive detection utilizing various machine learning algorithms. Several machine learning (ML) techniques, such as Decision Trees (DT), Random Forest (RF), Logistic Regression (LR), Passive Aggressive Classifier (PAC), Extra Trees Classifier (ETC) and boosting algorithms were employed on datasets to determine if the content is authentic or untrue in order to identify hoaxes on different platforms for online networking sites.
Rumor detection is a process designed to identify and analyze false or misleading content in information. The rapid development of social media is an important channel for people to obtain external information, and also a platform for the widespread dissemination of false news. In view of this phenomenon, this paper analyzes the key elements such as time, place, people and events in the news text information by algorithm, and uses the algorithm to analyze the social media, multiple information such as news sources, historical data, and comprehensively analyze the authenticity of the news between the two. This paper mainly introduces the steps and methods of Bidirectional Encoder Representations from Transformers model and Bidirectional Encoder Representations from Transformers-Convolutional Neural Network model in capturing text information, and tests and optimizes the algorithm. The test results show that these two algorithms can efficiently and quickly identify most of the fake news, and intuitively show them to users through the system output, which is worth applying and popularizing in the corresponding field.
No abstract available
Rumor detection in social media is a critical concern and is exacerbated by the complexity and diversity of rumor texts. Within social media rumor posts, auxiliary elements such as real-time emerging comments regarding the event can reveal public uncertainty and skepticism about the content in question. Motivated by this intuition, we propose the Adaptive Weighted Ensemble Deep Learning model (AWEDL), an innovative framework that integrates rumor and stance models. AWEDL adeptly captures user attitudes embedded in comment data, without requiring explicit stance labels. Through a comprehensive process of weighted fusion and dynamic adjustment of vector representations pertaining to rumors and stances, AWEDL strikes an optimal balance between these features. This approach culminates in an increase in the precision of rumor detection. Extensive validation on three well-recognized datasets, including Twitter15, Twitter16, and Weibo, firmly establishes AWEDL's superiority over existing benchmarks, with notably remarkable performance on the Weibo dataset, achieving an average F1 score of 95.7%.
Multimodal rumor detection aims at detecting rumors using information from textual and visual modalities. The most critical difficulty in multimodal rumor detection lies in capturing both the intra-modal and inter-modal relationships from multimodal data. However, existing methods mainly focus on the multimodal fusion process while paying little attention to the intra-modal relationships. To address these limitations, we propose a multimodal rumor detection method with deep metric learning (MRML) to effectively extract multimodal relationships of news for detecting rumors. Specifically, we design the metric-based triplet learning to extract the intra-modal relationships between rumors and non-rumors in every modality and the contrastive pairwise learning to capture the inter-modal relationships across multimodal. Extensive experiments on two real-world multimodal datasets show the superior performance of our rumor detection method.
No abstract available
The rapid flow of information in social networks leads to the rapid spread of rumors, which has serious negative impacts on society. To overcome the limitation of existing multi-modal rumor detection methods that overlook inter-modal correlations during information fusion, we propose the Dual-Key Prompt Learning Network for Enhanced Multi-modal Rumor Detection (DPLE). We use the superpixel segmentation method based on the Simple Linear Iterative Clustering (SLIC) algorithm to segment the image, and Graph Convolutional Network (GCN) is combined to capture the local details and structural information of the image. We designed a globally shared dual-key prompt pool to dynamically select the prompt through the joint index of text-image prompt keys, effectively building deep links between modes and enhancing mutual understanding. At the same time, momentum updating strategy is introduced in training to improve the robustness of the model. The experimental results show that DPLE achieves more excellent performance on both PHEME and Weibo datasets, which proves the effectiveness and advantages of the method in dealing with the multimodal rumor detection task.
The rapid spread of information through online platforms and social media networks has led to an increase in the propagation of rumors, which can have detrimental effects. Researchers have proposed various deep learning models for multimodal rumor detection. However, these models often handle each modality individually, limiting the abilities of information complementation and modal enhancement. To address this challenge, we propose a Cross-modal Information-enhanced Fusion Network (CIFN) for rumor detection on social media platforms. CIFN enhances the representation of different modalities in a unified framework before effectively combining textual and visual information to accomplish the rumor detection task. Specifically, CIFN introduces the Feature Information Enhancement (FIE) module, which enhances different modal information by selectively focusing on relevant features and capturing interdependencies between modalities. Additionally, CIFN introduces a Review-based Fusion Mechanism (RFM) to integrate textual and visual features, considering the weight allocation of different modalities at the feature level. Extensive experiments conducted on two public datasets show that the proposed CIFN outperforms existing methods in rumor detection.
Traditional rumor detection methods that only focus on text content have achieved certain results. However, with the rapid development of social platforms, graphic information has occupied a large proportion. In this scenario, traditional detection methods cannot make full use of picture information for rumor detection. Aiming at the above scenarios, a rumor detection model integrating multi-modal features is proposed. Firstly, text features and visual features as well as their hidden states are extracted by using the pre-trained deep learning model, and then the preliminary fusion features are obtained by integrating the hidden states of text and image through the attention mechanism. Then, the text features, preliminary fusion features and social features are spliced, and the image features, preliminary fusion features and social features are spliced to obtain two final fusion features. Then the two features are input into different full connection layers to get their respective prediction results. Finally, the two prediction results are integrated to obtain the final detection results. Experimental results show that the proposed model is effective in detecting multimodal rumor data.
Rumors on social media platforms have a significant negative impact on society, making rumor detection increasingly critical. However, most existing methods focus on identifying rumors only after they have already spread widely and negative impacts have occurred. Therefore, identifying rumors in the early stages is necessary. Early rumor detection is typically characterized by limited spread and small sample sizes, making it impractical to rely on large datasets or rumor propagation structures. To overcome these characteristics, a multimodal meta learning method based on few-shot learning (FSL) for early rumor detection is proposed. A multimodal feature extraction layer is designed to extract data features of various modalities, while a multimodal hidden information extraction layer is constructed to uncover deep information from these features. Furthermore, a multimodal fusion output layer is developed to combine and process the multimodal information, leading to rumor classification. A meta-learning algorithm is introduced to address the challenge of small sample sizes, utilizing fast and multistep update methods to enhance the adaptability and stability of model. Comparative experiments conducted on two publicly available datasets confirm that our proposed method demonstrates strong performance in early rumor detection.
This paper proposes a novel approach for rumor detection in social media by integrating multi-modal information fusion. With the prevalence of false information and rumors on social platforms, an effective detection mechanism is essential. Our method leverages deep neural network architectures, including image captioning and recurrent neural networks (RNN) with attention mechanisms, to integrate text, image, and social features. We adopt two feature fusion strategies, early fusion and late fusion, to comprehensively integrate multimodal data. In addition, we design a multi-layer bidirectional recurrent neural network (BRNN) architecture to capture textual relations and improve classification accuracy. Experimental results on real datasets demonstrate the superiority of the proposed method, with an F-value of 0.89. This study promotes the advancement of social media rumor detection technology through effective multimodal information fusion and deep learning architectures. Future work will focus on optimizing model components and exploring advanced sentiment analysis methods for further performance improvements.
No abstract available
Rumor detection on social media has become increasingly important. Most existing graph-based models presume rumor propagation trees (RPTs) have deep structures and learn sequential stance features along branches. However, through statistical analysis on real-world datasets, we find RPTs exhibit wide structures, with most nodes being shallow 1-level replies. To focus learning on intensive substructures, we propose Rumor Adaptive Graph Contrastive Learning (RAGCL) method with adaptive view augmentation guided by node centralities. We summarize three principles for RPT augmentation: 1) exempt root nodes, 2) retain deep reply nodes, 3) preserve lower-level nodes in deep sections. We employ node dropping, attribute masking and edge dropping with probabilities from centrality-based importance scores to generate views. A graph contrastive objective then learns robust rumor representations. Extensive experiments on four benchmark datasets demonstrate RAGCL outperforms state-of-the-art methods. Our work reveals the wide-structure nature of RPTs and contributes an effective graph contrastive learning approach tailored for rumor detection through principled adaptive augmentation. The proposed principles and augmentation techniques can potentially benefit other applications involving tree-structured graphs.
No abstract available
Information spreads swiftly via social media in response to breaking news, modifying several aspects of the data. Nevertheless, it’s crucial to understand that early modifications can have been brought about by hearsay or unconfirmed information. Unverified data can be labeled to stop potentially false information from spreading. Rumor detection provides more data to a rumor-tracking system that evaluates the veracity of the rumors. In recent research, predictive models such as Support Vector Machines (SVM), Conditional Random Fields (CRF), and Random Forests have been employed to detect rumors. Twitter, a popular medium for exchanging information, rebranded itself as $\mathbf{X}$ in response to recent changes in social networking sites. This shift might affect the dynamics of information sharing and represents a major turning point in social media history. We were able to employ the Deep Neural Network (DNN), Gated Recurrent Unit (GRU), and Long Short-Term Memory (LSTM) models to detect rumors in recent narratives by investigating the sequential differences in reporting when news breaks on social media platforms. LSTM obtained $88.55 \%$ test accuracy, DNN achieved $85.44 \%$ test accuracy, and GRU achieved $\mathbf{8 9. 6 2 \%}$ test accuracy based on the test data.
In an era where rumors can propagate rapidly across social media platforms such as Twitter and Weibo, automatic rumor detection has garnered considerable attention from both academia and industry. Existing multimodal rumor detection models often overlook the intricacies of sample difficulty, e.g., text-level difficulty, image-level difficulty, and multimodal-level difficulty, as well as their order when training. Inspired by the concept of curriculum learning, we propose the Curriculum Learning and Fine-grained Fusion-driven multimodal Rumor Detection (CLFFRD) framework, which employs curriculum learning to automatically select and train samples according to their difficulty at different training stages. Furthermore, we introduce a fine-grained fusion strategy that unifies entities from text and objects from images, enhancing their semantic cohesion. We also propose a novel data augmentation method that utilizes linear interpolation between textual and visual modalities to generate diverse data. Additionally, our approach incorporates deep fusion for both intra-modality (e.g., text entities and image objects) and inter-modality (e.g., CLIP and social graph) features. Extensive experimental results demonstrate that CLFFRD outperforms state-of-the-art models on both English and Chinese benchmark datasets for rumor detection in social media.
The generation and spread of rumors are phenomenon that cannot be eradicated in human society. However, the wanton spread of rumors will harm the development of society, especially in the current era of information. The existing rumor detection models or frameworks are basically based on deep neural networks, which can extract text and propagation features and other information aiding rumor assessment, and then combine these features to filter rumors. Although these methods are effective, it is difficult to distinguish rumors in complex contexts or instances where the signs of forgery are not so obvious. Therefore, based on the interactive characteristics of text content and Propagation graph structure, this research proposes a rumor detection model with deep cross fusion. First, the text content and the information of propagation graph structures are encoded while ensuring that they are complete and independent, thereby extracting important information; Then, the separately encoded text content is deeply cross fused with the propagation graph structures, so that the information can penetrate and complement each other, and at the same time, a self-attention operation is performed after the cross fusion of each layer. Finally, the encoding from both components is fully integrated, then a classification feature vector is introduced into the output result of each part, and the classification result is output after multiple layers of fully connected layers. The experiments on three classical rumor detection datasets show that the model has a significant improvement in both accuracy and specified delay time, and the maximum accuracy is improved by 1.21%.
Spreading rumors on social media is a phenomenon that has destructive implication of societal interaction, diverts attention toward destructive behavior. The impact will be more influenced in healthcare management. This research aims to detect the rumors and identify the sources using deep learning algorithms. In our proposed system, after pre-processing, the tweet comments are extracted from topics and ranked as deny, support, query and comment. Then the comments are classified as positive, negative and neutral using Artificial Neural Network Neuro-fuzzy Inference System Spline-based pi-shaped Membership Function (ANISPIMF). Then the negative comments are classified into offensive, violence, misogyny and hate mongering by using Improved Deep Learning Neural Network (IDLNN) which is the combination of Deep Neural Network with Cuckoo Search–Flower Pollination Algorithm to optimize the weight values. The optimized ANISPIMF performs very well for the COVID-19 dataset in terms of Accuracy, Precision and Recall. The proposed system attains better performance and efficiency when weighted against prevailing methodologies — regarding the performance measures, there is an improvement of accuracy by 0.6%, recall by 0.7%, and precision by 1%, together with an [Formula: see text]1-score of 1.2% than the Multiloss Hierarchical Bi-LSTM with Attenuation Factor (MHA).
With the rise of social media, the rapid spread of rumors online has resulted in numerous negative effects on society and the economy. The methods for rumor detection have attracted great interest from both academia and industry. Given the widespread effectiveness of contrastive learning, many graph contrastive learning models for rumor detection have been proposed by using the event propagation structure as graph data. However, the existing contrastive models usually treat the propagation structure of other events similar to the anchor events as negative samples. While this design choice allows for discriminative learning, on the other hand, it also inevitably pushes apart semantically similar samples and, thus, degrades model performance. In this article, we propose a novel propagation fusion model called propagation structure fusion model based on node-level contrastive learning (PFNC) for rumor detection based on node-level contrastive learning. PFNC first obtains three augmented propagation structures by masking the text of each node in the propagation structure randomly and perturbing some edges in the propagation structure based on the importance of edges. Then, PFNC applies the node-level contrastive learning method between every two augmented propagation structures to prevent the samples with similar propagation structure from far away. Finally, a convolutional neural network (CNN)-based model is proposed to capture the relevant information that is consistent and supplementary among three augmented propagation structures by regarding the propagation structure of the event as a color picture, three augmented propagation structures as color channels, and each node as a pixel. The experimental results on real datasets show that the PFNC significantly outperforms the state-of-the-art models for rumor detection.
Aiming at the lack of feature extraction ability of rumor detection methods based on the deep learning model, this study proposes a rumor detection method based on deep learning in social network big data environment. Firstly, the scheme of combining API interface and third-party crawler program is adopted to obtain Weibo rumor information from the Weibo “false Weibo information” public page, so as to obtain the Weibo dataset containing rumor information and nonrumor information. Secondly, the distributed word vector is used to encode text words, and the hierarchical Softmax and negative sampling are used to improve the training efficiency. Finally, a classification and detection model based on the combination of semantic features and statistical features is constructed, the memory function of Multi-BiLSTM is used to explore the dependency between data, and the statistical features are combined with semantic features to expand the feature space in rumor detection and describe the distribution of data in the feature space to a greater extent. Experiments show that when the word vector dimension is 300, compared with the compared literature, the accuracy of the proposed method is improved by 4.232% and 1.478%, respectively, and the F1 value of the proposed method is improved by 5.011% and 1.795%, respectively. The proposed method can better extract data features and has better rumor detection ability.
No abstract available
No abstract available
Rumor detection is a task of identifying information that spread among people whose truth value is false or unverified, and it has been a great challenge due to the rapid development of social media. The traditional machine learning based detection methods can make full use of informative features but cannot extract high-level representations. Other methods involved deep learning neural networks exploit propagation structural information to achieve high accuracy, for example, Bi-Directional Graph Convolution Networks(BiGCN) achieved the best performance on rumor detection by operating on bottom-up and top-down structures. However, those deep learning methods ignore other useful features like content-based features. In this paper, we not only make full use of three aspects of features based on a new concept: kernel subtree, which focus more on informative features of influential nodes of an event, but also propose a new model, which consists of Separation Convolution blocks, Long Short Term Memory(LSTM) and Squeeze and Excitation Networks(SENet), to make comprehensive use of features extracted on the basis of kernel subtree. First, we utilize Separation Convolutions to learn more local information with different kernel size, then LSTM can learn high-level interactions among features and find more global information. After that, SENet applies attention mechanism to put more weights on informative channels of feature maps. Meanwhile, on test set, Gradient Boosting Decision Tree(GBDT) is used to assist our model with few events. The experiments on the PHEME dataset show that our approach can identify rumors with accuracy 95% which outperforms BiGCN by 10% at least.
No abstract available
The increasing popularity of social media has made the creation and spread of rumors much easier. Widespread rumors on social media could cause devastating damages to society and individuals. Automatically detecting rumors in a timely manner is greatly needed but also very challenging technically. In this paper, we propose a new deep feature fusion method that employs the linguistic characteristics of the source tweet text and the underlying patterns of the propagation tree of the source tweet for Twitter rumor detection. Specifically, the pre-trained Transformer-based model is applied to extract context-sensitive linguistic features from the short source tweet text. A novel sequential encoding method is proposed to embed the propagation tree of a source tweet into the vector space. A convolutional neural network (CNN) architecture is then developed to extract temporal-structural features from the encoded propagation tree. The performance of the proposed deep feature fusion method is evaluated with two public Twitter rumor datasets. The results demonstrate that the proposed method achieves significantly better detection performance than other state-of-the-art baseline methods.
The evolution of the World Wide Web and the rapid enhancement of social sites like Facebook, Twitter changed the way of communication. With these social media sites, people are creating and disseminating more information, meanwhile large rumor and fake news also increased. Automatic text classification as information and disinformation is a challenging task. We investigate the rumor identification problem by considering the contextual information. The proposed work is the hybridization of CNN and BILSTM with Glove embedding to classify the tweets into rumor and non-rumor. All the experiments are performed on publicly available dataset collected from Kaggle, that is the world largest community. Experimental result show that the proposed model outperformed when compared with baseline model. The proposed model provide 90.93% accuracy.
During public crisis events, multimodal contents from social platforms such as text, images, and videos, contain valuable knowledge for official rapid response. However, the fragmentation of such knowledge across platforms undermines timely decision‐making and limits the effectiveness of intelligent emergency response. This study proposes a cross‐platform emergency knowledge collaboration method based on the multimodal heterogeneous information network. Firstly, the structure of the multimodal heterogeneous information network is defined and constructed for each specific platform, followed by corresponding visualizations. Then, the Enhanced‐HGCN model with attention‐based fusion is proposed to learn effective representations from the constructed networks. Based on the learned representations, the node–community collaboration strategy is designed to enable semantic and structural alignment across different platforms by linking similar nodes and their corresponding communities. Experimental results indicate that the constructed collaboration network achieves superior structural connectivity and richer semantic representation compared to single‐platform networks. This collaboration network provides a stronger foundation for downstream tasks such as emergency knowledge recommendation.
Social media platforms are highly interconnected because many users maintain a presence across multiple platforms. Consequently, efforts to limit the spread of misinformation taken by individual platforms can have complex consequences on misinformation diffusion across the social media ecosystem. This is further complicated by the diverse social structures, platform standards, and moderation mechanisms provided on each platform. We study this issue by extending our previous model of Reddit interactions and community-specific moderation measures. By adding a followership-based model of Twitter interactions and facilitating cross-platform user participation, we simulate information diffusion across heterogeneous social media platforms. While incorporating platform-specific moderation mechanisms, we simulate interactions at the user level and specify user-specific attributes. This allows practitioners to conduct experiments with various types of actors and different combinations of moderation. We show how the model can simulate the impacts of such features on discussions facilitated by Reddit and Twitter and the cross-platform spread of misinformation. To validate this model, we use a combination of empirical datasets from three U.S. political events and prior findings from user surveys and studies.
ABSTRACT The research field of crisis informatics examines, amongst others, the potentials and barriers of social media use during conflicts and crises. Social media allow emergency services to reach the public easily in the context of crisis communication and receive valuable information (e.g. pictures) from social media data. However, the vast amount of data generated during large-scale incidents can lead to issues of information overload and quality. To mitigate these issues, this paper proposes the semi-automatic creation of alerts including keyword, relevance and information quality filters based on cross-platform social media data. We conducted empirical studies and workshops with emergency services across Europe to raise requirements, then iteratively designed and implemented an approach to support emergency services, and performed multiple evaluations, including live demonstrations and field trials, to research the potentials of social media-based alerts. Finally, we present the findings and implications based on semi-structured interviews with emergency services, highlighting the need for usable configurability and white-box algorithm representation.
Propagation analysis refers to studying how information spreads on social media, a pivotal endeavor for understanding social sentiment and public opinions. Numerous studies contribute to visualizing information spread, but few have considered the implicit and complex diffusion patterns among multiple platforms. To bridge the gap, we summarize cross-platform diffusion patterns with experts and identify significant factors that dissect the mechanisms of cross-platform information spread. Based on that, we propose an information diffusion model that estimates the likelihood of a topic/post spreading among different social media platforms. Moreover, we propose a novel visual metaphor that encapsulates cross-platform propagation in a manner analogous to the spread of seeds across gardens. Specifically, we visualize platforms, posts, implicit cross-platform routes, and salient instances as elements of a virtual ecosystem — gardens, flowers, winds, and seeds, respectively. We further develop a visual analytic system, namely BloomWind, that enables users to quickly identify the cross-platform diffusion patterns and investigate the relevant social media posts. Ultimately, we demonstrate the usage of BloomWind through two case studies and validate its effectiveness using expert interviews.
Background Suicide represents a critical public health concern, and machine learning (ML) models offer the potential for identifying at-risk individuals. Recent studies using benchmark datasets and real-world social media data have demonstrated the capability of pretrained large language models in predicting suicidal ideation and behaviors (SIB) in speech and text. Objective This study aimed to (1) develop and implement ML methods for predicting SIBs in a real-world crisis helpline dataset, using transformer-based pretrained models as a foundation; (2) evaluate, cross-validate, and benchmark the model against traditional text classification approaches; and (3) train an explainable model to highlight relevant risk-associated features. Methods We analyzed chat protocols from adolescents and young adults (aged 14-25 years) seeking assistance from a German crisis helpline. An ML model was developed using a transformer-based language model architecture with pretrained weights and long short-term memory layers. The model predicted suicidal ideation (SI) and advanced suicidal engagement (ASE), as indicated by composite Columbia-Suicide Severity Rating Scale scores. We compared model performance against a classical word-vector-based ML model. We subsequently computed discrimination, calibration, clinical utility, and explainability information using a Shapley Additive Explanations value-based post hoc estimation model. Results The dataset comprised 1348 help-seeking encounters (1011 for training and 337 for testing). The transformer-based classifier achieved a macroaveraged area under the curve (AUC) receiver operating characteristic (ROC) of 0.89 (95% CI 0.81-0.91) and an overall accuracy of 0.79 (95% CI 0.73-0.99). This performance surpassed the word-vector-based baseline model (AUC-ROC=0.77, 95% CI 0.64-0.90; accuracy=0.61, 95% CI 0.61-0.80). The transformer model demonstrated excellent prediction for nonsuicidal sessions (AUC-ROC=0.96, 95% CI 0.96-0.99) and good prediction for SI and ASE, with AUC-ROCs of 0.85 (95% CI 0.97-0.86) and 0.87 (95% CI 0.81-0.88), respectively. The Brier Skill Score indicated a 44% improvement in classification performance over the baseline model. The Shapley Additive Explanations model identified language features predictive of SIBs, including self-reference, negation, expressions of low self-esteem, and absolutist language. Conclusions Neural networks using large language model–based transfer learning can accurately identify SI and ASE. The post hoc explainer model revealed language features associated with SI and ASE. Such models may potentially support clinical decision-making in suicide prevention services. Future research should explore multimodal input features and temporal aspects of suicide risk.
Crisis mapping platforms have transformed disaster management and digital humanitarian efforts by allowing victims to quickly submit “Requests for Help” (RFH) messages directly from disaster locations via mobile devices. On these platforms, online volunteers are often engaged in processing and categorizing messy and incomplete RFH messages into structured and useful crisis reports that can aid first responders in their recovery efforts. This research note examines the case of the Ushahidi platform deployed during the 2010 Haiti Earthquake to propose design principles for a crisis mapping platform that facilitates the conversion and categorization of victim-submitted RFH messages into actionable crisis reports for on-site first responders. To validate the proposed design principles, we instantiated them with the help of a template, and conducted a series of experiments to confirm the effectiveness of the template in improving the categorization quality of crisis reports. We expect that the design principles will be particularly useful for developing digital platforms aimed at humanitarian crisis response that requires a large-scale participation of online crowd volunteers.
This research introduces a comprehensive crowdsourced disaster management system utilizing artificial intelligence to enhance real-time response, decision-making, and disaster mitigation. The system integrates deep learning models for disaster detection, categorization, and prediction, leveraging cloud-based AWS services for scalability, reliability, and accessibility. The methodology includes real-time data gathering from social media platforms, IoT sensors, governmental databases, and user-generated reports, ensuring a robust and multi-source approach for situational awareness. By actively involving community participation through mobile and web-based applications, the system strengthens resilience and ensures immediate response to emergency situations. The project addresses critical challenges such as misinformation filtering, automatic classification of disaster severity, automated response recommendations, and infrastructure scalability. With advancements in AI-driven data analytics, the platform ensures efficient disaster response by optimizing resource allocation, reducing response time, and improving the coordination between emergency services and affected populations. The paper highlights the transformative potential of AI in disaster preparedness, mitigation, and response through intelligent automation and crowdsourced intelligence.
This paper describes ESARS, a real-time situation-aware social media- enabled emergency situation alert and reporting system, as a decision support system built on multi-agent software design architecture for emergency situation management. The impact of an incident or disruption due to the incident could be minimized by implementing real-time intervention strategies that involve event monitoring, detection and situation identification via classification and prediction, notification, visualization and reporting that culminate in providing emergency support within time. The nature of agent behavior, which is autonomous, proactive and cooperative, makes them a suitable method for the design and deployment of a dynamic system of this nature. The system relies on historical and streamed real-time geolocation-enabled Twitter data stream for the target emergency events to provide decision-makers with dynamic, comprehensive, and timely information specific to the emergency situation.
The rapid spread of misinformation across digital platforms has created a critical need for accessible, real-time fact-checking tools. Existing fact-checking systems are typically web-only, limited to research prototypes, or lack public accessibility and real-time responsiveness. This paper presents SmartFactCheckBot, a dual-interface AI misinformation detection platform deployed through a Telegram bot and a web-based interface. The system uses a DistilBERT classifier fine-tuned on the Kaggle Fake News Classification dataset and incorporates an explainability engine to increase public trust. The backend is deployed on an AWS EC2 t2.small instance in us-east-1, with a production-grade stack including FastAPI, Nginx, systemd services, and HTTPS. Real-world usage analytics collected via Google Analytics 4 (GA4) and Amazon CloudWatch show 125 bot users and 135 unique web users during the initial deployment phase. Experimental results demonstrate an accuracy of 94.7%, F1-score of 94.1%, and an inference time of 72 ms, making the system suitable for public, real-time use. SmartFactCheckBot demonstrates a practical, scalable AI solution to support society in mitigating misinformation and serves as an open-source public-benefit technology.
In today's deep learning-dominated era, real-time classification of public emergencies is a critical research area. Existing methods, however, often fall short in considering both temporal and spatial aspects comprehensively. This study introduces GEDNAS, a novel model that combines atrous convolutional neural network (DCNN), gated recurrent unit (GRU), and neural structure search (NAS) to address these limitations. GEDNAS utilizes DCNN to capture local spatio-temporal features, integrates GRU for time series modeling, and employs NAS for overall structural optimization. The approach significantly enhances real-time public emergency classification performance, showcasing its efficiency and accuracy in responding to real-time scenarios and providing robust support for emergency response efforts. This research introduces an innovative solution for public safety, advancing the application of deep learning in emergency management and inspiring the design of real-time classification models, ultimately enhancing overall societal safety.
The explosive growth of social platforms has transformed user-generated content into a vital source for detecting real-world crises in real time. Platforms such as Twitter and Reddit capture early indicators of natural disasters, political unrest, and public health emergencies. Yet, the overwhelming scale of data and the rapid spread of misinformation undermine the accuracy and timeliness of actionable insights. Traditional event detection approaches rely on keyword tracking or clustering, but these methods struggle with scalability, semantic ambiguity, and susceptibility to false alarms. Deep learning–based models have improved detection quality, but they still lack robust mechanisms for filtering unreliable content as it emerges. To address these challenges, this paper proposes CrisisSense, a real-time framework that combines transformer-based semantic embeddings with "Density-Based Spatial Clustering of Applications with Noise (DBSCAN)" for event detection and a hybrid misinformation filtering algorithm. DBSCAN is chosen due to its ability to identify dense semantic clusters while effectively ignoring noisy or irrelevant posts, making it highly suitable for dynamic social streams. The filtering layer integrates credibility scoring, external fact-checking signals, and attention-driven text classification to actively suppress misleading content during crises. The novelty of CrisisSense lies in its dual-layered architecture, which ensures both timely event detection and reliability by mitigating misinformation at the source. Evaluation on large-scale Twitter and Reddit datasets demonstrates superior performance in terms of precision, recall, and detection latency, establishing CrisisSense as a scalable solution for enhancing crisis intelligence, emergency response, and decision-making in high-stakes environments.
The increasing reliance on social media as a primary information source has heightened the need for efficient real-time event detection. This study introduces a scalable framework that leverages big data analytics, natural language processing (NLP), and machine learning to identify emerging events with high accuracy. Our methodology integrates streaming data processing, deep learning-based text analysis, and graph-based clustering. The proposed model demonstrates superior performance in detecting real-world events while effectively filtering misinformation and reducing noise, as evidenced by experimental evaluation. The results suggest that our approach enhances the accuracy and efficiency of event detection in social media streams, making it a valuable tool for emergency response, public safety, and news verification [1][2].
Existing misinformation detection benchmark datasets (e.g., COVMIS and LIAR2) are limited by their reliance on fact-checking labels that are prone to factual inaccuracies due to cognitive constraints of fact-checkers and outdated labels. Prior misinformation detection tasks have been hindered by the dual problems of label redundancy and cold start. To this end, we propose a novel Cyclic Evidence-based Misinformation Detection (CEMD) framework, which incorporates two core mechanisms: (i) a Retrieval Augmented Generation (RAG) pipeline that leverages the latest external knowledge to augment insufficient prior knowledge; and (ii) a cyclic evidence-bootstrapping mechanism that mitigates label redundancy and cold start. We introduce an improved dataset, COVMIS2, built upon COVMIS, and conduct comprehensive experiments to evaluate the efficacy of our framework. Our results demonstrate that the CEMDo outperforms the prior state-of-the-art (SOTA) baseline on LIAR2 by 11.95% and surpasses the human baseline on COVMIS2 by 6.31%, leveraging the Llama-3-70B-Instruct model to augment prior knowledge and the DoRA fine-tuned Llama-3-8B-Instruct model for binary classification. Furthermore, we curate new benchmark datasets, COVMIS2024 and LIAR2024, by recategorizing the redundant labels of COVMIS2 and LIAR2 through the CEMDo.
This research introduces a machine learning framework for real-time rumor detection on social media platforms, particularly focusing on tweet classification. Several machine learning approaches, including Support Vector Machine (SVM), Logistic Regression, Naïve Bayes, Random Forest, and a Convolutional Neural Network (CNN), are explored. An ensemble model integrating Random Forest, SVM, and Logistic Regression is also examined. The CNN architecture demonstrates the highest accuracy (88%), outperforming the ensemble model (87%) and SVM (87%). CNN's superior performance is attributed to its ability to capture deep linguistic patterns and contextual nuances in social media content. The system features an intuitive user interface for immediate analysis, offering a practical solution to combat misinformation. The results underscore the potential of advanced machine learning techniques in enhancing the accuracy and efficiency of automated rumor detection.
In this paper, a novel hybrid deep learning framework for real-time and explainable social network analytics is presented. BERT and GNNs are proposed for integrating textual content and user interaction graphs in the analysis. To combine these two complementary categories of signals by a multi-modal attention mechanism, the framework captures both the meaning of the user generated content as well as the underlining social dynamics. It supports sentiment classification, misinformation detection and influence modeling on dynamic and large scale social media environment, and it is designed for multi-task learning. The model includes the Explainable AI (XAI) modules, such as SHAP values and attention heat maps for ensuring the transparency so that analysts and users may interpret the way predictions are made. In addition, a trust index to evaluate model reliability is introduced in the specific scenario of health monitoring or crisis response. A Kafka based streaming engine is used to enable real time data processing, and as a result, network metrics can be updated with low latency as data comes. The structure of the framework is evaluated on the benchmark datasets such as SemEval for sentiment analysis, FakeNewsNet for misinformation detection and SNAP Twitter Graphs for structural analysis. Results show that this proposed model performs better than such baselines in term of accuracy, F 1 score, and robustness with computational efficiency such that it is suitable to be deployed in real time. In addition, along with XAI tools, XAI tools can serve a meaningful interpretability purpose, which is invaluable for decision-making systems operating in high impact domains e.g., healthcare.
This paper introduces a hybrid framework for fake news classification in Indonesian-Language social media by integrating document-level similarity with sentence-level linguistic analysis. The approach begins with cosine similarity computation between user-submitted headlines and a curated hoax corpus represented through TF-IDF vectors. Headlines that exceed the threshold are immediately labeled as fake, while those falling below undergo a second stage of analysis. In this stage, relevant tweets are retrieved in real time and examined using six carefully selected sentence-level features: average sentence length, punctuation frequency, function word usage, phrase structure count, sentiment polarity, and the type–token ratio of content words. These features are designed to capture the syntactic and stylistic patterns commonly found in misinformation. The dataset, collected from TurnBackHoax.id, Komdigi, and Kompas, consists of 32,865 labeled entries. A stratified 10-fold cross-validation was employed to evaluate five machine learning classifiers. Results demonstrate that the Support Vector Machine (SVM) with an RBF kernel achieved the strongest performance, recording an F1-score of 84.4% and surpassing MLP, KNN, Decision Tree, and Naive Bayes. Validation on 15 real news headlines further confirmed the robustness of the framework in low-similarity cases. These findings underscore that the integration of vector-based similarity with optimized sentence-level features enhances detection accuracy while preserving transparency and adaptability. The proposed model offers a lightweight and domain-flexible solution that is particularly suitable for real-time misinformation mitigation in low-resource contexts.
The proliferation of misinformation, such as rumors on social media, has drawn significant attention, prompting various expressions of stance among users. Although rumor detection and stance detection are distinct tasks, they can complement each other. Rumors can be identified by cross-referencing stances in related posts, and stances are influenced by the nature of the rumor. However, existing stance detection methods often require post-level stance annotations, which are costly to obtain. We propose a novel LLM-enhanced Multiple Instance Learning (MIL) approach to jointly predict post stance and claim class labels, supervised solely by claim labels, using an undirected microblog propagation model. Our weakly supervised approach relies only on bag-level labels of claim veracity, aligning with MIL principles. To achieve this, we transform the multi-class problem into multiple MIL-based binary classification problems. We then employ a discriminative attention layer to aggregate the outputs from these classifiers into finer-grained classes. Experiments conducted on three rumor datasets and two stance datasets demonstrate the effectiveness of our approach, highlighting strong connections between rumor veracity and expressed stances in responding posts. Our method shows promising performance in joint rumor and stance detection compared to the state-of-the-art methods.
No abstract available
With the proliferation of user-generated content on social media platforms, the dissemination of rumors and misinformation has become a pressing concern. The rapid spread of such content poses significant challenges to public trust, political stability, and information integrity. In this context, the development of automated rumor detection systems has gained increasing attention. However, a critical challenge lies in ensuring the robustness of these systems against adversarial manipulations—subtle alterations to input text that can degrade model performance. This study systematically evaluates the vulnerability of transformer-based models, specifically DistilBERT, to adversarial attacks in the domain of rumor detection. The experiments are conducted on the LIAR dataset, a benchmark corpus comprising over 12,000 fact-checked political statements annotated with six fine-grained truthfulness labels, which are restructured into a binary classification framework. The model is subjected to three types of adversarial perturbations: synonym substitution, character-level noise, and insertion of distracting terms. While the fine-tuned DistilBERT model demonstrates strong performance on clean data—achieving an accuracy of 0.97, F1-score of 0.96, precision of 0.95, and recall of 0.98—its performance degrades notably under adversarial conditions. For instance, character-level attacks reduce accuracy to 0.90 and recall to 0.81, underscoring the model’s susceptibility to minor textual variations. These findings emphasize the necessity of enhancing adversarial robustness in rumor detection models to ensure their reliability in real-world social media environments.
The prevalence of social media has made information sharing possible across the globe. The downside, unfortunately, is the wide spread of misinformation. Methods applied in most previous rumor classifiers give an equal weight, or attention, to words in the microblog, and do not take the context beyond microblog contents into account; therefore, the accuracy becomes plateaued. In this research, we propose an ensemble neural architecture to detect rumor on Twitter. The architecture incorporates word attention and context from the author to enhance the classification performance. In particular, the word-level attention mechanism enables the architecture to put more emphasis on important words when constructing the text representation. To derive further context, microblog posts composed by individual authors are exploited since they can reflect style and characteristics in spreading information, which are significant cues to help classify whether the shared content is rumor or legitimate news. The experiment on the real-world Twitter dataset collected from two well-known rumor tracking websites demonstrates promising results.
No abstract available
The coronavirus emerged at the end of 2019 and has caused thousands of casualties all over the world. The pandemic has also been accompanied by loss of employment and economic down fall. Naturally, the pandemic and lack of knowledge of coronavirus has created public anxiety and panic. Nowadays, social medias like Twitter and Facebook and online news forum reach most people and have become popular channels of communication and information sharing. Unfortunately, these have become easy targets for rumors and fake news. The rapid flow of rumors and misleading information on the coronavirus over these online platforms has promoted public anxiety and fear. Consequently, the detection of rumors has become obligatory for economy and public safety. In this context, the present research focused on detecting and classifying rumors so that precautionary measures can be incorporated. Attention-based BiLSTM with BERT for rumor classification on the COVID-19 rumor dataset was proposed. The suggested classification model achieved an accuracy of 80.71% and a micro-F1 score of 90.85. Furthermore, the experimental outcomes affirm the superior efficacy of our proposed technique over existing methods.
No abstract available
In the current digital era, the rapid expansion of social media has accelerated the spread of rumors, allowing misinformation to circulate rapidly and affect a wide audience. Detecting rumors is a challenging task due to their varying degrees of credibility and diverse categories. Current detection techniques generally focus on single features, such as textual information or sentiment analysis, while more advanced methods combine different data modalities, including text, images, and metadata, to improve accuracy. Some methods employ hierarchical structures; however, they often fail to comprehend the semantic and contextual nuances of the content and overlook the complete retweet propagation structure, impairing their effectiveness in detecting rumors. This paper proposes a Dual GNN BERT-based Model (DGBM) for rumor detection. Our model utilizes a Graph Convolutional Network (GCN) and a Graph Neural Network (GNN) to effectively capture contextual and semantic relationships between the words in source tweets and their retweets. We applied Gated-GNN to combine information from different levels and better understand how rumors spread. We conducted experiments on the Sina Weibo benchmark dataset, achieving an accuracy of 97% , which is 2% higher than the previous methods. This improvement demonstrates that our approach effectively captures local and global features of the propagation graph structure.
As online social networks are experiencing extreme popularity growth, determining the veracity of online statements denoted by rumors automatically as earliest as possible is essential to prevent the harmful effects of propagating misinformation. Early detection of rumors is facilitated by considering the wisdom of the crowd through analyzing different attitudes expressed toward a rumor (i.e., users’ stances). Stance detection is an imbalanced problem as the querying and denying stances against a given rumor are significantly less than supportive and commenting stances. However, the success of stance-based rumor detection significantly depends on the efficient detection of “query” and “deny” classes. The imbalance problem has led the previous stance classifier models to bias toward the majority classes and ignore the minority ones. Consequently, the stance and subsequently rumor classifiers have been faced with the problem of low performance. This paper proposes a novel adaptive cost-sensitive loss function for learning imbalanced stance data using deep neural networks, which improves the performance of stance classifiers in rare classes. The proposed loss function is a cost-sensitive form of cross-entropy loss. In contrast to most of the existing cost-sensitive deep neural network models, the utilized cost matrix is not manually set but adaptively tuned during the learning process. Hence, the contributions of the proposed method are both in the formulation of the loss function and the algorithm for calculating adaptive costs. The experimental results of applying the proposed algorithm to stance classification of real Twitter and Reddit data demonstrate its capability in detecting rare classes while improving the overall performance. The proposed method improves the mean F-score of rare classes by about 13% in RumorEval 2017 dataset and about 20% in RumorEval 2019 dataset.
The rapid dissemination of rumors on social media highlights the urgent need for automatic detection methods to safeguard societal trust and stability. While existing multimodal rumor detection models primarily emphasize capturing consistency between intrinsic modalities (e.g., news text and images), they often overlook the intricate interplay between intrinsic and social modalities. This limitation hampers the ability to fully capture nuanced relationships that are crucial for a comprehensive understanding. Additionally, current methods struggle with effectively fusing social context with textual and visual information, resulting in fragmented interpretations. To address these challenges, this paper proposes a novel Intrinsic-Social Modality Alignment and Fusion (ISMAF) framework for multimodal rumor detection. ISMAF first employs a cross-modal consistency alignment strategy to align complex interactions between intrinsic and social modalities. It then leverages a mutual learning approach to facilitate collaborative refinement and integration of complementary information across modalities. Finally, an adaptive fusion mechanism is incorporated to dynamically adjust the contribution of each modality, tackling the complexities of three-modality fusion. Extensive experiments on both English and Chinese real-world multimedia datasets demonstrate that ISMAF consistently outperforms state-of-the-art models.
No abstract available
The rapid propagation of rumors on social media can give rise to various social issues, underscoring the necessity of swift and automated rumor detection. Existing studies typically identify rumors based on their textual or static propagation structural information, without considering the dynamic changes in the structure of rumor propagation over time. In this paper, we propose the Temporal Tree Transformer model, which simultaneously considers text, propagation structure, and temporal changes. By analyzing observing the growth of propagation tree structures in different time windows, we use Gated Recurrent Unit (GRU) to encode these trees to obtain better representations for the classification task. We evaluate our model’s performance using the PHEME dataset. In most existing studies, information leakage occurs when conversation threads from all events are randomly divided into training and test sets. We perform Leave-One-Event-Out (LOEO) cross-validation, which better reflects real-world scenarios. The experimental results show that our model achieves state-of-the-art accuracy 75.84% and Macro F1 score of 71.98%, respectively. These results demonstrate that extracting temporal features from propagation structures leads to improved model generalization.
The development of the Internet is accompanied by the appearance of many problems and challenges, among which the negative impact of rumors on society needs urgently addressing. Weibo platform is virtual and hidden, and has a large number of users. Therefore, identifying and predicting Weibo rumors can help reduce the harm caused by rumors to society. In this paper, NLP techniques and feature engineering are used, while using random forest for rumor identification to achieve accurate identification and prediction of rumors. Research has shown that Weibo rumors are precisely identifiable. By accurately identifying rumors, this research aims to reduce the harmful effects of rumors and provide a reference for society.
Detecting rumors is the most difficult task when working with social media sites. To address this problem, various Deep Learning (DL)-based rumor detection methods have evolved in previous decades. Amongst, Bidirectional Encoder Representations from Transformers with Attention based Balanced Spatial-Temporal Residual Graph Convolutional Networks (BERT-ABSTRGCN) was developed to address the models convergence issues for rumor stance classification. However, the training procedure will slow down significantly due to complex features, which increase computational complexity and strain hardware resources needed for the large feature set. For this, Ensemble-based multiple time series analysis technique is developed for timely rumor recognition on social media by eliminating the feature set complexity on larger dataset. Initially, the collected tweet streams are pre-processed to obtain the clean data. Then, BERT is applied on the pre-processed data to extract different relevant tweet timestamps features without any time delay for models training. Reaction counts are used as features in time-series vectors that are created from Twitter conversations, which are lists of tweets. These vectors are then inputted into DL models to produce the time-series data. The suggested ensemble model improves the classification model performance-using majority voting across multiple ABSTRGCN within the ensemble, which leverages the individual strengths of each network to enhance overall performance. The proposed model reduces resource and computational complexity in model training on large datasets for improvising the rumor detection efficiency in social media. The complete model is termed as BERT-Ensemble of ABSTRGCN (BERT-EABSTRGCN). Furthermore, the findings show that the suggested model hits 96.89% accuracy on PHEME datasets respectively compared to the classical models.
Early rumor detection (ERD) on social media platform is very challenging when limited, incomplete and noisy information is available. Most of the existing methods have largely worked on event-level detection that requires the collection of posts relevant to a specific event and relied only on user-generated content. They are not appropriate to detect rumor sources in the very early stages, before an event unfolds and becomes widespread. In this paper, we address the task of ERD at the message level. We present a novel hybrid neural network architecture, which combines a task-specific character-based bidirectional language model and stacked Long Short-Term Memory (LSTM) networks to represent textual contents and social-temporal contexts of input source tweets, for modelling propagation patterns of rumors in the early stages of their development. We apply multi-layered attention models to jointly learn attentive context embeddings over multiple context inputs. Our experiments employ a stringent leave-one-out cross-validation (LOO-CV) evaluation setup on seven publicly available real-life rumor event data sets. Our models achieve state-of-the-art(SoA) performance for detecting unseen rumors on large augmented data which covers more than 12 events and 2,967 rumors. An ablation study is conducted to understand the relative contribution of each component of our proposed model.
No abstract available
Automatic detecting rumors on social media has become a challenging task. Previous studies focus on learning indicative clues from conversation threads for identifying rumorous information. However, these methods only model rumorous conversation threads from various views but fail to fuse multi-view features very well. In this paper, we propose a novel multi-view fusion framework for rumor representation learning and classification. It encodes the multiple views based on Graph Convolutional Networks (GCN) and leverages Convolutional Neural Networks (CNN) to capture the consistent and complementary information among all views and fuse them. Experimental results on two public datasets demonstrate that our method outperforms state-of-the-art approaches.
In an era where both the general public and established news outlets increasingly rely on social media for real-time information, the abundance of rumors offers a huge difficulty. False information can have far-reaching implications, affecting individuals, communities, and even entire countries. To address this issue, a low-cost, self-regulating, and forward-thinking rumor detection technique is required. This research performs an intensive analysis of the performance of three robust machine learning algorithms, including XGBoost, SVM, and Random Forest, as well as two deep learning-based transformers, namely BERT and DistilBert. In this research, the models are trained and evaluated on a combined dataset comprising data from Twitter15 and Twitter16 datasets. Support Vector Machine (SVM) and Random Forest exhibit the best accuracy among classical machine learning models, reaching 89.05%. In comparison, among the transformer-based deep learning models, BERT achieves the best accuracy of 90.20%. In conclusion, the learning-based transformers beat its competitors in terms of accuracy, recall, precision, and F-measure, proving its efficacy in minimizing the detrimental impact of rumors on people, communities, and society as a whole.
No abstract available
In the era of widespread dissemination through social media, the task of rumor detection plays a pivotal role in establishing a trustworthy and reliable information environment. Nonethe-less, existing research on rumor detection confronts several challenges: the limited expressive power of text encoding sequences, difficulties in domain knowledge coverage and effective information extraction with knowledge graph-based methods, and insufficient mining of semantic structural information. To address these issues, we propose a Crowd Intelligence and ChatGPT-Assisted Network(CICAN) for rumor classification. Specifically, we present a crowd intelligence-based semantic feature learning module to capture textual content’s sequential and hierarchical features. Then, we design a knowledge-based semantic structural mining module that leverages ChatGPT for knowledge enhancement. Finally, we construct an entity-sentence heterogeneous graph and de-sign Entity-Aware Heterogeneous Attention to integrate diverse structural information meta-paths effectively. Experimental results demonstrate that CICAN achieves performance improvement in rumor detection tasks, validating the effectiveness and rationality of using large language models as auxiliary tools.
Most rumor detection methods extract the features of rumor through two aspects of text semantics and propagation structure to achieve automatic rumor classification, while most of the existing methods do not realize that false and irrelevant interactions in the propagation structure will reduce the accuracy of rumor detection. In addition, most of the existing rumor detection methods failed to effectively extract key clues from the comments of social network users. In response to these phenomena, this article proposes a social network rumor detection method combining a dual attention mechanism and graph convolutional network (GCN) (dual-attention GCN, DA-GCN). First, build an event propagation graph; then, the GCN is used to extract the propagation structure information of each event-related microblog (tweet), and the attention mechanism is combined to suppress the false and irrelevant interactive relationships. Therefore, the anti-interference propagation structure features are extracted from the propagation graph. Second, to fully utilize the clues in users’ comments, this article makes use of the attention mechanism to fuse source microblog (tweet) with the comment–retweet information and extract interactive semantic features from it. Finally, the above two features are fused to generate a new event representation. Experimental results show that the proposed DA-GCN has an accuracy of 94.4%, 90.5%, and 90.2% on the Weibo dataset, the Twitter15 dataset, and the Twitter16 dataset, respectively, and has achieved excellent performance in the early rumor detection task, which proves that the proposed method is reasonable and effective.
The proliferation of rumors on social media has become a major concern due to its ability to create a devastating impact. Manually assessing the veracity of social media messages is a very time-consuming task that can be much helped by machine learning. Most message veracity verification methods only exploit textual contents and metadata. Very few take both textual and visual contents, and more particularly images, into account. Moreover, prior works have used many classical machine learning models to detect rumors. However, although recent studies have proven the effectiveness of ensemble machine learning approaches, such models have seldom been applied. Thus, in this paper, we propose a set of advanced image features that are inspired from the field of image quality assessment, and introduce the Multimodal fusiON framework to assess message veracIty in social neTwORks (MONITOR), which exploits all message features by exploring various machine learning models. Moreover, we demonstrate the effectiveness of ensemble learning algorithms for rumor detection by using five metalearning models. Eventually, we conduct extensive experiments on two real-world datasets. Results show that MONITOR outperforms state-of-the-art machine learning baselines and that ensemble models significantly increase MONITOR’s performance.
Accurate and efficient rumor detection is critical for information governance, particularly in the context of the rapid spread of misinformation on social networks. Traditional rumor detection relied primarily on manual analysis. With the continuous advancement of technology, machine learning and deep learning approaches for rumor identification have gradually emerged and gained prominence. However, previous approaches often struggle to simultaneously capture both the sequential and the global structural relationships among topological nodes within a social network. To tackle this issue, we introduce a hybrid model for detecting rumors that integrates a Graph Convolutional Network (GCN) with a Transformer architecture, aiming to leverage the complementary strengths of structural and semantic feature extraction. Positional encoding helps preserve the sequential order of these nodes within the propagation structure. The use of Multi-head attention mechanisms enables the model to capture features across diverse representational subspaces, thereby enhancing both the richness and depth of text comprehension. This integration allows the framework to concurrently identify the key propagation network of rumors, the textual content, the long-range dependencies, and the sequence among propagation nodes. Experimental evaluations on publicly available datasets, including Twitter 15 and Twitter 16, demonstrate that our proposed fusion model significantly outperforms both standalone models and existing mainstream methods in terms of accuracy. These results validate the effectiveness and superiority of our approach for the rumor detection task.
No abstract available
本报告通过多维视角整合了中国应急虚假信息检测的研究现状,将相关文献划分为多模态融合技术、传播结构建模、大模型驱动语义分析、应急实战响应框架以及鲁棒性学习策略五大核心模块。总体而言,该领域研究正经历从单一数据源检测向多源跨模态融合、从传统机器学习向大语言模型微调、从离线算法分析向应急场景实时化系统建设的范式转型,致力于解决复杂突发事件下的信息鉴别难题。