心理筛查AI对话系统
多模态感知与行为生理特征融合筛查
该组文献聚焦于利用非侵入式技术提取多维生物特征,包括文本语义、语音(音调/节奏)、视觉(面部表情/眼动)及生理信号(ECG/RRI)。核心研究点在于通过多模态融合算法(如注意力机制、Transformer、CNN-LSTM)提高抑郁、焦虑等心理障碍自动识别的精度与鲁棒性。
- Multimodal machine learning for video based single question mental health assessment(Bradley Grimm, Pernille Yilmam, Brett Talbot, Loren Larsen, 2025, npj Digital Medicine)
- MDAM-Net: A Multi-Dimensional Adaptive Attention Network for Multimodal Depression Detection(Yiming Gao, 2025, 2025 IEEE 5th International Conference on Data Science and Computer Application (ICDSCA))
- A Multimodal Virtual Psychiatrist Interviewer and Mental Health Screener(Suresh Yeresime, 2026, International Journal for Research in Applied Science and Engineering Technology)
- AI-Powered Mental Health Assessment Using Speech and Text Analysis(S. Patil, Samruddhi Faratkhane, 2025, International Journal For Multidisciplinary Research)
- WavFace: A Multimodal Transformer-Based Model for Depression Screening(Ricardo Flores, M. Tlachac, Avantika Shrestha, Elke A. Rundensteiner, 2025, IEEE Journal of Biomedical and Health Informatics)
- Machine Learning Models for Speech-Based Depression Screening(Kunanon Kongchatree, Ongon Suriyo, Natvara Pichedpan, R. Kongkachandra, Pokpong Songmuang, 2025, 2025 IEEE International Conference on Cybernetics and Innovations (ICCI))
- A transparent four-feature speech model for depression screening applicable across clinical and community settings, including assisted-living environments(K. Mekulu, F. Aqlan, Hui Yang, 2026, Frontiers in Digital Health)
- Heart2Mind: Human-Centered Contestable Psychiatric Disorder Prediction System Using Wearable ECG Monitors(Hung Nguyen, Alireza Rahimi, Veronica Whitford, Hélène Fournier, Irina Kondratova, René Richard, Hung Cao, 2026, ACM Transactions on Computing for Healthcare)
- Deep Learning Multimodal Ensemble Techniques for Detecting the Depression Level using Multitudinous Selective Features(Deepak Joshi, Manpreet Kaur, 2024, 2024 2nd International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT))
- Mindsphere: An AI-Driven Multimodal Framework for Personalized Mental Health Assessment(T. S, S. S, S. P, V. R, 2025, 2025 International Conference on Information, Implementation, and Innovation in Technology (I2ITCON))
- Multimodal Mental Health Screening via Questionnaire Fusion and Semantic Embeddings(Chaitra B V, S. M. K., Aditya Anand, Raksha B V, Sara Suha, S. Keerthi, 2025, 2025 2nd International Conference on Software, Systems and Information Technology (SSITCON))
- Real‐Time Anxiety and Depression Detection by Combining Large Language Models and Machine Learning With Explainability Capabilities on a User‐Centric, Engaging Conversational Assistant(Silvia García-Méndez, Francisco de Arriba-Pérez, Julen Beiro‐Suso, F. González-Castaño, 2026, Expert Systems)
- Depression screening with textual and audio features based on large language models and machine learning.(Yu Jin, Xin Chen, Xintian Hong, Mingge Wang, Wenbang Niu, Ao Liu, Yi Li, Yanjun Bu, Yuanyuan Wang, 2025, Journal of Affective Disorders)
- Cross-Modal Attention for Multimodal Depression Detection Using Limited DAIC-WOZ Data(Farras Shaabihah, Kusnawi Kusnawi, 2025, 2025 12th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE))
- Manasvita : AI-Powered Multimodal Mental Wellness Platform(Ms. Anuradha Singh, 2025, International Journal for Research in Applied Science and Engineering Technology)
- Integrating Visual Modalities with Large Language Models for Mental Health Support(Zhouan Zhu, Shangfei Wang, Yuxin Wang, Jiaqiang Wu, 2025, International Conference on Computational Linguistics)
- Semi-Structural Interview-Based Chinese Multimodal Depression Corpus Towards Automatic Preliminary Screening of Depressive Disorders(Bochao Zou, Jiali Han, Yingxue Wang, R. Liu, Shenghui Zhao, Lei Feng, Xiang-wen Lyu, Huimin Ma, 2023, IEEE Transactions on Affective Computing)
- SilentCare: A Trust-Aware and Explainable AI Companion for Mental Health Support Through Adaptive Multichannel Dialogue Management(Md. Saad, Fazal Md. Kaifulla, Kazim Md. Fauzaan, Zaki Meccai, Dr. P.Naresh, D. S. L. Reddy, Dr.Praveen Kulkarni, 2025, 2025 10th International Conference on Communication and Electronics Systems (ICCES))
- Dual-Pipeline LSTM Screening for Anxiety and Depression via Video Interviews(P. Rahi, Sanjay Singla, 2025, 2025 IEEE 7th International Conference on Computing, Communication and Automation (ICCCA))
- End-to-end multimodal system for depression detection from online recordings(Mateusz Kowalewski, Maciej Stroinski, Kamil Kwarciak, Volodymyr Laptiev, D. Hemmerling, 2023, 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC))
- Deep learning-based detection of depression by fusing auditory, visual and textual clues.(Chenyang Xu, Yangbin Chen, Yanbao Tao, Wanqing Xie, Xiaofeng Liu, Yunhan Lin, Chunfeng Liang, Fan Du, Zhixiong Zhi, Chuan Shi, 2025, Journal of Affective Disorders)
- Neuroreflect: A Multimodal AI Framework for Real-Time Emotion and Mental Disorder Detection Using Speech and Text(S. Philip, Nehal Shaju, E. Priya, Reena Pagare, 2025, 2025 IEEE Pune Section International Conference (PuneCon))
- AI-Powered Depression Detection Using Text and Speech(Ranjith Durgunala, Tharun Nalamasu, Shruthika Baswa, Vaishnavi Sama, 2025, International Journal For Multidisciplinary Research)
- Multimodal Deep Learning for Early Detection of Depression and Anxiety through Explainable AI(Reuel Stefan Nallapalli, 2025, International Journal of Science and Research Archive)
- Depression Detection Using Multimodal Analysis with Chatbot Support(Archana Sharma, A. Saxena, Ashok Kumar, Divyanshu Singh, 2024, 2024 2nd International Conference on Disruptive Technologies (ICDT))
大语言模型(LLM)驱动的诊断推理与测评转化
这类研究探讨如何利用LLM(如GPT、Llama)进行心理专业化改造。通过提示工程(HCoT/APOLO)、检索增强生成(RAG)、指令微调和多轮对话重构,将传统心理量表(PHQ-9/BDI)或投射测验转化为智能对话流程,实现从简单问答到深度诊断推理的演进。
- DS@GT at eRisk 2025: From prompts to predictions, benchmarking early depression detection with conversational agent based assessments and temporal attention models(Anthony Miyaguchi, David Guecha, Y. Chiu, S. Gaur, 2025, Conference and Labs of the Evaluation Forum)
- Interpretable depression assessment using a large language model(Jae-Joong Lee, Jihoon Han, Choong-Wan Woo, 2026, PLOS Digital Health)
- Contextual Prompt Enabler for Mental Health (CPEMH): An Agent-Based LLM Framework for Prompt Design, Evaluation, and Selection for Depression Screening from Transcripts(Giuliano Lorenzoni, I. Portugal, Paulo S. C. Alencar, Donald D. Cowan, 2025, 2025 IEEE International Conference on Big Data (BigData))
- MonBarta: Deep Dive into Psychological Diagnosis of Mental Health Conversations Using LLM(Saidur Rahman Sujon, Ahmadul Karim Chowdhury, Ashek Seum, Anup Kumar Sutradhar, Faisal Muhammad Shah, 2025, 2025 IEEE 9th International Conference on Software Engineering & Computer Systems (ICSECS))
- COGITO: A Multilingual Psychiatric Mental Health Consultation and Patient Care Management Retrieval-Augmented AI platform(Nivindhan M, J. P, 2026, 2026 International Conference on AI-Driven Smart Systems and Ubiquitous Computing (ICAUC))
- PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation(Jinpeng Hu, Tengteng Dong, Gang Luo, Hui Ma, Peng Zou, Xiao Sun, Dan Guo, Xun Yang, Meng Wang, 2024, IEEE Transactions on Computational Social Systems)
- A Data Construction and Fine-Tuning Framework for LLM-Based Vietnamese Mental Health Assessment(X. Tran, T. Vo, D. T. Nguyen, Haitao Chu, Thanh Tran, Thai Le, M. Nguyen, 2025, 2025 RIVF International Conference on Computing and Communication Technologies (RIVF))
- Beyond rating scales: With targeted evaluation, large language models are poised for psychological assessment(O. Kjell, Katarina Kjell, H. A. Schwartz, 2023, Psychiatry Research)
- SINAI at eRisk@CLEF 2025: Transformer-Based and Conversational Strategies for Depression Detection(A. M. Mármol-Romero, Manuel García Vega, Miguel Ángel García Cumbreras, Arturo Montejo Ráez, 2025, Conference and Labs of the Evaluation Forum)
- SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support(Huachuan Qiu, Hongliang He, Shuai Zhang, Anqi Li, Zhenzhong Lan, 2023, Findings of the Association for Computational Linguistics: EMNLP 2024)
- Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study(Prottay Kumar Adhikary, Aseem Srivastava, Shivani Kumar, Salam Michael Singh, Puneet Manuja, Jini K Gopinath, Vijay Krishnan, Swati Kedia, K. Deb, Tanmoy Chakraborty, 2024, JMIR Mental Health)
- Chat, Summary and Diagnosis: A LLM - Enhanced Conversational Agent for Interactive Depression Detection(Xiaoheng Zhang, Weigang Cui, Junjie Wang, Yang Li, 2024, 2024 4th International Conference on Industrial Automation, Robotics and Control Engineering (IARCE))
- Constructing and applying a multi-turn psychological support dialogue corpus based on the Helping Skills Chain-of-Thought(Lanqing Du, Yunong Li, Yujie Long, Shihong Chen, 2026, Frontiers in Psychology)
- CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling(Chenhao Zhang, Renhao Li, Minghuan Tan, Min Yang, Jingwei Zhu, Di Yang, Jiahao Zhao, Guancheng Ye, Chengming Li, Xiping Hu, Derek F. Wong, 2024, Annual Meeting of the Association for Computational Linguistics)
- Transformer Ensembles and LLM-Powered Approaches for Depression Symptom Analysis and Contextualized Early Risk Detection(Poojan Vachharajani, 2025, Conference and Labs of the Evaluation Forum)
- Providers of relief in distress: RAG-based LLMs as situation and intent-aware assistants(Ahmad M. Nazar, Brianna Norman, Halle Northway, Abrahim Toutoungi, Emma Zatkalik, Gabriel Carlson, E. Sabado, Hamza Shawa, Mohamed Y. Selim, 2026, Frontiers in Artificial Intelligence)
- AdaptiCare: A Psychological Counseling Dialogue Enhancement Framework Based on Multi-Agent Collaboration(Linying Su, 2025, 2025 5th International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA))
对话管理、共情交互与多智能体架构设计
该组文献侧重于优化AI的临床交互能力,引入治疗理论(如CBT、图式疗法)和心理理论(ToM)。研究涵盖了多智能体协同框架(Multi-agent)、共情识别与生成、对话礼貌性、以及通过游戏化(PsychoGAT)或眼动增强等手段提升用户的信任感与自我披露意愿。
- Towards Multimodal Emotional Support Conversation Systems(Yuqi Chu, Lizi Liao, Zhiyuan Zhou, Chong-Wah Ngo, Richang Hong, 2024, IEEE Transactions on Multimedia)
- Structured Dialogue System for Mental Health: An LLM Chatbot Leveraging the PM+ Guidelines(Yixiang Chen, Xinyu Zhang, Jinran Wang, Xurong Xie, Nan Yan, Hui Chen, Lan Wang, 2024, No journal)
- ConText at WASSA 2024 Empathy and Personality Shared Task: History-Dependent Embedding Utterance Representations for Empathy and Emotion Prediction in Conversations(Patrícia Pereira, Helena Moniz, Joao Paulo Carvalho, 2024, Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis)
- ‘Am I Understood?’: How the Interplay Between Embodiment and Theory of Mind Behavior Affects LLM-based Conversational Agents on Perceived Trust, Anthropomorphism, Presence, Usability, and User Experience(Elizabeth A. Schlesener, Marcin Ziolkowski, Sai-Keung Wong, Brent Westmoreland, Sabarish V. Babu, 2025, ACM Transactions on Interactive Intelligent Systems)
- Facilitating Early Maladaptive Schema-Guided Polite and Empathetic Psychotherapeutic Support: An LLM-Driven MoE-RL-Based Dialogue System(Priyanshu Priya, Asif Ekbal, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
- Promoting Mental Self-Disclosure in a Spoken Dialogue System(Mahdin Rohmatillah, B. Aditya, L.-J. Yang, B. Ngo, W. Sulaiman, Jen-Tzung Chien, 2023, Interspeech)
- Beyond Words: Gaze-Enhanced LLM-based Dialogue System for Therapeutic Purposes(Karolina Gabor-Siatkowska, I. Stefaniak, Artur Janicki, 2025, 2025 32nd International Conference on Systems, Signals and Image Processing (IWSSIP))
- MAGI: Multi-Agent Guided Interview for Psychiatric Assessment(Guanqun Bi, Zhuang Chen, Zhou Liu, Hongkai Wang, Xiyao Xiao, Yuqiang Xie, Wen Zhang, Yongkang Huang, Yuxuan Chen, Libiao Peng, Yi Feng, Minlie Huang, 2025, Annual Meeting of the Association for Computational Linguistics)
- AgentMental: An Interactive Multi-Agent Framework for Explainable and Adaptive Mental Health Assessment(Jinpeng Hu, Ao Wang, Qianqian Xie, Hui Ma, Zhuo Li, Dan Guo, 2025, AAAI Conference on Artificial Intelligence)
- AnnaAgent: Dynamic Evolution Agent System with Multi-Session Memory for Realistic Seeker Simulation(Ming Wang, Peidong Wang, L. Wu, Xiaocui Yang, Daling Wang, Shi Feng, Yuxin Chen, Bixuan Wang, Yifei Zhang, 2025, Findings of the Association for Computational Linguistics: ACL 2025)
- A multi-task learning framework for politeness and emotion detection in dialogues for mental health counselling and legal aid(Priyanshu Priya, Mauajama Firdaus, Asif Ekbal, 2023, Expert Systems with Applications)
- EmotiCare: An Emotion-Aware Conversational Agent for Mental Health Support(Mary Valentina Janet A, A. R, G. Brindha, 2025, 2025 7th International Conference on Innovative Data Communication Technologies and Application (ICIDCA))
- PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents(Qisen Yang, Z. Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang, 2024, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers))
- Probing Empathetic Dialogue for Mental Health Support: An Emotional Intelligence-enhanced Multi-agent Reinforcement Learning Approach(Liang Yang, Jiaheng Xie, Qiuju Yin, Zhijun Yan, Yimeng Dong, Yangzi Lin, 2025, International Conference on Interaction Sciences)
- Towards Efficient and Robust Linguistic Emotion Diagnosis for Mental Health via Multi-Agent Instruction Refinement(Jian Zhang, Zhangqi Wang, Zhiyuan Wang, Weiping Fu, Yu He, Haiping Zhu, Qika Lin, Jun Liu, 2026, IEEE Transactions on Affective Computing)
- An LLM-based Simulation Framework for Embodied Conversational Agents in Psychological Counseling(Lixiu Wu, Yuanrong Tang, Qisen Pan, Xianyang Zhan, Yuchen Han, L. Xiao, Tianhong Wang, Chen Zhong, Jiangtao Gong, 2024, AAAI Conference on Artificial Intelligence)
- Computational Psychotherapy System for Mental Health Prediction and Behavior Change with a Conversational Agent(Tine Kolenik, Günter Schiepek, M. Gams, 2024, Neuropsychiatric Disease and Treatment)
针对特定群体与细分场景的应用实证
研究将AI系统应用于真实世界的特定社会群体,包括儿童、大学生、老年人、癌症/阿尔茨海默病照顾者、以及不同文化背景下的医疗从业者。重点在于解决医疗资源分配不均问题,并根据特定人群的语言习惯和心理特征提供定制化干预。
- AI for Personalized Mental Health Support – Early Intervention in Rural/Underserved India(Shangavi S, K. N., Renuka N, Vibhitha V, Namirthaa S, Sanjay P, 2025, 2025 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT))
- Camily: A BERT-Powered Chatbot for Employee Workplace Wellness(Meenakshi Reji, Honey Mol O, 2025, 2025 5th International Conference on Pervasive Computing and Social Networking (ICPCSN))
- A Text-Based AI Chatbot for Emotional Support in Student Mental Health(Satish Babu Thunuguntla, Poduri Sesha Sai Sathwik, Nannapaneni Lalitya, P. Surya, K. Kiran, L. Pallavi, 2025, 2025 6th International Conference on Inventive Research in Computing Applications (ICIRCA))
- Conversational Agent Utilization Patterns of Individuals with Autism Spectrum Disorder(S. Aghakhani, A. Rousseau, S. Mizrahi, X. Tan, G. Dosovitsky, L. Mlodzianowski, Z. Marshall, E. Bunge, 2024, Journal of Technology in Behavioral Science)
- Mapping Caregiver Needs to AI Chatbot Design: Strengths and Gaps in Mental Health Support for Alzheimer's and Dementia Caregivers(J. Shi, Dong Whi Yoo, Keran Wang, Violeta J. Rodriguez, Ravi Karkar, Koustuv Saha, 2025, ACM Transactions on Computing for Healthcare)
- Behavioral Markers of Childhood Depression from a Neurodevelopmental Perspective: Linguistic Fragmentation and Hostile Projection in Chatbot Conversations(Sihoon Lee, Jaeyong Lee, 2025, Brain, Digital, & Learning)
- HelpMe: Early Detection of University Students' Mental Health Issues Using a Chatbot-Integrated Dashboard(N. A. Wahab, A. Zainudin., Aznoora Osman, Norfiza Ibrahim, Abdul Hapes Mohammed, 2025, Journal of Computing Research and Innovation)
- LLM-Driven Psychological Companionship for LBC: A Multi-Tone Emotional Voice Synthesis Framework(Shouyuan Qin, Guixia Wang, Xuhui Xiong, 2025, International Journal on Artificial Intelligence Tools)
- Engagement of Sri Lankan Medical Practitioners with AI and Mental Health Decision Support AI Chatbot(T. Adhikari, J. Wijayanayake, K. Vidanage, 2025, 2025 5th International Conference on Advanced Research in Computing (ICARC))
- Adolescents' Experience with a Conversational Agent for Depression(A. Testerman, A. Bharat, T. Patterson, E. Bunge, 2026, Information)
- Role-Playing LLM-Based Multi-Agent Support Framework for Detecting and Addressing Family Communication Bias(Rushia Harada, Yuken Kimura, Keito Inoshita, 2025, 2025 IEEE Cyber Science and Technology Congress (CyberSciTech))
- Grow with Your AI Buddy: Designing an LLMs-based Conversational Agent for the Measurement and Cultivation of Children's Mental Resilience(Zihui Hu, Hanchao Hou, Shiguang Ni, 2024, Proceedings of the 23rd Annual ACM Interaction Design and Children Conference)
- Evaluating an AI-Enabled Mobile Mental Health Monitoring Tool Among Family Caregivers of Adults Living With Cancer: Single-Arm Feasibility and Acceptability Trial Protocol.(C. Acquati, Michael Aratow, Tahmida Nazreen, Arunima Bhattacharjee, Isabella K Marra, Ashley S Alexander, 2026, JMIR Research Protocols)
- Using an AI-powered Mobile Application Chatbot to Address Maternal Depression Indicators and Inquiries in the Perinatal and Postpartum Periods: A Multimethod Analysis(Carson J Peters, Valerie Aldana Lainez, Kaili Clark, Michelle Jasczynski, Quynh C. Nguyen, Elizabeth M Norell, 2026, INQUIRY: The Journal of Health Care Organization, Provision, and Financing)
- MindWell: A Conversational Agent for Professional Depression Screening on Social Media(Eliseo Bao, Anxo Pérez, Javier Parapar, 2025, Lecture Notes in Computer Science)
- Rates and correlates of study enrolment and use of a chatbot aimed to promote mental health services use for eating disorders following online screening.(Laura D'Adamo, A. C. Grammer, G. Rackoff, Jillian Shah, Marie-Laure Firebaugh, C. B. Taylor, D. Wilfley, Ellen E. Fitzsimmons-Craft, 2024, European Eating Disorders Review)
- Digital Humans for Depression Assessment and Intervention Support: Scoping Review(Jiashuo Cao, Wujie Gao, Ruoyu Wen, Chen Li, Simon Hoermann, Nilufar Baghaei, M. Billinghurst, 2025, JMIR Mental Health)
系统安全性、伦理风险与工程化效能评估
关注心理AI系统在临床部署中的最后一公里问题。包括自杀/高风险行为的识别与预警、消除算法偏见、建立对话安全基准(Benchmark)、提高模型的可解释性(XAI)以及通过标准化框架(如Ψ-Arena)验证系统在真实环境中的有效性与可行性。
- Development of Mental Health Prediction App for the Depression Assistance Based on AI Chatbot(Harsh Pratap Singh, Nagendra Singh, Avani Trivedi, Pramod Kumar Panda, Sudeesh Chouhan, Kiran Bidua, Neeraj Sharma, 2025, 2025 International Conference on Engineering Innovations and Technologies (ICoEIT))
- Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback(Shijin Zhu, Zhuang Chen, Guanqun Bi, Binghang Li, Yaxi Deng, Dazhen Wan, Libiao Peng, Xiyao Xiao, Rongsheng Zhang, Tangjie Lv, Zhipeng Hu, Fangfang Li, Minlie Huang, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
- Real-World Deployment of a Bias‑Mitigated, Multimodal AI Chatbot for Empathetic Mental Health Support and Stress Relief(R. Wadibhasme, Talari Umadevi, K. Rani, R. Kalaiarasu, P. T., Y. Roopa, 2025, Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies)
- Risk-Aware Bilingual Spoken Dialogue for Campus Mental Health Support(You-Teng Lin, Li-Yang Zhang, Yitian Chen, Jen-Tzung Chien, 2026, Proceedings of the AAAI Conference on Artificial Intelligence)
- A Benchmark for Understanding Dialogue Safety in Mental Health Support(Huachuan Qiu, Tong Zhao, Anqi Li, Shuai Zhang, Hongliang He, Zhenzhong Lan, 2023, Natural Language Processing and Chinese Computing)
- EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety(Jiahao Qiu, Yinghui He, Xinzhe Juan, Yiming Wang, Yuhan Liu, Zixin Yao, Yue Wu, Xun Jiang, Ling Yang, Mengdi Wang, 2025, Conference on Empirical Methods in Natural Language Processing)
- Development and evaluation of LLM-based suicide intervention chatbot(Xueting Cui, Yun Gu, Hui Fang, Tingshao Zhu, 2025, Frontiers in Psychiatry)
- Adversarial Evaluation Algorithm for Detecting Extreme Behaviors of LLMs in Psychological Counseling Scenarios(Qingxing Zeng, Xiang Li, Siyu Wang, Kai Liu, 2025, 2025 2nd International Conference on Algorithms, Software Engineering and Network Security (ASENS))
- Decomposing depression: a Comparative Study on Self-Disclosure using Web-Surveys and Conversational Agents(R. Rubiano-Cruz, Stefan Greulich, Christian Huchler, Michael Hies, Valentin Petzold, 2025, European Conference on Information Systems)
- Evaluating the Feasibility and Acceptability of a GPT-Based Chatbot for Depression Screening: A Mixed-Methods Study(Zhijun Guo, Alvina Lai, Z. Deng, Kezhi Li, 2024, No journal)
- Performance Evaluation and Analysis of LLMs for Detecting Dangerous Speech in Emotional Disorder Data(Shuo Yang, Jiaxin Chen, Yongfeng Tao, 2025, 2025 lEEE International Conference on Cloud Computing Technology and Science (CloudCom))
- Developing a single‐session outcome measure using natural language processing on digital mental health transcripts(Gregor Milligan, Aynsley Bernard, Liz Dowthwaite, Elvira Perez Vallejos, Jamie Davis, L. Salhi, James Goulding, 2024, Counselling and Psychotherapy Research)
- Thrive Path: Navigating Emotional Journey with AI Chatbot and Machine Learning techniques for Mental Health(D. S. Sirdeshpande, 2025, INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT)
- AI Chatbot For Mental Health(Himanshu Ranjan Singh, Sparsh Singh, Manmohan Singh, 2025, SSRN Electronic Journal)
- Smart Chatbots for Mental Health Support: AI-Based Screening and Early Identification of Depression(Er. Monika Devi, Anushka Jaiswal, Gurjot Singh, Akarshan Jangid, Suhail Sama, Jeet Bharti, 2025, 2025 IEEE 6th Global Conference for Advancement in Technology (GCAT))
- Vickybot, a chatbot for anxiety-depressive symptoms and work-related burnout(G. Anmella, M. Sanabra, M. Primé-tous, X. Segú, M. Cavero, R. Navinés, A. Mas, V. Olivé, L. Pujol, S. Quesada, C. Pio, M. Villegas, I. Grande, I. Morilla, A. Martínez-Àran, V. Ruiz, E. Vieta, D. Hidalgo-Mazzei, 2023, European Psychiatry)
- AI-Powered Virtual Mental Health Assistant for Early-Stage NLP-Based Mental Health Screening(D. Loy, P. Yau, Dennis Wong, 2025, 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF))
最终分组全面覆盖了心理筛查AI对话系统的技术、交互、应用及伦理四大维度。研究趋势展示了从单模态向多模态深度融合的跨越,以及大语言模型如何在临床逻辑和共情交互上重塑对话系统。同时,通过对特定人群的针对性开发和严苛的安全评估框架,心理AI正逐步从实验室研究走向受监管的医疗应用实践,旨在通过技术手段填补全球心理健康服务的巨大缺口。
总计121篇相关文献
Purpose: This study aims to design and evaluate a chatbot-based artificial intelligence system to identify stress levels in students using the Naïve Bayes classification method. With increasing mental health concerns among students, early stress detection is considered crucial for timely intervention Methods: This study proposes an AI-based chatbot system to detect student stress levels using a comparative approach between Naïve Bayes and Support Vector Machine (SVM) algorithms. A Kaggle dataset with 15 psychological and academic indicators was preprocessed and balanced using SMOTE. Naïve Bayes showed higher accuracy (90%) than SVM (89%). The trained model was deployed via Flask with Ngrok tunneling and integrated into a Flutter mobile app connected to the Gemini AI API for real-time stress screening. This research offers a practical and scalable solution for early mental health detection in students through intelligent chatbot interaction. Result: The findings show that the Naïve Bayes model achieves a classification accuracy of 90%, slightly surpassing the SVM model, which records an accuracy of 89%. Evaluation through ROC and AUC metrics supports the reliability of Naïve Bayes in detecting stress levels. The integrated chatbot offers a responsive and engaging platform for preliminary mental health assessments. Novelty: This research presents a unique contribution by combining AI-driven stress detection with a real-time chatbot interface, offering an accessible and scalable approach to student mental health support. The integration of machine learning models with conversational AI provides an innovative solution for early intervention. Future developments may involve deep learning and more diverse psychological inputs to further improve accuracy and effectiveness.
ABSTRACT While chatbots show promise for large-scale mental health screening, few offer interactive, free-text conversations, limiting their appeal for self-administered screening and impeding the timely detection of mental health issues. This study introduces an AI-based chatbot that allows users to respond to validated screening surveys for mental disorders (PHQ-9, GAD-7, and PCL-5) in a natural, free-text conversation manner with real-time feedback. The study's objectives include evaluating the chatbot's usability and reducing the frequency of response clarifications while accurately interpreting users’ responses. The system was assessed running in hybrid NLU mode (Phase 2; N = 587; Mage = 21.56, SD = 5.56, 67.8% women) after being trained on data collected while running in rule-based mode (Phase 1; N = 274; Mage = 21.86, SD = 5.50). During user-chatbot interactions, the chatbot required clarification only 4.64% of the time. Using the AI NLU model, the chatbot could understand user responses in 85.65% of cases and interpret free-text similarly to human annotators. In terms of usability, the chatbot in hybrid NLU mode was perceived as more engaging, friendly, and easier to use than in the rule-based NLU mode, which may be indirectly attributed to the enhanced autonomy provided by the AI NLU model.
Mental health conditions especially depression are a major rising issue in the world in terms of prevalence and the adverse effects they have on the quality of life. Subsequently, early identification and treatment are essential in reducing the further problematization of depressive symptoms, but adequate screening mechanisms (mainly those that are conventional) can be hampered by stigma, accessibility, and scarcity of resources. This study suggests creating an AI-based chatbot that will allow conducting a psychological screening to identify the symptoms of depression at an early stage. The chatbot will make use of natural language processing/machine learning (NLP/ML) technology to engage users in a conversational system to be able to determine linguistic, sentiment, and behavioral clues of elements of depressive behavior. Training on curated psychological datasets is undertaken in the system and sentiment analysis models are taken into account to achieve accuracy and reliability. The experimental analysis reveals that chatbot shows a good prospect in the scalable and efficient detection of depressive symptoms. This avenue has a strong potential of supplementing conventional clinical practice because of its ability to increase the chances of early intervention and proactive mental health care due to the anonymity and accessibility combined with perpetual accessibility.
OBJECTIVE Few individuals with eating disorders (EDs) receive treatment. Innovations are needed to identify individuals with EDs and address care barriers. We developed a chatbot for promoting services uptake that could be paired with online screening. However, it is not yet known which components drive effects. This study estimated individual and combined contributions of four chatbot components on mental health services use (primary), chatbot helpfulness, and attitudes toward changing eating/shape/weight concerns ("change attitudes," with higher scores indicating greater importance/readiness). METHODS Two hundred five individuals screening with an ED but not in treatment were randomized in an optimization randomized controlled trial to receive up to four chatbot components: psychoeducation, motivational interviewing, personalized service recommendations, and repeated administration (follow-up check-ins/reminders). Assessments were at baseline and 2, 6, and 14 weeks. RESULTS Participants who received repeated administration were more likely to report mental health services use, with no significant effects of other components on services use. Repeated administration slowed the decline in change attitudes participants experienced over time. Participants who received motivational interviewing found the chatbot more helpful, but this component was also associated with larger declines in change attitudes. Participants who received personalized recommendations found the chatbot more helpful, and receiving this component on its own was associated with the most favorable change attitude time trend. Psychoeducation showed no effects. DISCUSSION Results indicated important effects of components on outcomes; findings will be used to finalize decision making about the optimized intervention package. The chatbot shows high potential for addressing the treatment gap for EDs.
OBJECTIVE We developed a chatbot aimed to facilitate mental health services use for eating disorders (EDs) and offered the opportunity to enrol in a research study and use the chatbot to all adult respondents to a publicly available online ED screen who screened positive for clinical/subclinical EDs and reported not currently being in treatment. We examined the rates and correlates of enrolment in the study and uptake of the chatbot. METHOD Following screening, eligible respondents (≥18 years, screened positive for a clinical/subclinical ED, not in treatment for an ED) were shown the study opportunity. Chi-square tests and logistic regressions explored differences in demographics, ED symptoms, suicidality, weight, and probable ED diagnoses between those who enroled and engaged with the chatbot versus those who did not. RESULTS 6747 respondents were shown the opportunity (80.0% of all adult screens). 3.0% enroled, of whom 90.2% subsequently used the chatbot. Enrolment and chatbot uptake were more common among respondents aged ≥25 years old versus those aged 18-24 and less common among respondents who reported engaging in regular dietary restriction. CONCLUSIONS Overall enrolment was low, yet uptake was high among those that enroled and did not differ across most demographics and symptom presentations. Future directions include evaluating respondents' attitudes towards treatment-promoting tools and removing barriers to uptake.
University students face increasing mental health challenges due to academic, social, and financial pressures, yet a shortage of mental health professionals limits early intervention. To bridge this gap, HelpMe—a web-based system with an interactive dashboard and chatbot—was developed for early mental health detection. The system provides a private space for students to monitor their well-being, with the chatbot guiding users through mental health screenings and offering conversational support, while the dashboard visualizes data for tracking emotional states over time. Developed using the Design Science Research Methodology (DSRM), HelpMe follows a structured process of problem identification, system design, and evaluation. The dashboard prioritizes simplicity and engagement, utilizing Power BI for data visualization, while the chatbot ensures a user-friendly mental health screening experience. User Experience Testing (UXT) with 30 university students assessed the system across six key scales, including attractiveness, efficiency, and dependability. Feedback was largely positive, especially regarding simplicity and visual appeal, though challenges were noted in chatbot responsiveness and dashboard efficiency, with occasional delays. This study highlights HelpMe’s potential as an accessible mental health support tool and identifies areas for improvement. Future enhancements will focus on refining chatbot interactions and optimizing real-time dashboard functionality to better support student well-being
Mental health issues among adolescents and youth are rising at an alarming rate, driven by academic pressure, social stigma, and digital dependency. This paper presents an NLP-based intelligent application designed to provide mental health support through real-time facial emotion recognition, sentiment analysis, depression screening, and AI-powered chatbot interaction. By combining Artificial Intelligence (AI), Machine Learning (ML), and Natural Language Processing (NLP), the proposed system bridges the gap between professional mental healthcare services and accessible digital interventions. The application ensures user privacy, anonymity, and real-time engagement, making it a scalable and supportive tool for individuals experiencing emotional distress.
No abstract available
Using AI-powered mobile applications for mental health screening can help reduce maternal mental health disparities among Black mothers who are pregnant or parenting in the United States. A maternal health education question and answer mobile application chatbot has the potential to intervene in the maternal depression cascade, specifically screening. Extant research demonstrates the usability of mobile applications addressing mental health. However, limited scholarship explores the intersection between AI-powered mobile application chatbots and maternal mental health. This study uses a multimethod analysis to evaluate the usability of an AI-powered mobile application to address maternal mental health among Black women. Data sources, including mobile application engagement, mental health disorder scales, and secondary qualitative analysis from focus group discussions (n = 5), will be assessed through a multimethod approach. The study team previously collected data across the United States for this clinical intervention in 2022. Findings indicate that the mobile application demonstrated promise in the application’s usability to screen for maternal health depression indicators. This was achieved using the mobile application’s intent classification functionality that classified users’ questions that contained targeted search terms (e.g., postpartum depression) or specific inquiries about mental health and appropriate follow-up from the study team to provide mental health resources. Critical interconnected themes were assessed and reflected high confidence, acceptance, and usability of the mobile application in addressing maternal mental health inquiries. Findings contribute to evidence about the usability of AI-powered mobile applications informed by Black mothers in appropriate screening for maternal depression indicators and inquiries. This study provides insight into closing the gap in maternal health disparities in depression outcomes for Black mothers. Trial Registration: ClinicalTrials.gov NCT06053515; https://clinicaltrials.gov/study/NCT06053515.
Mental health at the workplace is an emerging issue, yet the traditional methods of psychological approaches in the workplace fail mainly because they have limited accessibility, carry a stigma, time-bound, and lack direct applicability to the workplace environment. Chatbots provide a ready alternative offering 24/7 access, anonymity, scalability, and AI-driven personalization toward clients. It relieves stress immediately, conducts mental health screening or assessment, and intervenes using stress management reminders. Beyond access and practicality, chatbots are cheaper and stigma-free compared to traditional methods. Camily is a unique chatbot application that employs BERT (Bidirectional Encoder Representations from Transformers) for sentiment analysis. By using the BERT methodology, one was able to achieve as high as 96 percent accuracy through improved contextual word understanding and emotional tone detection in recognizing anxiety or stress levels from a conversation. Designed with a unique dataset of 1380 entries marked out with clear wellness lexicons, Camily provides immediate wellness tips, stress management techniques, or referrals when necessary. After chatting, it will predict the stress level to be Low, Medium, or High based on the chat and classify it from the dataset used. This research will complement a developing mental health support system that can be integrated into workplaces.
Depression is a prevalent mental health condition worldwide, often characterized by persistent sadness, loss of interest or pleasure, and feelings of worthlessness. Depression is the leading cause of mental health issues worldwide, and it is becoming more severe without self-awareness, early screening, and further medication. Early detection and intervention are critical in mitigating its adverse effects. Leveraging advancements in Artificial Intelligence (AI), particularly in Natural Language Processing (NLP), chatbots have emerged as potential tools for early depression indication. Chatbots are beneficial tools in the mental health domain, such as in assisting mental health risk users. This paper presents the development of a rule-based chatbot aimed at detecting early signs of depression through conversational interactions by screening symptoms of depression. Predefined rules are developed to ensure the assessment can generate reliable results. The rule-based chatbot is developed to assist in depression indication assessment for mental health-risk individuals at an early stage and provide the risky patient with appropriate support and resources. The chatbot assessment has adopted the Depression Anxiety and Stress Scale 21 (DASS21) instrument. Based on the System Usability Scale (SUS) results, the rule-based chatbot has been accepted by all 30 respondents with good acceptance of an average SUS score of 77.2. Thus, the outcome of this chatbot can be utilized as a professional platform to encourage self-disclosure of mental depression indications for users, and it can be beneficial as the initial reference before recommending further action before the earlier help-seeking.
Introduction A significant proportion of people attending Primary Care (PC) have anxiety-depressive symptoms and work-related burnout and there is a lack of resources to attend them. The COVID-19 pandemic has worsened this problem, particularly affecting healthcare workers, and digital tools have been proposed as a workaround. Objectives We present the development, feasibility and effectiveness studies of chatbot (Vickybot) aimed at screening, monitoring, and reducing anxiety-depressive symptoms and work-related burnout in PC patients and healthcare workers. Methods User-centered development strategies were adopted. Main functions included self-assessments, psychological modules, and emergency alerts. (1) Simulation: HCs used Vickybot for 2 weeks to simulate different possible clinical situations and evaluated their experience. (3) Feasibility and effectiveness study: People consulting PC or healthcare workers with mental health problems were offered to use Vickybot for one month. Self-assessments for anxiety (GAD-7) and depression (PHQ-9) symptoms, and work-related burnout (based on the Maslach Burnout Inventory) were administered at baseline and every two weeks. Feasibility was determined based on the combination of both subjective and objective user-engagement Indicators (UEIs). Effectiveness was measured using paired t-tests as the change in self-assessment scores. Results (1) Simulation: 17 HCs (73% female; mean age=36.5±9.7) simulated different clinical situations. 98.8% of the expected modules were recommended according to each simulation. Suicidal alerts were correctly activated and received by the research team. (2) Feasibility and effectiveness study: 34 patients (15 from PC and 19 healthcare workers; 77% female; mean age=35.3±10.1) completed the first self-assessments, with 34 (100%) presenting anxiety symptoms, 32 (94%) depressive symptoms, and 22 (64.7%) work-related burnout. Nine (26.5%) patients completed the second self-assessments after 2-weeks of use. No significant differences were found for anxiety [t(8) = 1.000, p = 0.347] or depressive [t(8) = 0.400, p = 0.700] symptoms, but work-related burnout was significantly reduced [t(8) = 2.874, p = 0.021] between the means of the first and second self-assessments. Vickybot showed high subjective-UEIs, but low objective-UEIs (completion, adherence, compliance, and engagement). Conclusions The chatbot proved to be useful in screening the presence and severity of anxiety and depressive symptoms, in reducing work-related burnout, and in detecting suicidal risk. Subjective perceptions of use contrasted with low objective-use metrics. Our results are promising, but suggest the need to adapt and enhance the smartphone-based solution in order to improve engagement. Consensus on how to report UEIs and validate digital solutions, especially for chatbots, are required. Disclosure of Interest None Declared
Family caregivers of individuals with Alzheimer’s Disease and Related Dementia (AD/ADRD) face significant emotional and logistical challenges that place them at heightened risk for stress, anxiety, and depression. Although recent advances in generative AI—particularly large language models (LLMs)—offer new opportunities to support mental health, little is known about how caregivers perceive and engage with such technologies. To address this gap, we developed Carey, a GPT-4o–based chatbot designed to provide informational and emotional support to AD/ADRD caregivers. Using Carey as a technology probe, we conducted semi-structured interviews with 16 family caregivers following scenario-driven interactions grounded in common caregiving stressors. Through inductive coding and reflexive thematic analysis, we surface a systemic understanding of caregiver needs and expectations across six themes—on-demand information access, safe space for disclosure, emotional support, crisis management, personalization, and data privacy. For each of these themes, we also identified the nuanced tensions in the caregivers’ desires and concerns. We present a mapping of caregiver needs, AI chatbots’ strengths, gaps, and design recommendations. Our findings offer theoretical and practical insights to inform the design of proactive, trustworthy, and caregiver-centered AI systems that better support the evolving mental health needs of AD/ADRD caregivers.
: In this paper we introduce the design, development, and evaluation of a bias‑mitigated and explainable multimodal AI chatbot for providing empathetic mental health support and stress alleviation in real‑life settings. The chatbot is pre-trained on generic, anonymized conversation datasets and is accompanied with bias detection-and-mitigation mechanisms, along with explainable AI modules in order to promote transparency and trust for the users. With the capability to cover text and voice interactions and by including the cognitive-behavioral therapy pr, our system provides rich, context-aware dialogue based on individual emotional states. A mixed‑methods assessment with therapists and end users across divergent age groups reveals substantial gains in emotional regulation, felt empathy and stress reduction. Trustscoring and emotional affinity tracking support continuous improvements and demonstrate the practicality of responsibly introducing AI-driven technologies for mental health in the wild, besides offering a scalable framework to endow form empathetic support systems.
The increasing demand for accessible mental health support highlights the need for scalable and inclusive solutions. This study presents a multilingual conversational AI chatbot designed to provide personalized and empathetic mental health assistance. Leveraging Natural Language Processing (NLP), Machine Learning (ML), and the Gemini API, the chatbot can understand user sentiment, detect emotional cues, and deliver contextually appropriate responses across multiple languages. Integrating Cognitive Behavioral Therapy (CBT) principles, it offers therapeutic interventions such as mood tracking, coping strategies, and emotional guidance. The chatbot's architecture ensures user privacy through encrypted communication and enables scalability for large user bases. Experimental results demonstrate its ability to engage users effectively, identify emotional states accurately, and provide supportive, human-like interactions. This system contributes to making mental health care more accessible, inclusive, and responsive, marking a step forward in AI-driven psychological support and early mental health intervention.
Engagement of Sri Lankan Medical Practitioners with AI and Mental Health Decision Support AI Chatbot
This study addresses the critical challenges in mental healthcare delivery in Sri Lanka, where limited resources hinder accurate and efficient diagnoses, particularly in low-resource settings. Mental health disorders affect millions globally, highlighting the need for innovative solutions. This research investigates factors influencing medical practitioners’ engagement with AI technologies and evaluates a Mental Health Decision Support AI Chatbot designed to improve mental healthcare outcomes. Using a mixed-methods approach that includes a systematic literature review, survey analysis, and chatbot evaluation, the study identifies key technological, organizational, and environmental factors affecting engagement. The chatbot, developed and tested on the Microsoft Azure platform, shows significant potential in enhancing diagnostic efficiency and improving patient management. The findings contribute to the growing body of knowledge on AI applications in healthcare and provide practical insights for implementing AI-driven solutions in resource-constrained settings.
The AI chatbot, a mental wellness tool, also provides an instant-self-help portal of non-judgmental advice to people experiencing depressive states. The chatbot utilizes ML and NLP techniques in generating responses to user inquiries and comments while keeping the conversation as engaging as possible and allowing the user to open up about their feelings and other abstractions in a safe manner. Through its features, users can analyze their emotional states and have a depression diagnosis while also allowing appropriate interventions and emotional support, and cognitive behavioral therapies can be performed in conversation.
This project introduces a healthcare chatbot designed to improve mental well-being through mood-responsive interactions. The system integrates a Large Language Model (LLM)-powered contextual understanding, enabling adaptive and personalized responses to user input. By analyzing users' language patterns, the chatbot adapts its responses to provide context-aware, empathetic support that aligns with the user's current emotional state. Built on the Rasa framework with Natural Language Understanding (NLU) capabilities, this chatbot offers personalized, therapeutic conversations aimed at alleviating symptoms of mental health issues like stress and loneliness. This healthcare tool shows potential for enhancing user engagement and supporting mental health recovery by offering a more personalized and language-sensitive approach to care.
Mental disorders such as depression and anxiety are on the rise, thus the requirement for scalable and accessible support frameworks. AI-fitted chatbots provide a new method of providing mental health care in the form of 24/7 provision, personalized interventions, and anonymized interaction capabilities [2]. Below is the creation of an AI-powered chatbot for mental well-being support using Natural Language Processing (NLP) and emotional computation. We evaluate the performance of the chatbot with the assistance of key performance indicators and handle the ethical concerns, limitations, and future improvement. We further analyze how the chatbot can reduce the workload of mental health professionals, enhance early intervention methods, and provide proactive treatment to users [2].
Recently, the rise in mental health issues, especially depression, has significantly increased, requiring an additional support system. The objective of this work is to develop a mental health prediction app that provides support to individuals by integrating an AI-enabled chatbot. This app uses a client-server architecture with various modules and offers personalized treatment plans and self-care strategies based on evidence-based practices. It also includes features such as tracking and monitoring of mood, stress, and sleep patterns, peer support mechanism, and the option to connect with healthcare providers. The success of the app will depend on user engagement, adoption rates, privacy assurance, and data security. This work underscores the potential of mobile health (mHealth) Android apps to support mental health and well-being, offering a promising solution for enhancing accessibility and empowering individuals to manage their well-being. For this study, multi-tune transcript featuring dataset is utilized and simulated using python. The system performance is evaluated using standard metrics including accuracy, precision, recall, ROC etc. The proposed model achieved the accuracy of 93.5% which performs better than the traditional approach and demonstrates its effectiveness as a robust solution for mental health.
Mental illness is common among students, and the majority of them do not receive timely and sufficient treatment due to stigma or a lack of resources. A chatbot developed with artificial intelligence for text-based conversation provides emotional support, confidentiality, and convenience. The chatbot makes use of sentiment analysis and natural language processing for detecting emotions, responding with sympathy, and presenting personalized advice that helps users combat stress and emotional issues. Around-the-clock accessibility ensures that assistance is readily available when needed. Its conversational nature promotes non-judgmental interactions to facilitate open ended discussions and developing trust. Its efficacy in enhancing emotional resilience and reducing loneliness has been proven with user reviews. The diversity-focused, accessible chatbot connects students with mental health resources, improving well-being and helping them better cope with adversity. This combination of AI and emotional support has tremendous potential for mass use.
In today’s world, where mental health challenges are becoming increasingly common, there’s a growing need for accessible, personalized, and compassionate support systems. ThrivePath was designed to help people overcome obstacles to mental healthcare seeking including stigma and ignorance. To provide users an immersive and supportive setting for going their emotional trip, we decided to combine interactive tools, AI-driven chatbots, and machine learning into a single platform. Google Gemini 1.5 by Pro powers an artificial intelligence chatbot on the platform that offers personalized, compassionate interactions to guide users confidentially and supportively through their issues. We included a mental health questionnaire to evaluate emo- tional well being in order to better equip consumers, producing knowledge that informs tailored recommendations for wellness tools including yoga, exercises, music therapy, and journaling. Guests have access to multimedia content; registered users have a more individualized experience; both kinds of users have unique capabilities on the platform. ThrivePath promotes a proactive attitude toward mental health by means of machine learning, chatbot, and customized content that helps users not only manage stress and anxiety but also provide them with means for self-care and emotional development. Keywords: emotional journey, chatbot, machine learning, mental health, AI, quiz, support system, mental well-being.
No abstract available
Adolescent mental health issues require early detection to prevent worsening conditions. This study aim developed a rule-based expert system for automating mental health screening instrument interpretation using Forward Chaining inference and Certainty Factor for uncertainty handling. The system encodes interpretation guidelines from two validated instruments: Mini MindHEAR Youth Scale V.1 for ages 10-18 years and Self-Reporting Questionnaire-29 for ages 19-24 years. From 710 survey respondents, 494 representative samples were selected using Stratified Random Sampling for validation. The knowledge base consists of 17 rules for Mini MindHEAR Youth Scale V.1 and 8 rules for Self-Reporting Questionnaire -29 with Certainty Factor values ranging from 0.5 to 0.95 based on symptom severity. Validation results showed the system achieved an overall guideline-alignment accuracy of 89.68% (443 matching interpretations out of 494 samples), measuring the system's ability to faithfully reproduce instrument interpretation guidelines rather than clinical diagnostic accuracy. The system demonstrated high explainability through transparent reasoning traces. This expert system can assist healthcare workers in automating screening instrument interpretation, particularly in resource-limited settings
Depression is a significant mental illness that affects how individuals express their emotions and engage with others, making communication challenging. Most depression assessment tools utilize self-report questionnaires, such as the Patient Health Questionnaire (PHQ-9). These psychometric instruments can be easily adapted to electronic forms. However, this approach cannot provide human-like explanations and interactions, leading to a poor interactivity. Furthermore, we have identified critical limitations in previous prompting methods. They are either constrained to queries using a single identifiable relation, or being agnostic to input contexts, making it difficult to capture variabilities that occur across different inference steps. To solve these issues, we develop a large language models LLM-enhanced conversational agent for depression detection, which makes it more effective and interactive. Specifically, we first explore an iterative knowledge-aware prompter (IKP), a new prompting paradigm which inject specific knowledge from language models progressively for multi-step reasoning, which learns to synthesize prompts conditioned on the current step's contexts. Second, our proposed system introduces a multi-step diagnosis (MSD) approach. Our system not only delivers a diagnosis but also generates a symptom summary through interactive conversations. Our proposed agent enables users to have interactive natural language dialogues with the system, enhancing their personalized comprehension of mental states. Our experiments demonstrate the effectiveness of the iterative knowledge-aware prompter design.
Background Conversational agents based on large language models (LLMs) have shown moderate efficacy in reducing depressive and anxiety symptoms. However, most existing evaluations lack methodological transparency, rely on closed-source models, and show limited standardization in performance and safety assessment. Objective We have two study objectives: (1) to develop an LLM-based conversational agent through system design analysis and initial functionality testing, and (2) to evaluate its safety and performance through standardized assessment in controlled simulated interactions focused on depression and anxiety of two LLMs (GPT-4o and Llama 3.1-8B). Methods We conducted a cross-sectional study in two phases. First, we developed a mental health platform integrating a conversational agent with functionalities including personalized context, pretrained therapeutic modules, self-assessment tools, and an emergency alert system. Second, we evaluated the agent's responses in simulated interactions based on predefined user personas for each LLM. Four expert raters assessed 816 interaction pairs using a 5-criterion Likert scale evaluating tone, clarity, domain accuracy (correctness), robustness, completeness, boundaries, target language, and safety. In addition, we use performance metrics based on numerical criteria such as cost, response length, and number of tokens. Multiple linear regression models were used to compare LLM performance and assess metric interrelations. Results First, we developed a web-based mental health platform using a user-centered design, structured into frontend, backend, and database layers. The system integrates therapeutic chat (GPT-4o and Llama 3.1-8B), psychological assessments (PHQ-9, GAD-7), CBT-based tasks, and an emergency alert system. The platform supports secure user authentication, data encryption, multilingual access, and session tracking. Second, GPT-4o outperformed Llama 3.1-8B in both performance metrics based on numerical criteria and Likert scale criteria, generating longer and more lexically diverse responses, using more tokens, and scoring higher in clarity, robustness, completeness, boundaries, and target language. However, it incurred higher costs, with no significant differences in tone, accuracy, or safety. Conclusion Our study presents a conversational agent with multiple functionalities and shows that GPT-4o outperforms Llama 3.1-8B in performance, although at a higher cost. This platform could be used in future clinical trials or real-world implementation studies.
Mental health disorders, including depression, anx- iety, and suicidal ideation, present significant challenges for continuous care, as symptoms often evolve undetected between clinical visits. This paper introduces Cognicare, an AI-driven conversational agent designed for real-time emotional monitoring, longitudinal risk assessment, and clinically actionable insights. Cognicare combines a fine-tuned RoBERTa model for multi-class mental health classification with DistilBERT-based sentiment analysis. These outputs are fused via a Dynamic Distress Scor- ing Algorithm, generating personalized, context-aware distress metrics that account for linguistic cues, temporal trends, and model confidence.Therapeutic interactions leverage a large lan- guage model (LLM) aligned with Cognitive Behavioral Therapy principles through structured prompt-chaining, ensuring emo- tionally congruent, contextually relevant, and psychologically safe responses. The system tracks longitudinal emotion trajectories, detects anomalies, and produces HL7 FHIR-compliant reports for clinicians, highlighting high-risk cases and trend patterns to support timely interventions.Evaluations demonstrate improved classification accuracy, enhanced empathy, and reduced toxicity in generated responses. Cognicare illustrates how integrating advanced NLP models with clinically informed design can pro- vide scalable, accessible, and reliable continuous mental health support, bridging the gap between user self-expression and evidence-based care.
No abstract available
This Working Note summarizes the participation of the DS@GT team in two eRisk 2025 challenges. For the Pilot Task on conversational depression detection with large language-models (LLMs), we adopted a prompt-engineering strategy in which diverse LLMs conducted BDI-II-based assessments and produced structured JSON outputs. Because ground-truth labels were unavailable, we evaluated cross-model agreement and internal consistency. Our prompt design methodology aligned model outputs with BDI-II criteria and enabled the analysis of conversational cues that influenced the prediction of symptoms. Our best submission, second on the official leaderboard, achieved DCHR = 0.50, ADODL = 0.89, and ASHR = 0.27.
Abstract The limitations of current cognitive impairment screening instruments, combined with the widespread adoption of smart speakers, highlight the need for an easy-to-use and accessible cognitive assessment tool. This study introduces DigiMoCA, a conversational agent for the detection of cognitive impairment, and presents its diagnostic, convergent and content validity. DigiMoCA was tested with 46 senior adults, and standard statistical analysis was utilized to determine its validity. We used T-MoCA as the gold standard. In the best scenario, we obtained a Pearson correlation of r=0.88 with the gold standard, and an area under the ROC curve of AUC=0.79 for the detection of cognitive impairment. Additionally, linear regression was applied to predict the gold standard outcomes from DigiMoCA’s, obtaining a coefficient of determination R2=0.77. This demonstrates the potential of DigiMoCA as a digital screening tool for the early detection of cognitive impairment among senior adults.
No abstract available
An intelligent agent for sentence completion test: creation and application in depression assessment
During large-scale psychological screening, traditional self-report questionnaires face challenges like response deception or social desirability bias, while the Sentence Completion Test (SCT) as a projective technique shows potential but is limited by manual scoring and high costs. Leveraging advancements in Large Language Models (LLMs), this study integrates SCT’s theoretical framework with LLM capabilities to develop a specialized set of SCT items for depression assessment in Chinese university students, using a self-built intelligent agent across three progressive empirical studies. Results show the agent demonstrates good reliability (Cronbach’s α = 0.89–0.92) and validity, with high consistency to manual scoring (r = 0.96), significant criterion correlations with the Beck Depression Inventory (r = 0.89) and Self-Rating Depression Scale (r = 0.85), confirmed structural validity via exploratory factor analysis. Furthermore, the intelligent agent could identify most invalid responses (F1 = 0.94, Accuracy = 0.99, Precision = 0.99, Recall = 0.90). This research marks a key milestone in SCT’s intelligent transformation, driving innovation in psychological assessment and offering new academic and practical pathways.
BACKGROUND Depression is prevalent, chronic, and burdensome. Due to limited screening access, depression often remains undiagnosed. Artificial intelligence (AI) models based on spoken responses to interview questions may offer an effective, efficient alternative to other screening methods. OBJECTIVE The primary aim was to use a demographically diverse sample to validate an AI model, previously trained on human-administered interviews, on novel bot-administered interviews, and to check for algorithmic biases related to age, sex, race, and ethnicity. METHODS Using the Aiberry app, adults recruited via social media (N = 393) completed a brief bot-administered interview and a depression self-report form. An AI model was used to predict form scores based on interview responses alone. For all meaningful discrepancies between model inference and form score, clinicians performed a masked review to determine which one they preferred. RESULTS There was strong concurrent validity between the model predictions and raw self-report scores (r = 0.73, MAE = 3.3). 90 % of AI predictions either agreed with self-report or with clinical expert opinion when AI contradicted self-report. There was no differential model performance across age, sex, race, or ethnicity. LIMITATIONS Limitations include access restrictions (English-speaking ability and access to smartphone or computer with broadband internet) and potential self-selection of participants more favorably predisposed toward AI technology. CONCLUSION The Aiberry model made accurate predictions of depression severity based on remotely collected spoken responses to a bot-administered interview. This study shows promising results for the use of AI as a mental health screening tool on par with self-report measures.
Conversational Agents have been showing promise for depression in adults in the short-term. Although, there has been little research done for conversational agents (CAs) with depression in adolescents. This study aimed to determine adolescents’ user experience with Athenabot, a behavioral activation CA for depression. The study included 66 participants who interacted with Athenabot. Participants were aged 13 to 18 (mean = 14.12) and predominantly identified as female (56.1%). Participants’ confidence in the CA’s utility to improve mood significantly increased from baseline to post-intervention (p < 0.001). Adolescents provided an acceptable Net Promoter Score of 6.73. Positive themes from feedback included the CA being helpful and favorably viewed, while negative themes included its perceived audience-dependency and impersonal nature. Recommendations for improvement included reducing repetitive questions and enhancing personalization. Adolescents significantly preferred multiple-choice questions over typed response questions (p < 0.05). However, there were no significant differences in preference for emojis, memes, or GIFs. Adolescents reported an increased confidence that the CA could improve their mood. While the CAs received acceptable support, feedback highlighted a need for improved engagement and personalization. Adolescents favored multiple-choice button questions over typed responses and preferred GIFs over memes and emojis, with no significant demographic differences.
Abstract Background The growing global burden of mental health disorders has intensified the search for scalable, accessible, and cost-effective interventions. Conversational agents in the form of digital humans have emerged as promising tools to deliver mental health support across diverse populations and settings. Objective This scoping review aimed to analyze the role of digital humans in depression management, identifying their specific applications in both diagnostic processes and therapeutic interventions. Additionally, it aimed to evaluate the design choices implemented in digital human systems, including their appearance, interaction modalities, back-end intelligence systems, and the various roles they assume. Methods Following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines, we systematically searched peer-reviewed literature across major databases, including ACM Digital Library, IEEE Xplore, Web of Science, and PubMed, to capture both psychological and technological perspectives. The search query included a wide variety of synonyms for digital humans and depression: (“avatar” OR “virtual agent” OR “embodied conversational agent” OR “relational agent” OR “digital human” OR “virtual human” OR “virtual character”) AND (“Major Depressive Disorder” OR “Depression”). Studies were included if they described the development, implementation, or evaluation of digital humans designed to support mental health outcomes. Data were charted on agent design, therapeutic approach, target population, delivery context, and reported effectiveness. Results In total, 20 studies (2010‐2024) were included. Depression assessment studies comprised 35% (n=7), interventions 55% (n=11), and combined approaches 10% (n=2). Assessment protocols included the questionnaires Patient Health Questionnaire-9 and Very Short Visual Analog Scale of the Center for Epidemiologic Studies Depression Scale - Visual Analog Scale - Very Short version, semistructured interviews based on Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition criteria, and interactive tasks designed to elicit emotional responses. Intervention approaches used cognitive behavioral therapy, psychoeducation, compassion-focused therapy, and avatar therapy. Digital humans assumed 5 distinct roles: interviewer (n=6), facilitator (n=3), counselor (n=3), educator (n=3), and actor (n=5). Interviewers primarily appeared in assessment studies, presenting structured questions. Counselors engaged in therapeutic dialogues, while educators delivered psychoeducational content. Facilitators assisted participants in achieving system goals. Actors portrayed specific emotions or dysfunctional beliefs to facilitate therapeutic processes. Studies highlighted digital humans’ utility in enhancing diagnostic processes and therapeutic interventions, noting the potential for transformation through physiological data integration. Conclusions This study demonstrates that digital humans represent a transformative advancement in depression management, offering innovative applications across both assessment and intervention phases. The evidence reveals digital humans’ effectiveness in replicating traditional therapeutic roles while providing unique advantages, including 24/7 accessibility, reduced stigma, consistent care delivery, and personalized support. Digital humans can successfully function to establish therapeutic alliances and elicit meaningful engagement comparable with human providers. Findings underscore the need for continued research to fully realize digital humans’ potential in addressing depression-specific needs, advocating for expansion into diverse therapeutic scenarios, and exploration of unexplored digital human applications.
Background The importance of computational psychotherapy is increasing due to the record high prevalence of mental health issues worldwide. Despite advancements, current computational psychotherapy systems lack advanced prediction and behavior change mechanisms using conversational agents. Purpose This work presents a computational psychotherapy system for mental health prediction and behavior change using a conversational agent. It makes two major contributions. First, we introduce a novel, golden standard dataset, comprising panel data with 1495 instances of quantitative stress, anxiety, and depression (SAD) symptom scores from diagnostic-level questionnaires and qualitative daily diary entries. Second, we present the computational psychotherapy system itself. Hypothesis We hypothesize that simulating a theory of mind - the human cognitive ability to understand others – in a conversational agent enhances its effectiveness in relieving mental health issues. Methods The system simulates theory of mind with a cognitive architecture comprising an ensemble of computational models, using cognitive modelling and machine learning models trained on the novel dataset, and novel ontologies. The system was evaluated through a computational experiment on mental health phenomena prediction from text, and an empirical interventional study on relieving mental health issues in 42 participants. Results The system outperformed state-of-the-art systems in terms of the number of detected categories and detection accuracy (highest accuracy: 91.41% using k-nearest neighbors (kNN); highest accuracy of other systems: 84% using long-short term memory network (LSTM)). The highest accuracy for 7-day forecasting was 87.68%, whereas the other systems were not able to forecast trends. In the study, the system outperformed Woebot, the current state-of-the-art, in reducing stress (p = 0.004) and anxiety (p = 0.008) levels. Conclusion The confirmation of our hypothesis indicates that incorporating theory of mind simulation in conversational agents significantly enhances their efficacy in computational psychotherapy, offering a promising advancement for mental health interventions and support compared to current state-of-the-art systems.
Psychological resilience refers to an individual's ability to adapt to adversity and stress. Education on psychological resilience during childhood can contribute to future mental health and well-being, such as reducing anxiety and depression [1] [2]. However, traditional psychosocial resilience training faces challenges with accessibility, heavily constrained by cost and spatiotemporal limitations. Recently, emerging large language models (LLMs) have demonstrated exceptional capabilities in conversational tasks, indicating new prospects for cultivating children's psychological resilience. In our work, 1) we conducted qualitative interviews with 10 Chinese children (aged 8-12) and their parents to understand their needs and current conditions; 2) based on the interview results and theories of psychological resilience, we summarized three pathways for developing children's psychological resilience using conversational agents (CAs) and identified six key challenges for designing child-centered CAs; 3) we designed and developed a web prototype using optimized LLMs (see Figure 1), which integrates personal and social support factors, to measure and foster children's psychological resilience through conversations; and 4) we invited 48 child volunteers in user testing and designed three sets of experiments to evaluate the effectiveness of system interventions, the effectiveness of measurements, and overall acceptability. Results indicate that the intervention tasks actively promoted psychological resilience in adolescents. Intelligent measurement scores were effectively consistent with traditional scales in objective scoring, while subjective evaluations, such as appeal and fun, significantly exceeded traditional scale scores. Through our practice, we show the potential of CAs in enhancing children's mental health and presented a reference application case. Moreover, we have unearthed notable future research issues, including challenges in designing psychologically educational CAs that are persistently attractive to children, combining real-life support factors with CAs, and ethical concerns regarding safety and privacy.
Individuals with autism spectrum disorder (ASD) face unique challenges in their social interactions. The use of conversational agents (CA) can provide support and help reduce barriers to care. However, research on CA’s by individuals with ASD is limited. The present study sought to better understand CA utilization patterns by users with ASD. A subset of data was collected from users of Wysa, a mental health CA. Engagement with the CA, utilization of offered mental health interventions (CA tools), collection of energy scores, depression (PHQ-9), and anxiety (GAD-7) outcomes were gathered and analyzed. Users engaged with Wysa on average 8.59 days and had a median of 97 conversational exchanges. Almost half of the users utilized at least one of the 230 tools offered. The most frequently used interventions focused on mindfulness, thought recording, sleep, grounding, and social support. Energy scores were reported on average 10.59 times, and the mean energy score was 42.77 out of 100. Mean baseline PHQ-9 and GAD-7 scores were 10.34 and 8.81, respectively. Overall, the current findings show that users with ASD engaged regularly with a CA, despite its targeted design for other mental health concerns such as depression and anxiety, rather than ASD. If users with ASD engage with these types of resources, this could become a new avenue of support to a population facing multiple challenges to accessing treatment.
No abstract available
The global aging trend demands innovative solutions to support older adults in maintaining emotional well-being and independent living. Depression and anxiety remain underdiagnosed in this population due to social stigma and lack of regular screening. To address this gap, we propose ESPELHAR, a smart mirror-based conversational assistant for continued mental health assessment at home. Building upon clinically validated instruments, namely the Patient Health Questionnaire-9 (PHQ-9) and General Anxiety Disorder-7 (GAD-7), our approach reimagines these tools as daily micro-assessments embedded into natural conversation. The assistant dynamically adapts the questions, scores responses through an inverted cumulative method, and maintains a temporal emotional profile to detect trends and worsening symptoms. This contribution presents the definition of Personas and Scenarios that guided system requirements, a detailed proposal for the mood assessment method, and the implementation of an initial system prototype. By grounding daily mood monitoring in robust psychometrics and more natural interaction, our approach bridges the gap between traditional clinical screening and real-world smart environments. Overall, we believe this work has the potential to support proactive mental health care at home.
Background Adolescents living with HIV are disproportionally affected by depression, which worsens antiretroviral therapy adherence, increases viral load, and doubles the risk of mortality. Because most adolescents living with HIV live in low- and middle-income countries, few receive depression treatment due to a lack of mental health services and specialists in low-resource settings. Chatbot technology, used increasingly in health service delivery, is a promising approach for delivering low-intensity depression care to adolescents living with HIV in resource-constrained settings. Objective The goal of this study is to develop and pilot-test for the feasibility and acceptability of a prototype, optimized conversational agent (chatbot) to provide mental health education, self-help skills, and care linkage for adolescents living with HIV. Methods Chatbot development comprises 3 phases conducted over 2 years. In the first phase (year 1), formative research will be conducted to understand the views, opinions, and preferences of up to 48 youths aged 10-19 years (6 focus groups of up to 8 adolescents living with HIV per group), their caregivers (5 in-depth interviews), and HIV program personnel (5 in-depth interviews) regarding depression among adolescents living with HIV. We will also investigate the perceived acceptability of a mental health chatbot, including barriers and facilitators to accessing and using a chatbot for depression care by adolescents living with HIV. In the second phase (year 1), we will iteratively program a chatbot using the SmartBot360 software with successive versions (0.1, 0.2, and 0.3), meeting regularly with a Youth Advisory Board comprised of adolescents living with HIV who will guide and inform the chatbot development and content to arrive at a prototype version (version 1.0) for pilot-testing. In the third phase (year 2), we will pilot-test the prototype chatbot among 50 adolescents living with HIV naïve to its development. Participants will interact with the chatbot for up to 2 weeks, and data will be collected on the acceptability of the chatbot-delivered depression education and self-help strategies, depression knowledge changes, and intention to seek care linkage. Results The study was awarded in April 2022, received institutional review board approval in November 2022, received funding in December 2022, and commenced recruitment in March 2023. By the completion of study phases 1 and 2, we expect our chatbot to incorporate key needs and preferences gathered from focus groups and interviews to develop the chatbot. By the completion of study phase 3, we will have assessed the feasibility and acceptability of the prototype chatbot. Study phase 3 began in April 2024. Final results are expected by January 2025 and published thereafter. Conclusions The study will produce a prototype mental health chatbot developed with and for adolescents living with HIV that will be ready for efficacy testing in a subsequent, larger study. International Registered Report Identifier (IRRID) DERR1-10.2196/55559
Detecting depression from conversational text using large language models (LLMs) has garnered significant interest. However, the limited interpretability of existing methods presents a major challenge for clinical application. To address this, we propose a novel framework for automatic depression assessment, which employs LLM prompting to extract interpretable factors linked to depression from text and uses linear regression to predict severity scores. We evaluated our approach using a benchmark dataset (DAIC-WOZ; n = 186), predicting Patient Health Questionnaire (PHQ)-8 scores from clinical interview transcripts. Our method identifies key behavioral and linguistic features indicative of depression while also achieving state-of-the-art performance with a mean absolute error (MAE) of 2.91 on the test set. The resulting model further generalizes to an independent test dataset (E-DAIC; n = 86) with an MAE of 2.86. These findings suggest that interpretable LLM-based approaches hold significant promise for enhancing the clinical utility of automated depression assessment.
Large Language Models (LLMs) have shown increasing potential in mental health applications, such as depression screening from clinical interviews. However, the effectiveness and reliability of these models strongly depend on how prompts are formulated-small variations in context, instruction, or role can significantly affect the results. To address this challenge, this paper introduces the Contextual Prompt Enabler for Mental Health (CPEMH), an agentic multi-stage framework designed to systematically investigate, compare, and select prompting strategies in a controlled and reproducible manner. CPEMH integrates specialized modules covering intelligent data sampling, analytical performance assessment, and consistency evaluation. The framework enables transparent and adaptable analyses of prompt strategies without relying on fine-tuning, retrieval-augmented generation (RAG), or external datasets. It supports multiple prompting paradigms-including Direct Instruction, Role-Based Prompting, and Chain-of-Thought Reasoning-and incorporates evaluation metrics that measure predictive performance (e.g., recall, F1-score), output consistency, and classification bias. A clinical case study conducted using the DAIC-WOZ dataset and the GPT-4 model demonstrated that CPEMH goes beyond illustrating the framework's operation-it can generate empirically validated recommendations and insights. The prompting strategies selected by the system achieved higher overall performance (F1-score) and a better balance between bias and robustness, which were subsequently confirmed on a larger sample-four times greater than the original subset-reinforcing the generalization and stability of the framework's outcomes. Furthermore, the analysis revealed structural and semantic patterns across different prompt styles, offering valuable insights into the role of contextual formulation in clinical classification tasks, particularly in the automated diagnosis of depression from interview transcripts. By combining the reasoning capabilities of LLMs with a configurable agentic architecture and diverse evaluation metrics, CPEMH offers a lightweight, transparent, and flexible approach to prompting strategy evaluation-promoting reliability, reproducibility, and ethical validity in automated mental health screening applications.
Automating structured clinical interviews could revolutionize mental healthcare accessibility, yet existing large language models (LLMs) approaches fail to align with psychiatric diagnostic protocols. We present MAGI, the first framework that transforms the gold-standard Mini International Neuropsychiatric Interview (MINI) into automatic computational workflows through coordinated multi-agent collaboration. MAGI dynamically navigates clinical logic via four specialized agents: 1) an interview tree guided navigation agent adhering to the MINI's branching structure, 2) an adaptive question agent blending diagnostic probing, explaining, and empathy, 3) a judgment agent validating whether the response from participants meet the node, and 4) a diagnosis Agent generating Psychometric Chain-of- Thought (PsyCoT) traces that explicitly map symptoms to clinical criteria. Experimental results on 1,002 real-world participants covering depression, generalized anxiety, social anxiety and suicide shows that MAGI advances LLM- assisted mental health assessment by combining clinical rigor, conversational adaptability, and explainable reasoning.
Mental well‐being is a worldwide priority of health systems. Nevertheless, the diagnosis and recovery rates are still low. This work demonstrates an Artificial Intelligence (AI)‐based, entertainment‐oriented, engaging assistant that can deliver on‐demand, non‐judgmental assessment in an accessible, scalable, and personalised way to people affected by anxiety and depression. For this purpose, we combine Machine Learning (ML) and Large Language Models (LLMs) in a stream‐based framework. Here, the LLMs are exploited to extract high‐level reasoning features from natural language utterances for an accurate ML prediction model. During a study lasting for 14 months, the participants, 146 users mostly within the 65–80 age range, used the conversational assistant. Each user was free to participate as they pleased, with average individual activity times of 4.5 months. During their participation, each user completed an average of two standard mental condition tests, which allowed updating the mental condition tags for classifier retraining in streaming mode. Our solution achieved promising results for detecting anxiety and depression in free dialogues, with accuracy metrics exceeding 90%, outperforming competing works from the literature. Moreover, prior research on anxiety and depression detection has often been limited to providing binary outcomes without explanations behind their rationale. Therefore, this work also addresses interpretability by automatically explaining its prediction in natural language. The contributions of this work are threefold: (i) detecting mental conditions from free dialogues in real‐time with minimal supervision, (ii) conducting a non‐invasive longitudinal analysis based on user engagement, and (iii) automatically providing explanations of the predictive capabilities of the solution. Our approach, supporting continuous interactions suitable for longitudinal studies, combined with explainability mechanisms, connects directly with several strategic lines promoted by the European Union (EU). In particular, the emphasis on early prevention and detection is aligned with a conversational tool that monitors indicators over time. Similarly, the EU promotes transparency, ethical governance, and trust in digital health technologies, as well as data interoperability and common standards as part of the push toward the European Health Data Space. In this context, incorporating explainability, that is, providing the user or researcher with understandable reasons for the model's inferences, strengthens acceptability, accountability, and alignment with good digital governance principles. In this way, our approach contributes to translating the European objectives of promoting accessible, safe, ethical, and evidence‐based digital mental health tools into practice, while facilitating longitudinal monitoring and proactive intervention.
This paper describes the participation of the SINAI-UJA team in the eRisk@CLEF 2025 lab. Specifically, we addressed two of the proposed tasks: (i) Task 2: Contextualized Early Detection of Depression, and (ii) Pilot Task: Conversational Depression Detection via LLMs. Our approach for Task 2 combines an extensive preprocessing pipeline with the use of several transformer-based models, such as RoBERTa Base or MentalRoBERTA Large, to capture the contextual and sequential nature of multi-user conversations. For the Pilot Task, we designed a set of conversational strategies to interact with LLM-powered personas, focusing on maximizing information gain within a limited number of dialogue turns. In Task 2, our system ranked 8th out of 12 participating teams based on F1 score. However, a deeper analysis revealed that our models were among the fastest in issuing early predictions, which is a critical factor in real-world deployment scenarios. This highlights the trade-off between early detection and classification accuracy, suggesting potential avenues for optimizing both jointly in future work. In the Pilot Task, we achieved 1st place out of 5 teams, obtaining the best overall performance across all evaluation metrics: DCHR, ADODL and ASHR. Our success in this task demonstrates the effectiveness of structured conversational design when combined with powerful language models, reinforcing the feasibility of deploying LLMs in sensitive mental health assessment contexts.
Social robots are designed to interact with humans in social and emotional ways. Although social robots can have practical use cases where privacy concerns prevail, the use of social robots in such cases has been limited. This paper addresses this literature gap by evaluating a social robot in assessing people’s stress, anxiety, and depression via conversational AI. In this work, we develop a social robot called “Furhat” for assessment of stress, anxiety, and depression via conversational AI as an alternative to the traditional pencil and paper method. The Furhat robot is designed to interact with individuals and provide a safe and non-judgmental space for individuals to express their stress, anxiety, and depression symptoms. The study compared the levels of stress, anxiety and depression assessed from Furhat with those assessed using the conventional pencil-and-paper method. Results demonstrated that the Furhat robot-based data collection and analysis method was similar to the pencil-and-paper method. Additionally, social robots were a preferred option for patients, as they reported higher levels of comfort and satisfaction with the Furhat-based screening process compared to that via the pencil-and-paper method. The use of social robots for data collection and emotional support has excellent potential for various applications in healthcare, education, and other domains.
No abstract available
Mental health has attracted substantial attention in recent years and large language model (LLM) can be an effective technology for alleviating this problem owing to its capability in text understanding and dialogue. However, existing research in this domain often suffers from limitations, such as training on datasets lacking crucial prior knowledge and evidence, and the absence of comprehensive evaluation methods. In this article, we propose a specialized psychological LLM, named PsycoLLM, trained on a proposed high-quality psychological dataset, including single-turn QA, multiturn dialogues, and knowledge-based QA. Specifically, we construct multi-turn dialogues through a three-step pipeline comprising multiturn QA generation, evidence judgment, and dialogue refinement. We augment this process with real-world psychological case backgrounds extracted from online platforms, enhancing the relevance and applicability of the generated data. Additionally, to compare the performance of PsycoLLM with other LLMs, we develop a comprehensive psychological benchmark based on authoritative psychological counseling examinations in China, which includes assessments of professional ethics, theoretical proficiency, and case analysis. The experimental results on the benchmark illustrate the effectiveness of PsycoLLM, which demonstrates superior performance compared with other LLMs.
Introduction Suicide accounts for over 720,000 deaths globally each year, and many more individuals experiencing suicidal ideation; thus, implementing large-scale, effective suicide intervention is vital for reducing suicidal behaviors. Traditional suicide intervention methods are hampered by shortages of qualified practitioners, variability in clinical competence, and high service costs. This study leverages Large Language Models (LLMs) to develop an effective suicide intervention chatbot, which provides early, large-scale, rapid self-help interventions. Methods First, according to existing psychological crisis intervention methods, we fine-tuned ChatGPT-4 via prompt engineering to develop a chatbot that promptly responds to the needs of individuals experiencing suicidal ideation. Then, we implemented a self-help web-based dialogue platform powered by this chatbot and conducted the evaluations of its usability and intervention efficacy. Results We found that the self-help suicide intervention chatbot achieved high effectiveness and quality in terms of user interface operability, interaction experience, emotional support, intervention efficacy, safety and privacy, and overall satisfaction. Discussion These findings demonstrate that the suicide intervention chatbot can provide effective emotional support and therapeutic intervention to a large cohort experiencing suicidal ideation.
Due to privacy concerns, open dialogue datasets for mental health are primarily generated through human or AI synthesis methods. However, the inherent implicit nature of psychological processes, particularly those of clients, poses challenges to the authenticity and diversity of synthetic data. In this paper, we propose ECAs (short for Embodied Conversational Agents), a framework for embodied agent simulation based on Large Language Models (LLMs) that incorporates multiple psychological theoretical principles. Using simulation, we expand real counseling case data into a nuanced embodied cognitive memory space and generate dialogue data based on high-frequency counseling questions. We validated our framework using the D4 dataset. First, we created a public ECAs dataset through batch simulations based on D4. Licensed counselors evaluated our method, demonstrating that it significantly outperforms baselines in simulation authenticity and necessity. Additionally, two LLM-based automated evaluation methods were employed to confirm the higher quality of the generated dialogues compared to the baselines.
As global demand for mental health support continues to rise, AI-driven dialogue systems have become a vital supplement to traditional psychotherapy. However, existing systems that rely on a single large language model (LLM) struggle to manage multi-phase intervention processes and lack the ability to dynamically adapt to users' changing psychological states. This paper introduces AdaptiCare, a multi-agent collaborative framework for mental health counseling, which simulates professional team-based intervention through specialized agents-Counselor, Assistant, Reasoner, and Coach—and comprehensively covers the structured three-phase process of assessment, intervention, and consolidation. We further propose a multi-dimensional evaluation scheme integrating the CARE and TEQ scales to assess empathy, professionalism and strategic guidance, using GPT-3.5-turbo for consistent fine-grained scoring. Experimental results show that AdaptiCare outperforms a single-LLM baseline by $11.2 \%, 6.5 \%$, and 4.5%, all statistically significant ($\mathrm{p}<0.05$). In comparison with seven leading models (e.g., LLaMA, Vicuna, PsyChat), AdaptiCare achieves superior performance in strategic coherence and overall balance. In a preliminary study involving 20 quasi-real users, participants showed significant improvement in PHQ-9 and GAD-7 scores, and gave generally positive feedback on the system's empathy, practicality of suggestions, and interaction fluency, further validating AdaptiCare's potential for real-world mental health support applications.
Using large language models (LLMs) to assist psychological counseling is a significant but challenging task at present. Attempts have been made on improving empathetic conversations or acting as effective assistants in the treatment with LLMs. However, the existing datasets lack consulting knowledge, resulting in LLMs lacking professional consulting competence. Moreover, how to automatically evaluate multi-turn dialogues within the counseling process remains an understudied area. To bridge the gap, we propose CPsyCoun, a report-based multi-turn dialogue reconstruction and evaluation framework for Chinese psychological counseling. To fully exploit psychological counseling reports, a two-phase approach is devised to construct high-quality dialogues while a comprehensive evaluation benchmark is developed for the effective automatic evaluation of multi-turn psychological consultations. Competitive experimental results demonstrate the effectiveness of our proposed framework in psychological counseling. We open-source the datasets and model for future research at https://github.com/CAS-SIAT-XinHai/CPsyCoun
The rapid development of large language models(LLMs) has significantly benefited various industries. However, the psychological challenges experienced by left-behind children(LBC) due to a lack of companionship have received insufficient attention. To address this issue, we propose a Multi-Tone Emotional Voice Synthesis Framework that integrates speech recognition, LLMs, and voice synthesis technologies. Specifically, an emotional psychology specialized knowledge and a psychological dialogue dataset tailored for LBC interactions. These resources enhance LLMs' capability to understand psychological issues and generate contextually appropriate responses that foster positive psychological development. Unlike conventional single-voice systems, our framework incorporates multi-tone audio samples through advanced speech recognition and synthesis techniques, enabling both diverse vocal outputs and efficient voice reconstruction. This approach significantly improves the naturalness and emotional expressiveness of synthesized speech. Experimental comparisons with ChatGLM3-6B and ChatGPT-3 demonstrate that our framework achieves superior performance in psychological question-answering tasks and voice expression. The results suggest its potential in providing effective psychological companionship for LBC, while offering practical implications for developing mental health support systems targeting this vulnerable population.
Mental health remains a growing public health concern in Bangladesh, where access to psychological services is limited by a shortage of professionals, cultural stigma, and economic barriers. To address these challenges, this study explores the development of Bengali supported mental health assistant, PsychAI, using large language models (LLMs). We introduce MonBarta, a synthetic dataset of 200 validated mental health conversations in Bengali, created using a few-shot generation approach and supervised by clinical psychologists. Leveraging the fine-tuning methodologies, we developed a PsychAI Analyzer Model (PAM) capable of classifying symptoms from client-assistant conversations. The system effectively identifies psychological conditions such as depression, anxiety, post-traumatic stress disorder (PTSD), schizophrenia-related disorders etc. Evaluation by mental health professionals shows an accuracy of $86.84 \%$ and an Average Human Evaluation Score (AHES) of 4.34 out of 5, indicating strong clinical relevance. This research contributes a linguistically and culturally grounded AI framework for early mental health assessment in under-resourced Bengali-speaking communities.
The Structured Dialogue System, referred to as SuDoSys, is an innovative Large Language Model (LLM)-based chatbot designed to provide psychological counseling. SuDoSys leverages the World Health Organization (WHO)'s Problem Management Plus (PM+) guidelines to deliver stage-aware multi-turn dialogues. Existing methods for employing an LLM in multi-turn psychological counseling typically involve direct fine-tuning using generated dialogues, often neglecting the dynamic stage shifts of counseling sessions. Unlike previous approaches, SuDoSys considers the different stages of counseling and stores essential information throughout the counseling process, ensuring coherent and directed conversations. The system employs an LLM, a stage-aware instruction generator, a response unpacker, a topic database, and a stage controller to maintain dialogue flow. In addition, we propose a novel technique that simulates counseling clients to interact with the evaluated system and evaluate its performance automatically. When assessed using both objective and subjective evaluations, SuDoSys demonstrates its effectiveness in generating logically coherent responses. The system's code and program scripts for evaluation are open-sourced.
Psychological measurement is essential for mental health, self-understanding, and personal development. Traditional methods, such as self-report scales and psychologist interviews, often face challenges with engagement and accessibility. While game-based and LLM-based tools have been explored to improve user interest and automate assessment, they struggle to balance engagement with generalizability. In this work, we propose PsychoGAT (Psychological Game AgenTs) to achieve a generic gamification of psychological assessment. The main insight is that powerful LLMs can function both as adept psychologists and innovative game designers. By incorporating LLM agents into designated roles and carefully managing their interactions, PsychoGAT can transform any standardized scales into personalized and engaging interactive fiction games. To validate the proposed method, we conduct psychometric evaluations to assess its effectiveness and employ human evaluators to examine the generated content across various psychological constructs, including depression, cognitive distortions, and personality traits. Results demonstrate that PsychoGAT serves as an effective assessment tool, achieving statistically significant excellence in psychometric metrics such as reliability, convergent validity, and discriminant validity. Moreover, human evaluations confirm PsychoGAT's enhancements in content coherence, interactivity, interest, immersion, and satisfaction.
Large language models (LLMs) have shown promise in providing scalable mental health support, while evaluating their counseling capability remains crucial to ensure both efficacy and safety. Existing evaluations are limited by the static assessment that focuses on knowledge tests, the single perspective that centers on user experience, and the open-loop framework that lacks actionable feedback. To address these issues, we propose Ψ-Arena, an interactive framework for comprehensive assessment and optimization of LLM-based counselors, featuring three key characteristics: (1) Realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients; (2) Tripartite evaluation that integrates assessments from the client, supervisor, and counselor perspectives; (3) Closed-loop optimization that iteratively improves LLM counselors using diagnostic feedback. Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives. Moreover, reflection-based optimization results in up to a 141% improvement in counseling performance. We hope Ψ-Arena provides a foundational resource for advancing reliable and human-aligned LLM applications in mental healthcare.
Recent advancements in large language models (LLMs) have opened new avenues in psychological counseling. This study leverages LLMs to develop chatbots capable of conducting empathetic and personalized counseling sessions by applying various prompt engineering techniques, including zero-shot, few-shot, meta-learning, Chain of Thought, and our newly developed Empathetic Meta-Chain (EMC) method. The EMC method demonstrated superior performance in empathy, response accuracy, interaction continuity, fluency, and understanding, as confirmed by expert evaluations. By integrating advanced empathetic strategies, the EMC chatbot significantly enhances its ability to support users' mental well-being through natural and engaging counseling interactions. These findings highlight the potential of LLM-based counseling chatbots to serve as effective tools in mental health support.
In this narrative review, we survey recent empirical evaluations of AI-based language assessments and present a case for the technology of large language models to be poised for changing standardized psychological assessment. Artificial intelligence has been undergoing a purported “paradigm shift” initiated by new machine learning models, large language models (e.g., BERT, LAMMA, and that behind ChatGPT). These models have led to unprecedented accuracy over most computerized language processing tasks, from web searches to automatic machine translation and question answering, while their dialogue-based forms, like ChatGPT have captured the interest of over a million users. The success of the large language model is mostly attributed to its capability to numerically represent words in their context, long a weakness of previous attempts to automate psychological assessment from language. While potential applications for automated therapy are beginning to be studied on the heels of chatGPT’s success, here we present evidence that suggests, with thorough validation of targeted deployment scenarios, that AI’s newest technology can move mental health assessment away from rating scales and to instead use how people naturally communicate, in language.
In Psychotherapy, Early Maladaptive Schemas (EMS) are entrenched negative perceptions of self or others that perpetuate mental health challenges, contribute to treatment resistance and relapse, and obstruct therapeutic progress. Addressing EMS using appropriate psychotherapeutic support (PS) strategies helps resolve core emotional deficits, mitigate resistance, and improve client engagement. Moreover, adapting polite and empathetic communication based on clients’ emotional states fosters trust, emotional safety, and a conducive therapeutic environment, which is critical for addressing EMS and achieving positive outcomes. Motivated by these insights, we introduce MATE - a novel EMS-guided polite and empAthetic dialogue sysTem for psychothErapeutic support. MATE integrates a Large Language Model (LLM) with a Mixture of Experts-based Reinforcement Learning (MoE-RL) approach to overcome the limitations of traditional RL methods, such as large action spaces and generic responses. The LLM captures diverse semantic patterns from dialogue context. MoE-RL leverages dedicated psychotherapeutic, politeness, and empathy experts, along with a new reward function, comprising PS, politeness, empathy, contextual consistency, and diversity rewards to guide policy learning for effective response generation. Evaluations on the HOPE and PSYCON datasets demonstrate MATE’s efficacy in generating polite and empathetic psychotherapeutic responses based on clients’ EMS and emotional cues while ensuring contextual consistency and diversity.
With the increasing prominence of mental health issues, automated psychological support dialogue systems have gradually gained attention. However, existing Chinese corpora mostly remain at the level of single-turn Q&A or lack psychological counseling theoretical grounding, making it difficult to cover the progressive interactions common in psychological counseling. Meanwhile, collecting and releasing large-scale real multi-turn dialogues faces challenges related to privacy protection and high costs. To address this, this paper proposes the Helping Skills Chain-of-Thought (HCoT) method, which integrates Helping Skills Theory with Chain-of-Thought prompting. We utilized GPT-4o to rewrite CD-CN single-turn data into a Chinese multi-turn psychological support corpus, HCoT-Corpus. This corpus contains 22,341 dialogues and 211,473 strategy annotations, achieving a systematic expansion in scale, structural depth, and theoretical grounding. Analysis results indicate that HCoT-Corpus demonstrates high structural coherence and multi-strategy collaborative characteristics under the “Exploration-Comfort-Action” three-stage framework. Experimental evaluations show that, compared to baselines like SMILE, the HCoT method achieves the most balanced performance in emotional resonance, strategy application, and structural integrity. Furthermore, HCoT-Chat, fine-tuned on Qwen2.5-7B-Instruct, achieved significant advantages in both automatic metrics and cross-model evaluations. This study demonstrates the HCoT method as a promising path for constructing large-scale, theoretically grounded psychological support dialogue datasets.
LLM-based conversational agents have become increasingly popular in recent years due to their novel capacity for natural, human-like dialogue interactions. However, mistrust in LLMs persists due to concerns about privacy, the potential for incorrect responses (often referred to as ’hallucinations’), and issues related to social bias. Previous AI research shows that anthropomorphic form positively influences users’ perceptions. However, this aspect remains under-explored in LLM-based conversational agent research. Our research features two anthropomorphic forms: embodied and behavioral. Embodied Anthropomorphic Form (EA) encompasses chatbot, chatbot with text-to-speech (TTS), and embodied conversational agent (ECA) interface designs. Behavioral Anthropomorphic Form (BA) involves LLMs instructed with and without Theory of Mind (ToM) principles. In an empirical evaluation, we explored how interplay between BA form and EA form, and vice-versa, affects users’ perceptions of LLM-based conversational agents on trust, anthropomorphism, presence, usability, and user experience. Our findings provide evidence of such effects, offering novel insight into the influence of both anthropomorphic forms on perceived anthropomorphism, presence, usability, user experience, and their positive impact on user trust in LLM-based conversational agents. However, the combined highest (i.e., ECA with ToM behaviors) and lowest (i.e., Chatbot without ToM behaviors) levels of both forms result in lower user trust, suggesting a complex relationship between embodiment and ToM behaviors that warrants further investigation.
This study aims to develop an adversarial evaluation algorithm to detect the ability of large language models (LLMs) to recognize extreme behaviors in psychological counseling scenarios. We constructed a CEA dataset containing background, emotion, and behavioral characteristics, and proposed a dynamic adversarial test method based on cross-variation to evaluate the sensitivity of LLMs to extreme behaviors and their ability to intervene ethically. The experimental results showed that different LLMs had different performances: DeepSeek, Wenxin Yiyan and Claude were usually able to actively intervene, showing a high sensitivity to extreme behaviors and effective intervention ability; GPT-4, on the other hand, was more indifferent and had a lower intervention rate. This study verifies the robustness and intervention ability of different LLMs in complex emotional scenarios, but also points out that their intervention strategies and response consistency still need to be further optimized in practical applications. Future research can focus on expanding the diversity of datasets, improving the ability of model behavior feature recognition, verifying in real-world scenarios, and strengthening ethics and privacy protection, so as to improve the application value of LLMs in the field of mental health.
Well-being in family settings involves subtle psychological dynamics that conventional metrics often , overlook. In particular, unconscious parental expectations termed ideal parent bias, can suppress children’s emotional expression and autonomy. This suppression, referred to as suppressed emotion, often stems from well-meaning but value-driven communication, which is hard to detect or address from outside the family context. Focusing on these latent dynamics, this study explores Large Language Model (LLM)-based support for psychologically safe family communication. We constructed a Japanese parent–child dialogue corpus of 30 scenarios, each annotated with metadata on ideal parent bias and suppressed emotion. Based on this corpus, we developed a role-playing LLM-based multi-agent dialogue support framework that analyzes dialogue and generates feedback. Specialized agents detect suppressed emotion, describe implicit ideal parent bias in parental speech, and infer contextual attributes such as the child’s age and background. A meta-agent compiles these outputs into a structured report, which is then passed to five selected expert agents. These agents collaboratively generate empathetic and actionable feedback through a structured four-step discussion process. Experiments show that the system can detect categories of suppressed emotion with reasonable discrimination, and produce feedback rated highly in empathy and practicality. Moreover, simulated follow-up dialogues incorporating this feedback exhibited signs of improved emotional expression and mutual understanding, suggesting the framework’s potential to support positive transformation in family interactions.
Large language models (LLMs) have advanced rapidly in Natural Language Processing (NLP) and show strong promise for mental-health assessment in underserved languages such as Vietnamese. We introduce Mindful Companion, a compact, culturally adaptive Vietnamese LLM. Our pipeline comprises three stages: (i) translating IMHI dialogues [1] into Vietnamese with GPT-4 [2] while preserving linguistic and cultural fidelity; (ii) refining the corpus with perplexity-based filters to retain semantic diversity; and (iii) adapting the Qwen2.5-1.5B backbone using parameter-efficient LoRA/QLoRA fine-tuning. On held-out evaluation, the model achieves 79.9% accuracy and 0.81% macro-F1, indicating readiness for real-world Vietnamese mental-health research and digital-empathy applications under limited compute. The system supports early linguistic screening and empathetic dialogue understanding, not clinical diagnosis or therapeutic decision-making. Our code implementation is available at: https://github.com/ai4li/mainproject/tree/main/MindfulBot
Mental health problems are becoming more common. Unfortunately, there are still not enough health specialists able to assist them. As a way to provide accessible and immediate support to those in need, digital interventions are receiving increased attention. Conversational agents based on large language models (LLMs) are increasingly used in mental health support. To help psychiatric patients, we have developed a therapeutic dialogue system called Terabot. Earlier experiments revealed several situations that paused the natural flow of the conversation with the patients. We created a new LLM-based dialogue system with an eye tracker as an additional input signal. We propose a feedback loop where the eye tracker gives the patient’s real-time gaze data into the dialogue system. Thanks to this, the patient can receive a response from Terabot, which is more suitable for the current dialogue situation. The dialogue agent’s responses can be improved, resulting in a more human-like flow of conversation.
The article presents initial experiences using Terabot, a task-oriented spoken dialogue system, in a psychiatric clinical study. Seven patients conversed with the system, having five therapeutic sessions each. This allowed the collection of 600 recordings, which were then analyzed. We observed speech and intent recognition accuracy and compared it with the results achieved with healthy testers. We demonstrated that speech and intent recognition decreased with the patient testing, and discuss potential reasons for this. We also showed that by adding words that were misrecognized to the training data for the dialogue system, we managed to increase recognition accuracy for some intents. Six out of seven patients confirmed that the relaxation exercises offered by Terabot were helpful, and five even said they liked talking with the dialogue system. These results encourage us to continue the study and further develop the dialogue model.
Psychiatric disorders affect millions, yet diagnosis depends on subjective assessments and uneven access to care. To address these challenges, there is a growing need for Contestable AI (CAI), a framework that extends beyond Explainable AI (XAI) by allowing clinicians to inspect, question, and revise algorithmic outputs, thereby reducing automation bias and strengthening accountability. We present Heart2Mind1, a human-centered CAI system for psychiatric disorder prediction that provides objective evidence while preserving clinical oversight. Heart2Mind collects R-R interval (RRI) time series from Polar H9/H10 wearable ECG sensors via a Cardiac Monitoring Interface and analyzes them using a Multi-Scale Temporal-Frequency Transformer (MSTFT) that combines time-domain and frequency-domain features. For contestability, the Contestable Diagnosis Interface integrates model explanations with dialogue. Self-Adversarial Explanations compare attention-based and gradient-based explanation maps to flag inconsistent predictions, and a collaboration chatbot helps users verify and challenge outputs. On the HRV-ACC dataset, MSTFT achieved 91.7% accuracy under leave-one-out cross-validation, outperforming benchmark methods. Human-centered evaluation with the Human-CAI Consensus Rate showed experts and CAI could confirm correct decisions and correct errors through readable, efficient dialogues ( \(FKGL\approx 15\) , median 8.3 minutes, 4 turns). These results support low-cost wearable CAI screening with objective biomarkers, safeguards, and an interactive path for clinicians to refine recommendations.
COGITO is a privacy-conscious, retrieval-augmented mental-health consultation and patient-care support system that offers reliable, context-sensitive, and multilingual conversational assistance. The system combines the large language models with semantic retrieval using the FAISS algorithm, anchoring responses to curated mental-health knowledge, which minimizes hallucinations as well as enhancing the accuracy of facts. COGITO, which is implemented with a simple Flask-based architecture, enables secure user authentication, session continuity, and encrypted interaction storage as well as automated creation of clinically styled mental-health reports. It is experimentally tested that retrieval augmentation enhances factual accuracy by 18.4% and decreases hallucinated responses by 34.7% compared to a simple LLM chatbot, and does not decrease response latency (0.8-1.4 seconds). The platform promotes the use of multiple languages and compassionate dialogue and complies with the principles of privacy and ethical design. These findings suggest that retrieval systems combined with controlled LLM reasoning give a scalable and responsible basis of AI-based mental-health support systems.
The COVID-19 pandemic increased anxiety, depression, and post-traumatic stress disorder around the world. It also revealed gaps in services and led to a rapid rise in the use of machine learning (ML) tools for assessment and support. This scoping review examined post-2020 evidence on ML-based screening, prediction, and intervention in populations affected by the pandemic. Following the Population-Concept-Context framework and PRISMA-ScR, searches in APA PsycINFO and MEDLINE/PubMed found 476 records. After removing duplicates and screening, 19 studies were included. Ten studies validated machine learning models using data from social media, surveys, smartphone or wearable sensors, speech, or electrocardiograms. Nine studies looked at machine learning-enabled chatbots in community or clinical settings. Screening and prediction models showed good results for anxiety, depression, and post-traumatic stress disorder, with multimodal approaches often achieving the best outcomes. Chatbot interventions were practical and well-received, leading to small reductions in depression, anxiety, or loneliness. However, these effects often matched those of self-help controls and depended on ongoing engagement. From all methodologies, the primary challenges have included privacy concerns, potential cultural and linguistic biases, insufficient external validation, and evolving datasets. In general, machine learning approaches offer valuable alternatives for the detection and provision of low-threshold support after the pandemic. Nevertheless, safe scaling requires validation on equity, privacy-preserving designs, transparent reporting, and appropriate practical implementation within routine care in order to establish trust, governance, and clinical acceptance.
Depression, a prevalent mental health disorder with severe health and economic consequences, can be costly and difficult to detect. To alleviate this burden, recent research has been exploring the depression screening capabilities of deep learning (DL) models trained on videos of clinical interviews conducted by a virtual agent. Such DL models need to consider the challenges of modality representation, alignment, and fusion as well as small sample sizes. To address them, we propose WavFace, a multimodal deep learning model that inputs audio and temporal facial features. WavFace adds an encoder-transformer layer over pre-trained models to improve the unimodal representation. It also applies an explicit alignment method for both modalities and then uses sequential and spatial self-attention over the alignment. Finally, WavFace fuses the sequential and spatial self-attentions among the two modality embeddings, inspired by how mental health professionals simultaneously observe visual and vocal presentation during clinical interviews. By leveraging sequential and spatial self-attention, WavFace outperforms pre-trained unimodal and multimodal models from the literature. With a single interview question, WaveFace screened for depression with a balanced accuracy of 0.81. This presents a valuable modeling approach for audio-visual mental health screening.
Depression, a widespread psychiatric disorder affecting people globally, spans all age groups, predominantly impacting adults. This bipolar disorder characterized by symptoms including pessimism, hopelessness, anhedonia, and sadness, significantly influences lives, contributing to depression. Our paper proposes a multi-model approach for depression detection, utilizing facial expression analysis, audio evaluation, and user text input through deep learning algorithms, alongside an intelligent chatbot for personalized support. This hybrid model integrates facial expressions, audio features, and textual input for a comprehensive approach to depression detection. The methodology includes four key objectives: a CNN model for real-time or pre-recorded video facial expression analysis, audio evaluation using an NLP algorithm to transcribe users' voices, text-based analysis uncovering linguistic patterns and emotional context, and Multimodal Fusion integrating outputs for a unified multimodal approach. The intelligent chatbot encourages users to share emotions openly, enhancing the system's accuracy in identifying individuals at risk of depression. Results demonstrate the fusion's contribution to early depression detection, enabling timely interventions and improving accuracy, efficiency, and overall performance.
Depression is a common psychiatric disorder worldwide. However, in China, a considerable number of patients with depression are not diagnosed, and most of them are not aware of their depression. Despite increasing efforts, the goal of automatic depression screening from behavioral indicators has not been achieved. A major limitation is the lack of available multimodal depression corpus in Chinese since linguistic knowledge is crucial in clinical practice. Therefore, we first carried out a comprehensive survey with psychiatrists from a renowned psychiatric hospital to identify key interview topics which are highly related to the diagnosis of depression. Then, a semi-structural interview study was conducted over a year with subjects who have undergone clinical diagnosis and professional assessment. After that, Visual, acoustic, and textual features were extracted and analyzed between the two groups, statistically significant differences were observed in all three modalities. Benchmark evaluations of both single modal and multimodal fusion methods of depression assessment were also performed. A multimodal transformer-based fusion approach achieved the best performance. Finally, the proposed Chinese Multimodal Depression Corpus (CMDC) was made publicly available after de-identification and annotation. Hopefully, the release of this corpus would promote the research progress and practical applications of automatic depression screening.
BACKGROUND Depression is a complex disorder that cannot be fully screened by textual features alone, as audio features capture additional psychomotor and affective changes. This study integrates textual and audio features for depression screening and compares the performance of various machine learning models. METHODS This study used a large-scale, multimodal psychology dataset of 1275 participants (707 males, 568 females; aged 12-16 years) that integrates PHQ-9 scores, 18,834 textual interview responses, and mel-spectrograms derived from audio recordings. Textual features were calculated using suicide risk scores from the Chinese Suicide Dictionary (CSD), emotional polarity probabilities, and depression severity probabilities generated by large language models (LLMs). For audio data, we estimated the combination of emotion status by the frequency (ratio) of eight emotions, which applied a fine-tuned U-Net model with mel spectrograms, mel-frequency cepstral coefficients (MFCCs), and chroma features. Finally, these features were combined and evaluated with five machine learning models using eight metrics to identify the best-performing model. RESULTS Among the five machine learning methods, multimodal fusion outperformed unimodal approaches (text-only and audio-only) with the lowest MAE and RMSE. The RFR model showed the best performance for depression prediction (Accuracy = 0.98 and Precision = 0.98) with the combination of prompt3 from LLMs. The most important features for depression prediction were depression severity, negative and positive emotional polarity, and suicide risk from textual features, and emotional features (happy, angry, neutral, and surprise) from audio features. CONCLUSIONS Combining audio and textual features improved depression screening accuracy. Future research could include facial expressions and physiological indicators to further enhance screening performance.
Mental health in modern society is a subject seldom given importance. Dealing with issues such as stress, depression, anxiety can be difficult unless help is sought and provided. The Emotion-Aware Chatbot is a project we built with the goal of providing said help. The chatbot applies to concepts such as facial recognition, emotion detection, interacting with the user via texts, processing natural language to assist in providing a comfortable, safe experience. We aim for the chatbot to be scalable, deployable and compatible in any environment to encourage mental wellness and support.
Depression is a severe mental health disorder, and its early detection and diagnosis are of great importance for effective treatment and patient recovery. This study proposes a Multi-Dimensional Adaptive Attention Network (MDAM-Net), which integrates audio, video, and personalized features to achieve accurate recognition of depression states. MDAM-Net incorporates three advanced techniques— Modality-Specific Enhanced Dynamic Emotion Experts (EMOE), Token-Channel Compounded Cross Attention (TACO), and Density Adaptive Attention Mechanism (DAAM)—to construct a comprehensive multimodal learning framework. Five-fold cross-validation experiments conducted on the MPDD dataset, which contains 330 samples, demonstrate that MDAM-Net significantly outperforms baseline methods across all four evaluation tasks. The model achieves an overall score of 64%, exceeding the official baseline by more than 10%. Furthermore, ablation studies validate the effectiveness of each component, providing a novel technological pathway for the automatic detection of depression. These findings highlight the strong generalization and robustness of MDAM-Net, suggesting its potential application in real-world clinical and psychological screening scenarios for depression detection.
Depression remains one of the most prevalent yet underdiagnosed mental health conditions worldwide. Traditional diagnostic tools often rely on subjective evaluations, which limit timely and scalable intervention. Recent advances in affective computing have enabled the use of multimodal datasuch as audio, visual, and text-for automated depression detection. This study proposes a cross-modal deep learning architecture that leverages attention-based fusion to integrate heterogeneous behavioral signals from limited DAIC-WOZ data. The model processes each modality through dedicated subnetworks and employs a multi-head cross-attention mechanism to learn inter-modality dependencies before final classification. Unlike prior approaches such as ACMA and FPTFormer, our model emphasizes lightweight deployment by reducing architectural complexity while maintaining competitive accuracy. Despite being trained on only 97 participants due to storage constraints, the system achieves $\mathbf{8 0 \%}$ accuracy and a macro-averaged $\mathbf{F}$ 1-score of $\mathbf{0. 7 8}$, demonstrating strong performance under data limitations. These findings highlight the feasibility of scalable, interpretable, and efficient AI frameworks for mental health screening in resourceconstrained environments.
This study investigates linguistic patterns in elementary students’ chatbot conversations from a neurodevelopmental perspective. A total of 123 students interacted with an AI chatbot for two weeks. Using a dual-method strategy combining Latent Dirichlet Allocation and qualitative analysis, the study identified an exploratory three-stage behavioral spectrum corresponding to depression severity: (1) Situational Complaint & Playful Distraction (Mild), (2) Relational Ambivalence & Narrative Effort (Moderate), and (3) Structural Disintegration & Hostile Projection (Severe). As severity increased, patterns shifted from playful interaction to fragmented hostility, consistent with deficits in cognitive control and social pain processing. Although the study did not include direct physiological measurements, these linguistic behaviors can be interpreted within a neurodevelopmental perspective as conceptually relevant patterns. The findings suggest that chatbot dialogue serves as a meaningful process-based marker for tracking emotional shifts often missed by traditional screening, highlighting its potential for early risk detection in schools. Given the small moderate and severe subgroups, these findings remain exploratory but provide promising markers for future validation.
Mental illness, particularly depression and anxiety, is a leading cause of global disease burden. Underdiagnosis is common due to misperceptions and negative stigma around mental health, limited resources, and self-reporting bias. Newer multimodal deep learning (MDL) frameworks have demonstrated the ability to distill behavioral, linguistic, and physiological signals pertaining to mental health from a number of data streams. However, uptake in clinical practice has been limited partly due to lack of transparency in how the models reach their conclusions. This study proposes a multimodal deep learning framework for the automatic early detection of anxiety and depression from text, audio and video signals with a special focus on Explainable AI(XAI). Basing the research on the benchmark datasets DAIC-WOZ, E-DAIC, and eRisk, the model outperformed unimodal baselines, and delivered clinically meaningful results that were interpretable. The research shows that leveraging explanatory artificial intelligence with MDL frameworks can create a more reliable and transparent AI-based screening tool for mental health problems.
In this modern era, massive population of this world is falling prey to mental illness and psychological diseases rapidly along with other physical diseases. Among all, depressive disorder is becoming major challenging and serious psychological problem. Due to a lack of initial treatment options and resources for depression detection, millions of people struggle with mental illness. It is the primary cause of anxiety disorders, bipolar disorders, sleep disorders, depression, and occasionally, it can result in suicide and self-harm. As a result, recognition and prompt treatment of people experiencing mental health illness is an extremely difficult endeavour but it has paramount importance. The whole purpose of research is to propose deep learning-based model that primarily incorporates audio and textual aspects of patient answers to diagnose depression using the Extended Distress Analysis Corpus Wizard of OZ (EDAIC-WOZ) dataset. The inclination towards this work is to help psychologists and people with depressive symptoms with an accurate detection using artificial intelligence methods so that quick and timely remedial steps can be taken. The proposed model classifies the patients into two groups based on depression level such as the group of people with depressive disorder and others who doesn’t have such disorder by making use of speech spectral and temporal features and textual emotional analysis. This model uses the late fusion ensemble method implemented on the resultant of Audio CNN model and textual Bi-LSTM model. By using optimized methods for pulling out the textual and audio features, this work ensures that relevant characteristics of audio and text to be considered for the study. The conclusion of the research derived from the data set demonstrate the effectiveness of the model by complimenting the conventional diagnostic methods in screening the depression. The novel method is considered to be efficient and scalable solution to monitor the mental health based on depression level. Results reveal that the proposed model outperforms the other models with F1 score 0.98.
We introduce a multilingual chatbot tailored for remote data collection, focusing on mental health issues such as depression. It gathers both self-reported information and voice recordings. Implemented as a Telegram bot, the system enables participants to complete psychological assessments, including the 8-item Patient Health Questionnaire (PHQ-8) and open-ended questions entirely through a conversational interface. The chatbot supports multiple languages, making it well-suited for cross-cultural research. Users respond to both close-ended and open-ended questions, with voice messages serving as the required mode of response for the latter. These recordings are stored for future multimodal analysis. To strengthen privacy guarantees, the chatbot operates in conjunction with a self-hosted Telegram Bot API server, ensuring that all user data remains within researcher-controlled infrastructure. The system emphasizes accessibility, privacy, and extensibility featuring built-in mechanisms for consent withdrawal and data deletion, as well as future support for additional modalities such as video responses and alternative psychometric instruments. This demonstration highlights the core functionality of the chatbot and its potential as a lightweight, scalable tool for affective computing and digital mental health research. The code is available at: https://github.com/danila-mamontov/mentalhealth-chatbot.
Depression is one of the most occurring civilizational diseases. In this paper, we propose a new approach for detecting depression through the analysis of social media content using face analysis, emotion recognition neural networks, and speech processing. We utilized audio-visual analysis and acquired more than 605 features in the time domain. Those are fed to machine learning and deep learning models for depression classification. Our approach outperforms the other state-of-the-art models, achieving the F1-score 0.77. The results have the potential to provide valuable insights for mental health professionals, offer early detection and intervention, and serve as a resource for individuals seeking help with their mental health. This study enables real-time analysis and represents a significant advancement in mental health and technology and has the potential to impact society.Clinical relevance—The system aims to provide a fast and accurate way to detect depression in individuals through online recordings. The use of multimodal information (e.g. audio, image) enhances the performance of the non-verbal behavioral analysis. The end-to-end system reduces the need for manual analysis by mental health professionals and increases the efficiency of depression screening. The system can potentially help identify individuals who are at risk for depression, enabling early intervention and treatment. The results from the system can complement traditional assessments and support mental health professionals in making a diagnosis. The system can be used in real-time processing, f.e. during online calls, and provide objective measurements summarizing the overall behavior based on computer vision and audio analysis.
Abstract: Mental health issues such as stress, anxiety, and depression have become increasingly prevalent in today’s fast-paced world. Early detection and timely intervention can significantly improve mental well-being. This project presents a Manasvita : AI-Powered Multimodal Mental Wellness Platformintegrating advanced machine learning techniques and AI-driven solutions to assist users in understanding and managing their mental health. The system incorporates user authentication, enabling secure access to personalized assessments and services. A doctor appointment module allows users to schedule consultations with mental health professionals. The platform utilizes Convolutional Neural Networks (CNNs) to analyze facial expressions and predict mental health conditions based on the FER2013 dataset. Additionally, a Random Forest classifier assesses stress levels using a structured dataset. An AI-powered chatbot, leveraging the Gemini AI API, provides users with immediate mental health-related support and guidance. To encourage positive behavioral changes, the platform includes a Task and Reward system, where doctors assign therapeutic tasks, and users earn incentives such as discounts or coupons upon completion. By integrating machine learning, artificial intelligence, and user engagement strategies, this project aims to provide an accessible, technology-driven solution for mental health monitoring and support. The proposed system enhances self-awareness, promotes timely intervention, and bridges the gap between users and professional healthcare services
This study demonstrates that a single video question can predict self-reported depression (PHQ-9), anxiety (GAD-7), and trauma (PCL-5) through text and voice analysis. As mental health screening needs increase, efficient multi-condition assessment methods could reduce patient burden in clinical settings. Our multimodal model, integrating MPNet for textual analysis and HuBERT for voice prosody, was trained on data from 2420 participants. Our approach achieves 64.6% reduced assessment time (78.4 s vs 221.7 s) while screening all three conditions from one response, with only 1.4% of participants unwilling to use video-based screening. Results demonstrate strong performance and demographic consistency across age, gender, and race/ethnicity supporting the feasibility of efficient multi-condition screening from brief video responses.
Mental health disorders affect nearly 25 % of the global population, with depression alone accounting for 7.5 % of all years lived with disability. MIND-SPHERE addresses these difficulties by providing tailored mental health assessments using an interactive avatar powered by AI-driven technology. Using advanced natural language processing models, such as RoBERTa for sentiment analysis, and combining multimodal input (text, audio, visual), the system achieves a sentiment identification accuracy of 91.2 %, outperforming standard methods by about 15 %.. Federated learning ensures secure processing of decentralized data, mitigating privacy risks. MIND-SPHERE demonstrates a 92.8% predictive accuracy in mental health diagnoses, while reducing patient screening times by 30 %, alleviating the strain on healthcare systems already facing a shortage of mental health professionals. With an intervention success rate of 87.5 % and a system response time of <150, MINDSPHERE significantly enhances user engagement, achieving rates as high as 91.4% compared to 78% in existing systems. As AI's projected market size in mental health surpasses $ 11 billion by 2026, MIND-SPHERE positions itself at the forefront of scalable, accessible mental health solutions. By merging psychology and AI, MIND-SPHERE aims to revolutionize global mental health care through advanced analytics, robust data security, and exceptional performance metrics.
BACKGROUND Early detection of depression is crucial for implementing interventions. Deep learning-based computer vision (CV), semantic, and acoustic analysis have enabled the automated analysis of visual and auditory signals. OBJECTIVE We proposed an automated depression detection model based on artificial intelligence (AI) that combined visual, audio and text clues. Moreover, we validated the model's performance in multiple scenarios, including interviews with chatbot. METHODS A chatbot for depressive symptom inquiry powered by GPT-2.0 was developed. The Brief Affective Interview Task was designed as supplement. Audio-video and textual clues were captured during interview, and features of different modalities were fused using a network with a multi-head cross-attention mechanism. To validate the model's generalizability, we conducted external validation using an independent dataset. RESULTS (1)In the internal validation set (152 depression patients and 118 HCs),the multimodal model achieved good predictive power for predicting depression in all scenarios, with an area under the curve (AUC) over 0.950 and an accuracy over 0.930. Under the symptomatic interview by chatbot scenario, the model achieved exceptional performance, with an AUC of 0.999. Specificity decreases slightly (0.883) in the Brief Affective Interview Task. The multimodal model outperformed unimodal and bimodal counterparts.(2)For external validation under the symptomatic interview by chatbot scenario, a geographically distinct dataset (55 depression patients and 45 HCs) was employed. The multimodal fusion model achieved an AUC of 0.978, though all modality combinations exhibited reduced performance compared to internal validation. LIMITATIONS Longitudinal follow-up was not conducted in this study, and severe depression applicability requires further study.
No abstract available
Constrained by the cost and ethical concerns of involving real seekers in AI-driven mental health, researchers develop LLM-based conversational agents (CAs) with tailored configurations, such as profiles, symptoms, and scenarios, to simulate seekers. While these efforts advance AI in mental health, achieving more realistic seeker simulation remains hindered by two key challenges: dynamic evolution and multi-session memory. Seekers'mental states often fluctuate during counseling, which typically spans multiple sessions. To address this, we propose AnnaAgent, an emotional and cognitive dynamic agent system equipped with tertiary memory. AnnaAgent incorporates an emotion modulator and a complaint elicitor trained on real counseling dialogues, enabling dynamic control of the simulator's configurations. Additionally, its tertiary memory mechanism effectively integrates short-term and long-term memory across sessions. Evaluation results, both automated and manual, demonstrate that AnnaAgent achieves more realistic seeker simulation in psychological counseling compared to existing baselines. The ethically reviewed and screened code can be found on https://github.com/sci-m-wang/AnnaAgent.
Developing specialized dialogue systems for mental health support requires multi-turn conversation data, which has recently garnered increasing attention. However, gathering and releasing large-scale, real-life multi-turn conversations that could facilitate advancements in mental health support presents challenges in data privacy protection and the time and cost involved in crowdsourcing. To address these challenges, we introduce SMILE, a single-turn to multi-turn inclusive language expansion technique that prompts ChatGPT to rewrite public single-turn dialogues into multi-turn ones. Our work begins by analyzing language transformation and validating the feasibility of our proposed method. We conduct a study on dialogue diversity, including lexical features, semantic features, and dialogue topics, demonstrating the effectiveness of our method. Further, we employ our method to generate a large-scale, lifelike, and diverse dialogue dataset named SMILECHAT, consisting of 55k dialogues. Finally, we utilize the collected corpus to develop a mental health chatbot, MeChat. To better assess the quality of SMILECHAT, we collect a small-scale real-life counseling dataset conducted by data anonymization. Both automatic and human evaluations demonstrate significant improvements in our dialogue system and confirm that SMILECHAT is high-quality. Code, data, and model are publicly available at https://github.com/qiuhuachuan/smile.
Current outcome measures in digital mental health lack granularity, especially for single‐session interventions. This study aimed to address this by utilising natural language processing (NLP) methods to create a clear and relevant outcome measure. This paper describes the development of the Adult Session Wants and Needs Outcome Measure (Adult SWAN‐OM), a novel outcome measure for the Qwell digital mental healthcare platform to understand service user (SU) needs engaging in single‐session therapy (SST).The research employs a multi‐phased approach combining NLP methods with the typical stages of outcome measures development as follows: (1) assumption definition and validation with SUs and clinicians; (2) transcript theme extraction using the RoBERTa large language model (LLM) in conjunction with topic modelling to extract themes from 254 single‐session transcripts from 192 SUs; (3) clinical item refinement focus group; (4) content validity with clinicians and SUs to improve the relevance and clarity of the items; and (5) outcome measure finalisation in a workshop held with clinicians to consolidate the final wording.Ninety‐six potential wants and needs were generated and distilled into 12 measure items. The outcome measure was shown to be relevant and clear to both SUs and clinicians when used in the context of SST.This study highlights the potential of combining NLP approaches with co‐creation methods in single‐session outcome measure development. We argue that the incorporation of clinical expertise and SU experience ensures the clarity and applicability of such measures and that this approach to capturing single‐session wants and needs promises novel insights for supporting digital mental health interventions.
BACKGROUND Increasing an individual's ability to focus on concrete, specific detail, thus reducing the tendency toward overly broad, decontextualised generalisations about the self and world, is a target within cognitive behavioural therapy (CBT). However, empirical investigation of the impact of within-treatment specificity on treatment outcomes is scarce. We evaluated whether the specificity of patient dialogue predicted a) end-of-treatment symptoms and b) session completion for CBT for common mental health issues. METHODS This preregistered (https://osf.io/agr4t) study trained a deep learning model to score the specificity of patient dialogue in transcripts from 353,614 internet-enabled CBT sessions for common mental health disorders, delivered on behalf of UK NHS services. Data were from obtained from 65,030 participants (n = 47,308 female, n = 241 unstated) aged 18-94 years (M = 34.69, SD = 12.35). Depressive disorders were the most common (39.1 %) primary diagnosis. Primary outcome was end-of-treatment score on the Patient Health Questionnaire-9 (PHQ-9). Secondary outcome was number of sessions attended. RESULTS Linear mixed-effects models demonstrated that increased patient specificity significantly predicted lower post-treatment symptoms on the PHQ-9, although the size and direction of the effect varied depending on the type of therapeutic activity being completed. Effect sizes were consistently small. Higher patient specificity was associated with completing a greater number of sessions. LIMITATIONS We are unable to infer causation from our data. CONCLUSIONS Although effect sizes were small, an effect of specificity was observed across common mental health disorders. Further studies are needed to explore whether encouraging patient specificity during CBT may provide an enhancement of treatment attendance and treatment effects.
Mental health assessment is crucial for early intervention and effective treatment, yet traditional clinician-based approaches are limited by the shortage of qualified professionals. Recent advances in artificial intelligence have sparked growing interest in automated psychological assessment, yet most existing approaches are constrained by their reliance on static text analysis, limiting their ability to capture deeper and more informative insights that emerge through dynamic interaction and iterative questioning. Therefore, in this paper, we propose a multi-agent framework for mental health evaluation that simulates clinical doctor-patient dialogues, with specialized agents assigned to questioning, adequacy evaluation, scoring, and updating. In detail, we introduce an adaptive questioning mechanism in which an evaluation agent assesses the adequacy of user responses to determine the necessity of generating targeted follow-up queries to address ambiguity and missing information. Additionally, we employ a tree-structured memory in which the root node encodes the user's basic information, while child nodes (e.g., topic and statement) organize key information according to distinct symptom categories and interaction turns. This memory is dynamically updated throughout the interaction to reduce redundant questioning and enhance the information extraction and contextual tracking capabilities. Experimental results on the DAIC-WOZ dataset illustrate the effectiveness of our proposed method, which achieves better performance than existing approaches. Our code is released at \url{https://github.com/MindIntLab-HFUT/AgentMental}.
No abstract available
The rise of mental health problems around the world demands intelligent digital interventions that are also privacy-preserving. Traditional chatbots are thus often limited by rule-based logic and shallow sentiment mapping, failing to stay contextually sensitive in their tracking of emotions. EmotiCare is an AI agent with multi-component functionality for mental health support that combines all three elements of real-time multi-label emotion classification, empathetic dialogue generation, and long-term mood tracking with LLMs. Some important features include SQLite-based emotion logging, spontaneous dynamic coping responses, and emergency guardian alerts in severe cases. The system is designed to combine local inference with ethical design, achieving personalization, privacy, and scalability. All evaluation metrics, such as precision, recall, F1-score, accuracy, hamming loss, confusion matrix, and ROC-AUC, indicate more than 95% accuracy on a 27-emotion classification task with only a minimal amount of misclassification. The system addresses ambiguities in input, invokes user trust, and manifests highly human-like empathy. The comparison conducted confirms that the approach stands superior to the existing set of models in creating a sustainable and proactive paradigm for mental health intervention.
Dialogue safety remains a pervasive challenge in open-domain human-machine interaction. Existing approaches propose distinctive dialogue safety taxonomies and datasets for detecting explicitly harmful responses. However, these taxonomies may not be suitable for analyzing response safety in mental health support. In real-world interactions, a model response deemed acceptable in casual conversations might have a negligible positive impact on users seeking mental health support. To address these limitations, this paper aims to develop a theoretically and factually grounded taxonomy that prioritizes the positive impact on help-seekers. Additionally, we create a benchmark corpus with fine-grained labels for each dialogue session to facilitate further research. We analyze the dataset using popular language models, including BERT-base, RoBERTa-large, and ChatGPT, to detect and understand unsafe responses within the context of mental health support. Our study reveals that ChatGPT struggles to detect safety categories with detailed safety definitions in a zero- and few-shot paradigm, whereas the fine-tuned model proves to be more suitable. The developed dataset and findings serve as valuable benchmarks for advancing research on dialogue safety in mental health support, with significant implications for improving the design and deployment of conversation agents in real-world applications. We release our code and data here: https://github.com/qiuhuachuan/DialogueSafety.
The rise of LLM-driven AI characters raises safety concerns, particularly for vulnerable human users with psychological disorders. To address these risks, we propose EmoAgent, a multi-agent AI framework designed to evaluate and mitigate mental health hazards in human-AI interactions. EmoAgent comprises two components: EmoEval simulates virtual users, including those portraying mentally vulnerable individuals, to assess mental health changes before and after interactions with AI characters. It uses clinically proven psychological and psychiatric assessment tools (PHQ-9, PDI, PANSS) to evaluate mental risks induced by LLM. EmoGuard serves as an intermediary, monitoring users' mental status, predicting potential harm, and providing corrective feedback to mitigate risks. Experiments conducted in popular character-based chatbots show that emotionally engaging dialogues can lead to psychological deterioration in vulnerable users, with mental state deterioration in more than 34.4% of the simulations. EmoGuard significantly reduces these deterioration rates, underscoring its role in ensuring safer AI-human interactions. Our code is available at: https://github.com/1akaman/EmoAgent
Background Comprehensive session summaries enable effective continuity in mental health counseling, facilitating informed therapy planning. However, manual summarization presents a significant challenge, diverting experts’ attention from the core counseling process. Leveraging advances in automatic summarization to streamline the summarization process addresses this issue because this enables mental health professionals to access concise summaries of lengthy therapy sessions, thereby increasing their efficiency. However, existing approaches often overlook the nuanced intricacies inherent in counseling interactions. Objective This study evaluates the effectiveness of state-of-the-art large language models (LLMs) in selectively summarizing various components of therapy sessions through aspect-based summarization, aiming to benchmark their performance. Methods We first created Mental Health Counseling-Component–Guided Dialogue Summaries, a benchmarking data set that consists of 191 counseling sessions with summaries focused on 3 distinct counseling components (also known as counseling aspects). Next, we assessed the capabilities of 11 state-of-the-art LLMs in addressing the task of counseling-component–guided summarization. The generated summaries were evaluated quantitatively using standard summarization metrics and verified qualitatively by mental health professionals. Results Our findings demonstrated the superior performance of task-specific LLMs such as MentalLlama, Mistral, and MentalBART evaluated using standard quantitative metrics such as Recall-Oriented Understudy for Gisting Evaluation (ROUGE)-1, ROUGE-2, ROUGE-L, and Bidirectional Encoder Representations from Transformers Score across all aspects of the counseling components. Furthermore, expert evaluation revealed that Mistral superseded both MentalLlama and MentalBART across 6 parameters: affective attitude, burden, ethicality, coherence, opportunity costs, and perceived effectiveness. However, these models exhibit a common weakness in terms of room for improvement in the opportunity costs and perceived effectiveness metrics. Conclusions While LLMs fine-tuned specifically on mental health domain data display better performance based on automatic evaluation scores, expert assessments indicate that these models are not yet reliable for clinical application. Further refinement and validation are necessary before their implementation in practice.
No abstract available
Linguistic expressions of emotions such as depression, anxiety, and trauma-related states are pervasive in clinical notes, counseling dialogues, and online mental health communities, and accurate recognition of these emotions is essential for clinical triage, risk assessment, and timely intervention. Although large language models (LLMs) have demonstrated strong generalization ability in emotion analysis tasks, their diagnostic reliability in high-stakes, context-intensive medical settings remains highly sensitive to prompt design. Moreover, existing methods face two key challenges: emotional comorbidity, in which multiple intertwined emotional states complicate prediction, and inefficient exploration of clinically relevant cues. To address these challenges, we propose APOLO (Automated Prompt Optimization for Linguistic Emotion Diagnosis), a framework that systematically explores a broader and finer-grained prompt space to improve diagnostic efficiency and robustness. APOLO formulates instruction refinement as a Partially Observable Markov Decision Process and adopts a multi-agent collaboration mechanism involving Planner, Teacher, Critic, Student, and Target roles. Within this closed-loop framework, the Planner defines an optimization trajectory, while the Teacher-Critic-Student agents iteratively refine prompts to enhance reasoning stability and effectiveness, and the Target agent determines whether to continue optimization based on performance evaluation. Experimental results show that APOLO consistently improves diagnostic accuracy and robustness across domain-specific and stratified benchmarks, demonstrating a scalable and generalizable paradigm for trustworthy LLM applications in mental healthcare.
Empathy and emotion prediction are key components in the development of effective and empathetic agents, amongst several other applications. The WASSA shared task on empathy empathy and emotion prediction in interactions presents an opportunity to benchmark approaches to these tasks.Appropriately selecting and representing the historical context is crucial in the modelling of empathy and emotion in conversations. In our submissions, we model empathy, emotion polarity and emotion intensity of each utterance in a conversation by feeding the utterance to be classified together with its conversational context, i.e., a certain number of previous conversational turns, as input to an encoder Pre-trained Language Model (PLM), to which we append a regression head for prediction. We also model perceived counterparty empathy of each interlocutor by feeding all utterances from the conversation and a token identifying the interlocutor for which we are predicting the empathy. Our system officially ranked 1st at the CONV-turn track and 2nd at the CONV-dialog track.
No abstract available
In rural areas of India that lack proper support, the limited availability of mental health services, problems in connectivity, and the stigma of the society greatly slow down the process of mental health diagnosis and intervention. As the case is with various psychological disorders such as stress, anxiety, and burnout. Our proposal is to implement an integrated AI-powered system consisting of risk prediction through machine learning along with a culturally adjusted langauge independent conversational bot that supports English and Tamil. We have developed a large-scale, localized dataset supplemented by audio that allows the system to be a personal, and the most suitable mental health guide for the users of low-resource settings. The system uses Random Forest classifiers with a maximum accuracy of 98.2% along with a LLaMA(Large Language Model Meta AI) for empathetic interactions. The system intends to address early stage mental health issues in remote areas, break down the stigma, and motivate people to look for the help they need. Expansion of the system to include wearable data and real-time adaptive feedback, as well as pilot deployment and clinical validation, are some of the future steps envisaged to further deepen the reach of scalable mental health support in rural India.
In high-stress humanitarian and mental health contexts, timely access to accurate, empathetic, and actionable information remains critically limited, especially for at-risk and underserved populations. This work introduces LLooMi, an open-source, retrieval-augmented generation (RAG) conversational agent designed to deliver trustworthy, emotionally attuned, and context-aware support across domains such as mental health crises, housing insecurity, medical emergencies, immigration, and food access. Leveraging large language models (LLMs) with structured prompting, LLooMi reformulates user queries into actionable intents, often implicit, emotionally charged, or vague. It then retrieves and grounds responses in a curated, domain-specific knowledge base, without storing personal user data, aligning with privacy-preserving and ethical AI design principles. LLooMi adopts an intent-aware architecture that adapts its tone, content, and level of detail based on the user's inferred psychological state and informational goals. This step enables delivering fast, directive responses in acute distress scenarios or longer, validation-oriented support when emotional reassurance is needed, emulating key facets of therapeutic communication. By integrating NLP-driven semantic retrieval, structured dialogue memory, and emotionally adaptive generation, LLooMi offers a novel approach to scalable, human-centered digital mental health interventions. Evaluation shows an average answer correctness (AC) of 92.4% and answer relevancy (AR) of 84.9%, with high scores in readability, perceived trust, and ease of use. These results suggest LLooMi's potential as a complementary NLP-based tool for mental health support in digital psychiatry and crisis care.
The integration of conversational artificial intelligence (AI) into mental health care promises a new horizon for therapist-client interactions, aiming to closely emulate the depth and nuance of human conversations. Despite the potential, the current landscape of conversational AI is markedly limited by its reliance on single-modal data, constraining the systems’ ability to empathize and provide effective emotional support. This limitation stems from a paucity of resources that encapsulate the multimodal nature of human communication essential for therapeutic counseling. To address this gap, we introduce the Multimodal Emotional Support Conversation (MESC) dataset, a first-of-its-kind resource enriched with comprehensive annotations across text, audio, and video modalities. This dataset captures the intricate interplay of user emotions, system strategies, system emotions, and system responses, setting a new precedent in the field. Leveraging the MESC dataset, we propose a general Sequential Multimodal Emotional Support framework (SMES) grounded in Therapeutic Skills Theory. Tailored for multimodal dialogue systems, the SMES framework incorporates an LLM-based reasoning model that sequentially generates user emotion recognition, system strategy prediction, system emotion prediction, and response generation. Our rigorous evaluations demonstrate that this framework significantly enhances the capability of AI systems to mimic therapist behaviors with heightened empathy and strategic responsiveness. By integrating multimodal data in this innovative manner, we bridge the critical gap between emotion recognition and emotional support, marking a significant advancement in conversational AI for mental health support. This work not only pushes the boundaries of AI’s role in mental health care but also establishes a foundation for developing conversational agents that can provide more empathetic and effective emotional support.
Depression, a prevalent mental health condition, significantly impacts individuals' well-being and quality of life. Early and accurate detection is crucial for effective intervention, yet many individuals struggle to recognize their symptoms or seek timely help. To address this challenge, we developed ZENMIND an AI-powered system that utilizes machine learning and Generative AI (GenAI) to assist in depression detection and personalized treatment recommendations. The system employs a questionnaire-based assessment, inspired by standardized tools such as the PHQ-9 and BDI, to systematically evaluate users' symptoms, emotional state, and behavioral patterns. By leveraging advanced machine learning algorithms, the platform analyzes quiz responses to provide a personalized depression risk assessment, encouraging individuals to seek professional support when needed. Beyond detection, the solution integrates GenAI to generate tailored wellness recommendations, including customized workout routines, dietary suggestions, and self-care strategies aligned with the user's mental health profile. Additionally, it features an AI-powered conversational agent, offering empathetic guidance, mental health education, and continuous emotional support. Users also gain access to a curated library of reliable information on depression, treatment options, and a directory of mental health organizations. By combining scientifically validated assessments with AI -driven insights, the system aims to bridge the gap between self-assessment and professional care, fostering early intervention, personalized well-being strategies, and improved mental health outcomes.
Using natural language processing (NLP), computer vision, and machine learning technologies, a comprehensive chatbot system named SilentCare was developed. The technical architecture of SilentCare includes and integrates speech-to-text components, realtime sign language recognition through gesture-based CNNs, emotion scrutiny using an enhanced LSTM model, and multilingual text reception for optimal interaction with the user. Assistive technologies tend to emphasize facilitating the disabled and mute limbs through singular interfaces, often neglecting the comprehensive requprw.microsoft.com/emotional-mental-health-careservices for the deaf and mute, mental health support, and other specialized services. SilentCare utilizes multimodal representations including texts, audio, sign languages, and customized healthcare services using emotional sentiment analysis to fill the void left by other technologies [1], [14]. A new LSTM-based emotion recognition algorithm that offers real-time accuracy and response speed in low-resource environments has been proposed. Compared to traditional models, SilentCare outperformed in emotional intent classification and user satisfaction evaluation, thus advancing the boundaries and scope of accessible assistive communication technologies.
NeuroReflect is a modular, offline-capable AI framework for early mental health screening using speech and text. It integrates (i) speech emotion recognition from acoustic features (MFCC, Chroma, Contrast) on RAVDESS, (ii) multilabel text emotion classification via TF-IDF and One-vs-Rest Logistic Regression on Reddit/eRisk data, and (iii) binary depression detection from voice prosody on DAIC-WOZ. We introduce emotion conflict scoring-detecting co-occurring opposite-valence emotions (e.g., joy + fear)-as a lightweight proxy for linguistic disorganization. The system achieves 74.2 % accuracy (ROC-AUC 0.75) in depression detection, 71% Hamming score in text emotion labeling, and 62.4 % in speech emotion recognition. All processing runs locally via a Python CLI, enabling privacy-preserving, real-time assessment in telehealth and self-monitoring scenarios.
Mental health issues significantly affect society, requiring long-term treatment that is both costly and impacts personal relationships. This research paper proposes a speech-based method to screen for depression risk as an accessible alternative. The research included five machine learning models-Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Naive Bayes, and Linear Discriminant Analysis (LDA)-were used to analyze speech data from 25 participants with varying mental health conditions. Three data preparation techniques-Leave-One-Out Cross-Validation, SMOTE, and raw data-were applied to optimize the models. Results showed that SVM with SMOTE achieved the highest accuracy (98 %), emphasizing the importance of effective data preparation. This research demonstrates the potential of using speech analysis as a cost-effective tool for early mental health screening and provides a foundation for practical applications in healthcare and beyond.
More than 280 million people worldwide suffer from mental health conditions like anxiety and depression. Despite increased awareness, early diagnosis remains challenging due to reliance on conventional methods like clinical interviews and self-reported questionnaires, which are often subjective and inaccessible. This study proposes a scalable, objective, and non invasive AI-powered mental health assessment framework that utilizes Natural Language Processing (NLP) and speech signal analysis. The system combines transformer-based models like BERT for understanding text and LSTM-based models for analyzing speech patterns including pitch, jitter, shimmer, and MFCCs. Using datasets such as Reddit, therapy transcripts, and DAIC WOZ, the system is trained to detect early signs of depression and anxiety. A multimodal fusion layer further integrates both modalities to enhance predictive accuracy. The final product is a real-time, mobile-compatible application that accepts voice or text inputs for screening and provides feedback, risk assessment, and recommendations. Ethical consid erations including data privacy, explainability, and algorithmic fairness are addressed throughout. The proposed system is aimed at reducing the diagnostic gap, supporting early intervention, and making mental health tools more accessible.
The growing prevalence of mental disorders, coupled with stigma, cost, and limited clinician access, highlights the urgent need for scalable AI solutions in mental health. Key challenges include delayed detection, poor integration of structured and unstructured data, and the lack of personalized screening methods. To address these gaps, we propose a hybrid model that combines SentenceTransformer-based text embeddings with structured data from the Patient Health Questionnaire-8 (PHQ8) and Post-Traumatic Stress Disorder Checklist-Civilian Version (PCL-C). Using 218 preprocessed files from the Extended Distress Analysis Interview Corpus (E-DAIC), we trained and optimized an Extreme Gradient Boosting (XGBoost) classifier. The model achieved strong performance, with an F1 score of 0.90 and an ROC-AUC of 0.9752. Through feature fusion, interpretability, and threshold optimization, the framework generalizes well and provides clinically meaningful feedback. These findings suggest that integrating structured assessments with semantic dialogue analysis improves early detection and supports explainable AI in mental health.
Mental health issues have become a significant global concern, affecting individuals' ability to perform daily tasks and increasing the risk of social prejudice. This paper presents a solution to address mental health challenges while reducing the burden on medical professionals. Our research includes a survey of 41 respondents to gauge public perception of mesntal health and their willingness to share their thoughts. We propose a web-based platform that provides mental health guidance and facilitates direct communication with psychologists. The platform integrates a speech-to-text model for audio transcription and a natural language processing (NLP) model to classify mental health conditions. Its architecture ensures secure data storage while enabling users to access essential resources without fear of discrimination.
This work presented a web-based system which introduces an active-listening strategy in a spoken dialogue for self-disclosure to support mental health of a campus user. To enhance the system usability and safety, this demo is developed to conduct the bilingual (Mandarin/English) spoken dialogue where a high-risk dialogue detection during speech interaction is reliably augmented. In particular, a prompt-driven GPT classifier identifies the utterances indicating self-harm or suicide intent and triggers safety alerts with help center and counselor notification. We also integrate a TTS module for Taiwanese Mandarin and standard English, and redesign the user interface to automatically pop up alert messages when high-risk dialogue is detected. In addition, we collect speech data under diverse mental dialogue scenarios with bilingual speech to enable system analysis, evaluation and refinement. Overall, these extensions build a framework that promotes empathetic interactions, enables timely alert in critical cases, and improves the accessibility for diverse users.
No abstract available
Depression in older adults, often underrecognized and frequently conflated with cognitive symptoms, remains a major challenge in settings such as assisted-living communities. However, the need for scalable, speech-based screening tools extends across diverse populations and is not restricted to older adults or residential care. Depression in older adults is both common and frequently underdiagnosed, and while assisted-living environments represent a high-need deployment context, the present model is population-agnostic and can be validated across multiple real-world settings. Depression often co-occurs with mild cognitive impairment, creating a complex and vulnerable clinical landscape. Despite this urgency, scalable, interpretable, and easy-to-administer tools for early screening remain scarce. In this study, we introduce a transparent and lightweight AI-driven screening model that uses only four linguistic features extracted from brief conversational speech to detect depression with high sensitivity. Trained on the DAIC-WOZ dataset and optimized for deployment in resource-constrained settings, our model achieved moderate discriminative performance (AUC = 0.760) with a clinically calibrated sensitivity of 92%. Beyond raw accuracy, the model offers insights into how affective language, syntactic complexity, and latent semantic content relate to psychological states. Notably, one semantic feature derived from transformer embeddings, emb_1, appears to capture deeper emotional or cognitive tension not directly expressed through lexical negativity. Although the dataset does not contain explicit cognitive-status labels, these findings motivate future research to test whether similar semantic patterns may overlap with linguistic indicators of cognitive-affective strain observed in prior work. Our approach outperforms many more complex models in the literature, yet remains simple enough for real-time, on-device use, marking a step forward in making mental health AI both interpretable and clinically actionable. The resulting framework is population-agnostic and can be validated in assisted-living environments as one of several high-need deployment settings.
Depression is a growing global concern, affecting millions of individuals and often going undiagnosed due to stigma and lack of access to mental health professionals. This project aims to develop an AI powered system for early detection of depression using text and speech analysis. The model leverages Natural Language Processing (NLP) to analyze textual input for depressive language patterns and Machine Learning (ML) techniques to detect vocal features associated with depression from speech samples. By integrating sentiment analysis, emotion detection, and acoustic feature extraction, the system can assess linguistic and vocal cues indicative of depressive tendencies. The model will be trained on datasets containing both text-based conversations and speech recordings labeled for depression severity. The goal is to create an assistive tool that can help mental health professionals with early screening and intervention, ultimately improving mental health outcomes.
Affective computing, as a core technology for intelligent services, is rapidly transitioning from the laboratory to real-world, cloud-driven applications, particularly in critical areas such as mental health. Research utilizing large language models (LLMs) to study affective disorders has a long history, with numerous AI-based diagnostic approaches proposed. However, few studies have evaluated LLMs' ability to identify dangerous discourse in natural conversational settings. To address this gap, this paper conducts the first systematic evaluation of five cutting-edge affective LLMs-GPT-5, Gemini, DeepSeek, Kimi, and Doubao-in identifying depression- and anxiety-related dangerous statements within general affective contexts. Testing was performed using a meticulously curated real clinical dialogue dataset. We designed a comprehensive evaluation framework integrating human annotation, multi-scenario sample data, and multidimensional metrics to measure model performance and generalization capabilities. Experimental results reveal significant performance disparities among models, with accuracy ranging from 0.66 to 0.92. DeepSeek achieves optimal generalization with 0.92 accuracy and 0.98 recall, while Gemini exhibits notable limitations with the lowest recall of 0.59. Analysis of the results reveals that current LLM deployment research faces three core challenges: the persistent precision-recall trade-off dilemma; limited contextual understanding capabilities; and inadequate risk stratification beyond binary classification. Based on these findings, we propose two practical recommendations: clinicians should adopt human-supervised hybrid approaches; developers must enhance safety alignment mechanisms, improve contextual reasoning capabilities, and build explainable affective LLMs to achieve generalization across diverse scenarios.
This paper presents the design, implementation, and evaluation of a multimodal intelligent conversational agent embodied within a WebGL-accessible 3D virtual environment. The system integrates voice and text-based interaction with a personalized avatar capable of delivering emotionally attuned and context-aware responses. Developed using Unity WebGL, Flask microservices, and containerized with Docker, the architecture incorporates SpeechRecognition for local speech-to-text (STT), and a fine-tuned DeepSeek-r1 large language model deployed via the Ollama framework. The agent's dialogue engine has been specialized through domain-specific transfer learning focused on mental health support, enabling empathetic and contextually appropriate responses. The fine-tuned model achieves an average response time that is only 23.5 % of that of the original DeepSeek model, while also demonstrating superior semantic similarity performance. The personalized avatars are synchronized dynamically with the affective content of each interaction, enhancing user engagement. A usability study with 20 participants demonstrates high satisfaction, low-latency performance, and emotional resonance. This work contributes a modular framework for deploying browser-based embodied AI agents in sensitive domains such as digital mental health care.
More than 264 million people suffer from depression and more than 301 million have anxiety disorders globally, as stated by the World Health Organization, indicating the urgency for early diagnosis. Historically used diagnosis is typically in demand for specialized clinicians and long testing and has obstacles to rapid treatment. In order to undo this, we propose hierarchical Long Short-Term Memory (LSTM) architecture for AI-based binary screening of risk of mental illness based on pre-prepared interview responses. Our subjects undergo an interview in the form of a video whose questions have been pre-prepared according to the DAIC-WOZ protocol and PHQ-8 clinical guidelines. They present responses that are accepted as transcripts and fed into the analysis framework. The design consists of a word-level LSTM semantic speech encoding and an utterance-level LSTM for contextual flow encoding at the session level. Experiments were conducted on two benchmark datasets, the DAIC-WOZ corpus for depression and Simulated Face-to-Face Medical Consultation Corpus (SFMCM) for anxiety. Results indicate that the model attains 90.62% classification accuracy with an F1-Score of 0.91, which is indicative of its ability to flag high-risk individuals. This work demonstrates the promise of LSTM-based models for large-scale text-based mental health screening. Scaling this up to a dual-pipeline diagnosis model screening for depression and anxiety simultaneously is what our future research will involve, bringing us yet another step closer to reality with AI-enabled mental health solutions.
BACKGROUND Psychological distress, particularly symptoms of depression and anxiety (D&A), is highly prevalent among family caregivers of individuals living with cancer, who often assume central roles in care coordination, treatment adherence, symptom monitoring, and emotional support. Rates of distress among caregivers frequently equal or exceed those observed in patients themselves. Despite increased attention to caregivers' mental health needs, routine distress screening remains limited in oncology care settings. Advances in mobile health technology and artificial intelligence (AI) offer opportunities to address these needs by providing accessible and user-driven tools. The Ellipsis Caregiver Assessment Enhancement (eCARE; Ellipsis Health, Inc) is a speech-based, AI-enabled mobile app designed to screen and monitor symptoms of depression and anxiety. By collecting brief voice recordings and in-app survey data, eCARE offers a scalable approach for integrating caregiver distress monitoring into cancer care. OBJECTIVE This single-arm trial will evaluate the feasibility and acceptability of the eCARE app among family members who are the primary caregivers of patients diagnosed with cancer within the past 5 years. Specifically, the study aims to (1) determine feasibility based on platform completion rates, (2) assess acceptability using validated measures, and (3) identify barriers and facilitators influencing the uptake and sustained use of eCARE. METHODS In Phase 1, a total of 60 United States-based family caregivers will be recruited from community health clinics, cancer and caregiving advocacy groups, and online postings. Screened and enrolled caregivers will complete 6 eCARE sessions over an 8-week period. Pre- and posttest surveys assess depression, anxiety, caregiving burden, and relational processes. Feasibility will be evaluated based on the proportion of participants who complete at least 66% of weekly assessments, and acceptability will be assessed using the acceptability of intervention measure (AIM). In Phase 2, a total of 20 caregivers will be invited to participate in semi-structured online interviews to explore user experience, including perceived benefits, barriers to use, and preferences for future implementation. Qualitative data will be analyzed thematically to inform tool refinement. RESULTS The study has received Institutional Review Board approval from the University of Houston. Participant recruitment and enrollment began in June 2024, with data collection expected to conclude by August 2025. Data analysis will begin in December 2025, with preliminary results anticipated by May 2026. CONCLUSIONS This study will generate preliminary evidence on the feasibility, acceptability, and utility of a speech-based, AI-enabled smartphone tool for monitoring D&A symptoms among family cancer caregivers. Findings will inform the design of a larger, fully powered trial and guide future implementation of remote psychological distress monitoring strategies in oncology care. By offering a low-burden, caregiver-centered approach, eCARE has the potential to expand access to psychosocial support and facilitate timely identification of needs and coordination of services across cancer care settings. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/83276.
The global increase in mental-health conditions such as anxiety, depression, and stress highlights the need for accessible and timely psychological evaluation. Conventional evaluation remains limited due to clinician shortages and the stigma associated with seeking help. This work presents a multimodal Virtual Psychiatrist Interviewer designed to facilitate adaptive and scalable early-stage mental-health screening. The proposed framework integrates DistilBERT for linguistic interpretation, a convolutional audio-emotion model to analyze vocal cues, and V2Face-based facial-affect recognition for visual understanding. An attention-driven fusion mechanism combines text, acoustic, and facial embeddings to capture complementary behavioral signals and produce robust preliminary assessments. The system is trained and evaluated on a curated mental health text dataset, the RAVDESS emotional speech corpus, and publicly available facial expression datasets. Experimental results demonstrate competitive performance on anxiety, depression, and stress detection tasks, while ablation studies confirm the contribution of each modality. The findings indicate the potential of the proposed system for real-time AI-assisted mental-health support
最终分组全面覆盖了心理筛查AI对话系统的技术、交互、应用及伦理四大维度。研究趋势展示了从单模态向多模态深度融合的跨越,以及大语言模型如何在临床逻辑和共情交互上重塑对话系统。同时,通过对特定人群的针对性开发和严苛的安全评估框架,心理AI正逐步从实验室研究走向受监管的医疗应用实践,旨在通过技术手段填补全球心理健康服务的巨大缺口。