用基于深度学习的格兰杰因果方法研究脑电的抑郁症检测
深度学习驱动的非线性格兰杰因果推断算法研究
该组文献专注于从方法论层面突破传统线性格兰杰因果(GC)的局限。利用神经网络(如MLP、RNN、LSTM、KAN、Transformer)结合正则化技术(如Jacobian正则化、稀疏惩罚)和自注意力机制,解决高维脑电数据中的非线性、长程依赖及潜变量干扰问题,为有效连接分析提供更精确的数学工具。
- Frequency decomposition of conditional Granger causality and application to multivariate neural field potential data(Yonghong Chen, Steven L. Bressler, Mingzhou Ding, 2006, ArXiv Preprint)
- Perturbing a Neural Network to Infer Effective Connectivity: Evidence from Synthetic EEG Data(Peizhen Yang, Xinke Shen, Zongsheng Li, Zixiang Luo, Kexin Lou, Quanying Liu, 2023, ArXiv Preprint)
- Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes(Dongxia Wu, Tsuyoshi Idé, Aurélie Lozano, Georgios Kollias, Jiří Navrátil, Naoki Abe, Yi-An Ma, Rose Yu, 2024, ArXiv Preprint)
- Estimating Granger Causality with Unobserved Confounders via Deep Latent-Variable Recurrent Neural Network(Yuan Meng, 2019, ArXiv Preprint)
- Sparse Causal Discovery in Multivariate Time Series(Stefan Haufe, Guido Nolte, Klaus-Robert Mueller, Nicole Kraemer, 2009, ArXiv Preprint)
- An Interpretable and Sparse Neural Network Model for Nonlinear Granger Causality Discovery(Alex Tank, Ian Cover, Nicholas J. Foti, Ali Shojaie, Emily B. Fox, 2017, ArXiv Preprint)
- Jacobian Granger Causal Neural Networks for Analysis of Stationary and Nonstationary Data(Suryadi, Yew-Soon Ong, Lock Yue Chew, 2022, ArXiv Preprint)
- Interpretable Models for Granger Causality Using Self-explaining Neural Networks(Ričards Marcinkevičs, Julia E. Vogt, 2021, ArXiv Preprint)
- GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Imaging Studies(Minh Nguyen, Gia H. Ngo, Mert R. Sabuncu, 2022, ArXiv Preprint)
- Kernel Granger causality and the analysis of dynamical networks(Daniele Marinazzo, Mario Pellicoro, Sebastiano Stramaglia, 2008, ArXiv Preprint)
- Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality(Saurabh Khanna, Vincent Y. F. Tan, 2019, ArXiv Preprint)
- Deep learning based doubly robust test for Granger causality(Yongchang Hui, Chijin Liu, Xiaojun Song, 2025, ArXiv Preprint)
- Granger Causality using Neural Networks(Malik Shahid Sultan, Samuel Horvath, Hernando Ombao, 2022, ArXiv Preprint)
- Neural Granger Causality(Alex Tank, Ian Covert, Nicholas Foti, Ali Shojaie, Emily Fox, 2018, ArXiv Preprint)
- Deep Recurrent Modelling of Granger Causality with Latent Confounding(Zexuan Yin, Paolo Barucca, 2022, ArXiv Preprint)
- Jacobian Regularizer-based Neural Granger Causality(Wanqi Zhou, Shuanghao Bai, Shujian Yu, Qibin Zhao, Badong Chen, 2024, ArXiv Preprint)
- Kolmogorov-Arnold Networks for Time Series Granger Causality Inference(Meiliang Liu, Yunfang Xu, Zijin Li, Zhengye Si, Xiaoxiao Yang, Xinyue Yang, Zhiwen Zhao, 2025, ArXiv Preprint)
- Variable-lag Granger Causality and Transfer Entropy for Time Series Analysis(Chainarong Amornbunchornvej, Elena Zheleva, Tanya Berger-Wolf, 2020, ArXiv Preprint)
- CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods(Wei Zhang, Thomas Kobber Panum, Somesh Jha, Prasad Chalasani, David Page, 2020, ArXiv Preprint)
融合有效连接与图神经网络的抑郁症自动检测模型
此类研究侧重于工程实现,将推断出的脑电有效连接(EC)或功能连接作为图特征,输入到GCN、CapsNet、Vision Transformer等深度学习架构中。通过端到端的学习,自动提取时空特征,旨在构建高准确率的抑郁症客观诊断系统。
- AMGCN-L: an adaptive multi-time-window graph convolutional network with long-short-term memory for depression detection(Han-Guang Wang, Qing‐Hao Meng, Li-Cheng Jin, Hui-Rang Hou, 2023, Journal of Neural Engineering)
- Major depressive disorder diagnosis based on effective connectivity in EEG signals: a convolutional neural network and long short-term memory approach(Abdolkarim Saeedi, Maryam Saeedi, Arash Maghsoudi, Ahmad Shalbaf, 2020, Cognitive Neurodynamics)
- Major Depressive Disorder Detection Using Effective Connectivity of EEG Signals and Deep Learning Transformer Model(Nur Amira Ahmad Rezal, Norashikin Yahya, Farah Diana Azman, Muhammad Ali Ashraf Hanapi, Azrina Abd Aziz, Danish M. Khan, 2024, No journal)
- CI-GNN: A Granger Causality-Inspired Graph Neural Network for Interpretable Brain Network-Based Psychiatric Diagnosis(Kaizhong Zheng, Shujian Yu, Badong Chen, 2023, ArXiv Preprint)
- TSF-MDD: A Deep Learning Approach for Electroencephalography-Based Diagnosis of Major Depressive Disorder with Temporal–Spatial–Frequency Feature Fusion(Wei Gan, R. P. Zhao, Yujie Ma, Xiaolin Ning, 2025, Bioengineering)
- Attention-Based Convolutional Recurrent Deep Neural Networks for the Prediction of Response to Repetitive Transcranial Magnetic Stimulation for Major Depressive Disorder.(Mohsen Sadat Shahabi, Ahmad Shalbaf, Behrooz Nobakhsh, Reza Rostami, Reza Kazemi, 2023, International journal of neural systems)
- Major Depressive Disorder Classification Based on Different Convolutional Neural Network Models: Deep Learning Approach.(Caglar Uyulan, Türker Tekin Ergüzel, Huseyin Unubol, Merve Cebi, Gokben Hizli Sayar, Mahdi Nezhad Asad, Nevzat Tarhan, 2021, Clinical EEG and neuroscience)
- A Hybrid Graph Neural Network for Enhanced EEG-Based Depression Detection(Yiye Wang, Wenming Zheng, Yang Li, Hao Yang, 2024, ArXiv Preprint)
- Discovery of Shared Latent Nonlinear Effective Connectivity for EEG-Based Depression Detection.(Wenjie Yuan, Xiaowei Zhang, Xuejuan Zhang, Shuangyan Wang, Tianzhi Wang, Tong Zhang, Qinglin Zhao, Bin Hu, 2025, IEEE transactions on neural networks and learning systems)
- An End-to-End Deep Learning Model for EEG-Based Major Depressive Disorder Classification(Min Xia, Yangsong Zhang, Yihan Wu, Xiuzhu Wang, 2023, IEEE Access)
- Neural Networks with Different Initialization Methods for Depression Detection(Tianle Yang, 2022, ArXiv Preprint)
- Boosted Convolutional Neural Networks for Motor Imagery EEG Decoding with Multiwavelet-based Time-Frequency Conditional Granger Causality Analysis(Yang Li, Mengying Lei, Xianrui Zhang, Weigang Cui, Yuzhu Guo, Ting-Wen Huang, Hua-Liang Wei, 2018, ArXiv Preprint)
抑郁症神经机制探索与个体化治疗反应预测
该组文献将因果分析应用于临床神经科学,探讨抑郁症患者大脑信息流向的异常模式(如Top-down控制减弱)。同时,利用预治疗阶段的脑电连接特征预测患者对药物(如舍曲林)或物理治疗(如rTMS)的反应,推动精准精神医学的发展。
- Anxiety and depression: A top‐down, bottom‐up model of circuit function(Deryn O. LeDuke, Matilde Borio, Raymundo L Miranda, Kay M. Tye, 2023, Annals of the New York Academy of Sciences)
- Automatic Positive and Negative Emotion Regulation in Adolescents with Major Depressive Disorder.(Wenhai Zhang, Cancan Zhao, Fanggui Tang, Wenbo Luo, 2024, Psychopathology)
- Extracting the Multiscale Causal Backbone of Brain Dynamics(Gabriele D'Acunto, Francesco Bonchi, Gianmarco De Francisci Morales, Giovanni Petri, 2023, ArXiv Preprint)
- Atypical intra- and inter-regional coupling patterns involved in repetitive non-suicidal self-injury.(Yi Xia, Ciqing Bao, Jiayu Liu, Yingying Huang, Xiaoqin Wang, Lingling Hua, Rui Yan, Jiabo Shi, Zhijian Yao, Qing Lu, 2026, Journal of psychiatric research)
- Unconscious elevated bottom-up processing in depression: Insights from dynamic causal modeling with EEG and fMRI.(Julia Schräder, Thilo Kellermann, Damin Kühn, Lennard Rompelberg, Michael T Schaub, Lisa Wagels, 2026, Journal of affective disorders)
- The human orbitofrontal cortex, vmPFC, and anterior cingulate cortex effective connectome: emotion, memory, and action(Edmund T. Rolls, Gustavo Deco, Chu‐Chung Huang, Jianfeng Feng, 2022, Cerebral Cortex)
- A study of resting-state EEG biomarkers for depression recognition(Shuting Sun, Jianxiu Li, Huayu Chen, Tao Gong, Xiaowei Li, Bin Hu, 2020, ArXiv Preprint)
- Depression Diagnosis Modeling With Advanced Computational Methods: Frequency-Domain eMVAR and Deep Learning.(Caglar Uyulan, Sara de la Salle, Turker T Erguzel, Emma Lynn, Pierre Blier, Verner Knott, Maheen M Adamson, Mehmet Zelka, Nevzat Tarhan, 2022, Clinical EEG and neuroscience)
- Prediction of treatment response in major depressive disorder using a hybrid of convolutional recurrent deep neural networks and effective connectivity based on EEG signal.(Seyed Morteza Mirjebreili, Reza Shalbaf, Ahmad Shalbaf, 2024, Physical and engineering sciences in medicine)
- Multi-scale EEG analysis identifies neural circuit signatures of iTBS responsiveness in major depressive disorder.(Leilei Zheng, Jihan Fu, Ziqi Sun, Yuhang Han, Tiannan Shao, Qiutang Wang, Zheng Lin, 2026, NeuroImage)
- Using deep learning and pretreatment EEG to predict response to sertraline, bupropion, and placebo.(Marman Ravan, Amin Noroozi, Harshil Gediya, Kennette James Basco, Gary Hasey, 2024, Clinical neurophysiology : official journal of the International Federation of Clinical Neurophysiology)
- Deep graph learning of multimodal brain networks defines treatment-predictive signatures in major depression.(Yong Jiao, Kanhao Zhao, Xinxu Wei, Nancy B Carlisle, Corey J Keller, Desmond J Oathes, Gregory A Fonzo, Yu Zhang, 2025, Molecular psychiatry)
- Brain stimulation outcome prediction in Major Depressive Disorder by deep learning models using EEG representations.(Mohsen Sadat Shahabi, Ahmad Shalbaf, Reza Rostami, Reza Kazemi, 2025, Computer methods in biomechanics and biomedical engineering)
- Predicting antidepressant response via local-global graph neural network and neuroimaging biomarkers(Rui Liu, Ximan Hou, Shuyu Liu, Yuan Zhou, Jingjing Zhou, Kaini Qiao, Han Qi, Ruinan Li, Zhiyu Yang, Ling Zhang, Jian Cui, Cheng Jin, Aihong Yu, Gang Wang, 2025, npj Digital Medicine)
- Predicting depression severity using effective and functional brain connectivity of the electroencephalography signals(Naif H. Alotaibi, Dalal M. Bakheet, 2025, Computers in Biology and Medicine)
复杂场景下的脑电稳健性、生成增强与通用模型
针对脑电数据噪声大、样本不均衡及隐私保护等现实挑战,该组研究提出了生成式对抗网络(GAN)、联邦学习、大规模预训练模型(LaBraM)以及算法公平性缓解方案,旨在提升抑郁症检测模型在复杂实际场景下的鲁棒性和泛化能力。
- Incomplete Depression Feature Selection with Missing EEG Channels(Zhijian Gong, Wenjia Dong, Xueyuan Xu, Fulin Wei, Chunyu Liu, Li Zhuo, 2025, ArXiv Preprint)
- On the use of generative deep neural networks to synthesize artificial multichannel EEG signals(Ozan Ozdenizci, Deniz Erdogmus, 2021, ArXiv Preprint)
- EEG Based Generative Depression Discriminator(Ziming Mao, Hao wu, Yongxi Tan, Yuhe Jin, 2024, ArXiv Preprint)
- TNPAR: Topological Neural Poisson Auto-Regressive Model for Learning Granger Causal Structure from Event Sequences(Yuequn Liu, Ruichu Cai, Wei Chen, Jie Qiao, Yuguang Yan, Zijian Li, Keli Zhang, Zhifeng Hao, 2023, ArXiv Preprint)
- Federated Granger Causality Learning for Interdependent Clients with State Space Representation(Ayush Mohanty, Nazal Mohamed, Paritosh Ramanan, Nagi Gebraeel, 2025, ArXiv Preprint)
- A VAE-based Framework for Learning Multi-Level Neural Granger-Causal Connectivity(Jiahe Lin, Huitian Lei, George Michailidis, 2024, ArXiv Preprint)
- Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI(Wei-Bang Jiang, Li-Ming Zhao, Bao-Liang Lu, 2024, ArXiv Preprint)
- Machine Learning Fairness for Depression Detection using EEG Data(Angus Man Ho Kwok, Jiaee Cheong, Sinan Kalkan, Hatice Gunes, 2025, ArXiv Preprint)
脑连接分析的技术评估、基准测试与综述
该组文献提供了行业标准和方法论指导,包括对脑电连接分析技术的系统回顾、针对脑网络GNN模型的基准测试(BrainGB),以及对节点中心性等图论指标在临床应用中有效性的批判性评估。
- Connectivity Analysis in EEG Data: A Tutorial Review of the State of the Art and Emerging Trends(Giovanni Chiarion, Laura Sparacino, Yuri Antonacci, Luca Faes, Luca Mesin, 2023, Bioengineering)
- BrainGB: A Benchmark for Brain Network Analysis With Graph Neural Networks(Hejie Cui, Wei Dai, Yanqiao Zhu, Xuan Kan, Antonio Aodong Chen Gu, Joshua Lukemire, Liang Zhan, Lifang He, Ying Guo, Carl Yang, 2022, IEEE Transactions on Medical Imaging)
- Node centrality measures are a poor substitute for causal inference(Fabian Dablander, Max Hinne, 2019, Scientific Reports)
- Technical and clinical considerations for electroencephalography-based biomarkers for major depressive disorder(Leif Simmatis, Emma E. Russo, Joseph Geraci, Irene E. Harmsen, Nardin Samuel, 2023, npj Mental Health Research)
本研究领域正经历从传统线性连接分析向深度学习驱动的非线性因果建模的范式转移。核心研究路径表现为:首先通过深度神经网络改进格兰杰因果算法以捕捉复杂的脑电非线性特征;其次将这些因果特征融入图神经网络(GNN)构建高效的抑郁症自动识别系统;同时,研究深入探讨了抑郁症的神经环路机制,并致力于个体化治疗反应的精准预测。此外,随着大规模预训练模型、联邦学习和生成式增强技术的引入,该领域正朝着更具鲁棒性、公平性和临床实用性的方向发展。
总计59篇相关文献
No abstract
<i>Objective.</i>Depression is a common chronic mental disorder characterized by high rates of prevalence, recurrence, suicide, and disability as well as heavy disease burden. An accurate diagnosis of depression is a prerequisite for treatment. However, existing questionnaire-based diagnostic methods are limited by the innate subjectivity of medical practitioners and subjects. In the search for a more objective diagnostic methods for depression, researchers have recently started to use deep learning approaches.<i>Approach.</i>In this work, a deep-learning network, named adaptively multi-time-window graph convolutional network (GCN) with long-short-term memory (LSTM) (i.e. AMGCN-L), is proposed. This network can automatically categorize depressed and non-depressed people by testing for the existence of inherent brain functional connectivity and spatiotemporal features contained in electroencephalogram (EEG) signals. AMGCN-L is mainly composed of two sub-networks: the first sub-network is an adaptive multi-time-window graph generation block with which adjacency matrices that contain brain functional connectivity on different time periods are adaptively designed. The second sub-network consists of GCN and LSTM, which are used to fully extract the innate spatial and temporal features of EEG signals, respectively.<i>Main results.</i>Two public datasets, namely the patient repository for EEG data and computational tools, and the multi-modal open dataset for mental-disorder analysis, were used to test the performance of the proposed network; the depression recognition accuracies achieved in both datasets (using tenfold cross-validation) were 90.38% and 90.57%, respectively.<i>Significance.</i>This work demonstrates that GCN and LSTM have eminent effects on spatial and temporal feature extraction, respectively, suggesting that the exploration of brain connectivity and the exploitation of spatiotemporal features benefit the detection of depression. Moreover, the proposed method provides effective support and supplement for the detection of clinical depression and later treatment procedures.
A serious public health problem, major depression disorder is characterised by a person's increased risk of suicidal thoughts, chronic melancholy, and disinterest in once-enjoyable activities. Inaccurate diagnosis and inadequate treatment are frequently the results of surveys and psychiatric examinations of MDD. This work aims to assess the use of effective connectivity (EC) and vision transformers (ViT) to address complexities associated with interpreting the intricate temporal and spatial properties of EEG data. This study estimated the EC within the brain default mode network (DMN) using the EEG signals from 30 MDD and 30 healthy controls (HC). The six key regions of the DMN's effective connectivity are taken from a 3D matrix and placed into a 2D matrix for a suitable input into a deep learning model. Next, a vision transformer (ViT) model is trained, validated, and tested using the 2D picture of EC. The results show that the proposed MDD diagnosis algorithm achieved 82.17% accuracy with an AUC of 0.887 in classifying MDD and HC test subjects. The model's accuracy decreased with the introduction of new subjects due to increasing variation in the EC images of MDD and HC classes.
No abstract
No abstract
Understanding how different areas of the human brain communicate with each other is a crucial issue in neuroscience. The concepts of structural, functional and effective connectivity have been widely exploited to describe the human connectome, consisting of brain networks, their structural connections and functional interactions. Despite high-spatial-resolution imaging techniques such as functional magnetic resonance imaging (fMRI) being widely used to map this complex network of multiple interactions, electroencephalographic (EEG) recordings claim high temporal resolution and are thus perfectly suitable to describe either spatially distributed and temporally dynamic patterns of neural activation and connectivity. In this work, we provide a technical account and a categorization of the most-used data-driven approaches to assess brain-functional connectivity, intended as the study of the statistical dependencies between the recorded EEG signals. Different pairwise and multivariate, as well as directed and non-directed connectivity metrics are discussed with a pros-cons approach, in the time, frequency, and information-theoretic domains. The establishment of conceptual and mathematical relationships between metrics from these three frameworks, and the discussion of novel methodological approaches, will allow the reader to go deep into the problem of inferring functional connectivity in complex networks. Furthermore, emerging trends for the description of extended forms of connectivity (e.g., high-order interactions) are also discussed, along with graph-theory tools exploring the topological properties of the network of connections provided by the proposed metrics. Applications to EEG data are reviewed. In addition, the importance of source localization, and the impacts of signal acquisition and pre-processing techniques (e.g., filtering, source localization, and artifact rejection) on the connectivity estimates are recognized and discussed. By going through this review, the reader could delve deeply into the entire process of EEG pre-processing and analysis for the study of brain functional connectivity and learning, thereby exploiting novel methodologies and approaches to the problem of inferring connectivity within complex networks.
The human orbitofrontal cortex, ventromedial prefrontal cortex (vmPFC), and anterior cingulate cortex are involved in reward processing and thereby in emotion but are also implicated in episodic memory. To understand these regions better, the effective connectivity between 360 cortical regions and 24 subcortical regions was measured in 172 humans from the Human Connectome Project and complemented with functional connectivity and diffusion tractography. The orbitofrontal cortex has effective connectivity from gustatory, olfactory, and temporal visual, auditory, and pole cortical areas. The orbitofrontal cortex has connectivity to the pregenual anterior and posterior cingulate cortex and hippocampal system and provides for rewards to be used in memory and navigation to goals. The orbitofrontal and pregenual anterior cortex have connectivity to the supracallosal anterior cingulate cortex, which projects to midcingulate and other premotor cortical areas and provides for action-outcome learning including limb withdrawal or flight or fight to aversive and nonreward stimuli. The lateral orbitofrontal cortex has outputs to language systems in the inferior frontal gyrus. The medial orbitofrontal cortex connects to the nucleus basalis of Meynert and the pregenual cingulate to the septum, and damage to these cortical regions may contribute to memory impairments by disrupting cholinergic influences on the neocortex and hippocampus.
Mapping the connectome of the human brain using structural or functional connectivity has become one of the most pervasive paradigms for neuroimaging analysis. Recently, Graph Neural Networks (GNNs) motivated from geometric deep learning have attracted broad interest due to their established power for modeling complex networked data. Despite their superior performance in many fields, there has not yet been a systematic study of how to design effective GNNs for brain network analysis. To bridge this gap, we present BrainGB, a benchmark for brain network analysis with GNNs. BrainGB standardizes the process by (1) summarizing brain network construction pipelines for both functional and structural neuroimaging modalities and (2) modularizing the implementation of GNN designs. We conduct extensive experiments on datasets across cohorts and modalities and recommend a set of general recipes for effective GNN designs on brain networks. To support open and reproducible research on GNN-based brain network analysis, we host the BrainGB website at https://braingb.us with models, tutorials, examples, as well as an out-of-box Python package. We hope that this work will provide useful empirical evidence and offer insights for future research in this novel and promising direction.
Major depressive disorder (MDD) is a prevalent mental illness associated with abnormalities in structural and functional brain connectivity, and it has become a global public health problem. Early diagnosis is important and challenging for the treatment of MDD. Previous studies proposed the classification methods for MDD based on brain connectivity features through functional connectivity (FC) or effective connectivity (EC) measures. However, it requires prior knowledge and experience to manually select a algorithm to calculate the brain connectivity features. Given that the representation learning capabilities of deep learning (DL) models and the ability to capture correlations between data of self-attention mechanism, we proposed an end-to-end integrated DL model for classifying MDD patients and healthy controls (HCs) based on resting-state electroencephalography (EEG) data. This model first automatically learned the potential connectivity relationships among EEG channels through a multi-head self-attention mechanism, and then extracted higher-level features through a parallel two-branch convolution neural network (CNN) module, and finally completed the classification through a fully connected layer. A public resting-state EEG dataset was utilized to evaluate the validity of the proposed model. The experimental results indicated that the proposed model achieved 91.06% average accuracy that was better than those of comparison methods using the leave-one-subject-out cross-validation (LOSOCV) method. This study may provide a novel approach for brain connectivity modeling of MDD detection.
No abstract
A functional interplay of bottom-up and top-down processing allows an individual to appropriately respond to the dynamic environment around them. These processing modalities can be represented as attractor states using a dynamical systems model of the brain. The transition probability to move from one attractor state to another is dependent on the stability, depth, neuromodulatory tone, and tonic changes in plasticity. However, how does the relationship between these states change in disease states, such as anxiety or depression? We describe bottom-up and top-down processing from Marr's computational-algorithmic-implementation perspective to understand depressive and anxious disease states. We illustrate examples of bottom-up processing as basolateral amygdala signaling and projections and top-down processing as medial prefrontal cortex internal signaling and projections. Understanding these internal processing dynamics can help us better model the multifaceted elements of anxiety and depression.
Major depressive disorder (MDD) is a prevalent mental illness characterized by persistent sadness, loss of interest in activities, and significant functional impairment. It poses severe risks to individuals' physical and psychological well-being. The development of automated diagnostic systems for MDD is essential to improve diagnostic accuracy and efficiency. Electroencephalography (EEG) has been extensively utilized in MDD diagnostic research. However, studies employing deep learning methods still face several challenges, such as difficulty in extracting effective information from EEG signals and risks of data leakage due to experimental designs. These issues result in limited generalization capabilities when models are tested on unseen individuals, thereby restricting their practical application. In this study, we propose a novel deep learning approach, termed TSF-MDD, which integrates temporal, spatial, and frequency-domain information. TSF-MDD first applies a data reconstruction scheme to obtain a four-dimensional temporal-spatial-frequency representation of EEG signals. These data are then processed by a model based on 3D-CNN and CapsNet, enabling comprehensive feature extraction across domains. Finally, a subject-independent data partitioning strategy is employed during training and testing to eliminate data leakage. The proposed approach achieves an accuracy of 92.1%, precision of 90.0%, recall of 94.9%, and F1-score of 92.4%, respectively, on the Mumtaz2016 public dataset. The results demonstrate that TSF-MDD exhibits excellent generalization performance.
Predicting antidepressant response via local-global graph neural network and neuroimaging biomarkers
No abstract
Adolescents with major depressive disorder (MDD) exhibit hypoactivity to positive stimuli and hyperactivity to negative stimuli in terms of neural responses. Automatic emotion regulation (AER) activates triple networks (i.e., the central control network, default mode network, and salience network). Based on previous studies, we hypothesized that adolescents with MDD exhibit dissociable spatiotemporal deficits during positive and negative AER. We first collected EEG data from 32 adolescents with MDD and 35 healthy adolescents while they performed an implicit emotional Go/NoGo task. Then, we characterized the spatiotemporal dynamics of cortical activity during AER. In Go trials, MDD adolescents exhibited reduced N2 amplitudes, enhanced theta power for positive pictures, and stronger bottom-up information flow from the left orbitofrontal cortex (OFC) to the right superior frontal gyrus compared to top-down information flow than the controls. In contrast, in NoGo trials, MDD adolescents exhibited elevated P3 amplitudes, enhanced theta power, and stronger top-down information flows from the right middle frontal gyrus to the right OFC and the left insula than the controls. Overall, adolescents with MDD exhibited impaired automatic attention to positive emotions and impaired automatic response inhibition. These findings have potential implications for the clinical treatment of adolescents with MDD.
Major depressive disorder (MDD) presents a substantial health burden with low treatment response rates. Predicting antidepressant efficacy is challenging due to MDD's complex and varied neuropathology. Identifying biomarkers for antidepressant treatment requires thorough analysis of clinical trial data. Multimodal neuroimaging, combined with advanced data-driven methods, can enhance our understanding of the neurobiological processes influencing treatment outcomes. To address this, we analyzed resting-state fMRI and EEG connectivity data from 130 patients treated with sertraline and 135 patients with placebo from the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study. A deep learning framework was developed using graph neural networks to integrate data-augmented connectivity and cross-modality correlation, aiming to predict individual symptom changes by revealing multimodal brain network signatures. The results showed that our model demonstrated promising prediction accuracy, with an R
Predicting an individual's response to antidepressant medication remains one of the most challenging tasks in the treatment of major depressive disorder (MDD). Our objective was to use the large EMBARC study database to develop an electroencephalography (EEG)-based method to predict response to antidepressant treatment. Pre-treatment EEG data were collected from study participants treated with either sertraline (N = 105), placebo (N = 119), or bupropion (N = 35). After preprocessing, the robust exact low-resolution electromagnetic tomography (ReLORETA) brain source localization method was used to reconstruct the source signals in 54 brain regions. Connectivity between regions was determined using symbolic transfer entropy (STE). A convolutional neural network (CNN) classified participants as responders or non-responders to each treatment. Classification accuracy was 91.0%, 95.4%, and 86.8% for sertraline, placebo, and bupropion, respectively. The most highly predictive features were connectivity between i) the anterior cingulate cortex and superior parietal lobule (alpha frequency), ii) the anterior cingulate cortex and orbitofrontal area (beta frequency), and iii) the orbitofrontal area and anterior cingulate cortex (gamma frequency). CNN analysis of EEG connectivity may accurately predict response to sertraline, bupropion, and placebo. The suggested method may offer clinicians an accessible and cost-effective tool for speedy treatment and helps pharmaceutical firms to test new antidepressants efficiently.
In this study, we have developed a novel method based on deep learning and brain effective connectivity to classify responders and non-responders to selective serotonin reuptake inhibitors (SSRIs) antidepressants in major depressive disorder (MDD) patients prior to the treatment using EEG signal. The effective connectivity of 30 MDD patients was determined by analyzing their pretreatment EEG signals, which were then concatenated into delta, theta, alpha, and beta bands and transformed into images. Using these images, we then fine tuned a hybrid Convolutional Neural Network that is enhanced with bidirectional Long Short-Term Memory cells based on transfer learning. The Inception-v3, ResNet18, DenseNet121, and EfficientNet-B0 models are implemented as base models. Finally, the models are followed by BiLSTM and dense layers in order to classify responders and non-responders to SSRI treatment. Results showed that the EfficiencyNet-B0 has the highest accuracy of 98.33, followed by DensNet121, ResNet18 and Inception-v3. Therefore, a new method was proposed in this study that uses deep learning models to extract both spatial and temporal features automatically, which will improve classification results. The proposed method provides accurate identification of MDD patients who are responding, thereby reducing the cost of medical facilities and patient care.
Major Depressive Disorder (MDD) is known as a widespread illness and needs a timely treatment. The treatment procedure is currently based on the trial and error between various treatments. An individualized treatment selection is crucial for saving time and financial resources and preventing possible side effects. Because of the complex nature of this problem, a Deep Learning (DL) approach, as a promising method for the precision medicine, was utilized to identify the responders to the treatment using pre-treatment EEG signals. Eighty-three patients with MDD participated in this study to receive treatment using Repetitive Transcranial Magnetic Stimulation (rTMS). A deep hybrid neural network was developed based on three pre-trained convolutional neural networks named DenseNet121, EfficientNetB0, and Xception. The training of each model was performed by feeding three types of EEG representations as the inputs into the models including the Wavelet Transform (WT) images, the connectivity matrix between electrode pairs, and the raw EEG signals. The performance of the proposed models were assessed for the three different input types and achieved the highest accuracy of 94.7% in classifying patients as responders or non-responders when utilizing a sequence of raw EEG images. For the WT and connectivity inputs the best accuracy of model was 94.38% and 94.25% respectively. Therefore, the proposed model can be a step forward towards the clinical implementation of an end-to-end treatment selection framework using raw EEG signals.
Repetitive Transcranial Magnetic Stimulation (rTMS) is proposed as an effective treatment for major depressive disorder (MDD). However, because of the suboptimal treatment outcome of rTMS, the prediction of response to this technique is a crucial task. We developed a deep learning (DL) model to classify responders (R) and non-responders (NR). With this aim, we assessed the pre-treatment EEG signal of 34 MDD patients and extracted effective connectivity (EC) among all electrodes in four frequency bands of EEG signal. Two-dimensional EC maps are put together to create a rich connectivity image and a sequence of these images is fed to the DL model. Then, the DL framework was constructed based on transfer learning (TL) models which are pre-trained convolutional neural networks (CNN) named VGG16, Xception, and EfficientNetB0. Then, long short-term memory (LSTM) cells are equipped with an attention mechanism added on top of TL models to fully exploit the spatiotemporal information of EEG signal. Using leave-one subject out cross validation (LOSO CV), Xception-BLSTM-Attention acquired the highest performance with 98.86% of accuracy and 97.73% of specificity. Fusion of these models as an ensemble model based on optimized majority voting gained 99.32% accuracy and 98.34% of specificity. Therefore, the ensemble of TL-LSTM-Attention models can predict accurately the treatment outcome.
The human brain is characterized by complex structural, functional connections that integrate unique cognitive characteristics. There is a fundamental hurdle for the evaluation of both structural and functional connections of the brain and the effects in the diagnosis and treatment of neurodegenerative diseases. Currently, there is no clinically specific diagnostic biomarker capable of confirming the diagnosis of major depressive disorder (MDD). Therefore, exploring translational biomarkers of mood disorders based on deep learning (DL) has valuable potential with its recently underlined promising outcomes. In this article, an electroencephalography (EEG)-based diagnosis model for MDD is built through advanced computational neuroscience methodology coupled with a deep convolutional neural network (CNN) approach. EEG recordings are analyzed by modeling 3 different deep CNN structure, namely, ResNet-50, MobileNet, Inception-v3, in order to dichotomize MDD patients and healthy controls. EEG data are collected for 4 main frequency bands (Δ, θ, α, and β, accompanying spatial resolution with location information by collecting data from 19 electrodes. Following the pre-processing step, different DL architectures were employed to underline discrimination performance by comparing classification accuracies. The classification performance of models based on location data, MobileNet architecture generated 89.33% and 92.66% classification accuracy. As to the frequency bands, delta frequency band outperformed compared to other bands with 90.22% predictive accuracy and area under curve (AUC) value of 0.9 for ResNet-50 architecture. The main contribution of the study is the delineation of distinctive spatial and temporal features using various DL architectures to dichotomize 46 MDD subjects from 46 healthy subjects. Exploring translational biomarkers of mood disorders based on DL perspective is the main focus of this study and, though it is challenging, with its promising potential to improve our understanding of the psychiatric disorders, computational methods are highly worthy for the diagnosis process and valuable in terms of both speed and accuracy compared with classical approaches.
Response to transcranial magnetic stimulation (TMS) in major depressive disorder (MDD) is highly variable, underscoring the need for biomarkers that both predict treatment efficacy and elucidate underlying neural mechanisms. We integrated deep learning and computational modeling to identify subtype-specific responses to intermittent theta-burst stimulation (iTBS) in MDD. Resting-state EEG and event-related potentials were collected from 198 patients across two independent cohorts (training: N = 125; validation: N = 73). A total of 55,476 EEG epochs were analyzed using a multi-scale convolutional recurrent neural network (MCRNN). To probe circuit-level mechanisms, Dynamic Causal Modeling with Parametric Empirical Bayes (DCM-PEB) was applied to assess subtype-specific effective connectivity. The MCRNN achieved robust predictive performance (accuracy = 0.91; 95% CI: 0.85-0.97 in the training cohort; 0.86; 95% CI: 0.76-0.96in the validation cohort), reliably stratifying patients into two neurophysiological subtypes. These subtypes differed in baseline symptom severity and clinical response trajectories. DCM-PEB revealed distinct effective connectivity signatures within frontal-temporal-parietal-motor circuits, with posterior probability exceeding 0.99, linking subtype-specific neural dynamics to treatment outcomes. EEG-based deep learning, combined with biophysically informed connectivity modeling, enables reliable prediction of iTBS outcomes in MDD. Subtype-specific disruptions in frontal-temporal coupling emerge as candidate biomarkers, offering mechanistic insight into neuromodulation response and a framework for personalized TMS interventions.
Granger causality (GC) effective connectivity (EC) calculated from electroencephalogram (EEG) signals has been widely used in mental disorder detection. However, the existing methods only take into account linear dynamics or nonlinear dynamics within a single sample, ignoring the nonlinear dynamics shared by the same class of subjects. In this article, a model combining graph neural networks (GNNs) and variational autoencoders (VAEs) is proposed to construct shared latent nonlinear EC from raw EEG signals for depression detection. Several convolution modules and fully connected layers are used in the graph encoding network to learn the embeddings of the connectivity connected by every two EEG channels. In the graph decoding network, a class-specific Gaussian mixture model (GMM) is introduced in the VAEs to model shared dynamics in EC of the same class of subjects, and the shared dynamics combine the encoded embeddings of the EC and the past time series to restore raw EEG signals. Through a node-to-edge encoding process and an edge-to-node decoding process, the shared latent nonlinear EC in EEG signals can ultimately be learned by gradually optimizing the model's loss function. The performance of the proposed method is verified on several open-accessed datasets. The excellent results prove that the proposed neural networks can learn more generalized nonlinear EC representations, and shared latent dynamics discovery can also help to identify depression better. The code is available at https://github.com/william-yuan2012/DSLNEC-tscausality.
MRI compatible EEG systems enable simultaneous EEG-fMRI data assessment, which provides high spatial and high temporal resolution of neural signaling data. Functional connectivity analyses suggest altered fronto-limbic emotion regulation in patients with major depressive disorder (MDD). Sixty patients with MDD and 66 healthy controls (HC) performed a priming task using unconsciously and consciously presented emotional facial expressions (happy, sad, neutral) performed a priming task using unconsciously and consciously presented emotional facial expressions. Effective connectivity of simultaneously recorded EEG-fMRI data between cortical (bilateral dorsolateral prefrontal cortex and fusiform gyrus) and subcortical regions (bilateral amygdala) was captured using dynamic causal modeling (DCM). Delineate stimulus-related changes in bottom-up and top-down neurophysiological networks across both EEG and fMRI data were estimated in models of unconscious and conscious processing, defined for both groups. Bayesian model selection favored a bottom-up processing model for both groups and input conditions (conscious and unconscious) in EEG-DCMs. Mixed top-down and bottom-up processing models best represented conscious and unconscious stimulus processing in HC fMRI-DCM, while bottom-up models were most representative for MDD fMRI data. Amygdala activity leads to higher DLPFC activity in conscious, and lower DLPFC activity in unconscious conditions in both groups. This study demonstrates the distinct capabilities of EEG and fMRI data through showing that EEG captures early and fast processing (bottom-up) while fMRI reflects both, bottom-up and top-down regulation. Activity reduction of DLPFC through FFA bottom-up connectivity in early processing (EEG-DCM) might inhibit later top-down emotion regulation through the DLPFC in MDD (fMRI-DCM).
Electroencephalogram (EEG)-based automated depression diagnosis systems have been suggested for early and accurate detection of mood disorders. EEG signals are highly irregular, nonlinear, and nonstationary in nature and are traditionally studied from a linear viewpoint by means of statistical and frequency features. Since, linear metrics present certain limitations and nonlinear methods have proven to be an efficient tool in understanding the complexities of the brain in the identification of underlying behavior of biological signals, such as electrocardiogram, EEG and magnetoencephalogram and thus, can be applied to all nonstationary signals. Various nonlinear algorithms can be used in the analysis of EEG signals. In this research paper, we aim to develop a novel methodology for EEG-based depression diagnosis utilizing 2 advanced computational techniques: frequency-domain extended multivariate autoregressive (eMVAR) and deep learning (DL). We proposed a hybrid method comprising a pretrained ResNet-50 and long-short term memory (LSTM) to capture depression-specific information and compared with a strong conventional machine learning (ML) framework having eMVAR connectivity features. The following 8 causality measures, which interpret the interaction mechanisms among spectrally decomposed oscillations, were used to extract features from multivariate EEG time series: directed coherence (DC), directed transfer function (DTF), partial DC (PDC), generalized PDC (gPDC), extended DC (eDC), delayed DC (dDC), extended PDC (ePDC), and delayed PDC (dPDC). The classification accuracies were 84% with DC, 85% with DTF, 95.3% with PDC, 95.1% with gPDC, 84.8% with eDC, 84.6% with dDC, 84.2% with ePDC, and 95.9% with dPDC for the eMVAR framework. Through a DL framework (ResNet-50 + LSTM), the classification accuracy was achieved as 90.22%. The results demonstrate that our DL methodology is a competitive alternative to the strong feature extraction-based ML methods in depression classification.
Non-suicidal self-injury (NSSI) is becoming more common among youth with major depressive disorder (MDD). Repetitive NSSI (R-NSSI), a severe type of NSSI, is a growing concern. As a behavioural addiction, R-NSSI may be closely linked to aberrant response inhibition. This study aimed to investigate the neural correlation of response inhibition within R-NSSI. A total of 85 youths participated in a Go/NoGo task during magnetoencephalography scanning, including 27 MDD youths with R-NSSI, 28 MDD youths without NSSI and 30 healthy controls. Phase amplitude coupling and functional connectivity were calculated to detect the intra- and inter-regional coupling patterns. Further, phase slope indexes were used to ascertain the brain information flow direction. Locally, the R-NSSI group showed hyper-coupling of gamma amplitude and beta phase within the temporal region. Globally, connectivity within the prefrontal cognitive control circuit, mediated by both beta and gamma oscillations, was diminished. Notably, the R-NSSI group showed a negative frontoparietal information flow contrasted with a positive temporoparietal information flow. Furthermore, the strength of functional connectivity mediated the association between NSSI and childhood trauma. Our findings suggest disrupted local and regional control processes in youths with R-NSSI. Overall, NSSI manifests an inefficient prefrontal network compensated by enhanced activity in temporal lobe regions. These results offer electrophysiological insights into the neural mechanisms of NSSI.
Granger causality is a widely-used criterion for analyzing interactions in large-scale networks. As most physical interactions are inherently nonlinear, we consider the problem of inferring the existence of pairwise Granger causality between nonlinearly interacting stochastic processes from their time series measurements. Our proposed approach relies on modeling the embedded nonlinearities in the measurements using a component-wise time series prediction model based on Statistical Recurrent Units (SRUs). We make a case that the network topology of Granger causal relations is directly inferrable from a structured sparse estimate of the internal parameters of the SRU networks trained to predict the processes$'$ time series measurements. We propose a variant of SRU, called economy-SRU, which, by design has considerably fewer trainable parameters, and therefore less prone to overfitting. The economy-SRU computes a low-dimensional sketch of its high-dimensional hidden state in the form of random projections to generate the feedback for its recurrent processing. Additionally, the internal weight parameters of the economy-SRU are strategically regularized in a group-wise manner to facilitate the proposed network in extracting meaningful predictive features that are highly time-localized to mimic real-world causal events. Extensive experiments are carried out to demonstrate that the proposed economy-SRU based time series prediction model outperforms the MLP, LSTM and attention-gated CNN-based time series models considered previously for inferring Granger causality.
Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. The assumption of fixed time delay also exists in Transfer Entropy, which is considered to be a non-linear version of Granger causality. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop Variable-lag Granger causality and Variable-lag Transfer Entropy, generalizations of both Granger causality and Transfer Entropy that relax the assumption of the fixed time delay and allow causes to influence effects with arbitrary time delays. In addition, we propose methods for inferring both variable-lag Granger causality and Transfer Entropy relations. In our approaches, we utilize an optimal warping path of Dynamic Time Warping (DTW) to infer variable-lag causal relations. We demonstrate our approaches on an application for studying coordinated collective behavior and other real-world casual-inference datasets and show that our proposed approaches perform better than several existing methods in both simulated and real-world datasets. Our approaches can be applied in any domain of time series analysis. The software of this work is available in the R-CRAN package: VLTimeCausality.
Advanced sensors and IoT devices have improved the monitoring and control of complex industrial enterprises. They have also created an interdependent fabric of geographically distributed process operations (clients) across these enterprises. Granger causality is an effective approach to detect and quantify interdependencies by examining how one client's state affects others over time. Understanding these interdependencies captures how localized events, such as faults and disruptions, can propagate throughout the system, possibly causing widespread operational impacts. However, the large volume and complexity of industrial data pose challenges in modeling these interdependencies. This paper develops a federated approach to learning Granger causality. We utilize a linear state space system framework that leverages low-dimensional state estimates to analyze interdependencies. This addresses bandwidth limitations and the computational burden commonly associated with centralized data processing. We propose augmenting the client models with the Granger causality information learned by the server through a Machine Learning (ML) function. We examine the co-dependence between the augmented client and server models and reformulate the framework as a standalone ML algorithm providing conditions for its sublinear and linear convergence rates. We also study the convergence of the framework to a centralized oracle model. Moreover, we include a differential privacy analysis to ensure data security while preserving causal insights. Using synthetic data, we conduct comprehensive experiments to demonstrate the robustness of our approach to perturbations in causality, the scalability to the size of communication, number of clients, and the dimensions of raw data. We also evaluate the performance on two real-world industrial control system datasets by reporting the volume of data saved by decentralization.
Granger causality has been widely used in various application domains to capture lead-lag relationships amongst the components of complex dynamical systems, and the focus in extant literature has been on a single dynamical system. In certain applications in macroeconomics and neuroscience, one has access to data from a collection of related such systems, wherein the modeling task of interest is to extract the shared common structure that is embedded across them, as well as to identify the idiosyncrasies within individual ones. This paper introduces a Variational Autoencoder (VAE) based framework that jointly learns Granger-causal relationships amongst components in a collection of related-yet-heterogeneous dynamical systems, and handles the aforementioned task in a principled way. The performance of the proposed framework is evaluated on several synthetic data settings and benchmarked against existing approaches designed for individual system learning. The method is further illustrated on a real dataset involving time series data from a neurophysiological experiment and produces interpretable results.
The current electroencephalogram (EEG) based deep learning models are typically designed for specific datasets and applications in brain-computer interaction (BCI), limiting the scale of the models and thus diminishing their perceptual capabilities and generalizability. Recently, Large Language Models (LLMs) have achieved unprecedented success in text processing, prompting us to explore the capabilities of Large EEG Models (LEMs). We hope that LEMs can break through the limitations of different task types of EEG datasets, and obtain universal perceptual capabilities of EEG signals through unsupervised pre-training. Then the models can be fine-tuned for different downstream tasks. However, compared to text data, the volume of EEG datasets is generally small and the format varies widely. For example, there can be mismatched numbers of electrodes, unequal length data samples, varied task designs, and low signal-to-noise ratio. To overcome these challenges, we propose a unified foundation model for EEG called Large Brain Model (LaBraM). LaBraM enables cross-dataset learning by segmenting the EEG signals into EEG channel patches. Vector-quantized neural spectrum prediction is used to train a semantically rich neural tokenizer that encodes continuous raw EEG channel patches into compact neural codes. We then pre-train neural Transformers by predicting the original neural codes for the masked EEG channel patches. The LaBraMs were pre-trained on about 2,500 hours of various types of EEG signals from around 20 datasets and validated on multiple different types of downstream tasks. Experiments on abnormal detection, event type classification, emotion recognition, and gait prediction show that our LaBraM outperforms all compared SOTA methods in their respective fields. Our code is available at https://github.com/935963004/LaBraM.
The Granger framework is useful for discovering causal relations in time-varying signals. However, most Granger causality (GC) methods are developed for densely sampled timeseries data. A substantially different setting, particularly common in medical imaging, is the longitudinal study design, where multiple subjects are followed and sparsely observed over time. Longitudinal studies commonly track several biomarkers, which are likely governed by nonlinear dynamics that might have subject-specific idiosyncrasies and exhibit both direct and indirect causes. Furthermore, real-world longitudinal data often suffer from widespread missingness. GC methods are not well-suited to handle these issues. In this paper, we propose an approach named GLACIAL (Granger and LeArning-based CausalIty Analysis for Longitudinal studies) to fill this methodological gap by marrying GC with a multi-task neural forecasting model. GLACIAL treats subjects as independent samples and uses the model's average prediction accuracy on hold-out subjects to probe causal links. Input dropout and model interpolation are used to efficiently learn nonlinear dynamic relationships between a large number of variables and to handle missing values respectively. Extensive simulations and experiments on a real longitudinal medical imaging dataset show GLACIAL beating competitive baselines and confirm its utility. Our code is available at https://github.com/mnhng/GLACIAL.
We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature either requires strong assumptions, such as linearity in the intensity function, or heuristically defined model parameters that do not necessarily meet the requirements of Granger causality. We propose Instance-wise Self-Attentive Hawkes Processes (ISAHP), a novel deep learning framework that can directly infer the Granger causality at the event instance level. ISAHP is the first neural point process model that meets the requirements of Granger causality. It leverages the self-attention mechanism of the transformer to align with the principles of Granger causality. We empirically demonstrate that ISAHP is capable of discovering complex instance-level causal structures that cannot be handled by classical models. We also show that ISAHP achieves state-of-the-art performance in proxy tasks involving type-level causal discovery and instance-level event type prediction.
Granger causality is popular for analyzing time series data in many applications from natural science to social science including genomics, neuroscience, economics, and finance. Consequently, the Granger causality test has become one of the main concerns of the econometrician for decades. Taking advantage of the theoretical breakthroughs in deep learning in recent years, we propose a doubly robust Granger causality test (DRGCT). Our method offers several key advantages. The first and most direct benefit is for the users, DRGCT allows them to handle large lag orders while alleviating the curse of dimensionality that traditional nonlinear Granger causality tests usually face. Second, introducing a doubly robust test statistic for time series based on neural networks that achieves a parametric convergence rate not only suggests a new paradigm for nonparametric inference in econometrics, but also broadens the application scope of deep learning. Third, a multiplier bootstrap method, combined with the doubly robust approach, provides an efficient way to obtain critical values, effectively reducing computational time and avoiding redundant calculations. We prove that the test asymptotically controls the type I error, while achieving power approaches one, and validate the effectiveness of our test through numerical simulations. In real data analysis, we apply DRGCT to revisit the price-volume relationship problem in the stock markets of America, China, and Japan.
Inferring causal relationships in observational time series data is an important task when interventions cannot be performed. Granger causality is a popular framework to infer potential causal mechanisms between different time series. The original definition of Granger causality is restricted to linear processes and leads to spurious conclusions in the presence of a latent confounder. In this work, we harness the expressive power of recurrent neural networks and propose a deep learning-based approach to model non-linear Granger causality by directly accounting for latent confounders. Our approach leverages multiple recurrent neural networks to parameterise predictive distributions and we propose the novel use of a dual-decoder setup to conduct the Granger tests. We demonstrate the model performance on non-linear stochastic time series for which the latent confounder influences the cause and effect with different time lags; results show the effectiveness of our model compared to existing benchmarks.
We study the problem of learning Granger causality between event types from asynchronous, interdependent, multi-type event sequences. Existing work suffers from either limited model flexibility or poor model explainability and thus fails to uncover Granger causality across a wide variety of event sequences with diverse event interdependency. To address these weaknesses, we propose CAUSE (Causality from AttribUtions on Sequence of Events), a novel framework for the studied task. The key idea of CAUSE is to first implicitly capture the underlying event interdependency by fitting a neural point process, and then extract from the process a Granger causality statistic using an axiomatic attribution method. Across multiple datasets riddled with diverse event interdependency, we demonstrate that CAUSE achieves superior performance on correctly inferring the inter-type Granger causality over a range of state-of-the-art methods.
Our goal is to estimate causal interactions in multivariate time series. Using vector autoregressive (VAR) models, these can be defined based on non-vanishing coefficients belonging to respective time-lagged instances. As in most cases a parsimonious causality structure is assumed, a promising approach to causal discovery consists in fitting VAR models with an additional sparsity-promoting regularization. Along this line we here propose that sparsity should be enforced for the subgroups of coefficients that belong to each pair of time series, as the absence of a causal relation requires the coefficients for all time-lags to become jointly zero. Such behavior can be achieved by means of l1-l2-norm regularized regression, for which an efficient active set solver has been proposed recently. Our method is shown to outperform standard methods in recovering simulated causality graphs. The results are on par with a second novel approach which uses multiple statistical testing.
Decoding EEG signals of different mental states is a challenging task for brain-computer interfaces (BCIs) due to nonstationarity of perceptual decision processes. This paper presents a novel boosted convolutional neural networks (ConvNets) decoding scheme for motor imagery (MI) EEG signals assisted by the multiwavelet-based time-frequency (TF) causality analysis. Specifically, multiwavelet basis functions are first combined with Geweke spectral measure to obtain high-resolution TF-conditional Granger causality (CGC) representations, where a regularized orthogonal forward regression (ROFR) algorithm is adopted to detect a parsimonious model with good generalization performance. The causality images for network input preserving time, frequency and location information of connectivity are then designed based on the TF-CGC distributions of alpha band multichannel EEG signals. Further constructed boosted ConvNets by using spatio-temporal convolutions as well as advances in deep learning including cropping and boosting methods, to extract discriminative causality features and classify MI tasks. Our proposed approach outperforms the competition winner algorithm with 12.15% increase in average accuracy and 74.02% decrease in associated inter subject standard deviation for the same binary classification on BCI competition-IV dataset-IIa. Experiment results indicate that the boosted ConvNets with causality images works well in decoding MI-EEG signals and provides a promising framework for developing MI-BCI systems.
Exploratory analysis of time series data can yield a better understanding of complex dynamical systems. Granger causality is a practical framework for analysing interactions in sequential data, applied in a wide range of domains. In this paper, we propose a novel framework for inferring multivariate Granger causality under nonlinear dynamics based on an extension of self-explaining neural networks. This framework is more interpretable than other neural-network-based techniques for inferring Granger causality, since in addition to relational inference, it also allows detecting signs of Granger-causal effects and inspecting their variability over time. In comprehensive experiments on simulated data, we show that our framework performs on par with several powerful baseline methods at inferring Granger causality and that it achieves better performance at inferring interaction signs. The results suggest that our framework is a viable and more interpretable alternative to sparse-input neural networks for inferring Granger causality.
With the advancement of neural networks, diverse methods for neural Granger causality have emerged, which demonstrate proficiency in handling complex data, and nonlinear relationships. However, the existing framework of neural Granger causality has several limitations. It requires the construction of separate predictive models for each target variable, and the relationship depends on the sparsity on the weights of the first layer, resulting in challenges in effectively modeling complex relationships between variables as well as unsatisfied estimation accuracy of Granger causality. Moreover, most of them cannot grasp full-time Granger causality. To address these drawbacks, we propose a Jacobian Regularizer-based Neural Granger Causality (JRNGC) approach, a straightforward yet highly effective method for learning multivariate summary Granger causality and full-time Granger causality by constructing a single model for all target variables. Specifically, our method eliminates the sparsity constraints of weights by leveraging an input-output Jacobian matrix regularizer, which can be subsequently represented as the weighted causal matrix in the post-hoc analysis. Extensive experiments show that our proposed approach achieves competitive performance with the state-of-the-art methods for learning summary Granger causality and full-time Granger causality while maintaining lower model complexity and high scalability.
Granger causality is a commonly used method for uncovering information flow and dependencies in a time series. Here we introduce JGC (Jacobian Granger Causality), a neural network-based approach to Granger causality using the Jacobian as a measure of variable importance, and propose a thresholding procedure for inferring Granger causal variables using this measure. The resulting approach performs consistently well compared to other approaches in identifying Granger causal variables, the associated time lags, as well as interaction signs. Lastly, through the inclusion of a time variable, we show that this approach is able to learn the temporal dependencies for nonstationary systems whose Granger causal structures change in time.
While most classical approaches to Granger causality detection assume linear dynamics, many interactions in real-world applications, like neuroscience and genomics, are inherently nonlinear. In these cases, using linear models may lead to inconsistent estimation of Granger causal interactions. We propose a class of nonlinear methods by applying structured multilayer perceptrons (MLPs) or recurrent neural networks (RNNs) combined with sparsity-inducing penalties on the weights. By encouraging specific sets of weights to be zero--in particular, through the use of convex group-lasso penalties--we can extract the Granger causal structure. To further contrast with traditional approaches, our framework naturally enables us to efficiently capture long-range dependencies between series either via our RNNs or through an automatic lag selection in the MLP. We show that our neural Granger causality methods outperform state-of-the-art nonlinear Granger causality methods on the DREAM3 challenge data. This data consists of nonlinear gene expression and regulation time courses with only a limited number of time points. The successes we show in this challenging dataset provide a powerful example of how deep learning can be useful in cases that go beyond prediction on large datasets. We likewise illustrate our methods in detecting nonlinear interactions in a human motion capture dataset.
Discovering causal relationships in time series data is central in many scientific areas, ranging from economics to climate science. Granger causality is a powerful tool for causality detection. However, its original formulation is limited by its linear form and only recently nonlinear machine-learning generalizations have been introduced. This study contributes to the definition of neural Granger causality models by investigating the application of Kolmogorov-Arnold networks (KANs) in Granger causality detection and comparing their capabilities against multilayer perceptrons (MLP). In this work, we develop a framework called Granger Causality KAN (GC-KAN) along with a tailored training approach designed specifically for Granger causality detection. We test this framework on both Vector Autoregressive (VAR) models and chaotic Lorenz-96 systems, analysing the ability of KANs to sparsify input features by identifying Granger causal relationships, providing a concise yet accurate model for Granger causality detection. Our findings show the potential of KANs to outperform MLPs in discerning interpretable Granger causal relationships, particularly for the ability of identifying sparse Granger causality patterns in high-dimensional settings, and more generally, the potential of AI in causality discovery for the dynamical laws in physical systems.
We propose the Granger causality inference Kolmogorov-Arnold Networks (KANGCI), a novel architecture that extends the recently proposed Kolmogorov-Arnold Networks (KAN) to the domain of causal inference. By extracting base weights from KAN layers and incorporating the sparsity-inducing penalty and ridge regularization, KANGCI effectively infers the Granger causality from time series. Additionally, we propose an algorithm based on time-reversed Granger causality that automatically selects causal relationships with better inference performance from the original or time-reversed time series or integrates the results to mitigate spurious connectivities. Comprehensive experiments conducted on Lorenz-96, Gene regulatory networks, fMRI BOLD signals, VAR, and real-world EEG datasets demonstrate that the proposed model achieves competitive performance to state-of-the-art methods in inferring Granger causality from nonlinear, high-dimensional, and limited-sample time series.
Granger causality analysis, as one of the most popular time series causality methods, has been widely used in the economics, neuroscience. However, unobserved confounders is a fundamental problem in the observational studies, which is still not solved for the non-linear Granger causality. The application works often deal with this problem in virtue of the proxy variables, who can be treated as a measure of the confounder with noise. But the proxy variables has been proved to be unreliable, because of the bias it may induce. In this paper, we try to "recover" the unobserved confounders for the Granger causality. We use a generative model with latent variable to build the relationship between the unobserved confounders and the observed variables(tested variable and the proxy variables). The posterior distribution of the latent variable is adopted to represent the confounders distribution, which can be sampled to get the estimated confounders. We adopt the variational autoencoder to estimate the intractable posterior distribution. The recurrent neural network is applied to build the temporal relationship in the data. We evaluate our method in the synthetic and semi-synthetic dataset. The result shows our estimated confounders has a better performance than the proxy variables in the non-linear Granger causality with multiple proxies in the semi-synthetic dataset. But the performances of the synthetic dataset and the different noise level of proxy seem terrible. Any advice can really help.
Dependence between nodes in a network is an important concept that pervades many areas including finance, politics, sociology, genomics and the brain sciences. One way to characterize dependence between components of a multivariate time series data is via Granger Causality (GC). Standard traditional approaches to GC estimation / inference commonly assume linear dynamics, however such simplification does not hold in many real-world applications where signals are inherently non-linear. In such cases, imposing linear models such as vector autoregressive (VAR) models can lead to mis-characterization of true Granger Causal interactions. To overcome this limitation, Tank et al (IEEE Transactions on Pattern Analysis and Machine Learning, 2022) proposed a solution that uses neural networks with sparse regularization penalties. The regularization encourages learnable weights to be sparse, which enables inference on GC. This paper overcomes the limitations of current methods by leveraging advances in machine learning and deep learning which have been demonstrated to learn hidden patterns in the data. We propose novel classes of models that can handle underlying non-linearity in a computationally efficient manner, simultaneously providing GC and lag order selection. Firstly, we present the Learned Kernel VAR (LeKVAR) model that learns kernel parameterized by a shared neural net followed by penalization on learnable weights to discover GC structure. Secondly, we show one can directly decouple lags and individual time series importance via decoupled penalties. This is important as we want to select the lag order during the process of GC estimation. This decoupling acts as a filtering and can be extended to any DL model including Multi-Layer Perceptrons (MLP), Recurrent Neural Networks (RNN), Long Short Term Memory Networks (LSTM), Transformers etc, for simultaneous GC estimation and lag selection.
We propose a method of analysis of dynamical networks based on a recent measure of Granger causality between time series, based on kernel methods. The generalization of kernel Granger causality to the multivariate case, here presented, shares the following features with the bivariate measures: (i) the nonlinearity of the regression model can be controlled by choosing the kernel function and (ii) the problem of false-causalities, arising as the complexity of the model increases, is addressed by a selection strategy of the eigenvectors of a reduced Gram matrix whose range represents the additional features due to the second time series. Moreover, there is no {\it a priori} assumption that the network must be a directed acyclic graph. We apply the proposed approach to a network of chaotic maps and to a simulated genetic regulatory network: it is shown that the underlying topology of the network can be reconstructed from time series of node's dynamics, provided that a sufficient number of samples is available. Considering a linear dynamical network, built by preferential attachment scheme, we show that for limited data use of bivariate Granger causality is a better choice w.r.t methods using $L1$ minimization. Finally we consider real expression data from HeLa cells, 94 genes and 48 time points. The analysis of static correlations between genes reveals two modules corresponding to well known transcription factors; Granger analysis puts in evidence nineteen causal relationships, all involving genes related to tumor development.
It is often useful in multivariate time series analysis to determine statistical causal relations between different time series. Granger causality is a fundamental measure for this purpose. Yet the traditional pairwise approach to Granger causality analysis may not clearly distinguish between direct causal influences from one time series to another and indirect ones acting through a third time series. In order to differentiate direct from indirect Granger causality, a conditional Granger causality measure in the frequency domain is derived based on a partition matrix technique. Simulations and an application to neural field potential time series are demonstrated to validate the method.
While most classical approaches to Granger causality detection repose upon linear time series assumptions, many interactions in neuroscience and economics applications are nonlinear. We develop an approach to nonlinear Granger causality detection using multilayer perceptrons where the input to the network is the past time lags of all series and the output is the future value of a single series. A sufficient condition for Granger non-causality in this setting is that all of the outgoing weights of the input data, the past lags of a series, to the first hidden layer are zero. For estimation, we utilize a group lasso penalty to shrink groups of input weights to zero. We also propose a hierarchical penalty for simultaneous Granger causality and lag estimation. We validate our approach on simulated data from both a sparse linear autoregressive model and the sparse and nonlinear Lorenz-96 model.
Learning Granger causality from event sequences is a challenging but essential task across various applications. Most existing methods rely on the assumption that event sequences are independent and identically distributed (i.i.d.). However, this i.i.d. assumption is often violated due to the inherent dependencies among the event sequences. Fortunately, in practice, we find these dependencies can be modeled by a topological network, suggesting a potential solution to the non-i.i.d. problem by introducing the prior topological network into Granger causal discovery. This observation prompts us to tackle two ensuing challenges: 1) how to model the event sequences while incorporating both the prior topological network and the latent Granger causal structure, and 2) how to learn the Granger causal structure. To this end, we devise a unified topological neural Poisson auto-regressive model with two processes. In the generation process, we employ a variant of the neural Poisson process to model the event sequences, considering influences from both the topological network and the Granger causal structure. In the inference process, we formulate an amortized inference algorithm to infer the latent Granger causal structure. We encapsulate these two processes within a unified likelihood function, providing an end-to-end framework for this task. Experiments on simulated and real-world data demonstrate the effectiveness of our approach.
This paper presents the very first attempt to evaluate machine learning fairness for depression detection using electroencephalogram (EEG) data. We conduct experiments using different deep learning architectures such as Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks across three EEG datasets: Mumtaz, MODMA and Rest. We employ five different bias mitigation strategies at the pre-, in- and post-processing stages and evaluate their effectiveness. Our experimental results show that bias exists in existing EEG datasets and algorithms for depression detection, and different bias mitigation methods address bias at different levels across different fairness measures.
As a critical mental health disorder, depression has severe effects on both human physical and mental well-being. Recent developments in EEG-based depression analysis have shown promise in improving depression detection accuracies. However, EEG features often contain redundant, irrelevant, and noisy information. Additionally, real-world EEG data acquisition frequently faces challenges, such as data loss from electrode detachment and heavy noise interference. To tackle the challenges, we propose a novel feature selection approach for robust depression analysis, called Incomplete Depression Feature Selection with Missing EEG Channels (IDFS-MEC). IDFS-MEC integrates missing-channel indicator information and adaptive channel weighting learning into orthogonal regression to lessen the effects of incomplete channels on model construction, and then utilizes global redundancy minimization learning to reduce redundant information among selected feature subsets. Extensive experiments conducted on MODMA and PRED-d003 datasets reveal that the EEG feature subsets chosen by IDFS-MEC have superior performance than 10 popular feature selection methods among 3-, 64-, and 128-channel settings.
The bulk of the research effort on brain connectivity revolves around statistical associations among brain regions, which do not directly relate to the causal mechanisms governing brain dynamics. Here we propose the multiscale causal backbone (MCB) of brain dynamics, shared by a set of individuals across multiple temporal scales, and devise a principled methodology to extract it. Our approach leverages recent advances in multiscale causal structure learning and optimizes the trade-off between the model fit and its complexity. Empirical assessment on synthetic data shows the superiority of our methodology over a baseline based on canonical functional connectivity networks. When applied to resting-state fMRI data, we find sparse MCBs for both the left and right brain hemispheres. Thanks to its multiscale nature, our approach shows that at low-frequency bands, causal dynamics are driven by brain regions associated with high-level cognitive functions; at higher frequencies instead, nodes related to sensory processing play a crucial role. Finally, our analysis of individual multiscale causal structures confirms the existence of a causal fingerprint of brain connectivity, thus supporting the existing extensive research in brain connectivity fingerprinting from a causal perspective.
Graph neural networks (GNNs) are becoming increasingly popular for EEG-based depression detection. However, previous GNN-based methods fail to sufficiently consider the characteristics of depression, thus limiting their performance. Firstly, studies in neuroscience indicate that depression patients exhibit both common and individualized brain abnormal patterns. Previous GNN-based approaches typically focus either on fixed graph connections to capture common abnormal brain patterns or on adaptive connections to capture individualized patterns, which is inadequate for depression detection. Secondly, brain network exhibits a hierarchical structure, which includes the arrangement from channel-level graph to region-level graph. This hierarchical structure varies among individuals and contains significant information relevant to detecting depression. Nonetheless, previous GNN-based methods overlook these individualized hierarchical information. To address these issues, we propose a Hybrid GNN (HGNN) that merges a Common Graph Neural Network (CGNN) branch utilizing fixed connection and an Individualized Graph Neural Network (IGNN) branch employing adaptive connections. The two branches capture common and individualized depression patterns respectively, complementing each other. Furthermore, we enhance the IGNN branch with a Graph Pooling and Unpooling Module (GPUM) to extract individualized hierarchical information. Extensive experiments on two public datasets show that our model achieves state-of-the-art performance.
Identifying causal relationships among distinct brain areas, known as effective connectivity, holds key insights into the brain's information processing and cognitive functions. Electroencephalogram (EEG) signals exhibit intricate dynamics and inter-areal interactions within the brain. However, methods for characterizing nonlinear causal interactions among multiple brain regions remain relatively underdeveloped. In this study, we proposed a data-driven framework to infer effective connectivity by perturbing the trained neural networks. Specifically, we trained neural networks (i.e., CNN, vanilla RNN, GRU, LSTM, and Transformer) to predict future EEG signals according to historical data and perturbed the networks' input to obtain effective connectivity (EC) between the perturbed EEG channel and the rest of the channels. The EC reflects the causal impact of perturbing one node on others. The performance was tested on the synthetic EEG generated by a biological-plausible Jansen-Rit model. CNN and Transformer obtained the best performance on both 3-channel and 90-channel synthetic EEG data, outperforming the classical Granger causality method. Our work demonstrated the potential of perturbing an artificial neural network, learned to predict future system dynamics, to uncover the underlying causal structure.
As a common mental disorder, depression is a leading cause of various diseases worldwide. Early detection and treatment of depression can dramatically promote remission and prevent relapse. However, conventional ways of depression diagnosis require considerable human effort and cause economic burden, while still being prone to misdiagnosis. On the other hand, recent studies report that physical characteristics are major contributors to the diagnosis of depression, which inspires us to mine the internal relationship by neural networks instead of relying on clinical experiences. In this paper, neural networks are constructed to predict depression from physical characteristics. Two initialization methods are examined - Xaiver and Kaiming initialization. Experimental results show that a 3-layers neural network with Kaiming initialization achieves $83\%$ accuracy.
Background: Depression has become a major health burden worldwide, and effective detection depression is a great public-health challenge. This Electroencephalography (EEG)-based research is to explore the effective biomarkers for depression recognition. Methods: Resting state EEG data was collected from 24 major depressive patients (MDD) and 29 normal controls using 128 channel HydroCel Geodesic Sensor Net (HCGSN). To better identify depression, we extracted different types of EEG features including linear features, nonlinear features and functional connectivity features phase lagging index (PLI) to comprehensively analyze the EEG signals in patients with MDD. And using different feature selection methods and classifiers to evaluate the optimal feature sets. Results: Functional connectivity feature PLI is superior to the linear features and nonlinear features. And when combining all the types of features to classify MDD patients, we can obtain the highest classification accuracy 82.31% using ReliefF feature selection method and logistic regression (LR) classifier. Analyzing the distribution of optimal feature set, it was found that intrahemispheric connection edges of PLI were much more than the interhemispheric connection edges, and the intrahemispheric connection edges had a significant differences between two groups. Conclusion: Functional connectivity feature PLI plays an important role in depression recognition. Especially, intrahemispheric connection edges of PLI might be an effective biomarker to identify depression. And statistic results suggested that MDD patients might exist functional dysfunction in left hemisphere.
Depression is a very common but serious mood disorder.In this paper, We built a generative detection network(GDN) in accordance with three physiological laws. Our aim is that we expect the neural network to learn the relevant brain activity based on the EEG signal and, at the same time, to regenerate the target electrode signal based on the brain activity. We trained two generators, the first one learns the characteristics of depressed brain activity, and the second one learns the characteristics of control group's brain activity. In the test, a segment of EEG signal was put into the two generators separately, if the relationship between the EEG signal and brain activity conforms to the characteristics of a certain category, then the signal generated by the generator of the corresponding category is more consistent with the original signal. Thus it is possible to determine the category corresponding to a certain segment of EEG signal. We obtained an accuracy of 92.30\% on the MODMA dataset and 86.73\% on the HUSM dataset. Moreover, this model is able to output explainable information, which can be used to help the user to discover possible misjudgments of the network.Our code will be released.
Recent promises of generative deep learning lately brought interest to its potential uses in neural engineering. In this paper we firstly review recently emerging studies on generating artificial electroencephalography (EEG) signals with deep neural networks. Subsequently, we present our feasibility experiments on generating condition-specific multichannel EEG signals using conditional variational autoencoders. By manipulating real resting-state EEG epochs, we present an approach to synthetically generate time-series multichannel signals that show spectro-temporal EEG patterns which are expected to be observed during distinct motor imagery conditions.
There is a recent trend to leverage the power of graph neural networks (GNNs) for brain-network based psychiatric diagnosis, which,in turn, also motivates an urgent need for psychiatrists to fully understand the decision behavior of the used GNNs. However, most of the existing GNN explainers are either post-hoc in which another interpretive model needs to be created to explain a well-trained GNN, or do not consider the causal relationship between the extracted explanation and the decision, such that the explanation itself contains spurious correlations and suffers from weak faithfulness. In this work, we propose a granger causality-inspired graph neural network (CI-GNN), a built-in interpretable model that is able to identify the most influential subgraph (i.e., functional connectivity within brain regions) that is causally related to the decision (e.g., major depressive disorder patients or healthy controls), without the training of an auxillary interpretive network. CI-GNN learns disentangled subgraph-level representations α and \b{eta} that encode, respectively, the causal and noncausal aspects of original graph under a graph variational autoencoder framework, regularized by a conditional mutual information (CMI) constraint. We theoretically justify the validity of the CMI regulation in capturing the causal relationship. We also empirically evaluate the performance of CI-GNN against three baseline GNNs and four state-of-the-art GNN explainers on synthetic data and three large-scale brain disease datasets. We observe that CI-GNN achieves the best performance in a wide range of metrics and provides more reliable and concise explanations which have clinical evidence.The source code and implementation details of CI-GNN are freely available at GitHub repository (https://github.com/ZKZ-Brain/CI-GNN/).
本研究领域正经历从传统线性连接分析向深度学习驱动的非线性因果建模的范式转移。核心研究路径表现为:首先通过深度神经网络改进格兰杰因果算法以捕捉复杂的脑电非线性特征;其次将这些因果特征融入图神经网络(GNN)构建高效的抑郁症自动识别系统;同时,研究深入探讨了抑郁症的神经环路机制,并致力于个体化治疗反应的精准预测。此外,随着大规模预训练模型、联邦学习和生成式增强技术的引入,该领域正朝着更具鲁棒性、公平性和临床实用性的方向发展。