A Dual-Path Spatio-Temporal Multi-Task Convolutional Graph Network Architecture
双路径(Dual-Path)时空解耦、双分支交互融合与跨通道建模
这组文献共同点是采用“双分支/双路径/双通道/双图”的显式结构设计:并行提取空间与时间(或不同图/模态)信息,并通过交叉注意力、双向/双通道交互、融合网关等机制实现跨路径信息交换,从而提升时空表征对齐与表达效率;其核心目标与架构理念与 Dual-Path Spatio-Temporal Separation 高度一致。
- Dual-branch spatio-temporal graph neural networks for pedestrian trajectory prediction(Xingchen Zhang, Panagiotis Angeloudis, Y. Demiris, 2023, Pattern Recognition)
- DPCA-GCN: Dual-Path Cross-Attention Graph Convolutional Networks for Skeleton-Based Action Recognition(Khadija Lasri, Khalid El Fazazy, M. A. Mahraz, H. Tairi, J. Riffi, 2025, Computation)
- A Dual-path Spatial-Temporal Separation Framework for 3D Human Pose Estimation(Yiran Liu, Ruirui Li, Yuandong Hu, 2025, 2025 International Joint Conference on Neural Networks (IJCNN))
- DPCA-GCN: Dual-Path Cross-Attention Graph Convolutional Networks for Skeleton-Based Action Recognition(Khadija Lasri, Khalid El Fazazy, M. A. Mahraz, H. Tairi, J. Riffi, 2025, Computation)
- A dual-path network based on temporal and spatial features fusion for remaining useful life prediction of aero-engines(M Wu, Z Li, H Shi, Y Zheng, X Liu, 2025, Measurement Science and …)
- STDGFN: A spatio-temporal dual-graph fusion network for traffic flow prediction(Ruotian Ye, Yitong Tao, Qingjian Ni, 2026, Applied Intelligence)
- DC-STGCN: Dual-Channel Based Graph Convolutional Networks for Network Traffic Forecasting(Cheng-yi Pan, Jiang Zhu, Zhixiang Kong, Huaifeng Shi, Wensheng Yang, 2021, Electronics)
- Spatio–Temporal Bidirectional Gated Graph Convolutional Network for Skeleton Action Recognition in Dynamic Complex Environments(Lifeng Yin, Yunhan Wang, Qianxi Zhou, Miao Wang, Maohua Sun, Wu Deng, 2025, IEEE Internet of Things Journal)
- Spatio–Temporal Bidirectional Gated Graph Convolutional Network for Skeleton Action Recognition in Dynamic Complex Environments(Lifeng Yin, Yunhan Wang, Qianxi Zhou, Miao Wang, Maohua Sun, Wu Deng, 2025, IEEE Internet of Things Journal)
- Spatial-temporal dual interactive graph convolutional networks for traffic flow forecasting(Wensheng Zhang, H. Cai, Hongli Shi, Zhenzhen Han, 2025, Future Generation Computer Systems)
- Dual-STGAT: Dual Spatio-Temporal Graph Attention Networks With Feature Fusion for Pedestrian Crossing Intention Prediction(Jing Lian, Yiyang Luo, Xuecheng Wang, Linhui Li, Ge Guo, Weiwei Ren, Tao Zhang, 2025, IEEE Transactions on Intelligent Transportation Systems)
- Spatio–Temporal Bidirectional Gated Graph Convolutional Network for Skeleton Action Recognition in Dynamic Complex Environments(Lifeng Yin, Yunhan Wang, Qianxi Zhou, Miao Wang, Maohua Sun, Wu Deng, 2025, IEEE Internet of Things Journal)
- Spatiotemporal Dual-Graph Interaction Network for Wind-Speed Forecasting(Jianlong Zhang, Li Wu, Zheng Zhao, Botao Zhang, 2025, Proceedings of the 2025 5th International Conference on Big Data, Artificial Intelligence and Risk Management)
- A dual-path dynamic directed graph convolutional network for air quality prediction.(Xiao Xiao, Zhiling Jin, Shuo Wang, Jing Xu, Ziyan Peng, Rui Wang, Wei Shao, Yilong Hui, 2022, Science of The Total Environment)
- MTGnet: Multi-Task Spatiotemporal Graph Convolutional Networks for Air Quality Prediction(Dan. Lu, R. Chen, Shanshan Sui, Qilong Han, Linglong Kong, Yichen Wang, 2022, 2022 International Joint Conference on Neural Networks (IJCNN))
动态/自适应图结构学习与多视图/多图空间依赖建模
这组论文的共同重点在于“邻接/关系的自适应与动态化”以及多视图/多图结构生成:通过学习或重建图结构(邻接矩阵、方向性连接、相关性驱动的动态边等)、融合预定义图与数据驱动图来刻画空间依赖的变化性与隐藏关系;相较于单纯注意力重加权,这里更强调图本体的动态构建与空间拓扑质量提升。
- MAGCN: A Multi-Adaptive Graph Convolutional Network for Traffic Forecasting(Qingyuan Zhan, Guixing Wu, Chuang Gan, 2021, 2021 International Joint Conference on Neural Networks (IJCNN))
- Multi-View Spatial–Temporal Graph Convolutional Network for Traffic Prediction(Shuqing Wei, Siyuan Feng, Hai Yang, 2024, IEEE Transactions on Intelligent Transportation Systems)
- D-STGCN: Dynamic Pedestrian Trajectory Prediction Using Spatio-Temporal Graph Convolutional Networks(Bogdan Ilie Sighencea, I. Stanciu, C. Căleanu, 2023, Electronics)
- Dual-STGAT: Dual Spatio-Temporal Graph Attention Networks With Feature Fusion for Pedestrian Crossing Intention Prediction(Jing Lian, Yiyang Luo, Xuecheng Wang, Linhui Li, Ge Guo, Weiwei Ren, Tao Zhang, 2025, IEEE Transactions on Intelligent Transportation Systems)
- MTDA-STGCN: Modern Temporal and Dual-Attention-Based Spatiotemporal Graph Convolutional Network for 4D Trajectory Prediction(Yuheng Kuang, Shuxuan Yuan, Xiang Zou, Jianping Zhang, Zhenyu Shi, Yuding Zhang, Fanman Meng, Zhengning Wang, 2025, IEEE Transactions on Intelligent Transportation Systems)
- A dual-path dynamic directed graph convolutional network for air quality prediction.(Xiao Xiao, Zhiling Jin, Shuo Wang, Jing Xu, Ziyan Peng, Rui Wang, Wei Shao, Yilong Hui, 2022, Science of The Total Environment)
- Spatiotemporal Dual-Graph Interaction Network for Wind-Speed Forecasting(Jianlong Zhang, Li Wu, Zheng Zhao, Botao Zhang, 2025, Proceedings of the 2025 5th International Conference on Big Data, Artificial Intelligence and Risk Management)
- Spatiotemporal interactive learning dynamic adaptive graph convolutional network for traffic forecasting(Feng Jiang, Xingyu Han, Shiping Wen, Tianhai Tian, 2025, Knowledge-Based Systems)
- ST-DAGCN: A spatiotemporal dual adaptive graph convolutional network model for traffic prediction(Yutian Liu, Tao Feng, S. Rasouli, Melvin Wong, 2024, Neurocomputing)
- Correlation Adaptive Dynamic Graph Convolutional Networks for Traffic Flow Prediction(Yan Chen, Dawen Xia, Yang Hu, Wenyong Zhang, Fuchu Zhang, 2025, Lecture Notes in Computer Science)
- A Spatio-Temporal Traffic Flow Prediction Method Based on Dynamic Graph Convolution Network(Guoliang Yang, Huasheng Yu, Hao Xi, 2022, 2022 34th Chinese Control and Decision Conference (CCDC))
- AdpSTGCN: Adaptive spatial-temporal graph convolutional network for traffic forecasting(Xudong Zhang, Xuewen Chen, Haina Tang, Yulei Wu, Hanji Shen, Jun Li, 2024, Knowledge-Based Systems)
- MTAGCN: Mixed Temporal Adaptive GCN for Air Pollution Prediction(Yuxuan Cheng, Zhanquan Wang, 2024, 2024 20th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD))
- SA-STGCN: Structural-Adaptive Spatio-Temporal Graph Convolution with Spatio-Temporal Attunement for skeleton-based gesture recognition(Junhui Li, M. Al-qaness, 2026, Robotics and Autonomous Systems)
- Adaptive Spatial-Temporal Graph Convolution Networks for Collaborative Local-Global Learning in Traffic Prediction(Yibi Chen, Yunchuan Qin, Kenli Li, C. Yeo, Keqin Li, 2023, IEEE Transactions on Vehicular Technology)
- STADGCN: spatial–temporal adaptive dynamic graph convolutional network for traffic flow prediction(Ying Shi, Wentian Cui, Ruiqin Wang, Jungang Lou, Qing Shen, 2025, Neural Computing and Applications)
- ADSTGCN: A Dynamic Adaptive Deeper Spatio-Temporal Graph Convolutional Network for Multi-Step Traffic Forecasting(Zhe Cui, Junjun Zhang, Giseop Noh, Hyunmok Park, 2023, Sensors)
- Spatiotemporal Adaptive Gated Graph Convolution Network for Urban Traffic Flow Forecasting(Bin Lu, Xiaoying Gan, Haiming Jin, Luoyi Fu, Haisong Zhang, 2020, Proceedings of the 29th ACM International Conference on Information & Knowledge Management)
时空注意力与自适应/门控机制(动态依赖强度重加权与特征融合)
这组文献共同点是用注意力(通道/空间/时间/多头/CBAM等)与门控/自适应模块来动态调整时空依赖强度与特征融合方式:通过显式的“权重学习—特征重标定”提升对复杂动态数据的表达能力与鲁棒性;侧重于动态依赖建模而非主要更新图结构本体。
- Channel attention and multi-scale graph neural networks for skeleton-based action recognition(Ronghao Dang, Chengju Liu, Meilin Liu, Qi Chen, 2022, AI Communications)
- Self-Relational Graph Convolution Network for Skeleton-Based Action Recognition(S. B. Yussif, Ning Xie, Yang Yang, H. Shen, 2023, Proceedings of the 31st ACM International Conference on Multimedia)
- Multi-attention Augmented Spatio-Temporal Graph Convolution Network for Gait Recognition Based on IMUs Data(Jianjun Yan, Zhihao Yang, Yue Lin, Wei Zhou, 2025, 2025 2nd International Conference on Electronic Engineering and Information Systems (EEISS))
- Human Behavior Recognition Based on Attention Mechanism and Bottleneck Residual Dual-Path Spatiotemporal Graph Convolutional Network(Jingke Xu, Cong Pan, 2024, 2024 4th International Conference on Neural Networks, Information and Communication (NNICE))
- SA-STGCN: Structural-Adaptive Spatio-Temporal Graph Convolution with Spatio-Temporal Attunement for skeleton-based gesture recognition(Junhui Li, M. Al-qaness, 2026, Robotics and Autonomous Systems)
- Attention-based spatial–temporal adaptive dual-graph convolutional network for traffic flow forecasting(Dawen Xia, Bingqi Shen, Jian Geng, Yang Hu, Yantao Li, Huaqing Li, 2023, Neural Computing and Applications)
- STAE-GCN: A Spatio-Temporal Adaptive Embedding Graph Convolutional Network for Multistep Traffic Forecasting(Changxi Yuan, Xin Li, Yuliang He, Hanchi Zhang, Jianing Liu, 2025, 2025 18th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI))
- Spatiotemporal Adaptive Gated Graph Convolution Network for Urban Traffic Flow Forecasting(Bin Lu, Xiaoying Gan, Haiming Jin, Luoyi Fu, Haisong Zhang, 2020, Proceedings of the 29th ACM International Conference on Information & Knowledge Management)
- Spatio-Temporal Gating-Adjacency GCN for Human Motion Prediction(Chongyang Zhong, Lei Hu, Zihao Zhang, Yongjing Ye, Shi-hong Xia, 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))
- Spatio–Temporal Bidirectional Gated Graph Convolutional Network for Skeleton Action Recognition in Dynamic Complex Environments(Lifeng Yin, Yunhan Wang, Qianxi Zhou, Miao Wang, Maohua Sun, Wu Deng, 2025, IEEE Internet of Things Journal)
- Spatio–Temporal Bidirectional Gated Graph Convolutional Network for Skeleton Action Recognition in Dynamic Complex Environments(Lifeng Yin, Yunhan Wang, Qianxi Zhou, Miao Wang, Maohua Sun, Wu Deng, 2025, IEEE Internet of Things Journal)
多任务学习与联合预测(共享时空表征的多目标输出)
该组文献均以多任务学习范式为核心:在同一时空图网络框架内共享表征、同时预测不同目标/粒度/层级(如动作与缺失修复、区域与区域间关系、不同预测统计或外部因素相关任务等),利用辅助任务提升主任务判别性与泛化能力;属于“面向多目标的联合时空图学习”。
- Multi-Task Synchronous Graph Neural Networks for Traffic Spatial-Temporal Prediction(He Li, D. Jin, Xuejiao Li, Jianbin Huang, Jaesoo Yoo, 2021, Proceedings of the 29th International Conference on Advances in Geographic Information Systems)
- Towards Accurate 3D Human Motion Prediction from Incomplete Observations(Qiongjie Cui, Huaijiang Sun, 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))
- Traffic Accident Risk Prediction via Multi-View Multi-Task Spatio-Temporal Networks(Senzhang Wang, Jiaqiang Zhang, Jiyue Li, Hao Miao, Jiannong Cao, 2023, IEEE Transactions on Knowledge and Data Engineering)
- Multi-View Multi-Task Spatiotemporal Graph Convolutional Network for Air Quality Prediction(Shanshan Sui, Qilong Han, 2022, SSRN Electronic Journal)
- MTGnet: Multi-Task Spatiotemporal Graph Convolutional Networks for Air Quality Prediction(Dan. Lu, R. Chen, Shanshan Sui, Qilong Han, Linglong Kong, Yichen Wang, 2022, 2022 International Joint Conference on Neural Networks (IJCNN))
- Multi-View Multi-Task Spatiotemporal Graph Convolutional Network for Air Quality Prediction(Shanshan Sui, Qilong Han, 2022, SSRN Electronic Journal)
- Human Movement Science-Informed Multi-Task Spatio Temporal Graph Convolutional Networks for Fitness Action Recognition and Evaluation(Jia-Wei Chang, Ming‐Hung Chen, Hao-Shang Ma, Hao-Lan Liu, 2023, Applied Soft Computing)
- TAP: Traffic Accident Profiling via Multi-Task Spatio-Temporal Graph Representation Learning(Zhi Liu, Yang Chen, Feng Xia, Jixin Bian, Bing Zhu, Guojiang Shen, Xiangjie Kong, 2022, ACM Transactions on Knowledge Discovery from Data)
- Multi-Task Spatial-Temporal Graph Attention Network for Taxi Demand Prediction(Mingming Wu, Chaochao Zhu, Lianliang Chen, 2020, Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence)
- Traffic Flow Prediction for Highway Based on Multi-Task Spatiotemporal Graph Network(Jinyong Gao, Sheng Luo, Junshan Tian, Cheng Zhou, Lianhua An, 2025, Transportation Safety and Environment)
- MT-STNets: Multi-Task Spatial-Temporal Networks for Multi-Scale Traffic Prediction(Senzhang Wang, Meiyue Zhang, Hao Miao, Philip S. Yu, 2021, Proceedings of the 2021 SIAM International Conference on Data Mining (SDM))
- MT-STNet: A Novel Multi-Task Spatiotemporal Network for Highway Traffic Flow Prediction(Guojian Zou, Ziliang Lai, Ting Wang, Zongshi Liu, Ye Li, 2024, IEEE Transactions on Intelligent Transportation Systems)
- AutoSTL: Automated Spatio-Temporal Multi-Task Learning(Zijian Zhang, Xiangyu Zhao, Miao Hao, Chunxu Zhang, Hongwei Zhao, Junbo Zhang, 2023, Proceedings of the AAAI Conference on Artificial Intelligence)
- STL Net: A spatio-temporal multi-task learning network for Autism spectrum disorder identification(Yongjie Huang, Yanyan Zhang, Man Chen, Xiao Han, Zhisong Pan, 2025, Biomedical Signal Processing and Control)
- Multi-task learning for gait-based identity recognition and emotion recognition using attention enhanced temporal graph convolutional network(Weijie Sheng, Xinde Li, 2021, Pattern Recognition)
- Traffic Flow Forecasting Model Based on Spatio-temporal Probability SparseAdaptive Hybrid Graph Convolution Network(Jia Luo, Linlong Chen, 2024, 2024 14th International Conference on Information Technology in Medicine and Education (ITME))
- Spatio-temporal Dual Graph Neural Networks for Travel Time Estimation(G. Jin, Huan Yan, Fuxian Li, Jincai Huang, Yong Li, 2021, ACM Transactions on Spatial Algorithms and Systems)
- A multi-task spatio-temporal fully convolutional model incorporating interaction patterns for traffic flow prediction(Qianqian Zhou, Ping Tu, Nan Chen, 2024, International Journal of Geographical Information Science)
- Cross-turbine fault diagnosis for wind turbines with SCADA data: A spatio-temporal graph network with multi-task learning(Xinhua Zhang, Guoqian Jiang, Wenyue Li, Die Bai, Qun He, Ping Xie, 2024, 2024 39th Youth Academic Annual Conference of Chinese Association of Automation (YAC))
- An IoMT Based Epileptic Seizure Predictor Using Dual-Path Temporal-Spectral Encoded Spatial Dynamic Graph Convolution Network(Weidong Yan, Mingyi Sun, Yang Li, 2024, 2024 4th International Conference on Industrial Automation, Robotics and Control Engineering (IARCE))
多尺度时间建模与时序依赖增强(TCN/因果卷积/记忆/长期短期覆盖)
这组论文的共同重点在于“时间维建模策略”:通过多尺度时间卷积/膨胀卷积、双向或门控时序建模、TCN/因果卷积类结构、以及长期依赖与记忆式融合来覆盖短期与长期时序关系;其改进核心通常在时序处理而非主要改变空间图结构或任务形式。
- MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition(Kowovi Comivi Alowonou, Ji-Hyeong Han, 2024, IEEE Access)
- MTDA-STGCN: Modern Temporal and Dual-Attention-Based Spatiotemporal Graph Convolutional Network for 4D Trajectory Prediction(Yuheng Kuang, Shuxuan Yuan, Xiang Zou, Jianping Zhang, Zhenyu Shi, Yuding Zhang, Fanman Meng, Zhengning Wang, 2025, IEEE Transactions on Intelligent Transportation Systems)
- Spatio–Temporal Bidirectional Gated Graph Convolutional Network for Skeleton Action Recognition in Dynamic Complex Environments(Lifeng Yin, Yunhan Wang, Qianxi Zhou, Miao Wang, Maohua Sun, Wu Deng, 2025, IEEE Internet of Things Journal)
- PTP-STGCN: Pedestrian Trajectory Prediction Based on a Spatio-temporal Graph Convolutional Neural Network(J. Lian, Weiwei Ren, Linhui Li, Yafu Zhou, Bin Zhou, 2022, Applied Intelligence)
- D-STGCN: Dynamic Pedestrian Trajectory Prediction Using Spatio-Temporal Graph Convolutional Networks(Bogdan Ilie Sighencea, I. Stanciu, C. Căleanu, 2023, Electronics)
- LGTCN: A Spatial–Temporal Traffic Flow Prediction Model Based on Local–Global Feature Fusion Temporal Convolutional Network(Wei Ye, Haoxuan Kuang, Kunxiang Deng, Dongran Zhang, Jun Li, 2024, Applied Sciences)
- Lightweight Multiscale Spatio-Temporal Graph Convolutional Network for Skeleton-Based Action Recognition(Zhiyun Zheng, Qilong Yuan, Huaizhu Zhang, Yizhou Wang, Junfeng Wang, 2025, Big Data Mining and Analytics)
- Spatial-Temporal Graph Fusion with Dual-Scale Convolution for Traffic Flow Prediction(Dan Wang, Meng Cui, Zhenhua Yu, Yukang Liu, 2026, Computers, Materials & Continua)
- Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation(P. Ghosh, Yi Yao, L. Davis, Ajay Divakaran, 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV))
- Graph Multi-Head Convolution for Spatio-Temporal Attention in Origin Destination Tensor Prediction(M. Bhanu, Rahul Kumar, Saswata Roy, João Mendes-Moreira, Joydeep Chandra, 2022, Lecture Notes in Computer Science)
- TT-GCN: Temporal-Tightly Graph Convolutional Network for Emotion Recognition From Gaits(Ieee Tong Zhang Member, Yelin Chen, Shuzhen Li, Xiping Hu, Ieee C. L. Philip Chen Fellow, C. L. Philip, 2024, IEEE Transactions on Computational Social Systems)
- Spatio-temporal Dual Graph Neural Networks for Travel Time Estimation(G. Jin, Huan Yan, Fuxian Li, Jincai Huang, Yong Li, 2021, ACM Transactions on Spatial Algorithms and Systems)
- Spatio‐temporal adaptive graph convolutional networks for traffic flow forecasting(Qiwei Ma, Weiran Sun, Junbo Gao, Pengwei Ma, Mengjie Shi, 2022, IET Intelligent Transport Systems)
- Spatial-temporal graph neural network based on gated convolution and topological attention for traffic flow prediction(Dewei Bai, Dawen Xia, Dan Huang, Yang Hu, Yantao Li, Huaqing Li, 2023, Applied Intelligence)
效率/鲁棒性/训练友好与工程化结构设计(轻量化、错误感知、性能预测)
该组文献相对更强调可训练性、计算效率与鲁棒性:如轻量化多尺度时空GCN、性能预测/评估导向的结构、错误感知(error-aware)或面向实际场景的工程化模块;与前述在“注意力/图结构/任务/时序建模”上的核心改动相比,侧重落地与鲁棒训练策略。
- STDGFN: A spatio-temporal dual-graph fusion network for traffic flow prediction(Ruotian Ye, Yitong Tao, Qingjian Ni, 2026, Applied Intelligence)
- Lightweight Multiscale Spatio-Temporal Graph Convolutional Network for Skeleton-Based Action Recognition(Zhiyun Zheng, Qilong Yuan, Huaizhu Zhang, Yizhou Wang, Junfeng Wang, 2025, Big Data Mining and Analytics)
- A Performance Prediction Method Based on Multi-task Spatio-Temporal Convolution Network for SDN Heterogeneous Network(Zongping Zhou, Dajun Du, Zheyi Chen, Junlin Yang, Yi Min Zhang, 2024, Communications in Computer and Information Science)
- A Spatio-Temporal Error-Aware Multi-Path Graph Convolutional Network for Iterative Forecasting of Wind Turbine Clusters(Baogen Fu, Qiuhui Xia, Rong-Ping Wang, Min Wu, 2026, Iranian Journal of Science and Technology, Transactions of Electrical Engineering)
- Correlation Adaptive Dynamic Graph Convolutional Networks for Traffic Flow Prediction(Yan Chen, Dawen Xia, Yang Hu, Wenyong Zhang, Fuchu Zhang, 2025, Lecture Notes in Computer Science)
- Traffic Flow Prediction for Highway Based on Multi-Task Spatiotemporal Graph Network(Jinyong Gao, Sheng Luo, Junshan Tian, Cheng Zhou, Lianhua An, 2025, Transportation Safety and Environment)
合并后的研究脉络围绕 Dual-Path Spatio-Temporal Multi-Task GCN 的关键要素展开:①以双分支/双路径对空间与时间(或多图/多模态)进行显式并行解耦并交叉交互;②用动态/自适应图学习与多视图/多图结构增强空间依赖刻画;③以注意力与自适应/门控机制对时空依赖强度进行动态重加权;④在共享图表征基础上引入多任务联合预测;⑤通过多尺度时间建模(含门控/TCN/记忆/因果卷积等)强化短长期依赖;⑥保留少量更偏工程化的轻量化/鲁棒性/训练友好结构与错误感知设计。整体形成“图结构学习 + 时空动态重加权 + 双路径解耦交互 + 多任务输出 + 多尺度时序建模”的组合框架。
总计79篇相关文献
Accurate air quality prediction can help cope with air pollution and improve the life quality. With the development of the deployments of low-cost air quality sensors, increasing data related to air quality has provided chances to find out more accurate prediction methods. Air quality is affected by many external factors such as the position, wind, meteorological information, and so on. Meanwhile, these factors are spatio-temporal dynamic and there are many dynamic contextual relationships between them. Many methods for air quality prediction do not consider these complex spatio-temporal correlations and dynamic contextual relationships. In this paper, we propose a dual-path dynamic directed graph convolutional network (DP-DDGCN) for air quality prediction. We first create a dual-path transposed dynamic directed graph according to static distance relationships of stations and the dynamic relationships generated by wind speed and directions. Then based on the dual-path dynamic directed graph, we can capture the dynamic spatial dependencies more comprehensively. After that we apply gated recurrent units (GRUs) and add the future meteorological features, to extract the complex temporal dependencies of historical air quality data. Using dual-path dynamic directed graph blocks and the GRUs, we finally construct a dynamic spatio-temporal gated recurrent block to capture the dynamic spatio-temporal contextual correlations. Based on real-world datasets, which record a large amount of PM2.5 concentration data, we compare the proposed model with the benchmark models. The experimental results show that our proposed model has the best performance in predicting the PM2.5 concentrations.
A spatio-temporal traffic flow prediction model based on dynamic graph convolution network (DGCN) is proposed to solve the problem that using a fixed Laplacian matrix to model the spatial features of the road network in the existing traffic flow prediction process, which is unable to fully utilize the traffic data and leads to poor prediction performance. The model using global Laplacian matrix learning layer, spatial attention layer and gated cyclic unit layer to form a matrix learning module, extracts spatio-temporal features from the given traffic flow data and predicts the future Laplacian matrix, and then applies it to the spatio-temporal prediction module consisting of graph convolutional network (GCN), gated cyclic unit and Transformer layers. The experimental results show that the proposed model is not only capable of modeling spatial features in real time according to the given data, but also can effectively improve the accuracy of traffic flow prediction.
: Traffic flow prediction is of great importance in traffic planning, road resource management, and congestion mitigation. However, existing prediction have significant limitations in modeling multi-scale spatial-temporal features, particularly in capturing temporal periodicity and spatial dependency in dynamically evolving traffic networks. This paper proposes a novel framework of traffic flow prediction, referred to as Adaptive Graph Fusion Dual-scale Convolutional Network (AGFDCN), which integrates spatial-temporal dynamic graphs with dual-scale convolutional networks. Specifically, we introduce a Dual-Scale Temporal Network, which combines long-and short-term dilated causal convolutions with a temporal decay-aware attention mechanism to efficiently capture traffic patterns across multiple temporal scales. Furthermore, we design a Dynamic Adaptive Graph Module, which models complex spatial dependencies in traffic networks through an adaptive graph fusion mechanism and a dual-path attention-gated module. Finally, the temporal and spatial representations are integrated by employing a gated fusion mechanism, enhancing the overall prediction performance. Experimental results obtained based on three highway datasets (i.e., PEMS04, PEMS07 and PEMS08) verify that the proposed model outperforms several state-of-the-art baselines in various evaluation metrics. Compared to the spatial-temporal graph model AGCRN with best performance in the baseline models, the proposed model exhibits significant improvements across all datasets: it achieves reduces of MAE by 42.07% and RMSE by 35.43% on PEMS04; MAE by 28.35% and RMSE by 29.28% on PEMS07; and MAE by 30.52% and RMSE by 30.73% on PEMS08, respectively, validating its effectiveness in modeling complex spatial-temporal traffic data and its robustness in handling sudden traffic changes.
… Spatio-temporal graph neural networks (SpatioTemporal … In summary, existing dual-path and multi-path fusion techniques … of the graph, we design a dual graph convolutional module, …
… prediction errors using a dual-path global information graph convolutional network, we … spatio-temporal directed error graphs. This method is then integrated with the unidirectional graph…
Skeleton-based action recognition has achieved remarkable advances with graph convolutional networks (GCNs). However, most existing models process spatial and temporal information within a single coupled stream, which often obscures the distinct patterns of joint configuration and motion dynamics. This paper introduces the Dual-Path Cross-Attention Graph Convolutional Network (DPCA-GCN), an architecture that explicitly separates spatial and temporal modeling into two specialized pathways while maintaining rich bidirectional interaction between them. The spatial branch integrates graph convolution and spatial transformers to capture intra-frame joint relationships, whereas the temporal branch combines temporal convolution and temporal transformers to model inter-frame dependencies. A bidirectional cross-attention mechanism facilitates explicit information exchange between both paths, and an adaptive gating module balances their respective contributions according to the action context. Unlike traditional approaches that process spatial–temporal information sequentially, our dual-path design enables specialized processing while maintaining cross-modal coherence through memory-efficient chunked attention mechanisms. Extensive experiments on the NTU RGB+D 60 and NTU RGB+D 120 datasets demonstrate that DPCA-GCN achieves competitive joint-only accuracies of 88.72%/94.31% and 82.85%/83.65%, respectively, with exceptional top-5 scores of 96.97%/99.14% and 95.59%/95.96%, while maintaining significantly lower computational complexity compared to multi-modal approaches.
Dementia typically results from damage to neural pathways and the consequent degeneration of neuronal connections. Graph neural networks (GNNs) have been widely employed to model complex brain networks. However, leveraging the complementary temporal, spatial, and spectral features for diagnosing neurocognitive disorders remains challenging. To address this issue, we propose a Bi-path Multi-scale Graph Neural Network (Bi-MCGNN), which integrates two paths : one focusing on time and spatial relationships, and the other on spatial and frequency patterns. By unifying these pathways, Bi-MCGNN integrates diverse brain features into a single framework. In order to more effectively represent brain networks, we designed specialized correlation matrixs to reinforce the constructed graph. We then performed multi-scale graph convolution to analyze brain connectivity at varying resolutions-from fine-grained to more extensive patterns, and ultimately employed an attention mechanism to enhance features across different domains. Extensive experiments on two real-world datasets demonstrate that our model outperforms state-of-the-art baselines.
… of graph convolutional networks (GCN) and GAT, called skipped … spatio-temporal features. Model 6 adopts the gated weighted summation method for the dual-branch spatio-temporal …
Transformer-based methods have achieved outstanding performance in 3D human pose estimation. However, existing approaches typically process spatial and temporal features within a single pathway, which leads to mutual interference and information loss. Moreover, they show clear deficiencies in modeling local information, making it difficult to capture subtle joint movements and short-term dependencies between frames in a motion sequence. To address these challenges, we propose a novel Dual-Path Spatial-Temporal Separation (DP-STS) framework. By modeling spatial and temporal self-attention mechanisms in parallel, we enhance the feature extraction specificity. To improve local information modeling, we introduce a Local Graph Convolution Module (LGCM) and a Local Temporal Convolutional Module (LTCM) into our framework. Additionally, we design a Graph-guided Keypoint Attention Module (GKAM) and a Time-guided Frame Attention Module (TFAM) to strengthen the framework’s understanding of the human skeletal structure and the sequential order of frames. Extensive experiments on the Human3.6M and MPI-INF-3DHP benchmark datasets demonstrate that DP-STS achieves state-of-the-art performance in 2D-to-3D human pose estimation on video data.
Graph convolutional networks (GCNs) have achieved remarkable success in skeleton-based action recognition. However, most existing studies mainly rely on historical and current data in Internet of Things (IoT) systems, failing to fully exploit the potential of future information. To address this issue, an innovative spatio–temporal bidirectional gated GCN (STBiG-GCN), namely, STBiG-GCN, is designed to enhance the accuracy of action recognition by integrating future information in dynamic and complex environments. First, STBiG-GCN employs bidirectional convolution to enhance the representation of time series features, effectively capturing the dependencies between past and future actions. Second, a gating mechanism is designed to dynamically adjust the output, selectively focusing on the most critical features. Finally, by combining skip connections and an efficient multiscale attention (EMA) mechanism, feature fusion and information flow are optimized, effectively alleviating the vanishing gradient problem in deep networks. On the NTU RGB+D dataset with X-sub and X-view partition standards, STBiG-GCN achieves accuracies of 84.6% and 91.7%, respectively, which are 3.1% and 3.4% higher than those of the ST-GCN model. On the Kinetics-Skeleton dataset, the Top-1 and Top-5 accuracies reach 32.9% and 55.2%, respectively. Additionally, to verify the robustness and generalization performance of the model under data scarcity conditions, experiments were conducted on the Northwestern-UCLA small sample dataset, achieving an accuracy of 92.7%, outperforming existing methods. These significant improvements demonstrate the superiority of STBiG-GCN in capturing action features and provide new solutions for time series analysis.
Video surveillance systems are crucial tools for enhancing security and technological prevention levels. Behavior recognition plays a vital role in violence prevention and the identification of hazardous activities within video surveillance systems. Addressing challenges such as the impact of redundant backgrounds and multi-angle shooting environments on human behavior recognition in surveillance videos, resulting in poor data processing capabilities and low accuracy, this paper proposes a novel behavior detection algorithm named BiST-GCN-CBAM (Bottleneck Residual Dual-Path Spatiotemporal Graph Convolutional Network with Attention Mechanism). The model integrates bottleneck residual connections and the Channel-wise Attention Mechanism (CBAM) into the framework of spatiotemporal graph convolution. The proposed model enhances the representation capability of human behavior features, facilitating efficient recognition. Additionally, the integrated attention mechanism allows the network to focus on relevant information while disregarding unnecessary details, effectively addressing the challenge of distinguishing key nodes for different behaviors and improving recognition accuracy. Experimental results on the Kinetic-skeleton and NTU-RGB+D datasets demonstrate that the algorithm effectively reduces network computational complexity while enhancing performance, leading to accurate human behavior recognition.
Epilepsy is a neurological disorder characterized by recurrent and uncontrolled seizures, severely impacting the safety of patients' lives. It is a challenging task to accurately predict seizures before its occurrence, and timely alarm patients to take appropriate measures to reduce the consequence of seizures. This work proposes an accurate and smart seizure warning predictor, named Seiz-Predictor, which is designed on the basis of the Internet of Medical Thing (IoMT) framework and constructed with a seizure prediction algorithm coupled with a mobile warning application. The Seiz-Predictor employs a dual-path temporal-spectral encoded spatial dynamic graph convolution network (DPTSE-SDGCNet) to enhance seizure prediction accuracy. Additionally, a compatible seizure warning mobile application is developed for seizure alarming, location recording, and interacting with cloud-based databases for further reference by doctors. The designed Seiz-Predictor offers several key benefits, including (1) We propose a novel lightweight network structure more suitable for epileptic seizure warning predictors, achieving more accurate predicting and shorter identification time in the public dataset. (2) A seizure warning mobile application is developed for cloud-based database connectivity, enabling computation of patient epileptic states for doctoral reference. Experimental results on the public CHB-MIT dataset demonstrate that the sensitivity of 96.1% and a false prediction rate of 0.086/h, outperforming the most recent investigations.
Four-dimensional (4D) trajectory prediction plays a critical role in modern air traffic management, enabling applications such as conflict detection, anomaly monitoring, and congestion mitigation. However, existing methods have limited information sources when modeling potential spatial correlations between aircraft in complex airspace scenarios, and their final trajectory inference ability is weak, resulting in lower prediction accuracy. Faced with these challenges, we propose Modern Temporal and Dual Attention based Spatiotemporal Graph Convolutional Network (MTDA-STGCN), which employs a self-attention mechanism to reconstruct the adjacency matrix to enhance the ability of capturing global node correlations. This adjacency matrix reconstructed with the self-attention mechanism is dynamically optimized throughout the training process of network, offering a more nuanced reflection of the inter-node relationships compared to traditional algorithms. Subsequently, our model uses graph attention to extract additional global features for modeling accuracy interactions between aircraft. Finally, the output is input into the Modern Temporal Prediction Network (MTPN) to obtain the predicted trajectory probability distribution. The experiments on real-world ADS-B datasets demonstrate that MTDA-STGCN outperforms existing 4D trajectory prediction algorithms on all datasets. The proposed dual-attention framework significantly enhances the capture of node spatial correlations, while the MTPN module effectively improves the accuracy of the predicted results.
… To address these challenges, we propose the spatiotemporal dual adaptive graph convolutional network (ST-DAGCN) model for spatiotemporal traffic prediction, which utilizes a dual-…
Network traffic forecasting is essential for efficient network management and planning. Accurate long-term forecasting models are also essential for proactive control of upcoming congestion events. Due to the complex spatial-temporal dependencies between traffic flows, traditional time series forecasting models are often unable to fully extract the spatial-temporal characteristics between the traffic flows. To address this issue, we propose a novel dual-channel based graph convolutional network (DC-STGCN) model. The proposed model consists of two temporal components that characterize the daily and weekly correlation of the network traffic. Each of these two components contains a spatial-temporal characteristics extraction module consisting of a dual-channel graph convolutional network (DCGCN) and a gated recurrent unit (GRU). The DCGCN further consists of an adjacency feature extraction module (AGCN) and a correlation feature extraction module (PGCN) to capture the connectivity between nodes and the proximity correlation, respectively. The GRU further extracts the temporal characteristics of the traffic. The experimental results based on real network data sets show that the prediction accuracy of the DC-STGCN model overperforms the existing baseline and is capable of making long-term predictions.
D-STGCN: Dynamic Pedestrian Trajectory Prediction Using Spatio-Temporal Graph Convolutional Networks
Predicting pedestrian trajectories in urban scenarios is a challenging task that has a wide range of applications, from video surveillance to autonomous driving. The task is difficult since pedestrian behavior is affected by both their individual path’s history, their interactions with others, and with the environment. For predicting pedestrian trajectories, an attention-based interaction-aware spatio-temporal graph neural network is introduced. This paper introduces an approach based on two components: a spatial graph neural network (SGNN) for interaction-modeling and a temporal graph neural network (TGNN) for motion feature extraction. The SGNN uses an attention method to periodically collect spatial interactions between all pedestrians. The TGNN employs an attention method as well, this time to collect each pedestrian’s temporal motion pattern. Finally, in the graph’s temporal dimension characteristics, a time-extrapolator convolutional neural network (CNN) is employed to predict the trajectories. Using a lower variable size (data and model) and a better accuracy, the proposed method is compact, efficient, and better than the one represented by the social-STGCNN. Moreover, using three video surveillance datasets (ETH, UCY, and SDD), D-STGCN achieves better experimental results considering the average displacement error (ADE) and final displacement error (FDE) metrics, in addition to predicting more social trajectories.
Artificial intelligence and Automatic Identification Systems (AIS) play pivotal roles in intelligent maritime navigation for the modern maritime industry. Many artificial intelligence maritime applications based on AIS data have dramatically benefited traditional operations and managements in the field of maritime industry, and also provided state-of-the-art predictive analytics for vessel collisions and route optimization. However, the problem of modeling the interactions of vessels in complex waters still needs to be adequately addressed. In this paper, we focus on using spatiotemporal AIS data to model and forecast multiple vessel trajectories amid dynamic interaction patterns, and we propose a forecast model based on a novel neural network, namely a spatiotemporal multi-graph fusion network (STMGF-Net). The innovative STMGF-Net comprises three crucial modules. First, a Spatiotemporal graph construction module generates interaction graphs of various navigation modes, such as motions, risks, and attributes of vessels, Second, a multi-mode fusion module embeds and fuses the above interaction graphs into STMGF-Net. Finally, squeeze-and-excitation and temporal convolutional networks are introduced as Squeeze-and-excitation temporal convolutional modules to enhance the overall efficiency of the model significantly. Overall, the STMGF-Net can recognize complex spatiotemporal interaction patterns among neighboring vessels so as to capture and integrate these interaction features for achieving high-precision prediction performance in intelligent maritime navigation. In numerical experiments, three water areas of Zhoushan Islands, Yangshan Waters, and Yangtze River Waters are used as training and testing datasets. The results show that STMGF-Net improved prediction errors of average and final distance with increase of 49.637% and 50.622% than classic and state-of-art graph neural networks.
… In this paper, we propose a dual-branch spatio-temporal graph neural network to … area, and a spatio-temporal graph convolutional network (STGCN) branch is designed to model group …
Pedestrian intent prediction is critical for autonomous driving, as accurately predicting crossing intentions helps prevent collisions and ensures the safety of both pedestrians and passengers. Recent research has focused on vision-based deep neural networks for this task, but challenges remain. First, current methods suffer from low efficiency in multi-feature fusion and unreliable predictions under challenging conditions. Additionally, real-time performance is essential in practical applications, so the efficiency of the algorithm is crucial. To address these issues, we propose a novel architecture, Dual-STGAT, which uses a dual-level spatio-temporal graph network to extract pedestrian pose and scene interaction features, reducing information loss and improving feature fusion efficiency. The model captures key features of pedestrian behavior and the surrounding environment through two modules: the Pedestrian Module and the Scene Module. The Pedestrian Module extracts pedestrian motion features using a spatio-temporal graph attention network, while the Scene Module models interactions between pedestrians and surrounding objects by integrating visual, semantic, and motion information through a graph network. Extensive experiments conducted on the PIE and JAAD datasets show that Dual-STGAT achieves over 90% accuracy in pedestrian crossing intention prediction, with inference latency close to 5ms, making it well-suited for large-scale production autonomous driving systems that demand both performance and computational efficiency.
… Meanwhile, STGCN [36] established the classic paradigm of spatiotemporal convolutional blocks, in which Chebyshev polynomial–approximated spectral graph convolutions (GCN) are …
… hypergraph obtained via dual transformations and embedding-based association learning to capture high-order group interactions; and (iii) a spatial-temporal dual-graph interactive …
… To address this challenge, we propose the Structural-Adaptive Spatio-Temporal GCN (SA-STGCN), which relies on an innovative spatiotemporal feature extraction mechanism …
Accurate traffic flow forecasting (TFF) is a prerequisite for urban traffic control and guidance, which has become the key to avoiding traffic congestion and improving traffic management …
Predicting traffic accidents can help traffic management departments respond to sudden traffic situations promptly, improve drivers’ vigilance, and reduce losses caused by traffic accidents. However, the causality of traffic accidents is complex and difficult to analyze. Most existing traffic accident prediction methods do not consider the dynamic spatio-temporal correlation of traffic data, which leads to unsatisfactory prediction accuracy. To address this issue, we propose a multi-task learning framework (TAP) based on the Spatio-temporal Variational Graph Auto-Encoders (ST-VGAE) for traffic accident profiling. We firstly capture the dynamic spatio-temporal correlation of traffic conditions through a spatio-temporal graph convolutional encoder and embed it as a low-latitude vector. Then, we use a multi-task learning scheme to combine external factors to generate the traffic accident profiling. Furthermore, we propose a traffic accident profiling application framework based on edge computing. This method increases the speed of calculation by offloading the calculation of traffic accident profiling to edge nodes. Finally, the experimental results on real datasets demonstrate that TAP outperforms other state-of-the-art baselines.
… Therefore, we propose a multi-view multi-task spatiotemporal graph convolutional network (M2… Meanwhile, M2 chooses a multi-task learning paradigm that includes a classification task (…
Spatio-temporal prediction plays a critical role in smart city construction. Jointly modeling multiple spatio-temporal tasks can further promote an intelligent city life by integrating their inseparable relationship. However, existing studies fail to address this joint learning problem well, which generally solve tasks individually or a fixed task combination. The challenges lie in the tangled relation between different properties, the demand for supporting flexible combinations of tasks and the complex spatio-temporal dependency. To cope with the problems above, we propose an Automated Spatio-Temporal multi-task Learning (AutoSTL) method to handle multiple spatio-temporal tasks jointly. Firstly, we propose a scalable architecture consisting of advanced spatio-temporal operations to exploit the complicated dependency. Shared modules and feature fusion mechanism are incorporated to further capture the intrinsic relationship between tasks. Furthermore, our model automatically allocates the operations and fusion weight. Extensive experiments on benchmark datasets verified that our model achieves state-of-the-art performance. As we can know, AutoSTL is the first automated spatio-temporal multi-task learning method.
… spatio-temporal correlation information based on the adjacency matrix of localized spatiotemporal graph. … Yu, HT Yin, and ZX Zhu,Spatio-temporal graph convolutional networks: A deep …
… To solve this problem, we propose a heterogeneous spatio-temporal multi-task learning network (STL Net) for distinguishing between ASD patients and normal controls (NCs). …
Abnormal traffic incidents such as traffic accidents have become a significant health and development threat with the rapid urbanization of many countries. Thus it is critically important to accurately forecast the traffic accident risks of different areas in a city, which has attracted increasing research interest in the research area of urban computing. The challenges of accurate traffic risk forecasting are three-fold. First, traffic accident data in some areas of a city is sparse, especially for a fine-grained prediction, which may cause the zero inflation problem during model training. Second, the spatio-temporal correlations of the traffic accidents occurring in different areas are rather complex and non-linear, which is difficult to capture by existing shallow models like regression. Third, the occurrence of traffic accidents can be significantly affected by various context features including weather, POI and road network features. It is non-trivial to capture the complex associations between the diverse context features and traffic accident risks for building an accurate prediction model. To address the above challenges, this paper proposes a Multi-View Multi-Task Spatio-Temporal Networks (MVMT-STN) model to forecast fine- and coarse-grained traffic accident risks of a city simultaneously. Specifically, to address the data sparsity issue in a fine-grained prediction, we adopt a multi-task learning framework to jointly forecast both fine- and coarse-grained traffic accident risks by considering their spatial associations. For each granularity prediction, we design the channel-wise CNN and multi-view GCN to capture the local geographic dependency and global semantic dependency, respectively. In order to obtain the diverse impacts of the context features on traffic accidents, we also introduce a fusion learning module that integrates the channel-wise and multi-view features learned from different types of the external factors. We conduct extensive experiments over two large real traffic accident datasets. The results show that MVMT-STN improves the performance of traffic accident risk prediction in both fine- and coarse-grained prediction by a large margin compared with existing state-of-the-art methods.
Abstract Previous traffic flow prediction studies have utilized spatio-temporal neural networks combined with the multi-task learning framework to seek complementary information for enhancing prediction performance. However, the existing methods still face two challenges: they fail to capture global interaction patterns between regions and lack consideration for inter-correlations within interaction patterns. To solve these issues, we propose a novel multi-task spatio-temporal fully convolutional model named MSTFCM. First, the model includes the interaction tensor and raster tensor as task inputs, where the interaction tensor extends the raster tensor by incorporating global interaction patterns between regions. Second, a multi-task framework combined spatio-temporal convolutional block was used to learn generalized features and interaction features. A channel spatio-temporal attention is added to adaptively adjust feature weights and capture inter-correlations. To train the MSTFCM, the uncertainty loss was designed as the learnable loss functions, which capture various flow fluctuations, to facilitate multi-task optimization. The proposed model was validated on two real-world traffic datasets collected in Xiamen, China. Experimental results showed that MSTFCM outperformed nine baselines in one-step and multi-step prediction, with slower performance degradation as predicted time intervals and steps increased. We further validated the model’s effectiveness through designed variants and visualization results.
… To achieve this, we apply 3 graph convolution kernels \(q_s\in \mathbb {R}^{f\times d}\), \(k_s… Multi-task adversarial spatial-temporal networks for crowd flow prediction. In: Proceedings of …
Traffic spatial-temporal prediction is of great significance to traffic management and urban construction. In this paper, we propose a multi-task graph Synchronous neural network (MTSGNN) to synchronously predict the spatial-temporal data at the regions and transitions between regions. The method of constructing "multitask graph representation" is proposed to retain the information of regions and transitions that existing works can not reflect. Then our model synchronously captures multiple types of dynamic spatial correlations, models dynamic temporal dependencies and re-weights different time steps to solve the problem of long-term time modeling. In three real data sets, we verify the validity of the proposed model.
… Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial …
Multi-step highway traffic flow prediction is crucial for intelligent transportation systems, and existing works have made significant advancements in this field. However, the physical structure, including path, distance, and node degree, is critical information in traffic networks and is often overlooked when encoding spatiotemporal dependencies. Meanwhile, the problem of prediction error propagation in multi-step flow forecasting is challenging to mitigate and can significantly impact overall forecasting performance. Moreover, traffic flows between toll and gantry stations in the highway network exhibit notable differences, leading to heterogeneous flow distributions. To overcome the above issues, a novel multi-task spatiotemporal network for highway traffic flow prediction (MT-STNet) is proposed, consisting of the encoder-decoder structure, a generative inference system, and multi-task learning. The spatiotemporal block with physical transformation is developed to construct both the encoder and decoder, integrating the physical structure information into modeling the highway network’s spatiotemporal dependencies. Additionally, the generative inference architecture is designed to extract the correlation between the historical- and target- sequences to generate the target hidden representations rather than a dynamic decoding way, avoiding multi-step prediction error propagation. Furthermore, because of traffic flow heterogeneity in the highway network, multi-task learning divides highway traffic flow prediction into three tasks, sharing the underlying traffic network and knowledge learned, thereby enhancing the prediction performance of each subtask. The evaluation experiments used monitoring data from a highway in Yinchuan City, Ningxia Province, China. The experimental results demonstrate that the performance of our proposed prediction model is better than that of the baseline methods.
Taxi demand prediction is of much importance, which enables the building of intelligent systems and smart city. It is necessary to predict taxi demand accurately to schedule taxi fleet in a reasonable and efficient way and to reduce the pressure of traffic jam. However, the taxi demand involves complex and non-linear spatial-temporal impacts. The superiority of deep learning makes people explore the possibility to apply it to traffic prediction. State-of-the-art methods on taxi demand prediction only capture static spatial correlations between regions (e.g., Using static graph embedding) and only take taxi demand data into consideration. We propose a Multi-Task Spatial-Temporal Graph Attention Network (MSTGAT-Net) framework which models the correlations between regions dynamically with graph-attention network and captures the correlation between taxi pick up and taxi drop off with multi-task training. To the best of our knowledge, it is the first paper to address the taxi demand prediction problem with graph attention network and multi-task learning. Experiments on real-world taxi data show that our model is superior to state-of-the-art methods.
Efficient and precise traffic flow prediction holds great importance in effective traffic management. This research presents a novel prediction model that integrates highway spatial changes and flow-related information (speed and vehicle composition). The highway is divided into segments, using key reference points like tunnels, toll stations, and ramps. An adaptive graph convolutional network is employed to capture relationships between these segments. The network automatically adjusts adjacency matrix weights, facilitating the extraction of spatial flow features. Incorporating flow related information, a multi-task module attention fusion network is introduced. The main task is traffic flow prediction, with average travel speed and vehicle composition as auxiliary tasks. This approach enhances feature acquisition and improves prediction accuracy. In experiments using Fuzhou–Jingtan Expressway data, the model significantly enhances prediction accuracy by at least 55%. Ablation experiments validate the effectiveness of the designed modules, improving the model's accuracy from 20% to 45%.
Travel time estimation is one of the core tasks for the development of intelligent transportation systems. Most previous works model the road segments or intersections separately by learning their spatio-temporal characteristics to estimate travel time. However, due to the continuous alternations of the road segments and intersections in a path, the dynamic features are supposed to be coupled and interactive. Therefore, modeling one of them limits further improvement in accuracy of estimating travel time. To address the above problems, a novel graph-based deep learning framework for travel time estimation is proposed in this article, namely, Spatio-temporal Dual Graph Neural Networks (STDGNN). Specifically, we first establish the node-wise and edge-wise graphs to, respectively, characterize the adjacency relations of intersections and that of road segments. To extract the joint spatio-temporal correlations of the intersections and road segments, we adopt the spatio-temporal dual graph learning approach that incorporates multiple spatial-temporal dual graph learning modules with multi-scale network architectures for capturing multi-level spatial-temporal information from the dual graph. Finally, we employ the multi-task learning approach to estimate the travel time of a given whole route, each road segment and intersection simultaneously. We conduct extensive experiments to evaluate our proposed model on three real-world trajectory datasets, and the experimental results show that STDGNN significantly outperforms several state-of-art baselines.
… Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial …
Traffic flow prediction based on vehicle trajectories collected from the installed GPS devices is critically important to Intelligent Transportation Systems (ITS). One limitation of existing traffic prediction models is that they mostly focus on predicting road-segment level traffic conditions, which can be considered as a fine-grained prediction. In many scenarios, however, a coarse-grained prediction, such as predicting the traffic flows among different urban areas covering multiple road links, is also required to help government have a better understanding on traffic conditions from the macroscopic point of view. This is especially useful in the applications of urban planning and public transportation planning. Another limitation is that the correlations among different types of traffic-related features are largely ignored. For example, the traffic flow and traffic speed are usually negatively correlated. Existing works regard these traffic-related features as independent features without considering their correlations. In this article, we for the first time study the novel problem of multivariate correlation-aware multi-scale traffic flow predicting, and we propose a feature correlation-aware spatio-temporal graph convolutional networks named MC-STGCN to effectively address it. Specifically, given a road graph, we first construct a coarse-grained road graph based on both the topology closeness and the traffic flow similarity among the nodes (road links). Then a cross-scale spatial-temporal feature learning and fusion technique is proposed for dealing with both the fine- and coarse-grained traffic data. In the spatial domain, a cross-scale GCN is proposed to learn the multi-scale spatial features jointly and fuse them together. In the temporal domain, a cross-scale temporal network that is composed of a hierarchical attention is designed for effectively capturing intra- and inter-scale temporal correlations. To effectively capture the feature correlations, a feature correlation learning component is also designed. Finally, a structural constraint is introduced to make the predictions on the two scale traffic data consistent. We conduct extensive evaluations over two real traffic datasets, and the results demonstrate the superior performance of the proposal on both fine- and coarse-grained traffic predictions.
Supervisory Control And Data Acquisition (SCADA) data based fault diagnosis for wind turbines is gaining attention due to its accessibility and affordability. However, existing deep learning methods struggle to effectively model SCADA data because of its complex spatio-temporal correlations. In addition, both the variability among wind turbines and the poor generalization of models make it difficult to port the trained model with existing data to others. To tackle these issues, a spatio-temporal graph convolutional network with multi-task learning named MTSTGCN has been proposed for cross-turbine fault diagnosis. MTSTGCN includes a graph data construction module, a graph learning module to learn spatio-temporal features, a domain adaptive module to learn invariant features for different turbine domains and a deep metric learning module for feature discrimination enhancement. The model is trained end-to-end, enabling collaborative multi-task training. The results of experiments conducted on two SCADA datasets indicate that the proposed method outperforms other baseline approaches.
Graph convolution network-based approaches have been recently used to model region-wise relationships in region-level prediction problems in urban computing. Each relationship represents a kind of spatial dependency, such as region-wise distance or functional similarity. To incorporate multiple relationships into a spatial feature extraction, we define the problem as a multi-modal machine learning problem on multi-graph convolution networks. Leveraging the advantage of multi-modal machine learning, we propose to develop modality interaction mechanisms for this problem in order to reduce the generalization error by reinforcing the learning of multi-modal coordinated representations. In this work, we propose two interaction techniques for handling features in lower layers and higher layers, respectively. In lower layers, we propose grouped GCN to combine the graph connectivity from different modalities for a more complete spatial feature extraction. In higher layers, we adapt multi-linear relationship networks to GCN by exploring the dimension transformation and freezing part of the covariance structure. The adapted approach, called multi-linear relationship GCN, learns more generalized features to overcome the train–test divergence induced by time shifting. We evaluated our model on a ride-hailing demand forecasting problem using two real-world datasets. The proposed technique outperforms state-of-the art baselines in terms of prediction accuracy, training efficiency, interpretability and model robustness.
Accurate short-term load forecasting is important for the safe and effective functioning of modern power systems. Seasonal-trend decomposition based on LOESS (STL) is an efficient method for handling the intricacy and fluctuation of load data. However, different types of components reflect different levels of information and have different importance in terms of capturing temporal features. Moreover, graph convolutional network (GCN) is often utilized to capture the non-Euclidean spatial features in load data. However, as the number of network nodes increases, the generalization capacity of the GCN decreases. Therefore, a novel spatiotemporal model, namely, multitask GCN with attention-based STL (MG-ASTL), is proposed for accurate short-term load forecasting. First, a new attention-based STL method is proposed, which utilizes attention mechanism to weight different components, thus making the proposed model to focus on more important components for more effective temporal feature extraction. Second, a new multitask GCN method is proposed, which utilizes density-based spatial clustering of applications with noise (DBSCAN) to divide load data into different groups for multitask learning, so that the simple spatial patterns with fewer nodes can be learned to increase the generalization capacity. The effectiveness of the proposed model is validated on the basis of experimental results under different conditions.
… The parallel learning of these features uses the Multi-Task Learning Network (MTLN) [16], … network(GCN) & temporal convolution network (TCN) blocks. The GCN block incorporates a …
The human gait reflects substantial information about individual emotions. Current gait emotion recognition methods focus on capturing gait topology information and ignore the importance of fine-grained temporal features. This article proposes the temporal-tightly graph convolutional network (TT-GCN) to extract temporal features. TT-GCN comprises three significant mechanisms: the causal temporal convolution network (casual-TCN), the walking direction recognition auxiliary task, and the feature mapping layer. To obtain tight temporal dependencies and enhance the relevance among gait periods, the causal-TCN is introduced. Based on the assumption of emotional consistency in the walking directions, the auxiliary task is proposed to enhance the ability of fine-grained feature extraction. Through the feature mapping layer, affective features can be mapped into the appropriate representation and fused with deep learning features. TT-GCN shows the best performance across five comprehensive metrics. All experimental results verify the necessity and feasibility of exploring fine-grained temporal feature extraction.
… graph convolutional network (AAST-GCN), aiming to achieve … are aggregated within a multi-task learning framework, with … proposed AAST-GCN outperforms other GCN-based methods…
High-precision traffic flow prediction facilitates intelligent traffic control and refined management decisions. Previous research has built a variety of exquisite models with good prediction results. However, they ignore the reality that traffic flows can propagate backwards on road networks when modeling spatial relationships, as well as associations between distant nodes. In addition, more effective model components for modeling temporal relationships remain to be developed. To address the above challenges, we propose a local–global features fusion temporal convolutional network (LGTCN) for spatio-temporal traffic flow prediction, which incorporates a bidirectional graph convolutional network, probabilistic sparse self-attention, and a multichannel temporal convolutional network. To extract the bidirectional propagation relationship of traffic flow on the road network, we improve the traditional graph convolutional network so that information can be propagated in multiple directions. In addition, in spatial global dimensions, we propose probabilistic sparse self-attention to effectively perceive global data correlations and reduce the computational complexity caused by the finite perspective graph. Furthermore, we develop a multichannel temporal convolutional network. It not only retains the temporal learning capability of temporal convolutional networks, but also corresponds each channel to a node, and it realizes the interaction of node features through output interoperation. Extensive experiments on four open access benchmark traffic flow datasets demonstrate the effectiveness of our model.
… As the number of GCN layers increases, the receptive fields … With this approach, the spatio-temporal GCN eventually … AS-GCN[7] focus on the topology design of spatiotemporal …
Multi-step traffic speed prediction is a challenging issue due to the multiple spatial-temporal dependencies among roads. Some spatial dependencies, especially those formed by different traffic modes, are not fully exploited, and how to simultaneously consider spatial and temporal dependencies and effectively integrate them within a single prediction framework needs further exploration. To tackle the above issues, we propose a multi-view spatial-temporal graph convolutional framework MVSTG, which adequately exploits the multi-view spatial-temporal dependencies and their interactions to improve the accuracy of traffic prediction. Multi-view temporal learning captures the multiple temporal trends by temporal convolution from multi-granularity historical data, and multi-view spatial learning handles the multiple spatial correlations by graph convolution from multiple graphs. In addition, view-wise attention-based fusion is proposed to adaptively identify the importance of each upstream view, fuse the multi-view information, and generate integrated results for downstream views. The experiments on two real-world urban traffic datasets demonstrate that the multi-view data and the proposed model framework enhance performance on the accuracy of speed prediction, especially in mid-term and long-term prediction.
… have designed a multi-task objective function within our model. Overall, the proposed model is a multi-task model based on Spatio Temporal Graph Convolutional Networks (ST-GCN), …
Gait Recognition is considered crucial for controlling Lower Limb Exoskeletons (LLEs). IMUs are widely utilized due to their portability and lightweight nature; however, existing methods often fail to capture the spatial connections among sensors. To address this issue, the Multi-attention Augmented Spatio Temporal Graph Convolution Network (MA-ST-GCN) is proposed for IMU-based Gait Recognition. This approach constructs a spatial graph based on human skeleton information and is refined through a Spatial Attention mechanism, which captures dependencies among joint nodes. A Temporal Attention mechanism is employed to identify key gait phases, thereby enhancing spatial graph convolution, while Channel Attention is leveraged to modulate different sensor channels, improving overall performance selectively. MA-ST-GCN was compared with RNN, LSTM, TCN, TST, ST-GCN, DGNN, and CA-MSN, demonstrating superior performance in integrating IMU data with skeleton information, confirming its effectiveness in gait recognition.
… a spatio-temporal interaction graph of pedestrian features is constructed by the STGCN composed of the S-GCN … and introduces the TCN instead of the previous recurrent architecture. …
Urban traffic flow forecasting is a critical issue in intelligent transportation systems. It is quite challenging due to the complicated spatiotemporal dependency and essential uncertainty brought about by the dynamic urban traffic conditions. In most of existing methods, the spatial correlation is captured by utilizing graph neural networks (GNNs) throughout a fixed graph based on local spatial proximity. However, urban road conditions are complex and changeable, which leads to the interactions between roads should also be dynamic over time. In addition, the global contextual information of roads are also crucial for accurate forecasting. In this paper, we exploit spatiotemporal correlation of urban traffic flow and construct a dynamic weighted graph by seeking both spatial neighbors and semantic neighbors of road nodes. Multi-head self-attention temporal convolution network is utilized to capture local and long-range temporal dependencies across historical observations. Besides, we propose an adaptive graph gating mechanism to extract selective spatial dependencies within multi-layer stacking and correct information deviations caused by artificially defined spatial correlation. Extensive experiments on real world urban traffic dataset from Didi Chuxing GAIA Initiative have verified the effectiveness, and the multi-step forecasting performance of our proposed models outperforms the state-of-the-art baselines. The source code of our model is publicly available at https://github.com/RobinLu1209/STAG-GCN.
We propose novel Stacked Spatio-Temporal Graph Convolutional Networks (Stacked-STGCN) for action segmentation, i.e., predicting and localizing a sequence of actions over long videos. We extend the Spatio-Temporal Graph Convolutional Network (STGCN) originally proposed for skeleton-based action recognition to enable nodes with different characteristics (e.g., scene, actor, object, action), feature descriptors with varied lengths, and arbitrary temporal edge connections to account for large graph deformation commonly associated with complex activities. We further introduce the stacked hourglass architecture to STGCN to leverage the advantages of an encoder-decoder design for improved generalization performance and localization accuracy. We explore various descriptors such as frame- level VGG, segment-level I3D, RCNN-based object, etc. as node descriptors to enable action segmentation based on joint inference over comprehensive contextual information. We show results on CAD120 (which provides pre-computed node features and edge weights for fair performance comparison across algorithms) as well as a more complex real- world activity dataset, Charades. Our Stacked-STGCN in general achieves improved performance over the state-of- the-art for both CAD120 and Charades. Moreover, due to its generic design, Stacked-STGCN can be applied to a wider range of applications that require structured inference over long sequences with heterogeneous data types and varied temporal extent.
… spatio-temporal prediction model based on optimally weighted graph convolutional network (GCN) and gated … to construct an optimally weighted graph for different turbine sites, which …
Enhancing the prediction of volatile and intermittent electric loads is one of the pivotal elements that contributes to the smooth functioning of modern power grids. However, conventional deep learning-based forecasting techniques fall short in simultaneously taking into account both the temporal dependencies of historical loads and the spatial structure between residential units, resulting in a subpar prediction performance. Furthermore, the representation of the spatial graph structure is frequently inadequate and constrained, along with the complexities inherent in Spatial–Temporal data, impeding the effective learning among different households. To alleviate those shortcomings, this article proposes a novel framework: Spatial–Temporal fusion adaptive gated graph convolution networks (STFAG-GCNs), tailored for residential short-term load forecasting (STLF). Spatial–Temporal fusion graph construction is introduced to compensate for several existing correlations where additional information are not known or unreflected in advance. Through an innovative gated adaptive fusion graph convolution (AFG-Conv) mechanism, Spatial–Temporal fusion graph convolution network (STFGCN) dynamically model the Spatial–Temporal correlations implicitly. Meanwhile, by integrating a gated temporal convolutional network (Gated TCN) and multiple STFGCNs into a unified Spatial–Temporal fusion layer, STFAG-GCN handles long sequences by stacking layers. Experimental results on real-world datasets validate the accuracy and robustness of STFAG-GCN in forecasting short-term residential loads, highlighting its advancements over state-of-the-art methods. Ablation experiments further reveal its effectiveness and superiority.
… the Gated-Memory Convolutional Neural Network (GMCNN), which combines the advantages of 1D Causal-CNN and GRU. The GMCNN can execute convolutional computations in …
Predicting future motion based on historical motion sequence is a fundamental problem in computer vision, and it has wide applications in autonomous driving and robotics. Some recent works have shown that Graph Convolutional Networks(GCN) are instrumental in modeling the relationship between different joints. However, considering the variants and diverse action types in human motion data, the cross-dependency of the spatio-temporal relationships will be difficult to depict due to the decoupled modeling strategy, which may also exacerbate the problem of insufficient generalization. Therefore, we propose the Spatio-Temporal Gating-Adjacency GCN(GAGCN) to learn the complex spatio-temporal dependencies over diverse action types. Specifically, we adopt gating networks to enhance the generalization of GCN via the trainable adaptive adjacency matrix obtained by blending the candidate spatio-temporal adjacency matrices. Moreover, GAGCN addresses the cross-dependency of space and time by balancing the weights of spatio-temporal modeling and fusing the decoupled spatio-temporal features. Extensive experiments on Human 3.6M, AMASS, and 3DPW demonstrate that GAGCN achieves state-of-the-art performance in both short-term and long-term predictions.
Traffic forecasting is a challenging problem because of the irregular and complex road network in space and the dynamic and non-stationary traffic flow in time. To solve this problem, the recently proposed temporal graph convolution models abstracted the spatial and temporal features of the traffic system and obtained considerable improvement. However, most of the current methods use empirical graphs to represent the road network, which don’t fully extract the spatial and temporal features. This paper proposes an Optimized Temporal-Spatial Gated Graph Convolution Network (OTSGGCN) for traffic forecasting, in which the spatial-temporal traffic feature is captured by an innovative graph convolution network with the graph constructed in a data-driven way. The experiments on two real-world traffic datasets show that the proposed method outperforms the state of the art traffic forecasting methods.
Multi-step traffic forecasting has always been extremely challenging due to constantly changing traffic conditions. Advanced Graph Convolutional Networks (GCNs) are widely used to extract spatial information from traffic networks. Existing GCNs for traffic forecasting are usually shallow networks that only aggregate two- or three-order node neighbor information. Because of aggregating deeper neighborhood information, an over-smoothing phenomenon occurs, thus leading to the degradation of model forecast performance. In addition, most existing traffic forecasting graph networks are based on fixed nodes and therefore need more flexibility. Based on the current problem, we propose Dynamic Adaptive Deeper Spatio-Temporal Graph Convolutional Networks (ADSTGCN), a new traffic forecasting model. The model addresses over-smoothing due to network deepening by using dynamic hidden layer connections and adaptively adjusting the hidden layer weights to reduce model degradation. Furthermore, the model can adaptively learn the spatial dependencies in the traffic graph by building the parameter-sharing adaptive matrix, and it can also adaptively adjust the network structure to discover the unknown dynamic changes in the traffic network. We evaluated ADSTGCN using real-world traffic data from the highway and urban road networks, and it shows good performance.
… [1] proposed spatio-temporal graph convolutional networks (STGCN), which uses graph convolution to extract spatial features and temporal gated convolution to extract temporal …
… Additionally, we designed a spatial static–dynamic graph learning layer, which integrates static adaptive graph learning, dynamic graph learning, and a spatial gated fusion module to …
The rapid growth of vehicles as countries become more developed has brought great challenges to traffic prediction. Recent works model only local or global spatial-temporal features via graph neural networks (GNNs). Furthermore, the explicit graph structure information may contain bias, in particular, the lack of connections among multiple nodes when in fact, they are interdependent. This results in the inability to accommodate information interaction and the underutilization of high-quality information. In this article, we design an adaptive spatial-temporal graph convolution networks (ASTGCNs) to collaboratively learn local-global spatial-temporal information for traffic prediction. Specifically, we obtain different local spatial-temporal information (i.e. spatial-temporal information of each temporal point) by dividing the global spatial-temporal information along the temporal dimension. For local spatial-temporal information, we establish an adaptive graph convolution to enhance the ability of graph convolution networks (GCNs) in managing bias in the explicit graph structure. We then employ an attention mechanism to learn the local summarization of dynamic node neighborhoods to obtain high-quality information. For global spatial-temporal information, a temporal convolution network (TCN) block and the ordinary differential equation (ODE) are utilized in our model. In essence, our proposed ASTGCNs integrates adaptive graph convolution, attention mechanism, TCN block and ODE to collaboratively learn local-global spatial-temporal information. Experimental results show that our ASTGCNs is superior to state-of-art (SOTA) methods when applied to four real-world datasets.
… multi-view feature graphs. We then introduce an adaptive graph convolution method to … information from both the topology graph and multi-view feature graphs, which are capable of …
… To achieve efficient and accurate traffic flow prediction, we propose a correlation adaptive dynamic GCN (CADGCN) method. As shown in Fig. 1, CADGCN consists of a encoder-…
Traffic forecasting is one of the most fundamental components in many applications from urban computing to intelligent transportation. Recently, graph convolutional networks (GCNs), which model traffic data as spatiotemporal graph, have attracted lots of attention. However, existing GCNs mainly focus on pre-defined graph structure which is fixed over the entire network. These methods of traffic forecasting can not capture the complex spatial correlations especially for the higher level features. To address these problems, we propose a novel Multi-Adaptive Graph Convolutional Network (MAGCN) for traffic forecasting in this work. Our model can dynamically learn the topology of the graph through multi-range GCNs in an end-to-end manner. This data-driven method makes the construction of graph more flexible and increases the generality of model to adapt to various data samples. Moreover, in the proposed framework, we design a novel Differential Temporal Graph Convolutional Network (DTGCN) to capture the periodic and immediate temporal correlations of traffic data, which is integrated in MAGCN to effectively capture the dynamic spatial-temporal dependencies. Besides, we adopt a multi-subgraph encoding mechanism to enhance the representation of complex spatial dependency. Extensive experiments on two real-world datasets demonstrate that the performance of our MAGCN exceeds the state-of-the-art baselines.
… Dynamic Adaptive Graph Convolutional Network (SILDAGCN… Moreover, it employs a dynamic adaptive graph … feature interaction learning mechanism designed to capture and learn the …
Spatiotemporal prediction plays a critical role in traffic forecasting, environmental monitoring, and social network analysis. However, existing methods still face challenges in modeling complex spatiotemporal dependencies. Conventional temporal models struggle to capture long-term dependencies, and traditional GNNs fail to dynamically adapt to changing spa-tial relationships. Additionally, many approaches lack sufficient multi-level feature fusion, limiting their ability to extract deep spatiotemporal representations. To address these issues, we pro-pose STAE-GCN, a novel Spatio- Temporal Adaptive Embedding Graph Convolutional Network. STAE-GCN enhances temporal modeling through even-odd sequence decomposition instead of dilated convolution and improves feature representation via spatiotemporal adaptive embedding. It further integrates multi-layer GCNs to learn spatial features across depths and employs learnable fusion to enable richer spatiotemporal interaction. Final predictions are made by aggregating all outputs followed by ReLU activation and an MLP. Extensive experiments on real-world datasets show that STAE-GCN outperforms state-of-the-art methods, particularly in long-term forecasting tasks, demonstrating improved robustness and generalization. This work offers a new perspective on effective spatiotemporal data modeling.
As a crucial issue for the advancement of intelligent urban centers, the urgent resolution of air pollution in urban areas is necessary. The essential task for accurate air pollution forecasting aims to aptly capture the spatiotemporal (ST) inter- dependencies of air pollution data. In recent years, numerous models grounded on the Spatial-Temporal Graph Neural Networks (STGNNs) framework have been devised to encapsulate intricate ST dependencies within air pollution data. However, traditional STGNN models are ineffective in capturing long-range temporal trends with lower complexity and in accounting for hidden spatial dependencies due to the explicit graph structure. In relation to the foregoing issue, the Mixed Temporal Adaptive Graph Convolutional Networks (MTAGCN) composed of stacked ST blocks is proposed for effectively acquiring spatial and temporal dependencies within air pollution data. More specifically, each spatial-temporal block contains two major parts: 1) the mixed temporal modules are proposed to capture the dynamic temporal patterns within the input data, which learns the global temporal dependencies of input data at a lower complexity; 2) the graph generation layers produce a learnable self-adaptive graph structure that capture the hidden spatial dependencies between each nodes, while the adaptive spatial layers dynamically learn the spatial patterns of input data using both pre-defined and learnable graph structures. The exceptional performance and robustness of our method over other baseline models are illustrated by comprehensive experiments with two nationwide public datasets.
Traffic flow forecasting is the basis for the dynamic control and application of Intelligent Transportation Systems (ITS), which is of significant practical importance in reducing road congestion. The intricate spatial and temporal connections of traffic flow continue to provide a significant obstacle to accurate traffic flow forecasting. To capture the dynamic spatio-temporal features of traffic flow simultaneously, this paper proposes a novel Spatiotemporal Probability Sparse Adaptive Hybrid Graph Convolution Network (STPASHG) for traffic flow prediction, which is mainly composed of a gated temporal convolution network (Gated TCN), an adaptive hybrid graph convolution module (AHGCM), a spatio-temporal convolution block (ST-Conv block), and a Probability sparse self-attention mechanism (ProbSSAtt block). Among them, Gated TCN utilizes dilated causal convolutional networks at different granularity levels to capture the temporal dependence of traffic flow, and AHGCM utilizes static adaptive graph learning (SAGL), dynamic graph learning (DGL), and spatial gate fusion mechanisms to synchronize and adequately capture dynamic spatiotemporal features. The ProbSSAtt block combines the dynamic temporal and spatial features and promotes the STPASHG model to effectively make medium- and long-term predictions. The experimental results demonstrate that the STPASHG model suggested in this research can successfully extract the dynamic spatio-temporal aspects of the traffic flow and has the best prediction performance compared to the widely used baseline approaches.
… (AT-GCN) for gait-based recognition and motion prediction. … We also present a multi-task learning architecture, which can … tasks benefit from the auxiliary prediction task. Furthermore, we …
Predicting accurate and realistic future human poses from historically observed sequences is a fundamental task in the intersection of computer vision, graphics, and artificial intelligence. Recently, continuous efforts have been devoted to addressing this issue, which has achieved remarkable progress. However, the existing work is seriously limited by complete observation, that is, once the historical motion sequence is incomplete (with missing values), it can only produce unexpected predictions or even deformities. Furthermore, due to inevitable reasons such as occlusion and the lack of equipment precision, the incompleteness of motion data occurs frequently, which hinders the practical application of current algorithms.In this work, we first notice this challenging problem, i.e., how to generate high-fidelity human motion predictions from incomplete observations. To solve it, we propose a novel multi-task graph convolutional network (MTGCN). Specifically, the model involves two branches, in which the primary task is to focus on forecasting future 3D human actions accurately, while the auxiliary one is to repair the missing value of the incomplete observation. Both of them are integrated into a unified framework to share the spatio-temporal representation, which improves the final performance of each collaboratively. On three large-scale datasets, for various data missing scenarios in the real world, extensive experiments demonstrate that our approach is consistently superior to the state-of-the-art methods in which the missing values from incomplete observations are not explicitly analyzed.
Cellular traffic prediction enables operators to adapt to traffic demand in real-time for improving network resource utilization and user experience. To predict cellular traffic, previous studies either applied Recurrent Neural Networks (RNN) at individual base stations or adapted Convolutional Neural Networks (CNN) to work at grid-cells in a geographically defined grid. These solutions do not consider explicitly the effect of handover on the spatial characteristics of the traffic, which may lead to lower prediction accuracy. Furthermore, RNN solutions are slow to train, and CNN-grid solutions do not work for cells and are difficult to apply to base stations. This paper proposes a new prediction model, STGCN-HO, that uses the transition probability matrix of the handover graph to improve traffic prediction. STGCN-HO builds a stacked residual neural network structure incorporating graph convolutions and gated linear units to capture both spatial and temporal aspects of the traffic. Unlike RNN, STGCN-HO is fast to train and simultaneously predicts traffic demand for all base stations based on the information gathered from the whole graph. Unlike CNN-grid, STGCN-HO can make predictions not only for base stations, but also for cells within base stations. Experiments using data from a large cellular network operator demonstrate that our model outperforms existing solutions in terms of prediction accuracy.
Accurate air quality forecasting is essential in managing outdoor activity risk and responding to pollution emergencies. However, effectively modeling complex underlying spatiotemporal dependencies among monitoring stations remains a challenging task. Most existing methods deeply rely on local features to model dynamic spatial correlations and on RNNs to model temporal evolution. In this paper, we propose a novel multi-task deep spatiotemporal graph neural network, named MTGnet, for air quality prediction. MTGnet's main advantage is its ability to adaptively capture complex correlations among different stations, represented as an adjacency matrix, using both local features and global patterns. MTGnet consists of multiple convolutional layers for aggregating information about nearby stations and extracting essential spatial and temporal features for future air quality prediction. Using this architecture, we implement a multi-task learning scheme that trains the model to predict air quality both finely, at the station level, and coarsely, at the city level. Experiments on multiple real datasets demonstrate that MTGnet outperforms state-of-the-art methods.
3D skeleton data has been widely used in action recognition as the skeleton-based method has achieved good performance in complex dynamic environments. The rise of spatio-temporal graph convolutions has attracted much attention to use graph convolution to extract spatial and temporal features together in the field of skeleton-based action recognition. However, due to the huge difference in the focus of spatial and temporal features, it is difficult to improve the efficiency of extracting the spatiotemporal features. In this paper, we propose a channel attention and multi-scale neural network (CA-MSN) for skeleton-based action recognition with a series of spatio-temporal extraction modules. We exploit the relationship of body joints hierarchically through two modules, i.e., a spatial module which uses the residual GCN network with the channel attention block to extract the high-level spatial features, and a temporal module which uses the multi-scale TCN network to extract the temporal features at different scales. We perform extensive experiments on both the NTU-RGBD60 and NTU-RGBD120 datasets to verify the effectiveness of our network. The comparison results show that our method achieves the state-of-the-art performance with the competitive computing speed. In order to test the application effect of our CA-MSN model, we design a multi-task tandem network consisting of 2D pose estimation, 2D to 3D pose regression and skeleton action recognition model. The end-to-end (RGB video-to-action type) recognition effect is demonstrated. The code is available at https://github.com/Rh-Dang/CA-MSN-action-recognition.git.
Facial micro-expressions (MEs) recognition has attracted much attention recently. However, because MEs are spontaneous, subtle and transient, recognizing MEs is a challenge task. In this paper, first, we use transfer learning to apply learning-based video motion magnification to magnify MEs and extract the shape information, aiming to solve the problem of the low muscle movement intensity of MEs. Then, we design a novel graph-temporal convolutional network (Graph-TCN) to extract the features of the local muscle movements of MEs. First, we define a graph structure based on the facial landmarks. Second, the Graph-TCN deals with the graph structure in dual channels with a TCN block. One channel is for node feature extraction, and the other one is for edge feature extraction. Last, the edges and nodes are fused for classification. The Graph-TCN can automatically train the graph representation to distinguish MEs while not using a hand-crafted graph representation. To the best of our knowledge, we are the first to use the learning-based video motion magnification method to extract the features of shape representations from the intermediate layer while magnifying MEs. Furthermore, we are also the first to use deep learning to automatically train the graph representation for MEs.
Using a Graph convolution network (GCN) for constructing and aggregating node features has been helpful for skeleton-based action recognition. The strength of the nodes' relation of an action sequence distinguishes it from other actions. This work proposes a novel spatial module called Multi-scale self-relational graph convolution (MS-SRGC) for dynamically modeling joint relations of action instances. Modeling the joints' relations is crucial in determining the spatial distinctiveness between skeleton sequences; hence MS-SRGC shows effectiveness for activity recognition. We also propose a Hybrid multi-scale temporal convolution network (HMS-TCN) that captures different ranges of time steps along the temporal dimension of the skeleton sequence. In addition, we propose a Spatio-temporal blackout (STB) module that randomly zeroes some continue frames for selected strategic joint groups. We sequentially stack our spatial (MS-SRGC) and temporal (HMS-TCN) modules to form a Self-relational graph convolution network (SR-GCN) block, which we use to construct our SR-GCN model. We append our STB on the SR-GCN model top for the randomized operation. With the effectiveness of ensemble networks, we perform extensive experiments on single and multiple ensembles. Our results beat the state-of-the-art methods on the NTU RGB-D, NTU RGB-D 120, and Northwestern-UCLA datasets.
Recent research has shown that modeling the dynamic joint features of the human body by a graph convolutional network (GCN) is a groundbreaking approach for skeleton-based action recognition, especially for the recognition of the body-motion, human-object and human-human interactions. Nevertheless, how to model and utilize coherent skeleton information comprehensively is still an open problem. In order to capture the rich spatiotemporal information and utilize features more effectively, we introduce a spatial residual layer and a dense connection block enhanced spatial temporal graph convolutional network. More specifically, our work introduces three aspects. Firstly, we extend spatial graph convolution to spatial temporal graph convolution of cross-domain residual to extract more precise and informative spatiotemporal feature, and reduce the training complexity by feature fusion in the, so-called, spatial residual layer. Secondly, instead of simply superimposing multiple similar layers, we use dense connection to take full advantage of the global information. Thirdly, we combine the above mentioned two components to create a spatial temporal graph convolutional network (ST-GCN), referred to as SDGCN. The proposed graph representation has a new structure. We perform extensive experiments on two large datasets: Kinetics and NTU-RGB+D. Our method achieves a great improvement in performance compared to the mainstream methods. We evaluate our method quantitatively and qualitatively, thus proving its effectiveness.
The ability to capture joint connections in complicated motion is essential for skeleton-based action recognition. However, earlier approaches may not be able to fully explore this connection in either the spatial or temporal dimension due to fixed or single-level topological structures and insufficient temporal modeling. In this paper, we propose a novel multilevel spatial-temporal excited graph network (ML-STGNet) to address the above problems. In the spatial configuration, we decouple the learning of the human skeleton into general and individual graphs by designing a multilevel graph convolution (ML-GCN) network and a spatial data-driven excitation (SDE) module, respectively. ML-GCN leverages joint-level, part-level, and body-level graphs to comprehensively model the hierarchical relations of a human body. Based on this, SDE is further introduced to handle the diverse joint relations of different samples in a data-dependent way. This decoupling approach not only increases the flexibility of the model for graph construction but also enables the generality to adapt to various data samples. In the temporal configuration, we apply the concept of temporal difference to the human skeleton and design an efficient temporal motion excitation (TME) module to highlight the motion-sensitive features. Furthermore, a simplified multiscale temporal convolution (MS-TCN) network is introduced to enrich the expression ability of temporal features. Extensive experiments on the four popular datasets NTU-RGB+D, NTU-RGB+D 120, Kinetics Skeleton 400, and Toyota Smarthome demonstrate that ML-STGNet gains considerable improvements over the existing state of the art.
Graph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting their ability to understand actions that involve coordinated movements across various parts of the body. An adaptive graph built upon the global context information of the joints can help move beyond this limitation. Therefore, in this paper, we propose a novel approach to skeleton-based action recognition named Multi-stage Adaptive Graph Convolution Network (MSA-GCN). It consists of two modules: Multi-stage Adaptive Graph Convolution (MSA-GC) and Temporal Multi-Scale Transformer (TMST). These two modules work together to capture complex spatial and temporal patterns within skeleton data effectively. Specifically, MSA-GC explores both local and global context information of the joints across all sequences to construct the adaptive graph and facilitates the understanding of complex and nuanced relationships between joints. On the other hand, the TMST module integrates a Gated Multi-stage Temporal Convolution (GMSTC) with a Temporal Multi-Head Self-Attention (TMHSA) to capture global temporal features and accommodate both long-term and short-term dependencies within action sequences. Through extensive experiments on multiple benchmark datasets, including NTU RGB+D 60, NTU RGB+D 120, and Northwestern-UCLA, MSA-GCN achieves state-of-the-art performance and verifies its effectiveness in skeleton-based action recognition.
合并后的研究脉络围绕 Dual-Path Spatio-Temporal Multi-Task GCN 的关键要素展开:①以双分支/双路径对空间与时间(或多图/多模态)进行显式并行解耦并交叉交互;②用动态/自适应图学习与多视图/多图结构增强空间依赖刻画;③以注意力与自适应/门控机制对时空依赖强度进行动态重加权;④在共享图表征基础上引入多任务联合预测;⑤通过多尺度时间建模(含门控/TCN/记忆/因果卷积等)强化短长期依赖;⑥保留少量更偏工程化的轻量化/鲁棒性/训练友好结构与错误感知设计。整体形成“图结构学习 + 时空动态重加权 + 双路径解耦交互 + 多任务输出 + 多尺度时序建模”的组合框架。