多源异构信息融合 - Acadwrite

多源异构信息融合

本报告通过对文献的结构化梳理，将多源异构信息融合领域划分为七大维度：从基础综述到深度多模态联合表征，涵盖了知识图谱与语义集成、隐私保护集成、物理工业传感诊断、遥感空间应用以及行业推荐决策系统。这些研究展示了从底层数据处理、算法建模到垂直场景应用的全链路发展态势，突出了异构性解决、鲁棒性提升及语义互操作性的核心研究价值。

共 119 篇文献，7 个研究方向

多源异构数据融合基础理论与综述

汇集了对多源异构数据融合领域进行系统性总结、定义标准、框架设计及探讨通用挑战的综述与理论研究。相关文献: Jing Gao et. al, 2020 等 8 篇文献

深度多模态联合表示与跨模态学习

聚焦于利用深度学习架构（如Attention, GNN, 对比学习）从多模态数据中进行联合特征提取、跨模态转换及构建统一的特征空间。相关文献: Fei Zhao et. al, 2024 等 33 篇文献

异构网络、知识图谱与语义集成

侧重于利用知识图谱、语义映射和实体对齐技术，解决多源知识的结构异构性与语义逻辑不一致问题。相关文献: Huan Zhao et. al, 2021 等 22 篇文献

分布式数据隐私保护与安全集成

专门探讨在联邦学习、差分隐私等技术支撑下，分布式环境下多源异构数据的安全对齐与融合计算方案。相关文献: Seonghyeon Gong et. al, 2025 等 6 篇文献

工业与物理系统的多源传感融合诊断

专注于处理物理传感器、监测设备产生的异构信号，应用于机械故障诊断、能源管理、采矿安全及复杂环境的实时监控与预测。相关文献: Qianqian Shi et. al, 2017 等 19 篇文献

多源地理空间遥感应用

针对遥感影像、地理空间数据等多源信息，利用特定融合算法提高地物分类、地质解释及灾害监测的准确性。相关文献: Bin Chen et. al, 2017 等 5 篇文献

特定领域场景推荐与决策辅助

结合行业垂直场景（如交通、推荐、医疗等），设计融合多源异构数据的决策支持系统、行为预测模型及工程化落地方案。相关文献: Dominik Cvetek et. al, 2021 等 26 篇文献

总计120篇相关文献

Multi-source heterogeneous data fusion

多源异构数据融合

doi.org-Lili Zhang, Yuxiang Xie, Luan Xidao 等, 2018-2018 International Conference on Artificial Intelligence and Big Data (ICAIBD)

… data fusion. This paper introduces big data fusion and methods for heterogeneous data fusion, … learning methods in multisource heterogeneous data fusion. Challenges of dealing with …

安装插件收集

被引 98

A multi-source heterogeneous data fusion method for intelligent systems in the Internet of Things

基于物联网智能系统的多源异构数据融合方法

doi.org-Rongrong Sun, Yuemei Ren, 2024-Intelligent Systems with ApplicationsIF 4.3

… Simulation tests confirm the superiority of our method, demonstrating a remarkable improvement in performance in the fusion of dynamic, multi-source heterogeneous data compared to …

安装插件收集

被引 25

Bearing fault diagnosis method based on multi-source heterogeneous information fusion

基于多源异构信息融合的轴承故障诊断方法

doi.org-K Zhang, T Gao, H Shi, 2022-Measurement Science and Technology3区IF 3.4

… However, most existing multi-source fusion methods are in … data redundancy, fuzzy multi-source signal fusion strategy, and insufficient accuracy. As a result, a new multi-source fusion …

安装插件收集

被引 79

A semantics-based approach to multi-source heterogeneous information fusion in the internet of things

基于语义的物联网多源异构信息融合方法

doi.org-Feng Wang, Liang Hu, Jin Zhou 等, 2015-Soft Computing4区IF 2.5

… of IoT information fusion is required. We compare features of IoT data and information with an … Then, we design a framework for multi-source heterogeneous information fusion in the IoT …

安装插件收集

被引 32

Credibility Assessment Method of Sensor Data Based on Multi-Source Heterogeneous Information Fusion

基于多源异构信息融合的传感器数据可信度评估方法

doi.org-Yanling Feng, Jixiong Hu, Rui Duan 等, 2021-Sensors3区IF 3.5

The credibility of sensor data is essential for security monitoring. High-credibility data are the precondition for utilizing data and data analysis, but the existing data credibility evaluation methods rarely consider the spatio-temporal relationship between data sources, which usually leads to low accuracy and low flexibility. In order to solve this problem, a new credibility evaluation method is proposed in this article, which includes two factors: the spatio-temporal relationship between data sources and the temporal correlation between time series data. First, the spatio-temporal relationship was used to obtain the credibility of data sources. Then, the combined credibility of data was calculated based on the autoregressive integrated moving average (ARIMA) model and back propagation (BP) neural network. Finally, the comprehensive data reliability for evaluating data quality can be acquired based on the credibility of data sources and combined data credibility. The experimental results show the effectiveness of the proposed method.

安装插件收集

被引 16

Vehicle Heterogeneous Multi-Source Information Fusion Positioning Method

基于信息概率的车辆异构多源信息融合定位方法

doi.org-Chengkai Tang, Chen Wang, Lingling Zhang 等, 2024-IEEE Transactions on Vehicular Technology2区IF 7.1

With the development of vehicle applications such as intelligent transportation and autonomous driving, the application fields based on location services have increasingly higher requirements for vehicle positioning reliability and real-time accuracy. However, the existing single navigation source of vehicles makes it difficult to realize real-time and high-precision positioning in different scenarios. The current multi-source information fusion methods have the problems of low generalization ability, poor expansibility, and high computational complexity, so it is challenging to apply in the field of vehicle positioning. To solve the above problems, this paper proposes a vehicle heterogeneous multi-source information fusion positioning method (MIFP) based on information probability, which converts the multiple heterogeneous navigation sources into information probability models to realize the unification of the time-frequency parameter format and designs an information fusion algorithm to realize the rapid fusion based on the theory of relative entropy. Through simulation tests and experimental verification by comparing with mainstream information fusion methods, such as the UKF method, the FGA method, and the NNA method, the MIFP method has high positioning accuracy and strong real-time performance. It can effectively solve the problems of weak expansion ability and large calculation amounts of current vehicle fusion positioning models. In the case of interference or mutation, the MIFP method can also suppress the influence of sudden errors on vehicle positioning.

安装插件收集

被引 28

Deep learning based multi-source heterogeneous information fusion framework for online monitoring of surface quality in milling process

基于深度学习的多源异构信息融合框架，用于加工过程中表面质量在线监测

doi.org-Xiaofeng Wang, Jihong Yan, 2024-Engineering Applications of Artificial Intelligence1区 TopIF 8.0

… the structural heterogeneity of various sensor data imposes barriers to information fusion as … This study developed a novel multi-source heterogeneous information fusion framework …

安装插件收集

被引 20

An efficient hierarchical model for multi-source information fusion

一种高效的多源信息融合的层次化模型

doi.org-Ismaïl Saadi, B. Farooq, Ahmed M. Mustafa 等, 2018-Expert Systems with Applications1区 TopIF 7.5

Abstract In urban and transportation research, important information is often scattered over a wide variety of independent datasets which vary in terms of described variables and sampling rates. As activity-travel behavior of people depends particularly on socio-demographics and transport/urban-related variables, there is an increasing need for advanced methods to merge information provided by multiple urban/transport household surveys. In this paper, we propose a hierarchical algorithm based on a Hidden Markov Model (HMM) and an Iterative Proportional Fitting (IPF) procedure to obtain quasi-perfect marginal distributions and accurate multi-variate joint distributions. The model allows for the combination of an unlimited number of datasets. The model is validated on the basis of a synthetic dataset with 1,000,000 observations and 8 categorical variables. The results reveal that the hierarchical model is particularly robust as the deviation between the simulated and observed multivariate joint distributions is extremely small and constant, regardless of the sampling rates and the composition of the datasets in terms of variables included in those datasets. Besides, the presented methodological framework allows for an intelligent merging of multiple data sources. Furthermore, heterogeneity is smoothly incorporated into micro-samples with small sampling rates subjected to potential sampling bias. These aspects are handled simultaneously to build a generalized probabilistic structure from which new observations can be inferred. A major impact in term of expert systems is that the outputs of the hierarchical model (HM) model serve as a basis for a qualitative and quantitative analyses of integrated datasets.

安装插件收集

被引 28

Multi-source heterogeneous data fusion prediction technique for the utility tunnel fire detection

多源异构数据融合的隧道火灾检测预测技术

doi.org-Bin Sun, Yan Li, Yangyang Zhang 等, 2024-Reliability Engineering & System Safety1区 TopIF 11.0

… Then, the multi-source heterogeneous data fusion fire detection is implemented for fire source localization and ceiling temperature distribution prediction based on Gauss model and the …

安装插件收集

被引 46

Multi-source information fusion based heterogeneous network embedding

基于多源信息融合的异构网络嵌入

doi.org-Bentian Li, D. Pi, Yunxia Lin 等, 2020-Information Sciences2区IF 6.8

Abstract Heterogeneous network embedding aims to learn a mapping between network data in original topological space and vectored data in low dimensional latent space, while encoding valuable information, such as structural and semantic information. The resulting vector representation has shown promising performance for extensive real-world applications, such as node classification and node clustering. However, most of existing methods merely focus on modeling network structural information, ignoring the rich multi-source information of different types of nodes. In this paper, we propose a novel Multi-source Information Fusion based Heterogeneous Network Embedding (MIFHNE) approach. We first capture the semantic information using the strategy of meta-graph based random walk. Subsequently, we jointly model the structural proximity, attribute information and label information in the framework of Nonnegative Matrix Factorization (NMF). Theoretical proofs and comprehensive experiments on two real-world heterogeneous network datasets demonstrate the feasibility and effectiveness of our approach.

安装插件收集

被引 39

FEV-Swin: Multi-source heterogeneous information fusion under a variant swin transformer framework for intelligent cross-domain fault diagnosis

基于变体Swin Transformer框架的多源异构信息融合用于智能跨域故障诊断

doi.org-Keyi Zhou, N. Lu, Bin Jiang 等, 2025-Knowledge-Based Systems1区 TopIF 7.6

… use of multi-source heterogeneous data to monitor the … information from multi-source heterogeneous data, this paper proposes a novel multi-source heterogeneous information fusion …

安装插件收集

被引 37

Rotor unbalance fault diagnosis using DBN based on multi-source heterogeneous information fusion

基于多源异构信息融合的深度置信网络转子不平衡故障诊断

doi.org-Jihong Yan, Yuanyuan Hu, Chaozhong Guo, 2019-Procedia Manufacturing

Abstract In the age of Internet of Things and Industrial 4.0, new advanced methods need to be proposed to analyse massive multi-source heterogeneous data from rotating machinery since traditional data analysis methods are difficult to mine features effectively and provide accurate fault results automatically. This paper proposes a rotor unbalance fault diagnosis method using deep belief network (DBN) to learn the representative features automatically and accurately identify fault states. Multi-source heterogeneous information composed with vibration signal and shaft orbit plots generated by raw displacement signals can fully exploit multi-sensor information in fault diagnosis. And multi-DBN model was introduced to deal with multi-source heterogeneous information fusion problem containing all fault information which could adaptively learn useful features through multiple nonlinear transformations compared with traditional approaches depending on time-consuming and labour-intensive manual feature extraction. The results indicate that the accuracy of classifying rotor unbalance fault states is up to 100% under proper parameters of DBN which significantly improves the effect of fault recognition and validates effectiveness using the proposed method.

安装插件收集

被引 45

Leakage diagnosis of natural gas pipeline based on multi-source heterogeneous information fusion

基于多源异构信息融合的天然气管道泄漏诊断方法

doi.org-X. Miao, Hong Zhao, 2024-International Journal of Pressure Vessels and Piping2区IF 3.5

… In this paper, we propose a multi-source heterogeneous information fusion method for the complementary fusion of laser optical sensing and weak magnetic technologies. Firstly, the …

安装插件收集

被引 17

Multi-source heterogeneous information fusion fault diagnosis method based on deep neural networks under limited datasets

基于深度神经网络在有限数据集下的多源异构信息融合故障诊断方法

doi.org-Dongying Han, Yu Zhang, Yue Yu 等, 2024-Applied Soft Computing2区 TopIF 6.6

… single monitoring data hinder the engineering application and generalization of diagnostic models to some extent. To this end, a novel multi-source heterogeneous information fusion (…

安装插件收集

被引 38

Analysis of Substation Joint Safety Control System and Model Based on Multi-Source Heterogeneous Data Fusion

基于多源异构数据融合的变电站联合安全控制系统与模型分析

doi.org-Bo Wu, Yifan Hu, 2023-IEEE Access4区IF 3.6

As the number of substations continues to increase globally and the market demand continues to rise, the current workload of maintenance and daily operation of substations in power grids cannot meet the current demand if only relying on manual work, and the design and implementation of intelligent safety control solutions for substations is imperative. Therefore, this paper proposes a joint safety control system and model analysis for substations based on multi-source heterogeneous data fusion. Firstly, a three-dimensional visualization substation efficient interactive operation platform is realized, which realizes the functions of substation scene roaming, system login, information management, equipment parameters, status viewing and operation ticket pushing; after that, a variety of intelligent hardware devices for data collection, such as multi-dimensional terminal sensors, intelligent wearable devices, intelligent pre-built positioning installation measure rod, and substation intelligent inspection robots are designed to greatly improve the substation inspection efficiency and realize real-time monitoring and data interaction in the inspection process. Finally, we propose an Attention-LSTM-based prediction model for substation multidimensional data, which can predict power equipment spatio-temporal data in the short term, and the prediction results can be combined with intelligent devices for joint diagnosis. The Attention-LSTM prediction model is well-trained in transformer oil temperature experiments, and the experimental results show that this model can provide early warning for the abnormal state of substation power equipment. In summary, this thesis describes a set of complete and practically feasible intelligent safety control methods for substations. The joint safety control system and model analysis of the substation based on multi-source heterogeneous data fusion designed in this paper is mainly oriented to the substation as an electric power workplace, which has quite a vast application prospect for energy equipment.

安装插件收集

被引 27

Multisource Heterogeneous Information Fusion Based on Graph Convolutional Network for Gearbox Fault Diagnosis

基于图卷积网络的多源异构信息融合用于齿轮箱故障诊断

doi.org-Siyuan Gao, Khandaker Noman, Gang Mao 等, 2025-IEEE Transactions on Instrumentation and Measurement2区IF 5.9

Achieving information fusion of multisensor data plays an important role in improving the performance of gearbox fault diagnosis. However, this fusion process is hindered by the heterogeneity problem caused by the different data dimensions of various sensors. To solve this problem, exploitation of the complementary nature of multisource heterogeneous data to provide more accurate fault information is necessary. Thus, a multisource heterogeneous information fusion method-based graph convolutional network (MHIF-GCN) is proposed in this article. In this framework, a convolutional autoencoder (CAE) is used to extract deep features corresponding to different types of sensors as graph node features for solving data heterogeneity problems. Second, the graph convolutional network (GCN) model based on K-nearest neighbor graph (KNNGraph) is introduced to establish the connection between different sensor data in the graph structure for realizing the feature-level fusion of sensor data and mining deeper fault data features. The results of two gearbox experiments validate the excellent fault diagnosis performance of the proposed MHIF-GCN. In Experiment I, the MHIF-GCN can accurately recognize six structural and nonstructural fault types. With the support of the complementary fusion mechanism, the proposed MHIF-GCN has the highest average diagnostic accuracy of 99.00% when compared with the other six methods. Even with a small number of training samples, the MHIF-GCN still performs very favorably compared to other methods with an accuracy of 88.87%. In Experiment II, the MHIF-GCN has the highest diagnostic accuracy of 94.00%, and the recall, precision, and the F-score for each fault state remain above 85%, and the proposed MHIF-GCN maintains a stable diagnostic performance.

安装插件收集

被引 3

Multi-source heterogeneous data fusion technology for electric power based on big data mining

基于大数据挖掘的电力多源异构数据融合技术

doi.org-Zhongjian Liu, Ruixin Qian, Xianing Jin 等, 2024-Journal of Computational Methods in Sciences and Engineering4区IF 0.4

With the rapid development of smart grid technology, a large amount of multi-source heterogeneous data has been generated in the power system, and its effective utilization is crucial for the optimization operation, demand prediction, and anomaly detection of the power system. However, the fusion processing of multi-source heterogeneous data faces many challenges, such as inconsistent data format, granularity, and quality, and direct fusion can easily lead to information redundancy and contradictions. A multi-source heterogeneous data fusion technology based on big data mining has been proposed to address the above issues. This method combines the advantages of convolutional neural networks and gated recurrent units to automatically extract features from image and sequence data and handle long-term dependency issues in time series data. Meanwhile, the K-means clustering algorithm is used to preprocess the data and train a specialized ConvGRU model. The results showed that in short-term load forecasting and abnormal electricity consumption behaviour detection tasks, the accuracy of this method reached 96.3% and 98.7%, respectively, with AUC values of 0.994 and 0.996. Compared to models that use only CNN or GRU, the performance is significantly improved. This method effectively solves the problem of integrating and processing multi-source heterogeneous power data, improves the accuracy and efficiency of power system data analysis, and provides strong support for the optimized operation of smart grids.

安装插件收集

被引 14

Information Fusion for Multi-Source Material Data: Progress and Challenges

多源材料数据融合：进展与挑战

doi.org-Jingren Zhou, Xin Hong, Peiquan Jin, 2019-Applied Sciences4区IF 2.5

The development of material science in the manufacturing industry has resulted in a huge amount of material data, which are often from different sources and vary in data format and semantics. The integration and fusion of material data can offer a unified framework for material data representation, processing, storage and mining, which can further help to accomplish many tasks, including material data disambiguation, material feature extraction, material-manufacturing parameters setting, and material knowledge extraction. On the other side, the rapid advance of information technologies like artificial intelligence and big data, brings new opportunities for material data fusion. To the best of our knowledge, the community is currently lacking a comprehensive review of the state-of-the-art techniques on material data fusion. This review first analyzes the special properties of material data and discusses the motivations of multi-source material data fusion. Then, we particularly focus on the recent achievements of multi-source material data fusion. This review has a few unique features compared to previous studies. First, we present a systematic categorization and comparison framework for material data fusion according to the processing flow of material data. Second, we discuss the applications and impact of recent hot technologies in material data fusion, including artificial intelligence algorithms and big data technologies. Finally, we present some open problems and future research directions for multi-source material data fusion.

安装插件收集

被引 46

A multi-source heterogeneous data fusion framework for fault diagnosis in industrial processes with missing image data

针对工业过程中缺失图像数据的故障诊断的多源异构数据融合框架

doi.org-Liang Ma, Qikai Yang, O. Llanes-Santiago 等, 2025-Measurement2区IF 5.6

… the comprehensive analysis of multi-source heterogeneous data and fault … multi-source heterogeneous data fusion framework is designed for fault diagnosis with missing image data…

安装插件收集

被引 4

Multimodal Representation Learning: Advances, Trends and Challenges

多模态表示学习：进展、趋势与挑战

doi.org-Sufang Zhang, Jun-Hai Zhai, Bo-Jun Xie 等, 2019-2019 International Conference on Machine Learning and Cybernetics (ICMLC)

Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.

安装插件收集

被引 44

Deep Multimodal Representation Learning from Temporal Data

基于时间数据的深度多模态表示学习

doi.org-Xitong Yang, Palghat Ramesh, Radha Chitta 等, 2017-2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

In recent years, Deep Learning has been successfully applied to multimodal learning problems, with the aim of learning useful joint representations in data fusion applications. When the available modalities consist of time series data such as video, audio and sensor signals, it becomes imperative to consider their temporal structure during the fusion process. In this paper, we propose the Correlational Recurrent Neural Network (CorrRNN), a novel temporal fusion model for fusing multiple input modalities that are inherently temporal in nature. Key features of our proposed model include: (i) simultaneous learning of the joint representation and temporal dependencies between modalities, (ii) use of multiple loss terms in the objective function, including a maximum correlation loss term to enhance learning of cross-modal information, and (iii) the use of an attention model to dynamically adjust the contribution of different input modalities to the joint representation. We validate our model via experimentation on two different tasks: video-and sensor-based activity classification, and audio-visual speech recognition. We empirically analyze the contributions of different components of the proposed CorrRNN model, and demonstrate its robustness, effectiveness and state-of-the-art performance on multiple datasets.

安装插件收集

被引 108

Deep Multimodal Data Fusion

深度多模态数据融合

doi.org-Fei Zhao, Chengcui Zhang, Baocheng Geng, 2024-ACM Computing Surveys1区 TopIF 28.0

Multimodal Artificial Intelligence (Multimodal AI), in general, involves various types of data (e.g., images, texts, or data collected from different sensors), feature engineering (e.g., extraction, combination/fusion), and decision-making (e.g., majority vote). As architectures become more and more sophisticated, multimodal neural networks can integrate feature extraction, feature fusion, and decision-making processes into one single model. The boundaries between those processes are increasingly blurred. The conventional multimodal data fusion taxonomy (e.g., early/late fusion), based on which the fusion occurs in, is no longer suitable for the modern deep learning era. Therefore, based on the main-stream techniques used, we propose a new fine-grained taxonomy grouping the state-of-the-art (SOTA) models into five classes: Encoder-Decoder methods, Attention Mechanism methods, Graph Neural Network methods, Generative Neural Network methods, and other Constraint-based methods. Most existing surveys on multimodal data fusion are only focused on one specific task with a combination of two specific modalities. Unlike those, this survey covers a broader combination of modalities, including Vision + Language (e.g., videos, texts), Vision + Sensors (e.g., images, LiDAR), and so on, and their corresponding tasks (e.g., video captioning, object detection). Moreover, a comparison among these methods is provided, as well as challenges and future directions in this area.

安装插件收集

被引 351

Deep Multimodal Representation Learning: A Survey

深度多模态表示学习：综述

doi.org-Wenzhong Guo, Jianwen Wang, Shiping Wang, 2019-IEEE Access4区IF 3.6

Multimodal representation learning, which aims to narrow the heterogeneity gap among different modalities, plays an indispensable role in the utilization of ubiquitous multimodal data. Due to the powerful representation ability with multiple levels of abstraction, deep learning-based multimodal representation learning has attracted much attention in recent years. In this paper, we provided a comprehensive survey on deep multimodal representation learning which has never been concentrated entirely. To facilitate the discussion on how the heterogeneity gap is narrowed, according to the underlying structures in which different modalities are integrated, we category deep multimodal representation learning methods into three frameworks: joint representation, coordinated representation, and encoder-decoder. Additionally, we review some typical models in this area ranging from conventional models to newly developed technologies. This paper highlights on the key issues of newly developed technologies, such as encoder-decoder model, generative adversarial networks, and attention mechanism in a multimodal representation learning perspective, which, to the best of our knowledge, have never been reviewed previously, even though they have become the major focuses of much contemporary research. For each framework or model, we discuss its basic structure, learning objective, application scenes, key issues, advantages, and disadvantages, such that both novel and experienced researchers can benefit from this survey. Finally, we suggest some important directions for future work.

安装插件收集

被引 463

A Survey on Deep Learning for Multimodal Data Fusion

多模态数据融合的深度学习综述

doi.org-Jing Gao, Peng Li, Zhikui Chen 等, 2020-Neural Computation4区IF 2.1

With the wide deployments of heterogeneous networks, huge amounts of data with characteristics of high volume, high variety, high velocity, and high veracity are generated. These data, referred to multimodal big data, contain abundant intermodality and cross-modality information and pose vast challenges on traditional data fusion methods. In this review, we present some pioneering deep learning models to fuse these multimodal big data. With the increasing exploration of the multimodal big data, there are still some challenges to be addressed. Thus, this review presents a survey on deep learning for multimodal data fusion to provide readers, regardless of their original community, with the fundamentals of multimodal deep learning fusion method and to motivate new multimodal data fusion techniques of deep learning. Specifically, representative architectures that are widely used are summarized as fundamental to the understanding of multimodal deep learning. Then the current pioneering multimodal data fusion deep learning models are summarized. Finally, some challenges and future topics of multimodal data fusion deep learning models are described.

安装插件收集

被引 696

Transfer Representation Learning Meets Multimodal Fusion Classification for Remote Sensing Images

迁移表示学习与多模态融合分类在遥感图像中的应用

doi.org-Mengru Ma, Wenping Ma, Licheng Jiao 等, 2022-IEEE Transactions on Geoscience and Remote Sensing1区 TopIF 8.6

To maximize the complementary advantages of synergistic multimodal, a transfer representation learning fusion network (TRLF-Net) is proposed for multisource remote sensing images collaborative classification in this article. First, with respect to the feature encoding, we design a dual-branch attention sparse transfer module (DAST-Module), which combines the spatial and channel attention (CA) masks to migrate the advantage attributes of the panchromatic (PAN) and the MS images mutually. This not only enhances their respective image advantages but also facilitates the sparse fusion of low-level features. Second, for the separation of multiscale information, a deep dual-scale decomposition module (DDSD-Module) is designed, which allows the decompose of high-frequency and low-frequency components. Then it uses the decomposed information to make the essential difference as small as possible, and the surrounding contour difference is as large as possible of the complementary multimodal image through the design of the loss function. Finally, to address the problem of large intraclass and small interclass differences, we develop a representation fusion of the global and local features’ module (RFGAL-Module). It mainly adopts global features to sort local features within classes, and then outputs them in a cascade. Thus, the characterization ability of features is improved, and the global and local features are used in a coordinated manner to accomplish the sample classification tasks. In particular, the experimental results demonstrate that TRLF-Net can obtain much improved accuracy and efficiency. The code is accessible in: https://github.com/ru-willow/SRLF-Net.

安装插件收集

被引 21

Molecular representation learning via multimodal fusion and decoupling

基于多模态融合与解耦的分子表示学习

doi.org-Xuan Zang, Junjie Zhang, Buzhou Tang, 2026-Information Fusion1区 TopIF 15.5

… a multimodal fusion-then-decoupling self-supervised molecular representation learning … First, we use a unified encoder to fuse 2D and 3D molecular structural information by …

安装插件收集

被引 4

Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects

多模态数据融合：方法、挑战与展望综述

doi.org-D. Lahat, T. Adalı, C. Jutten, 2015-Proceedings of the IEEE1区 TopIF 25.9

In various disciplines, information about the same phenomenon can be acquired from different types of detectors, at different conditions, in multiple experiments or subjects, among others. We use the term “modality” for each such acquisition framework. Due to the rich characteristics of natural phenomena, it is rare that a single modality provides complete knowledge of the phenomenon of interest. The increasing availability of several modalities reporting on the same system introduces new degrees of freedom, which raise questions beyond those related to exploiting each modality separately. As we argue, many of these questions, or “challenges,” are common to multiple domains. This paper deals with two key issues: “why we need data fusion” and “how we perform it.” The first issue is motivated by numerous examples in science and technology, followed by a mathematical framework that showcases some of the benefits that data fusion provides. In order to address the second issue, “diversity” is introduced as a key concept, and a number of data-driven solutions based on matrix and tensor decompositions are discussed, emphasizing how they account for diversity across the data sets. The aim of this paper is to provide the reader, regardless of his or her community of origin, with a taste of the vastness of the field, the prospects, and the opportunities that it holds.

安装插件收集

被引 1112

Multimodal deep learning for biomedical data fusion: a review

多模态深度学习在生物医学数据融合中的应用：综述

doi.org-S. Stahlschmidt, B. Ulfenborg, Jane Synnergren, 2022-Briefings in Bioinformatics2区IF 7.7

Abstract Biomedical data are becoming increasingly multimodal and thereby capture the underlying complex relationships among biological processes. Deep learning (DL)-based data fusion strategies are a popular approach for modeling these nonlinear relationships. Therefore, we review the current state-of-the-art of such methods and propose a detailed taxonomy that facilitates more informed choices of fusion strategies for biomedical applications, as well as research on novel methods. By doing so, we find that deep fusion strategies often outperform unimodal and shallow approaches. Additionally, the proposed subcategories of fusion strategies show different advantages and drawbacks. The review of current methods has shown that, especially for intermediate fusion strategies, joint representation learning is the preferred approach as it effectively models the complex interactions of different levels of biological organization. Finally, we note that gradual fusion, based on prior biological knowledge or on search strategies, is a promising future research path. Similarly, utilizing transfer learning might overcome sample size limitations of multimodal data sets. As these data sets become increasingly available, multimodal DL approaches present the opportunity to train holistic models that can learn the complex regulatory dynamics behind health and disease.

安装插件收集

被引 589

Multimodal deep representation learning for video classification

多模态深度表示学习用于视频分类

doi.org-Haiman Tian, Yudong Tao, Samira Pouyanfar 等, 2018-World Wide Web4区IF 3.4

… learning models ignore some data types and only focus on a single modality. This paper presents a new multimodal deep learning … Multimodal data fusion is critical yet challenging for a …

安装插件收集

被引 78

Representation Learning and Nature Encoded Fusion for Heterogeneous Sensor Networks

异构传感器网络的表示学习与本质编码融合

doi.org-Longwei Wang, Q. Liang, 2019-IEEE Access4区IF 3.6

Target detection based on heterogeneous sensor networks is considered in this paper. Fusion problem is investigated to fully take advantage of the information of multi-modal data. The sensing data may not be compatible with each other due to heterogeneous sensing modalities, and the joint PDF of the sensors is not easily available. A two-stage fusion method is proposed to solve the heterogeneous data fusion problem. First, the multi-modality data is transformed into the same representation form by a certain linear or nonlinear transformation. Since there is a model mismatch among the different modalities, each modality is trained by an individual statistical model. In this way, the information of different modalities is preserved. Then, the representation is used as the input of the probabilistic fusion. The probabilistic framework allows data from different modalities to be processed in a unified information fusion space. The inherent inter-sensor relationship is exploited to encode the original sensor data on a graph. Iterative belief propagation is used to fuse the local sensing belief. The more general correlation case is also considered, in which the relation between two sensors is characterized by the correlation factor. The numerical results are provided to validate the effectiveness of the proposed method in heterogeneous sensor network fusion.

安装插件收集

被引 15

Effective Techniques for Multimodal Data Fusion: A Comparative Analysis

多模态数据融合的有效技术：一种对比分析

doi.org-Maciej Pawłowski, Anna Wróblewska, S. Sysko-Romańczuk, 2022-Sensors3区IF 3.5

Data processing in robotics is currently challenged by the effective building of multimodal and common representations. Tremendous volumes of raw data are available and their smart management is the core concept of multimodal learning in a new paradigm for data fusion. Although several techniques for building multimodal representations have been proven successful, they have not yet been analyzed and compared in a given production setting. This paper explored three of the most common techniques, (1) the late fusion, (2) the early fusion, and (3) the sketch, and compared them in classification tasks. Our paper explored different types of data (modalities) that could be gathered by sensors serving a wide range of sensor applications. Our experiments were conducted on Amazon Reviews, MovieLens25M, and Movie-Lens1M datasets. Their outcomes allowed us to confirm that the choice of fusion technique for building multimodal representation is crucial to obtain the highest possible model performance resulting from the proper modality combination. Consequently, we designed criteria for choosing this optimal data fusion technique.

安装插件收集

被引 156

Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion

模态到模态转换：基于对抗表示学习与图融合网络的多模态融合方法

doi.org-Sijie Mai, Haifeng Hu, Songlong Xing, 2019-Proceedings of the AAAI Conference on Artificial Intelligence

Learning joint embedding space for various modalities is of vital importance for multimodal fusion. Mainstream modality fusion approaches fail to achieve this goal, leaving a modality gap which heavily affects cross-modal fusion. In this paper, we propose a novel adversarial encoder-decoder-classifier framework to learn a modality-invariant embedding space. Since the distributions of various modalities vary in nature, to reduce the modality gap, we translate the distributions of source modalities into that of target modality via their respective encoders using adversarial training. Furthermore, we exert additional constraints on embedding space by introducing reconstruction loss and classification loss. Then we fuse the encoded representations using hierarchical graph neural network which explicitly explores unimodal, bimodal and trimodal interactions in multi-stage. Our method achieves state-of-the-art performance on multiple datasets. Visualization of the learned embeddings suggests that the joint embedding space learned by our method is discriminative.

安装插件收集

被引 234

A Comprehensive Survey on Multimodal Data Representation and Information Fusion Algorithms

多模态数据表示和信息融合算法的综合调查

doi.org-Apeksha Gaonkar, Yogya Chukkapalli, P. J. Raman 等, 2021-2021 International Conference on Intelligent Technologies (CONIT)

A contemporary survey on recent advancements in the field of multimodal signal processing, with a focus on multimodal data representation and information fusion is presented in this paper. Multimodal data representation is of critical importance in many signal processing applications, and information fusion algorithms aim at narrowing the heterogeneity gap among the different modalities. First, we start with a brief overview on techniques with some of the commonly used unimodal signals such as text, speech and image, which serves as fundamental requirement in multimodal representation. Next, we discuss multimodal data representation with audio-video, iris, fingerprint, face, LiDAR scanning and images. Later, we provide details on information fusion, broadly classified into model-agnostic and model-based approaches and mention some applications. Further, we discuss some of the challenges associated with multimodal signal processing, in terms of uncertainties, mismatches and inaccuracies in data representation and fusion.

安装插件收集

被引 24

A multimodal differential privacy framework based on fusion representation learning

基于融合表示学习的多模态差分隐私框架

doi.org-Chaoxin Cai, Yingpeng Sang, Hui Tian, 2022-Connection Science3区IF 3.4

Differential privacy mechanisms vary in modalities, and there have been many methods implementing differential privacy on unimodal data. Few studies focus on unifying them to protect multimodal data, though privacy protection of multimodal data is of great significance. In our work, we propose a multimodal differential privacy protection framework. Firstly, we use multimodal representation learning to fuse different modalities and map them to the same subspace. Then based on this representation, we use the Local Differential Privacy (LDP) mechanism to protect data. We propose two protection methods for low-dimensional and high-dimensional fusion tensors respectively. The former is based on Binary Encoding, and the latter is based on multi-dimensional Fourier Transform. To the best of our knowledge, we are the first to propose LDP-based methods for the representation learning of multimodal fusion. Experimental results demonstrate the flexibility of our framework where both approaches show efficient performance as well as high data utility.

安装插件收集

被引 5

Multimodal Representation Learning by Alternating Unimodal Adaptation

通过交替单模态适应的多模态表示学习

doi.org-Xiaohui Zhang, Jaehong Yoon, Mohit Bansal 等, 2023-2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Multimodal learning, which integrates data from diverse sensory modes, plays a pivotal role in artificial intelligence. However, existing multimodal learning methods often struggle with challenges where some modalities appear more dominant than others during multimodal learning. resulting in suboptimal performance. To address this challenge, we propose MLA (Multimodal Learning with Alternating Uni-modal Adaptation). MLA reframes the conventional joint multimodal learning process by transforming it into an al-ternating unimodal learning process, thereby minimizing interference between modalities. Simultaneously, it captures cross-modal interactions through a shared head, which undergoes continuous optimization across different modalities. This optimization process is controlled by a gradient modi-fication mechanism to prevent the shared head from losing previously acquired information. During the inference phase, MLA utilizes a test-time uncertainty-based model fusion mechanism to integrate multimodal information. Extensive experiments are conducted on five diverse datasets, encom-passing scenarios with complete modalities and scenarios with missing modalities. These experiments demonstrate the superiority of MLA over competing prior approaches. Our code is available at https://github.com/Cecile-hi/MLA.

安装插件收集

被引 100

Manifold alignment for heterogeneous single-cell multi-omics data integration using Pamona

基于Pamona的异质单细胞多组学数据整合的流形对齐方法

doi.org-Kai Cao, Yiguang Hong, Lin Wan, 2021-Bioinformatics3区IF 5.4

MOTIVATION: Single-cell multi-omics sequencing data can provide a comprehensive molecular view of cells. However, effective approaches for the integrative analysis of such data are challenging. Existing manifold alignment methods demonstrated the state-of-the-art performance on single-cell multi-omics data integration, but they are often limited by requiring that single-cell datasets be derived from the same underlying cellular structure. RESULTS: In this study, we present Pamona, a partial Gromov-Wasserstein distance-based manifold alignment framework that integrates heterogeneous single-cell multi-omics datasets with the aim of delineating and representing the shared and dataset-specific cellular structures across modalities. We formulate this task as a partial manifold alignment problem and develop a partial Gromov-Wasserstein optimal transport framework to solve it. Pamona identifies both shared and dataset-specific cells based on the computed probabilistic couplings of cells across datasets, and it aligns cellular modalities in a common low-dimensional space, while simultaneously preserving both shared and dataset-specific structures. Our framework can easily incorporate prior information, such as cell type annotations or cell-cell correspondence, to further improve alignment quality. We evaluated Pamona on a comprehensive set of publicly available benchmark datasets. We demonstrated that Pamona can accurately identify shared and dataset-specific cells, as well as faithfully recover and align cellular structures of heterogeneous single-cell modalities in a common space, outperforming the comparable existing methods. AVAILABILITYAND IMPLEMENTATION: Pamona software is available at https://github.com/caokai1073/Pamona. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

安装插件收集

被引 98

Heterogeneous Data Integration: A Literature Scope Review

异构数据集成：文献综述

doi.org-S. Borowicc, S. N. Alves-Souza, 2024-Proceedings of the 26th International Conference on Enterprise Information Systems

: Data have been collected by communities for analysis, visualization, predictions and other activities to support data-driven decision. Obtaining value from data assets directly depends on the data integration task. However, Big Data poses new challenges to integration due to data heterogeneity. It is essential to understand the main problems and to know technologies and techniques that have been employed to improve the ability to obtain value by heterogeneous data integration. This paper presents a literature scope review that highlights the main techniques applied to heterogeneous data integration. The literature reviewed presents solutions mostly focusing on a specific purpose or part of the integration process instead of a clear understanding of how the techniques can be used in a complete integration process. Therefore, this work shows a whole picture of a data integration process organizing the techniques according to their functionalities and presents a workflow with tasks associated to techniques and resources, focusing on semantic mediation, such as mapping and matching tasks. Ontologies and semantic web technologies are promising to address data heterogeneity and have been used in the semantic enrichment of data and semantic mediation between data sources and global model. However, some aspects remain to be further investigated, such as ontology and terminology construction, data processing scalability and semantic mediation, especially for mapping definition.

安装插件收集

被引 7

Data Integration for Heterogenous Datasets

doi.org-J. Hendler, 2014-Big Data4区

… data” area, in which the variety of heterogeneous data being used, rather than the scale of the data being analyzed, is the limiting factor in data … guarantee that terms align or even that …

安装插件收集

被引 74

Heterogeneous Data Fusion via Space Alignment Using Nonmetric Multidimensional Scaling

通过非度量多维尺度空间对齐的异构数据融合

doi.org-J. Choo, S. Bohn, Grant C. Nakamura 等, 2012-Proceedings of the 2012 SIAM International Conference on Data Mining

… For instance, when integrating multi-lingual data, we can match them in a feature level by comparing the terms between different languages [2] or even use off-the-shelf translation …

安装插件收集

被引 25

Towards Heterogeneous Network Alignment: Design and Implementation of a Large-Scale Data Processing Framework

面向异构网络对齐：大规模数据处理框架的设计与实现

doi.org-Marianna Milano, P. Veltri, M. Cannataro 等, 2018-Lecture Notes in Computer Science

… mine heterogeneous networks. We propose a two-step alignment strategy that receives as input two heterogeneous … For the sake of the simplicity we consider only the integration of two …

安装插件收集

被引 2

Pattern fusion analysis by adaptive alignment of multiple heterogeneous omics data

基于多异质组学数据自适应对齐的模式融合分析

doi.org-Qianqian Shi, Chuanchao Zhang, Minrui Peng 等, 2017-Bioinformatics3区IF 5.4

Motivation Integrating different omics profiles is a challenging task, which provides a comprehensive way to understand complex diseases in a multi-view manner. One key for such an integration is to extract intrinsic patterns in concordance with data structures, so as to discover consistent information across various data types even with noise pollution. Thus, we proposed a novel framework called ‘pattern fusion analysis’ (PFA), which performs automated information alignment and bias correction, to fuse local sample-patterns (e.g. from each data type) into a global sample-pattern corresponding to phenotypes (e.g. across most data types). In particular, PFA can identify significant sample-patterns from different omics profiles by optimally adjusting the effects of each data type to the patterns, thereby alleviating the problems to process different platforms and different reliability levels of heterogeneous data. Results To validate the effectiveness of our method, we first tested PFA on various synthetic datasets, and found that PFA can not only capture the intrinsic sample clustering structures from the multi-omics data in contrast to the state-of-the-art methods, such as iClusterPlus, SNF and moCluster, but also provide an automatic weight-scheme to measure the corresponding contributions by data types or even samples. In addition, the computational results show that PFA can reveal shared and complementary sample-patterns across data types with distinct signal-to-noise ratios in Cancer Cell Line Encyclopedia (CCLE) datasets, and outperforms over other works at identifying clinically distinct cancer subtypes in The Cancer Genome Atlas (TCGA) datasets. Availability and implementation PFA has been implemented as a Matlab package, which is available at http://www.sysbio.ac.cn/cb/chenlab/images/PFApackage_0.1.rar. Supplementary information Supplementary data are available at Bioinformatics online.

安装插件收集

被引 68

Heterogeneous data integration methods for patient similarity networks

患者相似性网络的异构数据集成方法

doi.org-J. Gliozzo, M. Mesiti, M. Notaro 等, 2022-Briefings in Bioinformatics2区IF 7.7

Abstract Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.

安装插件收集

被引 13

Integrating Heterogeneous Data: A Systematic Review of Challenges and Evolution Solution

异构数据集成：挑战与演化解决方案的系统综述

doi.org-Meriem Bensaci, Mohammed Charaf Eddine Meftah, Elahe Meftah 等, 2025-Proceedings of the 9th International Conference on Future Networks and Distributed Systems

Data integration has become a cornerstone of modern data-driven systems, enabling organizations to combine heterogeneous, distributed data sources into unified, actionable forms. Despite substantial advancements, challenges such as semantic heterogeneity, scalability, data quality, and automation continue to limit the efficiency and reliability of integration techniques. This paper presents a comprehensive systematic literature review that investigates the major challenges, existing techniques, and emerging trends in data integration research. Following a rigorous four-stage selection process, high-quality studies published were analyzed to synthesize both theoretical frameworks and practical solutions. The reviewed literature reveals an evolution from traditional rule-based and ontology-driven approaches toward AI-assisted, machine learning-based, and cloud-enabled integration architectures. The study identifies ongoing research gaps and highlights the need for scalable, intelligent data integration frameworks, supported by reported improvements such as a 13.2% increase in precision and a 30% reduction in performance costs achieved by modern methods.

安装插件收集

An approach for semantic integration of heterogeneous data sources

异构数据源语义集成方法

doi.org-Giuseppe Fusco, L. Aversano, 2020-PeerJ Computer Science4区IF 2.5

Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.

安装插件收集

被引 18

Federated Learning for Heterogeneous Data Integration and Privacy Protection

异质数据整合与隐私保护的联邦学习方法

doi.org-Chenwei Gong, Xuyang Zhang, Yuzhen Lin 等, 2025-… Cooperative Work in …

Federated learning (FL) represents a promising approach that enables the collaborative training of machine learning models without compromising data privacy. This approach is particularly advantageous when handling heterogeneous data dispersed across numerous institutions or devices, as centralized data aggregation is often constrained by privacy concerns and data regulations. In order to address the challenges posed by heterogeneous data, we have devised an adaptive data integration mechanism. This mechanism maps the features of disparate data sources to a unified feature space through the use of feature alignment technology, thereby facilitating the effective fusion of data. This fusion is achieved through the application of statistical alignment and multi- perspective learning technology. Furthermore, in order to safeguard the confidentiality of data, we integrate differential privacy and homomorphic encryption techniques, thereby preventing the disclosure of information during model updates and data transfers. Furthermore, a multi-level privacy protection strategy is proposed, which employs de-identification, secure multi-party computation, and federated averaging technologies at the three stages of data preprocessing, model training, and result aggregation, respectively. This approach ensures data security and facilitates effective model updates. The experimental results demonstrate that the proposed framework exhibits enhanced model performance and robustness in comparison to traditional federated learning methods on a multitude of real-world heterogeneous datasets.

安装插件收集

被引 5

Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks

构建异构关系数据集嵌入以用于数据集成任务

doi.org-Riccardo Cappuzzo, Paolo Papotti, Saravanan Thirumuruganathan, 2020-Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

Deep learning based techniques have been recently used with promising results for data integration problems. Some methods directly use pre-trained embeddings that were trained on a large corpus such as Wikipedia. However, they may not always be an appropriate choice for enterprise datasets with custom vocabulary. Other methods adapt techniques from natural language processing to obtain embeddings for the enterprise's relational data. However, this approach blindly treats a tuple as a sentence, thus losing a large amount of contextual information present in the tuple. We propose algorithms for obtaining local embeddings that are effective for data integration tasks on relational databases. We make four major contributions. First, we describe a compact graph-based representation that allows the specification of a rich set of relationships inherent in the relational world. Second, we propose how to derive sentences from such a graph that effectively "describe" the similarity across elements (tokens, attributes, rows) in the two datasets. The embeddings are learned based on such sentences. Third, we propose effective optimization to improve the quality of the learned embeddings and the performance of integration tasks. Finally, we propose a diverse collection of criteria to evaluate relational embeddings and perform an extensive set of experiments validating them against multiple baseline methods. Our experiments show that our framework, EmbDI, produces meaningful results for data integration tasks such as schema matching and entity resolution both in supervised and unsupervised settings.

安装插件收集

被引 135

Integration of heterogeneous geospatial data in a federated database

异构地理空间数据在联邦数据库中的集成

doi.org-M. Butenuth, G. V. Goesseln, M. Tiedge 等, 2007-ISPRS Journal of Photogrammetry and Remote Sensing1区 TopIF 12.2

The integration of heterogeneous geospatial data offers possibilities to manually and automatically derive new information, which are not available when using only a single data source. Furthermore, it allows for a consistent representation and the propagation of updates from one data set to the other. However, different acquisition methods, data schemata and updating cycles of the content can lead to discrepancies in geometric and thematic accuracy and correctness which hamper the combined integration. To overcome these difficulties, appropriate methods for the integration and harmonization of data from different sources and of different types are needed. In this paper we describe two generic cases including novel integration algorithms, namely the integration of two heterogeneous vector data sets, and the integration of raster and vector data. Both algorithms are linked to a federated database which allows for automatic object matching and for managing n:m relationships. We describe and illustrate our work using vector data from topography and the geosciences, as well as multi-spectral imagery. © 2007 International Society for Photogrammetry andRemote Sensing, Inc. (ISPRS). Published byElsevier B.V. All rights reserved.

安装插件收集

被引 134

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection

具有多尺度多路径与跨模态交互的多模态融合网络用于RGB-D显著性物体检测

doi.org-Hao Chen, Youfu Li, Dan Su, 2019-Pattern Recognition1区 TopIF 7.6

Abstract Paired RGB and depth images are becoming popular multi-modal data adopted in computer vision tasks. Traditional methods based on Convolutional Neural Networks (CNNs) typically fuse RGB and depth by combining their deep representations in a late stage with only one path, which can be ambiguous and insufficient for fusing large amounts of cross-modal data. To address this issue, we propose a novel multi-scale multi-path fusion network with cross-modal interactions (MMCI), in which the traditional two-stream fusion architecture with single fusion path is advanced by diversifying the fusion path to a global reasoning one and another local capturing one and meanwhile introducing cross-modal interactions in multiple layers. Compared to traditional two-stream architectures, the MMCI net is able to supply more adaptive and flexible fusion flows, thus easing the optimization and enabling sufficient and efficient fusion. Concurrently, the MMCI net is equipped with multi-scale perception ability (i.e., simultaneously global and local contextual reasoning). We take RGB-D saliency detection as an example task. Extensive experiments on three benchmark datasets show the improvement of the proposed MMCI net over other state-of-the-art methods.

安装插件收集

被引 333

CMBF: Cross-Modal-Based Fusion Recommendation Algorithm

基于跨模态融合的推荐算法（CMBF）

doi.org-Xi Chen, Yang Lu, Yuehai Wang 等, 2021-Sensors3区IF 3.5

A recommendation system is often used to recommend items that may be of interest to users. One of the main challenges is that the scarcity of actual interaction data between users and items restricts the performance of recommendation systems. To solve this problem, multi-modal technologies have been used for expanding available information. However, the existing multi-modal recommendation algorithms all extract the feature of single modality and simply splice the features of different modalities to predict the recommendation results. This fusion method can not completely mine the relevance of multi-modal features and lose the relationship between different modalities, which affects the prediction results. In this paper, we propose a Cross-Modal-Based Fusion Recommendation Algorithm (CMBF) that can capture both the single-modal features and the cross-modal features. Our algorithm uses a novel cross-modal fusion method to fuse the multi-modal features completely and learn the cross information between different modalities. We evaluate our algorithm on two datasets, MovieLens and Amazon. Experiments show that our method has achieved the best performance compared to other recommendation algorithms. We also design ablation study to prove that our cross-modal fusion method improves the prediction results.

安装插件收集

被引 17

Multimodal Industrial Anomaly Detection via Uni-Modal and Cross-Modal Fusion

基于单模态和跨模态融合的多模态工业异常检测

doi.org-Hao Cheng, Jiaxiang Luo, Xianyong Zhang, 2025-IEEE Transactions on Industrial Informatics1区 TopIF 9.9

Constructing comprehensive multimodal feature representations from RGB images (RGB) and point clouds (PT) in 2D–3D multimodal anomaly detection (MAD) methods is very important to reveal various types of industrial anomalies. For multimodal representations, most of the existing MAD methods often consider the explicit spatial correspondence between the modality-specific features extracted from RGB and PT through space-aligned fusion, while overlook the implicit interaction relationships between them. In this study, we propose a uni-modal and cross-modal fusion (UCF) method, which comprehensively incorporates the implicit relationships within and between modalities in multimodal representations. Specifically, UCF first establishes uni-modal and cross-modal embeddings to capture intramodal and intermodal relationships through uni-modal reconstruction and cross-modal mapping. Then, an adaptive nonequal fusion method is proposed to develop fusion embeddings, with the aim of preserving the primary features and reducing interference of the uni-modal and cross-modal embeddings. Finally, uni-modal, cross-modal, and fusion embeddings are all collaborated to reveal anomalies existing in different modalities. Experiments conducted on the MVTec 3D-AD benchmark and the real-world surface mount inspection demonstrate that the proposed UCF outperforms existing approaches, particularly in precise anomaly localization.

安装插件收集

被引 10

Cross-Modal Retrieval: A Systematic Review of Methods and Future Directions

跨模态检索：方法综述与未来方向展望

doi.org-Lei Zhu, Tianshi Wang, Fengling Li 等, 2023-Proceedings of the IEEE1区 TopIF 25.9

With the exponential surge in diverse multimodal data, traditional unimodal retrieval methods struggle to meet the needs of users seeking access to data across various modalities. To address this, cross-modal retrieval has emerged, enabling interaction across modalities, facilitating semantic matching, and leveraging complementarity and consistency between heterogeneous data. Although prior literature has reviewed the field of cross-modal retrieval, it suffers from numerous deficiencies in terms of timeliness, taxonomy, and comprehensiveness. This article conducts a comprehensive review of cross-modal retrieval’s evolution, spanning from shallow statistical analysis techniques to vision-language pretraining (VLP) models. Commencing with a comprehensive taxonomy grounded in machine learning paradigms, mechanisms, and models, this article delves deeply into the principles and architectures underpinning existing cross-modal retrieval methods. Furthermore, it offers an overview of widely used benchmarks, metrics, and performances. Lastly, this article probes the prospects and challenges that confront contemporary cross-modal retrieval, while engaging in a discourse on potential directions for further progress in the field. To facilitate the ongoing research on cross-modal retrieval, we develop a user-friendly toolbox and an open-source repository at https://cross-modal-retrieval.github.io.

安装插件收集

被引 104

Advancing Multi-Modal Beam Prediction With Cross-Modal Feature Enhancement and Dynamic Fusion Mechanism

基于跨模态特征增强和动态融合机制的多元模态波束预测技术进步

doi.org-Qihao Zhu, Yu Wang, Wenmei Li 等, 2025-IEEE Transactions on Communications2区 TopIF 8.3

In millimeter-wave and terahertz band communication systems, precise beam prediction is crucial for optimizing network performance and enhancing signal transmission efficiency. Traditional beam prediction methods have primarily relied on single-modal data, which often fails to capture the comprehensive environmental information necessary for optimal accuracy. In contrast, multi-modal data-based approaches offer a more promising solution by leveraging the strengths of diverse data sources. However, many existing fusion methods are static, inadequately accounting for variations in information content across different modalities, which can hinder the full utilization of each modality’s advantages. To address these limitations, this paper proposes an advanced multi-modal beam prediction method that integrates multipath-like data augmentation (MLDA), cross-modal feature enhancement (CMFE), and an uncertainty-aware dynamic fusion mechanism. Our approach combines image and radar data to predict beam indices, dynamically adjusting the weights of different modalities to accommodate varying information densities. The proposed method employs ResNet34 for feature extraction from the multi-modal data, followed by a cross-modal feature enhancement module that aggregates complementary information from the image and radar data. Finally, the dynamic fusion mechanism integrates the predictions from the single-modal data. Experimental results demonstrate that our method significantly improves the accuracy and robustness of beam prediction, achieving an overall accuracy of 89.72%. The performance of the proposed method is further validated through comparisons with various existing methods and comprehensive ablation studies, highlighting its superiority in multi-modal assisted beam prediction scenarios.

安装插件收集

被引 22

XKanFuse: A novel cross-modal fusion method based on Kolmogorov-Arnold Network for multi-modal medical image fusion

XKanFuse：基于Kolmogorov-Arnold网络的新型多模态医学影像融合方法

doi.org-Xinjian Wei, Yafei Xiong, Haotian Lu 等, 2025-Knowledge-Based Systems1区 TopIF 7.6

… image fusion is proposed to improve fusion performance, … ) in XKanFuse enables effective cross-modal exchange and … facilitating precise cross-modal interaction and fusion. Extensive …

安装插件收集

被引 6

Cross-Modal Fusion Convolutional Neural Networks With Online Soft-Label Training Strategy for Mechanical Fault Diagnosis

具有在线软标签训练策略的跨模态融合卷积神经网络用于机械故障诊断

doi.org-Yadong Xu, Ke Feng, Xiaoan Yan 等, 2024-IEEE Transactions on Industrial Informatics1区 TopIF 9.9

Convolutional neural network (CNN)-based fault detection approaches based on multisource signals have attracted increasing interest from the research community and industrial practices, thanks to the powerful feature representation capability of CNN and the rapid development of sensor technology. Various strategies have been applied in existing CNN-based diagnostic models to learn features from 1-D real-valued multivariate data. However, the distribution gap and the intrinsic correlations among multisource mechanical signals during the learning process have been rarely considered, which may lead to suboptimal fault identification results. To tackle this issue, this article proposes a cross-modal fusion convolutional neural network (CMFCNN) for mechanical fault diagnosis, which performs modality-specific and cross-modal feature representation on multisource data. Specifically, CMFCNN adopts two parallel modality-specific networks and a cross-modal knowledge-sharing network to fully explore independent and shared features from the multisource mechanical signals. To achieve effective feature propagation and fusion, a cross-modal fusion module is introduced to integrate cross-modal features and pass the fused information to the next layer. Moreover, to alleviate overfitting and achieve a better diagnostic performance of the framework, an online soft-label training algorithm is adopted in the CMFCNN training phase. Extensive experimental results on the cylindrical rolling bearing dataset and the planetary gearbox dataset validate that the proposed CMFCNN outperforms seven state-of-the-art methods significantly, especially under strong noise conditions.

安装插件收集

被引 74

Kernel Cross-Modal Factor Analysis for Information Fusion With Application to Bimodal Emotion Recognition

基于核的跨模态因子分析用于信息融合及双模态情感识别的应用

doi.org-Yongjin Wang, L. Guan, A. Venetsanopoulos, 2012-IEEE Transactions on Multimedia1区 TopIF 9.7

… Abstract—In this paper, we investigate kernel based methods for multimodal information analysis and fusion. We introduce a novel approach, kernel cross-modal factor analysis, which …

安装插件收集

被引 135

A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives

从单一模态、跨模态和多模态视角的音乐生成综述

doi.org-Shuyu Li, Shulei Ji, Zihao Wang 等, 2025-ACM Computing Surveys1区 TopIF 28.0

With the rapid development of artificial intelligence, music generation has evolved from single-modal to cross-modal approaches and is gradually moving toward multi-modal fusion. This survey systematically reviews this developmental trajectory. The discussion begins with the representation methods for key modalities, including audio, symbolic, text, and visual data. Music generation techniques are then organized across single-modal, cross-modal, and multi-modal settings. In addition, key datasets and evaluation methodologies relevant to these tasks are compiled. Finally, the survey discusses core challenges in the field, including modal fusion, data scarcity, and evaluation frameworks, and outlines potential directions for future research.

安装插件收集

被引 4

Deep Multiscale Fusion Hashing for Cross-Modal Retrieval

深度多尺度融合哈希算法在跨模态检索中的应用

doi.org-Xiushan Nie, Bowei Wang, Jiajia Li 等, 2021-IEEE Transactions on Circuits and Systems for Video Technology1区 TopIF 11.1

Owing to the rapid development of deep learning and the high efficiency of hashing, hashing methods based on deep learning models have been extensively adopted in the area of cross-modal retrieval. In general, in existing deep model-based methods, modality-specific features play an important role during the hash learning. However, most existing methods only use the modality-specific features from the final fully connected layer, ignoring the semantic relevance among modality-specific features with different scales in multiple layers. To address this issue, in this study, we put forward an end-to-end deep hashing method called deep multiscale fusion hashing (DMFH) for cross-modal retrieval. For the proposed DMFH, we first design different network branches for two modalities and then adopt multiscale fusion models for each branch network to fuse the multiscale semantics, which can be used to explore the semantic relevance. Furthermore, the multi-fusion models also embed the multiscale semantics into the final hash codes, making the final hash codes more representative. In addition, the proposed DMFH can learn common hash codes directly without a relaxation, thereby avoiding a loss in accuracy during hash learning. Experimental results on three benchmark datasets prove the relative superiority of the proposed method.

安装插件收集

被引 83

The State of the Art for Cross-Modal Retrieval: A Survey

跨模态检索的技术现状综述

doi.org-Kun Zhou, F. H. Hassan, Gan Keng Hoon, 2023-IEEE Access4区IF 3.6

Cross-modal retrieval, which aims to search for semantically relevant data across different modalities, has received increasing attention in recent years. Deep learning, with its ability to extract high-level representations from multimodal data, has become a popular approach for cross-modal retrieval. In this paper, we present a comprehensive survey of deep learning techniques for cross-modal retrieval including 37 papers published in recent years. The review is organized into four main sections, covering traditional subspace learning methods, deep learning, and machine learning-based approaches, techniques based on large multi-modal models, and an analysis of datasets used in the field of cross-modal retrieval. We compare and analyze the performance of different deep learning methods on benchmark datasets, the result shows that although a large number of innovative methods have been proposed, there are still some problems that need to be solved, such as multi-modal feature alignment, multi-modal feature fusion, and subspace learning, as well as specialized datasets.

安装插件收集

被引 16

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-Modal Alignment

AlignMamba：基于局部和全局跨模态对齐的多模态Mamba增强

doi.org-Yan Li, Yifei Xing, Xiangyuan Lan 等, 2024-2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Cross-Modal alignment is crucial for multimodal representation fusion due to the inherent heterogeneity between modalities. While Transformer-Based methods have shown promising results in modeling inter-modal relationships, their quadratic computational complexity limits their applicability to long-sequence or large-scale data. Although recent Mamba-Based approaches achieve linear complexity, their sequential scanning mechanism poses fundamental challenges in comprehensively modeling cross-modal relationships. To address this limitation, we propose Align-Mamba, an efficient and effective method for multimodal fusion. Specifically, grounded in Optimal Transport, we introduce a local cross-modal alignment module that explicitly learns token-level correspondences between different modalities. Moreover, we propose a global cross-modal alignment loss based on Maximum Mean Discrepancy to implicitly enforce the consistency between different modal distributions. Finally, the unimodal representations after local and global alignment are passed to the Mamba backbone for further cross-modal interaction and multimodal fusion. Extensive experiments on complete and incomplete multimodal fusion tasks demonstrate the effectiveness and efficiency of the proposed method. For instance, on the CMU-MOSI dataset, AlignMamba improves classification accuracy by 0.9%, reduces GPU memory usage by 20.3%, and decreases inference time by 83.3%.

安装插件收集

被引 25

A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning

基于多任务学习的一种跨模态层次融合多模态情感分析方法

doi.org-Lan Wang, Junjie Peng, Cangzhi Zheng 等, 2024-Information Processing & Management1区 TopIF 6.9

… fusion of heterogeneous data is one of the core problems of multimodal sentiment analysis. Most cross-modal fusion … propose a cross-modal hierarchical fusion method for multimodal …

安装插件收集

被引 78

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

基于跨模态融合和注意力机制的弱监督视频异常检测

doi.org-Ayush Ghadiya, P. Kar, Vishal M. Chudasama 等, 2024-2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Recently, weakly supervised video anomaly detection (WS-VAD) has emerged as a contemporary research direction to identify anomaly events like violence and nudity in videos using only video-level labels. However, this task has substantial challenges, including addressing imbalanced modality information and consistently distinguishing between normal and abnormal features. In this paper, we address these challenges and propose a multi-modal WS-VAD framework to accurately detect anomalies such as violence and nudity. Within the proposed framework, we introduce a new fusion mechanism known as the Cross-modal Fusion Adapter (CFA), which dynamically selects and enhances highly relevant audio-visual features in relation to the visual modality. Additionally, we introduce a Hyperbolic Lorentzian Graph Attention (HLGAtt) to effectively capture the hierarchical relationships between normal and abnormal representations, thereby enhancing feature separation accuracy. Through extensive experiments, we demonstrate that the proposed model achieves state-of-the-art results on benchmark datasets of violence and nudity detection.

安装插件收集

被引 20

RGBD Salient Object Detection via Disentangled Cross-Modal Fusion

基于解耦跨模态融合的RGBD显著目标检测

doi.org-Hao Chen, Yongjian Deng, Youfu Li 等, 2020-IEEE Transactions on Image Processing1区 TopIF 13.7

Depth is beneficial for salient object detection (SOD) for its additional saliency cues. Existing RGBD SOD methods focus on tailoring complicated cross-modal fusion topologies, which although achieve encouraging performance, are with a high risk of over-fitting and ambiguous in studying cross-modal complementarity. Different from these conventional approaches combining cross-modal features entirely without differentiating, we concentrate our attention on decoupling the diverse cross-modal complements to simplify the fusion process and enhance the fusion sufficiency. We argue that if cross-modal heterogeneous representations can be disentangled explicitly, the cross-modal fusion process can hold less uncertainty, while enjoying better adaptability. To this end, we design a disentangled cross-modal fusion network to expose structural and content representations from both modalities by cross-modal reconstruction. For different scenes, the disentangled representations allow the fusion module to easily identify and incorporate desired complements for informative multi-modal fusion. Extensive experiments show the effectiveness of our designs and a large outperformance over state-of-the-art methods.

安装插件收集

被引 87

Embedding-based entity alignment between multi-source temporal knowledge graphs

基于嵌入的多源时间知识图谱实体对齐

doi.org-Lin Zhu, Nan Li, Luyi Bai, 2024-Engineering Applications of Artificial Intelligence1区 TopIF 8.0

… The goal of entity alignment is to identify entities in two multi-source knowledge graphs (KGs… Recent researches on multi-source entity alignment mainly concentrate on static KGs. In fact…

安装插件收集

被引 4

LMKG: A large-scale and multi-source medical knowledge graph for intelligent medicine applications

LMKG：面向智能医疗应用的大规模多源医学知识图谱

doi.org-Peiru Yang, Hongjun Wang, Yingzhuo Huang 等, 2023-Knowledge-Based Systems1区 TopIF 7.6

… knowledge triplets from medical texts. Then we propose a hierarchical entity alignment framework for further knowledge … -scale, high-quality, multi-source, and multi-lingual medical KG …

安装插件收集

被引 42

Collective Multi-type Entity Alignment Between Knowledge Graphs

知识图谱间的集体多类型实体对齐

doi.org-Qi Zhu, Hao Wei, Bunyamin Sisman 等, 2020-Proceedings of The Web Conference 2020

Knowledge graph (e.g. Freebase, YAGO) is a multi-relational graph representing rich factual information among entities of various types. Entity alignment is the key step towards knowledge graph integration from multiple sources. It aims to identify entities across different knowledge graphs that refer to the same real world entity. However, current entity alignment systems overlook the sparsity of different knowledge graphs and can not align multi-type entities by one single model. In this paper, we present a Collective Graph neural network for Multi-type entity Alignment, called CG-MuAlign. Different from previous work, CG-MuAlign jointly aligns multiple types of entities, collectively leverages the neighborhood information and generalizes to unlabeled entity types. Specifically, we propose novel collective aggregation function tailored for this task, that (1) relieves the incompleteness of knowledge graphs via both cross-graph and self attentions, (2) scales up efficiently with mini-batch training paradigm and effective neighborhood sampling strategy. We conduct experiments on real world knowledge graphs with millions of entities and observe the superior performance beyond existing methods. In addition, the running time of our approach is much less than the current state-of-the-art deep learning methods.

安装插件收集

被引 59

MMIEA: Multi-modal Interaction Entity Alignment model for knowledge graphs

MMIEA：多模态交互实体对齐模型用于知识图谱

doi.org-Bin Zhu, Meng-Sheng Wu, Yunpeng Hong 等, 2023-Information Fusion1区 TopIF 15.5

… entity alignment for knowledge graphs and proposed the Multi-Modal Interaction Entity Alignment … INT model for the entity alignment task in multi-modal knowledge graphs. Experimental …

安装插件收集

被引 27

Multi-source knowledge fusion: a survey

多源知识融合：综述

doi.org-Xiaojuan Zhao, Yan Jia, Aiping Li 等, 2020-World Wide Web4区IF 3.4

Multi-source knowledge fusion is one of the important research topics in the fields of artificial intelligence, natural language processing, and so on. The research results of multi-source knowledge fusion can help computer to better understand human intelligence, human language and human thinking, effectively promote the Big Search in Cyberspace, effectively promote the construction of domain knowledge graphs (KGs), and bring enormous social and economic benefits. Due to the uncertainty of knowledge acquisition, the reliability and confidence of KG based on entity recognition and relationship extraction technology need to be evaluated. On the one hand, the process of multi-source knowledge reasoning can detect conflicts and provide help for knowledge evaluation and verification; on the other hand, the new knowledge acquired by knowledge reasoning is also uncertain and needs to be evaluated and verified. Collaborative reasoning of multi-source knowledge includes not only inferring new knowledge from multi-source knowledge, but also conflict detection, i.e. identifying erroneous knowledge or conflicts between knowledges. Starting from several related concepts of multi-source knowledge fusion, this paper comprehensively introduces the latest research progress of open-source knowledge fusion, multi-knowledge graphs fusion, information fusion within KGs, multi-modal knowledge fusion and multi-source knowledge collaborative reasoning. On this basis, the challenges and future research directions of multi-source knowledge fusion in a large-scale knowledge base environment are discussed.

安装插件收集

被引 94

Cross-knowledge-graph entity alignment via relation prediction

基于关系预测的跨知识图谱实体对齐

doi.org-Hongren Huang, Chen Li, Xutan Peng 等, 2021-Knowledge-Based Systems1区 TopIF 7.6

… -parameter to balance embedding loss and alignment loss, the other is the … entity alignment framework named RpAlign (Relation prediction based cross-knowledge-graph entity Align…

安装插件收集

被引 22

MultiJAF: Multi-modal joint entity alignment framework for multi-modal knowledge graph

多模态联合实体对齐框架：用于多模态知识图谱的MultiJAF

doi.org-Bo Cheng, Jia Zhu, Meimei Guo, 2022-Neurocomputing2区IF 6.5

… with the same real-world identity from different Knowledge Graphs (KGs). Existing methods … Joint entity Alignment Framework (MultiJAF), which can effectively utilize the knowledge of …

安装插件收集

被引 30

Leveraging Multi-source knowledge for Chinese clinical named entity recognition via relational graph convolutional network

Leveraging Multi-source knowledge for Chinese clinical named entity recognition via relational graph convolutional network

doi.org-Ying Xiong, Hao Peng, Yang Xiang 等, 2022-Journal of Biomedical Informatics2区IF 4.5

OBJECTIVE External knowledge, such as lexicon of words in Chinese and domain knowledge graph (KG) of concepts, has been recently adopted to improve the performance of machine learning methods for named entity recognition (NER) as it can provide additional information beyond context. However, most existing studies only consider knowledge from one source (i.e., either lexicon or knowledge graph) in different ways and consider lexicon words or KG concepts independently with their boundaries. In this paper, we focus on leveraging multi-source knowledge in a unified manner where lexicon words or KG concepts are well combined with their boundaries for Chinese Clinical NER (CNER). MATERIAL AND METHODS We propose a novel method based on relational graph convolutional network (RGCN), called MKRGCN, to utilize multi-source knowledge in a unified manner for CNER. For any sentence, a relational graph based on words or concepts in each knowledge source is constructed, where lexicon words or KG concepts appearing in the sentence are linked to the containing tokens with the boundary information of the lexicon words or KG concepts. RGCN is used to model all relational graphs constructed from multi-source knowledge, and the representations of tokens from multi-source knowledge are integrated into the context representations of tokens via an attention mechanism. Based on the knowledge-enhanced representations of tokens, we deploy a conditional random field (CRF) layer for named entity label prediction. In this study, a lexicon of words and a medical knowledge graph are used as knowledge sources for Chinese CNER. RESULTS Our proposed method achieves the best performance on CCKS2017 and CCKS2018 in Chinese with F1-scores of 91.88% and 89.91%, respectively, significantly outperforming existing methods. The extended experiments on NCBI-Disease and BC2GM in English also prove the effectiveness of our method when only considering one knowledge source via RGCN. CONCLUSION The MKRGCN model can integrate knowledge from the external lexicon and knowledge graph effectively for Chinese CNER and has the potential to be applied to English NER.

安装插件收集

被引 26

Temporal Knowledge Graph Entity Alignment via Representation Learning

基于表示学习的时序知识图谱实体对齐

doi.org-Xiuting Song, Luyi Bai, Rongke Liu 等, 2022-Lecture Notes in Computer Science

… graph (KG) by matching the same entities in multi-source KGs. … for entity alignment between such temporal knowledge graphs (TKGs). In this paper, we propose a novel entity alignment …

安装插件收集

被引 14

Concept-Aware Entity Alignment Network for Industrial Knowledge Graph

基于概念感知的工业知识图谱实体对齐网络

doi.org-Shuai Wu, W. Tong, Yuhong Hou 等, 2025-IEEE Transactions on Industrial Informatics1区 TopIF 9.9

The industrial knowledge graph (IKG) can improve the cognitive intelligence of the manufacturing system and is recognized as one of the cores of the next-generation industrial management information system. Due to the multisource heterogeneous nature of industrial data, aligning entities with the same semantics (entity alignment) is the core technology for building large-scale, high-coverage IKGs. Existing approaches show that embedded learning of IKGs performs well for this task. However, most advanced methods ignore concept information when learning topological information about IKGs. Inspired by the ontology matching theory, in this article, we realize the importance of entity concepts in alignment. The conceptual semantics of entities can usually be obtained through the is–a relation. However, the IKG is usually constructed by triples (entity, relation, entity) automatically extracted from a large text corpus. This will lead to entities in the IKG having problems such as lacking conceptual information, belonging to multiple concepts, or having different concept granularities. To solve the two problems of lacking conceptual information and different concept granularity, we propose the concept-aware entity alignment network (CAEA), aggregating bidirectional relations and attributes to get the entity concept semantics by a novel concept-aware graph attention mechanism. The excellent performance of the CAEA can better support the construction of large and complete IKGs and support downstream applications such as industrial knowledge recommendation and assisted decision-making. To verify the performance of the CAEA on the IKG, we construct a new entity alignment benchmark using industrial control network security data and verify the effectiveness of the CAEA on the new benchmark and several mainstream datasets. Experimental results show that our method outperforms other state-of-the-art (SOTA) methods and promotes the development of IKGs.

安装插件收集

被引 3

Informed Multi-context Entity Alignment

基于信息的多语境实体对齐

doi.org-Kexuan Xin, Zequn Sun, Wen Hua 等, 2022-Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

Entity alignment is a crucial step in integrating knowledge graphs (KGs) from multiple sources. Previous attempts at entity alignment have explored different KG structures, such as neighborhood-based and path-based contexts, to learn entity embeddings, but they are limited in capturing the multi-context features. Moreover, most approaches directly utilize the embedding similarity to determine entity alignment without considering the global interaction among entities and relations. In this work, we propose an Informed Multi-context Entity Alignment (IMEA) model to address these issues. In particular, we introduce Transformer to flexibly capture the relation, path, and neighborhood contexts, and design holistic reasoning to estimate alignment probabilities based on both embedding similarity and the relation/entity functionality. The alignment evidence obtained from holistic reasoning is further injected back into the Transformer via the proposed soft label editing to inform embedding learning. Experimental results on several benchmark datasets demonstrate the superiority of our IMEA model compared with existing state-of-the-art entity alignment methods.

安装插件收集

被引 24

Attribute-Consistent Knowledge Graph Representation Learning for Multi-Modal Entity Alignment

多模态实体对齐的属性一致性知识图谱表示学习方法

doi.org-Qian Li, Shu Guo, Yang Luo 等, 2023-Proceedings of the ACM Web Conference 2023

The multi-modal entity alignment (MMEA) aims to find all equivalent entity pairs between multi-modal knowledge graphs (MMKGs). Rich attributes and neighboring entities are valuable for the alignment task, but existing works ignore contextual gap problems that the aligned entities have different numbers of attributes on specific modality when learning entity representations. In this paper, we propose a novel attribute-consistent knowledge graph representation learning framework for MMEA (ACK-MMEA) to compensate the contextual gaps through incorporating consistent alignment knowledge. Attribute-consistent KGs (ACKGs) are first constructed via multi-modal attribute uniformization with merge and generate operators so that each entity has one and only one uniform feature in each modality. The ACKGs are then fed into a relation-aware graph neural network with random dropouts, to obtain aggregated relation representations and robust entity representations. In order to evaluate the ACK-MMEA facilitated for entity alignment, we specially design a joint alignment loss for both entity and attribute evaluation. Extensive experiments conducted on two benchmark datasets show that our approach achieves excellent performance compared to its competitors.

安装插件收集

被引 56

Question-answering model based on multi-source heterogeneous data fusion and knowledge graph

基于多源异构数据融合和知识图谱的问答模型

doi.org-Qian Deng, Yi Qiu, 2026-IET Conference Proceedings

… multiple sources are aligned to the same canonical entity v, this paper merges their representations via confidenceweighted averaging or maximum-confidence selection: …

安装插件收集

A comprehensive survey of entity alignment for knowledge graphs

知识图谱实体对齐的全面调查

doi.org-Kaisheng Zeng, Chengjiang Li, Lei Hou 等, 2021-AI OpenIF 14.8

… However, current multi-source KGs have heterogeneity and complementarity, and it is … almost all the latest knowledge graph representations learning and entity alignment methods and …

安装插件收集

被引 144

Trustworthy Knowledge Graph Completion Based on Multi-sourced Noisy Data

基于多源噪声数据的可信知识图谱补全方法

doi.org-Jiacheng Huang, Yao Zhao, Wei Hu 等, 2022-Proceedings of the ACM Web Conference 2022

Knowledge graphs (KGs) have become a valuable asset for many AI applications. Although some KGs contain plenty of facts, they are widely acknowledged as incomplete. To address this issue, many KG completion methods are proposed. Among them, open KG completion methods leverage the Web to find missing facts. However, noisy data collected from diverse sources may damage the completion accuracy. In this paper, we propose a new trustworthy method that exploits facts for a KG based on multi-sourced noisy data and existing facts in the KG. Specifically, we introduce a graph neural network with a holistic scoring function to judge the plausibility of facts with various value types. We design value alignment networks to resolve the heterogeneity between values and map them to entities even outside the KG. Furthermore, we present a truth inference model that incorporates data source qualities into the fact scoring function, and design a semi-supervised learning way to infer the truths from heterogeneous values. We conduct extensive experiments to compare our method with the state-of-the-arts. The results show that our method achieves superior accuracy not only in completing missing facts but also in discovering new facts.

安装插件收集

被引 14

Study on Multi-source Heterogeneous Data Fusion and Knowledge Graph Construction Techniques in Higher Education Institutions

高等教育机构多源异构数据融合与知识图谱构建技术研究

doi.org-Chengbo Wang, 2025-Proceedings of the 2025 3rd International Conference on Educational Knowledge and Informatization

Data resources in universities are increasingly abundant, yet data silos hinder their effective utilization. This research addresses multi-source heterogeneous data fusion and knowledge graph construction in universities. We propose a deep learning-based data fusion model with entity alignment and relationship extraction techniques, design a knowledge extraction method for educational contexts, and develop an integrated knowledge graph management system. The results show an entity alignment accuracy of 91.7% and response times below 120ms. The system successfully processes over 23 million entities, enabling intelligent data applications across university departments.

安装插件收集

Multi-source remote sensing data fusion: status and trends

doi.org-Jixian Zhang, 2010-International Journal of Image and Data Fusion4区IF 1.3

… multi-source data fusion within varying spatial and temporal resolutions. This article reviews current techniques of multi-source remote sensing data fusion … , ie, pixel/data level, feature …

安装插件收集

被引 756

Data Fusion for Multi-Source Sensors Using GA-PSO-BP Neural Network

基于GA-PSO-BP神经网络的多源传感器数据融合

doi.org-Jiguo Liu, Jian Huang, Rui Sun 等, 2021-IEEE Transactions on Intelligent Transportation Systems2区 TopIF 8.4

The development of real-time road condition systems will better monitor road network operation status. However, the weak point of all these systems is their need for comprehensive and reliable data. For traffic data acquisition, two sources are currently available: 1) floating vehicles and 2) remote traffic microwave sensors (RTMS). The former consists of the use of mobile probe vehicles as mobile sensors, and the latter consists of a set of fixed point detectors installed in the roads. First, the structure of a three-layer BP neural network is designed to achieve the fusion of the floating car data (FCD) and the fixed detector data (FDD) efficiently. Second, in order to improve the accuracy of traffic speed estimation, a multi-source data fusion model that combines information from floating vehicles and microwave sensors, and that, by using GA-PSO-BP neural network is proposed. The proposed model has combined GA and PSO ingeniously. The hybrid model can not only overcome the difficulties of the traditional fusion model of its estimation inaccuracy, but also compensate the insufficiency of the traditional BP algorithm. Finally, this system has been tested and implemented on actual roads, and the simulation results show the accuracy of data has reached 98%.

安装插件收集

被引 71

A framework for multi-source data fusion

多源数据融合框架

doi.org-R. Yager, 2004-Information Sciences2区IF 6.8

… go into the development of a multi-source data fusion algorithm are described. Features that … for data fusion based on a voting like process that tries to adjudicate conflict among the data…

安装插件收集

被引 97

A Survey of Methods and Technologies for Congestion Estimation Based on Multisource Data Fusion

基于多源数据融合的交通拥堵估算方法与技术综述

doi.org-Dominik Cvetek, M. Mustra, Niko Jelusic 等, 2021-Applied Sciences4区IF 2.5

Traffic congestion occurs when traffic demand is greater than the available network capacity. It is characterized by lower vehicle speeds, increased travel times, arrival unreliability, and longer vehicular queueing. Congestion can also impose a negative impact on the society by decreasing the quality of life with increased pollution, especially in urban areas. To mitigate the congestion problem, traffic engineers and scientists need quality, comprehensive, and accurate data to estimate the state of traffic flow. Various types of data collection technologies have different advantages and disadvantages as well as data characteristics, such as accuracy, sampling frequency, and geospatial coverage. Multisource data fusion increases the accuracy and provides a comprehensive estimation of the performance of traffic flow on a road network. This paper presents a literature overview related to the estimation of congestion and prediction based on the data collected from multiple sources. An overview of data fusion methods and congestion indicators used in the literature for traffic state and congestion estimation is given. Results of these methods are analyzed, and a disseminative analysis of the advantages and disadvantages of surveyed methods is presented.

安装插件收集

被引 36

Multi-source remotely sensed data fusion for improving land cover classification

多源遥感数据融合以提高土地覆盖分类精度

doi.org-Bin Chen, Bin Chen, Bo Huang 等, 2017-ISPRS Journal of Photogrammetry and Remote Sensing1区 TopIF 12.2

… We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on …

安装插件收集

被引 205

Mechanical fault diagnosis and prediction in IoT based on multi-source sensing data fusion

基于多源传感数据融合的物联网机械故障诊断与预测

doi.org-Min Huang, Zhen Liu, Yang Tao, 2020-Simulation Modelling Practice and Theory3区IF 4.6

Abstract Using multi-source sensing data based on the Internet of Things (IoT) with artificial intelligence and big data processing technology to achieve predictive maintenance of mechanical equipment can remarkably improve the service life of the machine and reduce labor costs when diagnosing mechanical faults, and it has become a highly relevant research topic. In this paper, the multi-source sensing data fusion models and fusion algorithms are studied and discussed. First, the Joint Directors of Laboratories (JDL) fusion model and the Hierarchical fusion model are compared and analyzed. Then, various types of fusion algorithms based on Neural Networks and Deep Learning, including Dempster-Shafer (D-S) evidence theory and their applications in mechanical fault diagnosis and fault prediction, are studied and compared. The findings reveal that exploring and designing a more intelligent fusion model incorporating the beneficial characteristics of different fusion algorithms are challenging and have a certain value for promoting the development of mechanical fault diagnosis and prediction.

安装插件收集

被引 149

Data fusion and multisource image classification

数据融合与多源图像分类

doi.org-D. Amarsaikhan, T. Douglas, 2004-International Journal of Remote Sensing4区IF 2.6

… In remote sensing applications, the most widely used multisource classification techniques … The aim of this research is (a) to compare different data fusion techniques for the …

安装插件收集

被引 103

Cross-Scale Mixing Attention for Multisource Remote Sensing Data Fusion and Classification

跨尺度混合注意力在多源遥感数据融合与分类中的应用

doi.org-Yunhao Gao, Mengmeng Zhang, Junjie Wang 等, 2023-IEEE Transactions on Geoscience and Remote Sensing1区 TopIF 8.6

Hyperspectral and multispectral images (HS/MS) fusion and classification as an important branch of data quality improvement and interpretation have attracted increasing attention in recent years. However, the unavailable sensor prior still limits the performance of many traditional fusion methods, consequently deteriorating the classification results. Despite the unsupervised methods based on convolutional neural network (CNN) making a lot of attempts to mitigate the limitations, challenges with extracting the long-range dependencies hamper the performance. To address these impediments, a transformer-based baseline constructed by the cross-scale mixing attention transformer (CSMFormer) is designed for HS/MS fusion and classification. Especially, the spatial–spectral mixer (SSMixer) is utilized to extract the long-range dependencies at a large scale. Simultaneously, cross-scale feature calibration is achieved by combining information from the original scale. After that, the nonlinear enhancement module (NLEM) is designed to encourage feature discrimination. Note that the spatial and spectral mixers can be replaced by any spatial–spectral feature extractors. Therefore, the proposed CSMFormer is flexible in data fusion, land-covers’ classification, segmentation, and so on. Experiments about data fusion and land-covers’ classification on two HS/MS wetland remote sensing scenes demonstrate the superiority of the proposed CSMFormer baseline, improving the data quality and classification precision.

安装插件收集

被引 69

A novel multi-source sensing data fusion driven method for detecting rolling mill health states under imbalanced and limited datasets

一种基于多源传感数据融合的检测轧机健康状态的新方法，适用于不平衡和有限数据集

doi.org-Peiming Shi, Yue Yu, Hao Gao 等, 2022-Mechanical Systems and Signal Processing1区 TopIF 8.9

… , in this paper, multi-source sensors are mounted on the rolling mill to collect various data. … monitoring with multi-source sensing data, compared to the other states of the art DL methods. …

安装插件收集

被引 66

Forest Types Classification Based on Multi-Source Data Fusion

Forest Types Classification Based on Multi-Source Data Fusion

doi.org-Ming Lu, Bin Chen, X. Liao 等, 2017-Remote Sensing2区IF 4.1

Forest plays an important role in global carbon, hydrological and atmospheric cycles and provides a wide range of valuable ecosystem services. Timely and accurate forest-type mapping is an essential topic for forest resource inventory supporting forest management, conservation biology and ecological restoration. Despite efforts and progress having been made in forest cover mapping using multi-source remotely sensed data, fine spatial, temporal and spectral resolution modeling for forest type distinction is still limited. In this paper, we proposed a novel spatial-temporal-spectral fusion framework through spatial-spectral fusion and spatial-temporal fusion. Addressing the shortcomings of the commonly-used spatial-spectral fusion model, we proposed a novel spatial-spectral fusion model called the Segmented Difference Value method (SEGDV) to generate fine spatial-spectra-resolution images by blending the China environment 1A series satellite (HJ-1A) multispectral image (Charge Coupled Device (CCD)) and Hyperspectral Imager (HSI). A Hierarchical Spatiotemporal Adaptive Fusion Model (HSTAFM) was used to conduct spatial-temporal fusion to generate the fine spatial-temporal-resolution image by blending the HJ-1A CCD and Moderate Resolution Imaging Spectroradiometer (MODIS) data. The spatial-spectral-temporal information was utilized simultaneously to distinguish various forest types. Experimental results of the classification comparison conducted in the Gan River source nature reserves showed that the proposed method could enhance spatial, temporal and spectral information effectively, and the fused dataset yielded the highest classification accuracy of 83.6% compared with the classification results derived from single Landsat-8 (69.95%), single spatial-spectral fusion (70.95%) and single spatial-temporal fusion (78.94%) images, thereby indicating that the proposed method could be valid and applicable in forest type classification.

安装插件收集

被引 37

Geological Remote Sensing Interpretation Using Deep Learning Feature and an Adaptive Multisource Data Fusion Network

基于深度学习特征和自适应多源数据融合网络的地质遥感解释

doi.org-Wei Han, Jun Li, Shengte Wang 等, 2022-IEEE Transactions on Geoscience and Remote Sensing1区 TopIF 8.6

Geological remote sensing interpretation can extract elements of interest from multiple types of images, which is vital in geological survey and mapping, especially in inaccessible regions. However, due to numerous classes, high interclass similarities, complex distributions, and sample imbalances of geological elements, the interpretation results of machine learning (ML)-based methods are understandably worse than manual visual interpretation. In addition, scholars in remote sensing have mainly carried out their works to interpret a single geological element category, such as mineral, lithological, soil, and structure. The interpretation of multiple geological elements is missing, which is more in line with the open world. To improve the interpretation results of ML-based methods and reduce the labor cost in geological survey and mapping, we propose a deep learning (DL)-feature-based adaptive multisource data fusion network (AMSDFNet) for the efficient interpretation of multiple geological remote sensing elements. The AMSDFNet has two branches for learning valuable spatial and spectral information from two kinds of data sources, in which the atrous spatial pyramid pooling (ASPP) operation and an attention block are applied to adaptively extract and fuse multiscale informative features. A hard example mining algorithm was also added to select important training examples to address sample imbalance. A large-scale region in western China with sufficient geological elements was set as the research area. The proposed model improved the two critical metrics by about 2% in the experiment section. As far as we know, this research work is the first time DL features and multisource remote sensing images have been utilized to simultaneously interpret geological elements of lithology, soil, surface water, and glaciers. The extensive experimental results demonstrated the superiority of DL features and our model in geological remote sensing interpretation.

安装插件收集

被引 77

Multisensor Data Fusion

多传感器数据融合

doi.org-E Waltz, J Llinas, 2001-Multisensor Data Fusion

… encompasses theory, techniques and tools conceived and employed for exploiting the synergy in information acquired from multiple sources (sensor, databases, information gathered …

安装插件收集

被引 1

Multi-source data fusion method for structural safety assessment of water diversion structures

水工结构安全评估的多源数据融合方法

doi.org-Sherong Zhang, Liu Ting, Wang Chao, 2021-Journal of Hydroinformatics3区

Building safety assessment based on single sensor data has the problems of low reliability and high uncertainty. Therefore, this paper proposes a novel multi-source sensor data fusion method based on Improved Dempster–Shafer (D-S) evidence theory and Back Propagation Neural Network (BPNN). Before data fusion, the improved self-support function is adopted to preprocess the original data. The process of data fusion is divided into three steps: Firstly, the feature of the same kind of sensor data is extracted by the adaptive weighted average method as the input source of BPNN. Then, BPNN is trained and its output is used as the basic probability assignment (BPA) of D-S evidence theory. Finally, Bhattacharyya Distance (BD) is introduced to improve D-S evidence theory from two aspects of evidence distance and conflict factors, and multi-source data fusion is realized by D-S synthesis rules. In practical application, a three-level information fusion framework of the data level, the feature level, and the decision level is proposed, and the safety status of buildings is evaluated by using multi-source sensor data. The results show that compared with the fusion result of the traditional D-S evidence theory, the algorithm improves the accuracy of the overall safety state assessment of the building and reduces the MSE from 0.18 to 0.01%.

安装插件收集

被引 15

A Comprehensive Review of Multi-Source Data Fusion Processing Methods

多源数据融合处理方法的综合综述

doi.org-Xiaping Ma, Peimin Zhou, Xiaoxing He 等, 2025-Preprints.org

In recent years, significant progress has been made in multi-source navigation data fusion methods, driven by rapid advancements in multi-sensor technology, artificial intelligence (AI) algorithms, and computational capabilities. On one hand, fusion methods based on filtering theory, such as Kalman Filtering (KF), Particle Filtering (PF), and Federated Filtering (FF), have been continuously optimized, enabling effective handling of non-linear and non-Gaussian noise issues. On the other hand, the introduction of AI technologies like deep learning and reinforcement learning has provided new solutions for multi-source data fusion, particularly enhancing adaptive capabilities in complex and dynamic environments. Additionally, methods based on Factor Graph Optimization (FGO) have also demonstrated advantages in multi-source data fusion, offering better handling of global consistency problems. In the future, with the widespread adoption of technologies such as 5G, the Internet of Things, and edge computing, multi-source navigation data fusion is expected to evolve towards real-time processing, intelligence, and distributed systems. So far, Fusion methods mainly include optimal estimation methods, filtering methods, uncertain reasoning methods, Multiple Model Estimation (MME), AI, and so on. To analyze the performance of these methods and provide a reliable theoretical reference and basis for the design and development of a multi-source data fusion system. This paper summarizes the characteristics of these fusion methods and the corresponding adaptation scenarios. These results can provide references for theoretical research, system development, and application in the fields of autonomous driving, unmanned vehicle navigation, and intelligent navigation.

安装插件收集

被引 2

Multisource Geospatial Data Fusion via Local Joint Sparse Representation

基于局部联合稀疏表示的多源地理空间数据融合

doi.org-Yuhang Zhang, S. Prasad, 2016-IEEE Transactions on Geoscience and Remote Sensing1区 TopIF 8.6

… In this paper, the ALWMJ-SRC algorithm is proposed for multisource remote sensing data fusion and classification. The proposed algorithm, based on the multitask joint SR framework, …

安装插件收集

被引 34

Heterogeneous Information Fusion and Visualization for a Large-Scale Intelligent Video Surveillance System

大规模智能视频监控系统中的异构信息融合与可视化

doi.org-Ching-Tang Fan, Yuan-Kai Wang, Caiyun Huang, 2017-IEEE Transactions on Systems, Man, and Cybernetics: Systems1区 TopIF 8.7

… fusing multimodal information for a large-scale intelligent video surveillance … , data fusion, and sensor tasking. The visualization not only displays 2-D, 3-D, and geographical information …

安装插件收集

被引 68

A Large Scale Video Surveillance System with Heterogeneous Information Fusion and Visualization for Wide Area Monitoring

基于异构信息融合与可视化的大规模视频监控系统，用于广域监控

doi.org-Yuan-Kai Wang, Ching-Tang Fan, Caiyun Huang, 2012-2012 Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing

… Based on these premises, we believe that a more advanced large-scale system should play the role of active assistance to help security work, instead of a new mechanism to replace …

安装插件收集

被引 8

Research on Multi-Source Heterogeneous Big Data Fusion Method Based on Feature Level

基于特征层的多源异构大数据融合方法研究

doi.org-Yanyan Chen, Chenxi Wang, Yuchen Zhou 等, 2024-International Journal of Pattern Recognition and Artificial Intelligence4区IF 1.1

With the development of research on multi-modal data fusion and its combination with online data management, the application of multi-modal big data fusion in theinformation management systems is more and more extensive. How to integrate multi-modal big data effectively is the key technology to building an e�cient information management system. In this paper, based on the combination of a multi-support vector machine and convolutional neural network, the feature-level data fusion of multi-source heterogeneous big data is implemented, and it is applied to the real data set to test the relevant model. Experimental results show that this method can not only realize heterogeneous integration of big data, but also has high accuracy and reliability.

安装插件收集

被引 9

iFusion: Towards efficient intelligence fusion for deep learning from real-time and heterogeneous data

iFusion：向实时和异构数据深度学习高效智能融合迈进

doi.org-Kehua Guo, Tao Xu, Xiaoyan Kui 等, 2019-Information Fusion1区 TopIF 15.5

Abstract Deep learning has shown great strength in many fields and has allowed people to live more conveniently and intelligently. However, deep learning requires a considerable amount of uniform training data, which introduces difficulties in many application scenarios. On the one hand, in real-time systems, training data are constantly generated, but users cannot immediately obtain this vast amount of training data. On the other hand, training data from heterogeneous sources have different data formats. Therefore, existing deep learning frameworks are not able to train all data together. In this paper, we propose the iFusion framework, which achieves efficient intelligence fusion for deep learning from real-time data and heterogeneous data. For real-time data, we train only newly arrived data to obtain a new discrimination model and fuse the previously trained models to obtain the discrimination result. For heterogeneous data, different types of data are trained separately; then, we fuse the different discrimination models so that it is not necessary to consider heterogeneous data formats. We use a method based on Dempster-Shafer theory (DST) to fuse the discrimination models. We apply iFusion to the deep learning of medical image data, and the results of the experiments show the effectiveness of the proposed method.

安装插件收集

被引 32

A General Embedding Framework for Heterogeneous Information Learning in Large-Scale Networks

大规模网络中异构信息学习的一般嵌入框架

doi.org-Xiao Huang, Jundong Li, Na Zou 等, 2018-ACM Transactions on Knowledge Discovery from Data3区IF 4.8

Network analysis has been widely applied in many real-world tasks, such as gene analysis and targeted marketing. To extract effective features for these analysis tasks, network embedding automatically learns a low-dimensional vector representation for each node, such that the meaningful topological proximity is well preserved. While the embedding algorithms on pure topological structure have attracted considerable attention, in practice, nodes are often abundantly accompanied with other types of meaningful information, such as node attributes, second-order proximity, and link directionality. A general framework for incorporating the heterogeneous information into network embedding could be potentially helpful in learning better vector representations. However, it remains a challenging task to jointly embed the geometrical structure and a distinct type of information due to the heterogeneity. In addition, the real-world networks often contain a large number of nodes, which put demands on the scalability of the embedding algorithms. To bridge the gap, in this article, we propose a general embedding framework named Heterogeneous Information Learning in Large-scale networks (HILL) to accelerate the joint learning. It enables the simultaneous node proximity assessing process to be done in a distributed manner by decomposing the complex modeling and optimization into many simple and independent sub-problems. We validate the significant correlation between the heterogeneous information and topological structure, and illustrate the generalizability of HILL by applying it to perform attributed network embedding and second-order proximity learning. A variation is proposed for link directionality modeling. Experimental results on real-world networks demonstrate the effectiveness and efficiency of HILL.

安装插件收集

被引 12

Heterogeneous Data Fusion: A Scalable Approach to Intrusion Detection

异构数据融合：入侵检测的可扩展方法

doi.org-Seonghyeon Gong, Jake Cho, K. Choi, 2025-IEEE Access4区IF 3.6

Machine Learning-based Intrusion Detection Systems (ML-IDS) are core functionalities in responding to today’s cyber-attacks by learning, detecting, and classifying various attack patterns. However, despite achieving high overall accuracy, existing ML-IDS approaches suffer from high false positive and false negative rates for certain attack patterns due to limited generalization performances. This research proposes a novel dataset construction method that enhances the performance of ML-IDS by integrating heterogeneous security data to expand feature representations. Our approach integrates data collected from heterogeneous domains based on timestamps and evaluates the expanded feature space regarding information gain and entropy difference. The proposed method dynamically adjusts the time window for data fusion based on the evaluation of the feature space, thereby generating an optimal dataset. Our approach leverages multiple security data sources to enhance dataset quality and improve the classification performance of ML-IDS models. Experimental results demonstrate that the proposed dataset fusion mechanism enhances learning and generalization performance. Experimental results of the dataset reconstruction demonstrate improved performance of multiple baseline models on the CIC-IDS-2018 dataset, particularly in detecting attack patterns with previously high false positive rates. Notably, base models trained on the reconstructed dataset achieved a macro F1-score of 0.9968, surpassing state-of-the-art baselines. These results demonstrate that our approach to improving dataset quality can effectively enhance the performance of existing ML-IDS.

安装插件收集

被引 1

Side Information Fusion for Recommender Systems over Heterogeneous Information Network

异构信息网络中的侧信息融合用于推荐系统

doi.org-Huan Zhao, Quanming Yao, Yangqiu Song 等, 2021-ACM Transactions on Knowledge Discovery from Data3区IF 4.8

Collaborative filtering (CF) has been one of the most important and popular recommendation methods, which aims at predicting users’ preferences (ratings) based on their past behaviors. Recently, various types of side information beyond the explicit ratings users give to items, such as social connections among users and metadata of items, have been introduced into CF and shown to be useful for improving recommendation performance. However, previous works process different types of information separately, thus failing to capture the correlations that might exist across them. To address this problem, in this work, we study the application of heterogeneous information network (HIN), which offers a unifying and flexible representation of different types of side information, to enhance CF-based recommendation methods. However, we face challenging issues in HIN-based recommendation, i.e., how to capture similarities of complex semantics between users and items in a HIN, and how to effectively fuse these similarities to improve final recommendation performance. To address these issues, we apply metagraph to similarity computation and solve the information fusion problem with a “matrix factorization (MF) + factorization machine (FM)” framework. For the MF part, we obtain the user-item similarity matrix from each metagraph and then apply low-rank matrix approximation to obtain latent features for both users and items. For the FM part, we apply FM with Group lasso (FMG) on the features obtained from the MF part to train the recommending model and, at the same time, identify the useful metagraphs. Besides FMG, a two-stage method, we further propose an end-to-end method, hierarchical attention fusing, to fuse metagraph-based similarities for the final recommendation. Experimental results on four large real-world datasets show that the two proposed frameworks significantly outperform existing state-of-the-art methods in terms of recommendation performance.

安装插件收集

被引 16

Heterogeneous Large-Scale Data Fusion Mechanism of Energy Storage Power Station Based on Neural Network

基于神经网络的储能电站异构大规模数据融合机制

doi.org-Yimin Deng, Zhoubo Weng, Tianlong Zhang, 2023-Journal of Multimedia Information System

… information of decision-making tasks. To achieve the efficient integration of heterogeneous large-scale data from energy storage power stations, this study presents a novel data fusion …

安装插件收集

被引 1

Large scale heterogeneous monitoring system with decentralized sensor fusion

大规模异构监测系统及其去中心化传感器融合架构

doi.org-G. Stamatescu, I. Stamatescu, Cristian Dragana 等, 2015-2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

… architecture of a large scale heterogeneous monitoring system and the application of decentralized sensor fusion mechanisms for efficient information extraction and data reduction. …

安装插件收集

被引 12

Fusing Heterogeneous Data: A Case for Remote Sensing and Social Media

融合异构数据：遥感与社交媒体应用案例

doi.org-Han Wang, E. Skau, H. Krim 等, 2018-IEEE Transactions on Geoscience and Remote Sensing1区 TopIF 8.6

Data heterogeneity can pose a great challenge to process and systematically fuse low-level data from different modalities with no recourse to heuristics and manual adjustments and refinements. In this paper, a new methodology is introduced for the fusion of measured data for detecting and predicting weather-driven natural hazards. The proposed research introduces a robust theoretical and algorithmic framework for the fusion of heterogeneous data in near real time. We establish a flexible information-based fusion framework with a target optimality criterion of choice, which for illustration, is specialized to a maximum entropy principle and a least effort principle for semisupervised learning with noisy labels. We develop a methodology to account for multimodality data and a solution for addressing inherent sensor limitations. In our case study of interest, namely, that of flood density estimation, we further show that by fusing remote sensing and social media data, we can develop well founded and actionable flood maps. This capability is valuable in situations where environmental hazards, such as hurricanes or severe weather, affect very large areas. Relative to the state of the art working with such data, our proposed information-theoretic solution is principled and systematic, while offering a joint exploitation of any set of heterogeneous sensor modalities with minimally assuming priors. This flexibility is coupled with the ability to quantitatively and clearly state the fusion principles with very reasonable computational costs. The proposed method is tested and substantiated with the multimodality data of a 2013 Boulder Colorado flood event.

安装插件收集

被引 48

CSF: Crowdsourcing semantic fusion for heterogeneous media big data in the internet of things

物联网异构媒体大数据的众包语义融合：CSF

doi.org-Kehua Guo, Yayuan Tang, Peiyun Zhang, 2017-Information Fusion1区 TopIF 15.5

… The other is the automatic method, which mainly involves low-level data to fuse semantic information without human intervention, and can be used for large-scale data. However, there …

安装插件收集

被引 43

Research on Heterogeneous Network Data Fusion based on Deep Learning

基于深度学习的异构网络数据融合研究

doi.org-Zengyun Hu, Minghao Liu, Lipeng Liu 等, 2024-2024 4th International …

The advent of the era of big data has led to the emergence of heterogeneous network data fusion as a prominent area of research. Heterogeneous network data is characterised by multi-modality, multi-source, and high dimensionality, which presents significant challenges for traditional data fusion methods. These methods often encounter difficulties in processing such data, including issues such as information redundancy, data inconsistency, and high computational complexity. This paper proposes a heterogeneous network data fusion model based on a deep neural network. The model employs the Multi-Layer Perceptron (MLP) as its fundamental framework, utilising the deep neural network to facilitate joint feature representation learning on data from disparate modalities. The Adaptive Feature Reconstruction Module enables the model to learn the interrelationships between different modalities and to balance the importance of different modal features in the fusion process in a dynamic manner. Furthermore, we introduce an innovative cross-modal attention mechanism, which is capable of effectively capturing the coupling relationship between deep features in heterogeneous data, thereby enhancing the expressiveness and data fusion efficacy of the model. The experimental results demonstrate that the proposed model markedly enhances the accuracy of classification and regression tasks in comparison to traditional methodologies.

安装插件收集

被引 3

A recommendation model with multi-scale semantic fusion on heterogeneous information network

基于异构信息网络的具有多尺度语义融合的推荐模型

doi.org-H Zhang, X Wang, X Li 等, 2023-… International Conference on …

… utilization of heterogeneous data and have the problem of information loss in the process of semantic information fusion. In this paper, we propose a Multi-scale Semantic Fusion …

安装插件收集

被引 1

Federated fault diagnosis using data fusion in large-scale heterogeneous unmanned systems

基于数据融合的大规模异构无人系统的联邦故障诊断

doi.org-Runze Li, Bin Jiang, Yan Zong 等, 2025-Control Engineering Practice2区IF 4.6

… reliability of heterogeneous unmanned systems. This paper proposes a federated fault diagnosis … based on data fusion, which combines visual images and multi-sensor information to …

安装插件收集

被引 4

A clustering and fusion method for large group decision making with double information and heterogeneous experts

基于双信息和异构专家的大规模群体决策聚类与融合方法

doi.org-Xiang-yu Zhong, Xuan-hua Xu, Xiao-hong Chen, 2021-Soft Computing4区IF 2.5

… and preference relation information, and heterogeneous … an expert clustering and information fusion method. First, … is proposed to classify the large-scale experts into several clusters. …

安装插件收集

被引 15

Deep well construction of big data platform based on multi-source heterogeneous data fusion

基于多源异构数据融合的大数据平台在深井施工中的应用

doi.org-Yu Zhang, Yange Wang, Hongwei Ding 等, 2019-International Journal of Internet Manufacturing and Services

At present, energy saving and emission reduction had become a problem of great concern for mankind. At the same time, there were some problems in the mining industry, such as waste of resources, low efficiency and easy occurrence of industrial accidents. Therefore, this paper had designed a deep well construction big data platform. The high precision and bear great pressure sensors were added to the system to solve the difficult problem of collecting information in deep wells by ordinary sensors. The multi-source heterogeneous data fusion algorithm was added to the system to solve the problem that the format of the data acquisition was different. In conclusion, the completion of the platform could achieve data monitoring in the process of mines. It not only helps to enhance the safety of mine construction, but also provides data analytical tools for further theoretical research of mine construction.

安装插件收集

被引 5

A MAS approach to fusion of heterogeneous information

一种多智能体系统方法用于异构信息的融合

doi.org-G. Pavlin, P. D. Oude, J. Nunnink, 2005-The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)

… large scale fusion of heterogeneous and noisy information. DPN agents can establish meaningful information filtering channels between the relevant information … -level information fusion…

安装插件收集

被引 13

Joint Representation Learning for Multi-Modal Transportation Recommendation

多模态交通推荐中的联合表示学习

doi.org-Hao Liu, Ting Li, Renjun Hu 等, 2019-Proceedings of the AAAI Conference on Artificial Intelligence

Multi-modal transportation recommendation has a goal of recommending a travel plan which considers various transportation modes, such as walking, cycling, automobile, and public transit, and how to connect among these modes. The successful development of multi-modal transportation recommendation systems can help to satisfy the diversified needs of travelers and improve the efficiency of transport networks. However, existing transport recommender systems mainly focus on unimodal transport planning. To this end, in this paper, we propose a joint representation learning framework for multi-modal transportation recommendation based on a carefully-constructed multi-modal transportation graph. Specifically, we first extract a multi-modal transportation graph from large-scale map query data to describe the concurrency of users, Origin-Destination (OD) pairs, and transport modes. Then, we provide effective solutions for the optimization problem and develop an anchor embedding for transport modes to initialize the embeddings of transport modes. Moreover, we infer user relevance and OD pair relevance, and incorporate them to regularize the representation learning. Finally, we exploit the learned representations for online multimodal transportation recommendations. Indeed, our method has been deployed into one of the largest navigation Apps to serve hundreds of millions of users, and extensive experimental results with real-world map query data demonstrate the enhanced performance of the proposed method for multimodal transportation recommendations.

安装插件收集

被引 88

Joint Representation Learning and Novel Category Discovery on Single- and Multi-modal Data

单模态和多模态数据上的联合表示学习与新类别发现

doi.org-Xu Jia, Kai Han, Yukun Zhu 等, 2021-2021 IEEE/CVF International Conference on Computer Vision (ICCV)

This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories. We present a generic, end-to-end framework to jointly learn a reliable representation and assign clusters to unlabelled data. To avoid over-fitting the learnt embedding to labelled data, we take inspiration from self-supervised representation learning by noise-contrastive estimation and extend it to jointly handle labelled and unlabelled data. In particular, we propose using category discrimination on labelled data and cross-modal discrimination on multi-modal data to augment instance discrimination used in conventional contrastive learning approaches. We further employ Winner-Take-All (WTA) hashing algorithm on the shared representation space to generate pairwise pseudo labels for unlabelled data to better predict cluster assignments. We thoroughly evaluate our framework on large-scale multi-modal video benchmarks Kinetics-400 and VGG-Sound, and image benchmarks CIFAR10, CIFAR100 and ImageNet, obtaining state-of-the-art results.

安装插件收集

被引 78

Molecular Joint Representation Learning via Multi-Modal Information of SMILES and Graphs

基于SMILES和图结构多模态信息的分子联合表征学习

doi.org-Tianyu Wu, Yang Tang, Qiyu Sun 等, 2022-IEEE/ACM Transactions on Computational Biology and Bioinformatics3区IF 3.4

In recent years, artificial intelligence has played an important role on accelerating the whole process of drug discovery. Various of molecular representation schemes of different modals (e.g., textual sequence or graph) are developed. By digitally encoding them, different chemical information can be learned through corresponding network structures. Molecular graphs and Simplified Molecular Input Line Entry System (SMILES) are popular means for molecular representation learning in current. Previous works have done attempts by combining both of them to solve the problem of specific information loss in single-modal representation on various tasks. To further fusing such multi-modal imformation, the correspondence between learned chemical feature from different representation should be considered. To realize this, we propose a novel framework of molecular joint representation learning via Multi-Modal information of SMILES and molecular Graphs, called MMSG. We improve the self-attention mechanism by introducing bond-level graph representation as attention bias in Transformer to reinforce feature correspondence between multi-modal information. We further propose a Bidirectional Message Communication Graph Neural Network (BMC GNN) to strengthen the information flow aggregated from graphs for further combination. Numerous experiments on public property prediction datasets have demonstrated the effectiveness of our model.

安装插件收集

被引 26

Relation-Induced Multi-Modal Shared Representation Learning for Alzheimer’s Disease Diagnosis

关系诱导的多模态共享表示学习用于阿尔茨海默病诊断

doi.org-Zhenyuan Ning, Qing Xiao, Qianjin Feng 等, 2021-IEEE Transactions on Medical Imaging1区 TopIF 9.8

The fusion of multi-modal data (e.g., magnetic resonance imaging (MRI) and positron emission tomography (PET)) has been prevalent for accurate identification of Alzheimer’s disease (AD) by providing complementary structural and functional information. However, most of the existing methods simply concatenate multi-modal features in the original space and ignore their underlying associations which may provide more discriminative characteristics for AD identification. Meanwhile, how to overcome the overfitting issue caused by high-dimensional multi-modal data remains appealing. To this end, we propose a relation-induced multi-modal shared representation learning method for AD diagnosis. The proposed method integrates representation learning, dimension reduction, and classifier modeling into a unified framework. Specifically, the framework first obtains multi-modal shared representations by learning a bi-directional mapping between original space and shared space. Within this shared space, we utilize several relational regularizers (including feature-feature, feature-label, and sample-sample regularizers) and auxiliary regularizers to encourage learning underlying associations inherent in multi-modal data and alleviate overfitting, respectively. Next, we project the shared representations into the target space for AD diagnosis. To validate the effectiveness of our proposed approach, we conduct extensive experiments on two independent datasets (i.e., ADNI-1 and ADNI-2), and the experimental results demonstrate that our proposed method outperforms several state-of-the-art methods.

安装插件收集

被引 81

Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering

基于图嵌入对比多模态表示学习的聚类方法

doi.org-Wei Xia, Tianxiu Wang, Quanxue Gao 等, 2023-IEEE Transactions on Image Processing1区 TopIF 13.7

Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneously learn the inter- and intra-modality consistency, resulting in a limited representation learning capacity. On the other hand, most existing processes are modeled for a finite sample set and cannot handle out-of-sample data. To handle the above two challenges, we propose a novel Graph Embedding Contrastive Multi-modal Clustering network (GECMC), which treats the representation learning and multi-modal clustering as two sides of one coin rather than two separate problems. In brief, we specifically design a contrastive loss by benefiting from pseudo-labels to explore consistency across modalities. Thus, GECMC shows an effective way to maximize the similarities of intra-cluster representations while minimizing the similarities of inter-cluster representations at both inter- and intra-modality levels. So, the clustering and representation learning interact and jointly evolve in a co-training framework. After that, we build a clustering layer parameterized with cluster centroids, showing that GECMC can learn the clustering labels with given samples and handle out-of-sample data. GECMC yields superior results than 14 competitive methods on four challenging datasets. Codes and datasets are available: https://github.com/xdweixia/GECMC.

安装插件收集

被引 71

MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning

MMKRL：一种用于多模态知识图谱表示学习的鲁棒嵌入方法

doi.org-Xinyu Lu, Lifang Wang, Zejun Jiang 等, 2021-Applied Intelligence3区IF 3.5

… (KGs); however, there is still much multi-modal (textual, visual) … solution called multi-modal knowledge representation learning (… Instead of simply integrating multi-modal knowledge with …

安装插件收集

被引 69

Self-Adaptive Representation Learning Model for Multi-Modal Sentiment and Sarcasm Joint Analysis

自适应多模态情感与讽刺联合分析表示学习模型

doi.org-Yazhou Zhang, Yang Yu, Mengyao Wang 等, 2023-ACM Transactions on Multimedia Computing, Communications, and Applications3区IF 6.0

Sentiment and sarcasm are intimate and complex, as sarcasm often deliberately elicits an emotional response in order to achieve its specific purpose. Current challenges in multi-modal sentiment and sarcasm joint detection mainly include multi-modal representation fusion and the modeling of the intrinsic relationship between sentiment and sarcasm. To address these challenges, we propose a single-input stream self-adaptive representation learning model (SRLM) for sentiment and sarcasm joint recognition. Specifically, we divide the image into blocks to learn its serialized features and fuse textual feature as input to the target model. Then, we introduce an adaptive representation learning network using a gated network approach for sarcasm and sentiment classification. In this framework, each task is equipped with its dedicated expert network responsible for learning task-specific information, while the shared expert knowledge is acquired and weighted through the gating network. Finally, comprehensive experiments conducted on two publicly available datasets, namely Memotion and MUStARD, demonstrate the effectiveness of the proposed model when compared to state-of-the-art baselines. The results reveal a notable improvement on the performance of sentiment and sarcasm tasks.

安装插件收集

被引 15

Understanding and Constructing Latent Modality Structures in Multi-Modal Representation Learning

理解与构建多模态表示学习中的潜在模态结构

doi.org-Qian Jiang, Changyou Chen, Han Zhao 等, 2023-2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Contrastive loss has been increasingly used in learning representations from multiple modalities. In the limit, the nature of the contrastive loss encourages modalities to exactly match each other in the latent space. Yet it remains an open question how the modality alignment affects the downstream task performance. In this paper, based on an information-theoretic argument, we first prove that exact modality alignment is sub-optimal in general for down-stream prediction tasks. Hence we advocate that the key of better performance lies in meaningful latent modality structures instead of perfect modality alignment. To this end, we propose three general approaches to construct latent modality structures. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and intermodality regularization. Extensive experiments are conducted on two popular multi-modal representation learning frameworks: the CLIP-based two-tower model and the ALBEF-based fusion model. We test our model on a variety of tasks including zero/few-shot image classification, image-text retrieval, visual question answering, visual reasoning, and visual entailment. Our method achieves consistent improvements over existing methods, demonstrating the effectiveness and generalizability of our proposed approach on latent modality structure regularization.

安装插件收集

被引 77

MJPR: Multi-Modal Joint Predictive Representation in Deep Reinforcement Learning

MJPR：深度强化学习中的多模态联合预测表示

doi.org-Zehan Wang, Ziming He, Zijia Wang 等, 2025-2025 IEEE International Conference on Robotics and Automation (ICRA)

Multi-modal reinforcement learning (RL) has been brought into focus due to its ability to provide complementary information from different sensors, enriching observations of agents. However, the introduction of multi-modal highdimensional observations brings challenges to sample efficiency. There is a lack of research on how to efficiently obtain multi-modal latent states while encouraging them to generate complementary information. To address this, we propose a representation learning method, Multi-modal Joint Predictive Representation (MJPR), which utilizes multi-modal interactive information to predict future latent states. The joint prediction method achieves the representation training for modalities and promotes each modality to generate complementary information related to predictions of each other. In addition, we introduce multi-modal loss balancing to prompt training equilibrium and cross-modal contrastive learning (CMCL) to align the modalities for effective modal interaction. We establish the multi-modal environments in the Deepmind Control suite (DMC) and Webots and compare our method with current RL representation methods. Experimental results show that MJPR outperforms state-of-the-art methods by an average of 12.0% on six subtasks in DMC environments. It outperforms advanced methods by 16.7% and 55.4% in simple tasks and complex tasks of Webots environment, respectively. Moreover, ablation experiments are established in the DMC environment to verify the importance of each module to MJPR.

安装插件收集

Enhancing Classification with Joint Representation Learning on Multimodal Data

基于多模态数据联合表示学习的分类增强

doi.org-Neha Dhirendra Sirur, Padmashree Desai, Sujatha C 等, 2026-Lecture Notes in Networks and Systems

… Collectively, these works reinforce the importance of multimodal learning for classification and fusion. Building on these approaches, our work explores joint representation learning …

安装插件收集

多源异构信息融合

本报告通过对文献的结构化梳理，将多源异构信息融合领域划分为七大维度：从基础综述到深度多模态联合表征，涵盖了知识图谱与语义集成、隐私保护集成、物理工业传感诊断、遥感空间应用以及行业推荐决策系统。这些研究展示了从底层数据处理、算法建模到垂直场景应用的全链路发展态势，突出了异构性解决、鲁棒性提升及语义互操作性的核心研究价值。

共 119 篇文献，7 个研究方向

多源异构数据融合基础理论与综述

汇集了对多源异构数据融合领域进行系统性总结、定义标准、框架设计及探讨通用挑战的综述与理论研究。相关文献: Jing Gao et. al, 2020 等 8 篇文献

深度多模态联合表示与跨模态学习

聚焦于利用深度学习架构（如Attention, GNN, 对比学习）从多模态数据中进行联合特征提取、跨模态转换及构建统一的特征空间。相关文献: Fei Zhao et. al, 2024 等 33 篇文献

异构网络、知识图谱与语义集成

侧重于利用知识图谱、语义映射和实体对齐技术，解决多源知识的结构异构性与语义逻辑不一致问题。相关文献: Huan Zhao et. al, 2021 等 22 篇文献

分布式数据隐私保护与安全集成

专门探讨在联邦学习、差分隐私等技术支撑下，分布式环境下多源异构数据的安全对齐与融合计算方案。相关文献: Seonghyeon Gong et. al, 2025 等 6 篇文献

工业与物理系统的多源传感融合诊断

专注于处理物理传感器、监测设备产生的异构信号，应用于机械故障诊断、能源管理、采矿安全及复杂环境的实时监控与预测。相关文献: Qianqian Shi et. al, 2017 等 19 篇文献

多源地理空间遥感应用

针对遥感影像、地理空间数据等多源信息，利用特定融合算法提高地物分类、地质解释及灾害监测的准确性。相关文献: Bin Chen et. al, 2017 等 5 篇文献

特定领域场景推荐与决策辅助

结合行业垂直场景（如交通、推荐、医疗等），设计融合多源异构数据的决策支持系统、行为预测模型及工程化落地方案。相关文献: Dominik Cvetek et. al, 2021 等 26 篇文献