患者表征学习

患者表征学习研究已形成以自监督对比学习、序列轨迹建模、图谱结构融合为三大技术支柱的完整体系。研究核心正从简单的单模态表示转向关注多模态数据一致性、临床时间序列的长程依赖以及异构实体间的关联性，同时领域内对模型的鲁棒性、偏差控制及临床可解释性评价愈发重视。

共 67 篇文献，4 个研究方向

基于对比学习与自监督预训练的表征学习

这些文献核心在于利用对比学习、自监督代理任务和预训练范式，解决医疗领域标签数据稀缺问题，通过对齐不同模态、视角或样本，生成通用的、鲁棒的患者表征。相关文献: Hongxu Yuan et. al, 2025 等 17 篇文献

基于序列建模与Transformer的临床轨迹分析

这些研究将患者病历视为时间序列，利用Transformer、RNN及VAE等深度架构捕捉病历记录中的长期依赖和动态演变规律，实现对诊疗轨迹的深度表征。相关文献: Su Xian et. al, 2025 等 23 篇文献

基于图神经网络与多模态融合的复杂结构嵌入

这些文献侧重于利用图神经网络和多模态整合技术，显式建模异构医疗实体（药物、诊断、服务）间的复杂关联，以及文本、结构化数据间的跨模态依赖关系。相关文献: Suparna Ghanvatkar et. al, 2023 等 15 篇文献

患者相似性挖掘与系统综述评估

该分组涵盖了利用患者群组相似性增强表征的方法，以及对该领域进行系统性归纳、方法论评估和挑战（如偏差、可解释性）探讨的综述类文献。相关文献: Chaohe Zhang et. al, 2021 等 12 篇文献

总计67篇相关文献

Deep Representation Learning of Patient Data from Electronic Health Records (EHR): A Systematic Review

doi.org-Yuqi Si, Jingcheng Du, Zhao Li 等, 2020-Journal of Biomedical Informatics2区IF 4.5

OBJECTIVES Patient representation learning refers to learning a dense mathematical representation of a patient that encodes meaningful information from Electronic Health Records (EHRs). This is generally performed using advanced deep learning methods. This study presents a systematic review of this field and provides both qualitative and quantitative analyses from a methodological perspective. METHODS We identified studies developing patient representations from EHRs with deep learning methods from MEDLINE, EMBASE, Scopus, the Association for Computing Machinery (ACM) Digital Library, and the Institute of Electrical and Electronics Engineers (IEEE) Xplore Digital Library. After screening 363 articles, 49 papers were included for a comprehensive data collection. RESULTS Publications developing patient representations almost doubled each year from 2015 until 2019. We noticed a typical workflow starting with feeding raw data, applying deep learning models, and ending with clinical outcome predictions as evaluations of the learned representations. Specifically, learning representations from structured EHR data was dominant (37 out of 49 studies). Recurrent Neural Networks were widely applied as the deep learning architecture (Long short-term memory: 13 studies, Gated recurrent unit: 11 studies). Learning was mainly performed in a supervised manner (30 studies) optimized with cross-entropy loss. Disease prediction was the most common application and evaluation (31 studies). Benchmark datasets were mostly unavailable (28 studies) due to privacy concerns of EHR data, and code availability was assured in 20 studies. DISCUSSION & CONCLUSION The existing predictive models mainly focus on the prediction of single diseases, rather than considering the complex mechanisms of patients from a holistic review. We show the importance and feasibility of learning comprehensive representations of patient EHR data through a systematic review. Advances in patient representation learning techniques will be essential for powering patient-level EHR analyses. Future work will still be devoted to leveraging the richness and potential of available EHR data. Reproducibility and transparency of reported results will hopefully improve. Knowledge distillation and advanced learning techniques will be exploited to assist the capability of learning patient representation further.