Welcome!
I am a fourth-year Ph.D. candidate in Informatics at the Pennsylvania State University, where I am fortunate to be advised by Dr. Fenglong Ma. My primary research lies at the intersection of multimodal learning, healthcare informatics, and knowledge-enhanced AI. I am particularly interested in developing robust, scalable methods that effectively integrate heterogeneous health data—including clinical text, medical images, time-series signals, and structured knowledge—for reliable and interpretable decision-making. My major research contributions have been recognized by top-tier conferences such as NeurIPS, KDD, ACL, EMNLP, NAACL, CIKM, and AMIA.
Prior to joining Penn State, I received my Master’s degree in Information Science from the University of North Carolina at Chapel Hill under the guidance of Dr. Yue Wang, and my Bachelor’s degree in Automation from Central South University in China, where I was mentored by Dr. Fan Guo. I am deeply grateful to both advisors for introducing me to the field of data science and artificial intelligence, which laid the foundation for my ongoing research journey.
Research Interests
My research interests include but are not limited to:
- Multimodal Self-supervised Learning [ACL’24] [EMNLP’23] [Preprint’25]
- Multimodal Healthcare Applications [EMNLP’24] [NeurIPS’24] [KDD’24] [ACL’24] [EMNLP’23][Preprint’25]
- Knowledge-enhanced Healthcare Applications [EMNLP’24] [NeurIPS’24][COLING’24] [AMIA’22] [Preprint’25]
- Natural Language Processing & Large Language Models [KDD’24] [COLING’24][NAACL’22] [Preprint’25]
- Information Retrieval [KDD’24][CIKM’22][Preprint’25]
Experience
Machine Learning Research Intern, Livestreaming Recommendation Team, TikTok
Mentor: Mr. Song Wang
May 2025 – Aug. 2025
Machine Learning Research Intern, Search ML Team, Instacart
Mentor: Dr. Taesik Na
May 2023 – Aug. 2023
Research Specialist, Renaissance Computing Institute, University of North Carolina at Chapel Hill
Mentor: Dr. Yue Wang
May 2022 – Aug. 2022
Selected Publications
📄 Peer-reviewed Papers
MedDiTPro: A Prompt-Guided Diffusion Transformer for Multimodal Longitudinal Medical Data Synthesis
Yuan Zhong, Xiaochen Wang, Jiaqi Wang, Xiaokun Zhang, Fenglong Ma
KDD’25
Developing Multimodal Healthcare Foundation Model: From Data-driven to Knowledge-enhanced
Xiaochen Wang
AAAI’25
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
Xiaochen Wang*, Jiaqi Wang*, Houping Xiao, Jinghui Chen, Fenglong Ma
EMNLP’24
FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection
Jiaqi Wang*, Xiaochen Wang*, Lingjuan Lyu, Jinghui Chen, Fenglong Ma
NeurIPS’24 (Spotlight)
Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources
Xiaochen Wang, Junyu Luo, Jiaqi Wang, Yuan Zhong, et al.
ACL’24
Mitigating Pooling Bias in E-commerce Search via False Negative Estimation
Xiaochen Wang, Xiao Xiao, Ruhan Zhang, Xuan Zhang, Taesik Na, Tejaswi Tenneti, Haixun Wang, Fenglong Ma
KDD’24
Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models
Yuan Zhong, Xiaochen Wang, Jiaqi Wang, Xiaokun Zhang, et al.
KDD’24
Hierarchical Pretraining on Multimodal Electronic Health Records
Xiaochen Wang, Junyu Luo, Jiaqi Wang, Ziyi Yin, Suhan Cui, Yuan Zhong, et al.
EMNLP’23
pADR: Towards Personalized Adverse Drug Reaction Prediction
Junyu Luo, Cheng Qian, Xiaochen Wang, Lucas Glass, Fenglong Ma
CIKM’23
Sentence-Level Resampling for Named Entity Recognition
Xiaochen Wang, Yue Wang
NAACL’22
Enabling Scientific Reproducibility through FAIR Data Management
Xiaochen Wang, Yue Wang, José-Luis Ambite, Abhishek Appaji, et al.
AMIA’22
Extreme Systematic Reviews: A Large Literature Screening Dataset
Jingwen Hou*, Xiaochen Wang*, Jean-Jacques Dubois, R. Byron Rice, Amanda Haddock, Yue Wang
CIKM’22
📝 Preprints
MedMKG: Benchmarking Medical Knowledge Exploitation with Multimodal Knowledge Graph
Xiaochen Wang, Yuan Zhong, Lingwei Zhang, Lisong Dai, Ting Wang, Fenglong Ma
Submitted to NeurIPS’25
GPR: Empowering Generation with Graph-Pretrained Retriever
Xiaochen Wang, Zongyu Wu, Yuan Zhong, Xiang Zhang, Suhang Wang, Fenglong Ma
Submitted to EMNLP’25