About Me
Zehan Wang (王泽寒) is a second-year PhD student in the College of Computer Science at Zhejiang University, supervised by Prof. Zhou Zhao. Before that, I obtained a Bachelor of Biological System Engineering from Zhejiang University, supervised by Prof. Haiyan Cen and Prof. Tao Lin.
My research interests lie broadly in Multi-modal Learning, e.g., Multi-modal Representation Learning, Multi-modal Generation and Multi-modal LLM. I have published several first-author papers at the top international AI conferences such as NeurIPS/ICCV/ACL.
I am actively looking for academic collaboration, feel free to drop me an email.
🔥 News
- 2023.10: 🎉🎉 Ex-MCR comes out!
- 2023.10: One paper is accepted by EMNLP 2023!
- 2023.09: 🎉🎉 C-MCR is accepted by NeurIPS 2022!
- 2023.08: Chat-3D comes out!
- 2023.06: Two papers are accepted by ICCV 2023!
- 2023.05: One Paper are accepted by ACL 2023!
📝 Representative Publications
Mutli-modal Representation Learning
- Unified Representations: C-MCR (NeurIPS 2023), Ex-MCR
- Audio-Video Representations: LiMo
- Visual-Language Representations: DG-NLVL (ACL 2023)
Multi-modal LLM & Understanding
- Large Language Model for 3D: Chat-3D
- 3D Visual Grounding: 3DRP-Net (EMNLP 2023), Weakly supervised 3DVG (ICCV 2023)
- Extending Multi-modal Contrastive Representations. Zehan Wang, Ziang Zhang, Luping Liu, Yang Zhao, Haifeng Huang, Tao Jin, Zhou Zhao Arxiv, 2023
- Learn unified multimodal contrastive representations for more than three modalities in a paired-data free and training-efficient way.
- Academic / Industry Impact: Our code and pre-trained models are released at , which provides state-of-the-art unified 3D-image-text-audio representations.
- Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes. Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao Arxiv, 2023
- LLM for 3D, the frist universal dialogue system for 3D world.
- Academic / Industry Impact: Our code is released at .
- Connecting Multi-modal Contrastive Representations. Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao NeurIPS 2023
- Paired-data free and training-efficient multi-modal constrastive representations learning method.
- Academic / Industry Impact: Our work is reported by PaperWeekly. Our code and pre-trained models are released at , which provides state-of-the-art audio-visual and 3D-language representations.
- Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding. Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao ICCV 2023
- The first weakly-supervised 3D visual grounding method.
Full Publication List
2023
- Extending Multi-modal Contrastive Representations. Zehan Wang, Ziang Zhang, Luping Liu, Yang Zhao, Haifeng Huang, Tao Jin, Zhou Zhao. Arxiv, 2023
- Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes. Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao. Arxiv, 2023
- Connecting Multi-modal Contrastive Representations. Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao. NeurIPS 2023
- Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding. Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao. ICCV 2023
- MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition Xize Cheng, Tao Jin, Rongjie Huang, Linjun Li, Wang Lin, Zehan Wang, Ye Wang, Huadai Liu, Aoxiong Yin, Zhou Zhao. ICCV 2023
- 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao. EMNLP 2023
- Scene-robust natural language video localization via learning domain-invariant representations Zehan Wang, Yang Zhao, Haifeng Huang, Yan Xia, Zhou Zhao. ACL 2023
📖 Educations
-
2022.09 - Present, Ph.D Student, Zhejiang University, Hangzhou.
-
2018.09 - 2022.06, Undergraduate, Zhejiang Univeristy, Hangzhou.
🎖 Honors and Awards
- Excellent Graduate, Zhejiang Province (2022)
- National Scholarships 2020
- National Scholarships 2019
- Zhejiang University First-class Scholarships (2019, 2020, 2021)
💬 Professional Services
- Conference Reviewer: ICCV 2023, ACM-MM 2023,EMNLP2023
- Assist to Review: NeurIPS 2023, CVPR 2023, CVPR 2022, ACM-MM 2022, SIGIR 2022, TMM, TNNLS
💻 Internships & Projects
- August. 2023: Research Intern: Tencent TEG.
Hosted by Hongfa Wang, Wei Liu.
Working on HunYuan Text-to-Video, Text-to-Audio Project.