📝 Representative Publications

Unified Multimodal Perception

  • Unified Representations: C-MCR (NeurIPS 2023), Ex-MCR (NeurIPS 2024), FreeBind (ICML 2024), OmniBind (ICLR 2025)

Spatial Intelligence in Visual Content

  • Point Cloud Understanding: Chat-3D (NAACL 2023) / Chat-Scene (NeurIPS 2024) for 3D MLLM, 3DRP-Net (EMNLP 2023) / WS-3DVG (ICCV 2023) for 3D visual grounding.

  • Spatial-aware Image Understanding: Orient Anything (ICML 2025), SpatialCLIP (CVPR 2025)

  • Spatial-aware Image Generation: 6DoF-Gen (Working on), GenSpace (Working on)

Arxiv 2025
sym
  • Depth Anything with Any Prior. Zehan Wang, Siyu Chen, Lihe Yang, Jialei Wang, Ziang Zhang, Hengshuang Zhao, Zhou Zhao Arxiv, 2025
  • The SoTA zero-shot depth estimation model that can integrate any form of depth measurement as prior.
ICML 2025
sym
CVPR 2025
sym
NeurIPS 2024
sym
ICLR 2025
sym
ICML 2024
sym
NeurIPS 2023
sym
  • Connecting Multi-modal Contrastive Representations Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao NeurIPS 2023
  • Learning multimodal contrastive representations without requiring paired data.

Full Publication List

2025

2024

2023