

Hi!
Intro [CV]
I completed my M.Sc. at the University of California, Irvine and my B.Eng. at Xidian University.
I am fortunate to have worked at CVMI Lab under the guidance of Prof. Xiaojuan Qi and at SIAT-MMLab under the supervision of Prof. Yu Qiao .
During that time, my research focuses on 3D point cloud analysis:
a) GDANet (AAAI 2021): A representative work of robust 3D shape recognition.
b) PAConv (CVPR 2021): A representative work of generic point cloud representation.
Currently, I am a Ph.D. candidate at GAP Lab, advised by Prof. Xiaoguang Han, at the Chinese University of Hong Kong (Shenzhen) .
Throughout my Ph.D. studies, my research primarily revolved around a central question:
Can we break the barrier of the painstaking real-world 3D data acquisition to train intelligent algorithms/agents that can perceive, model, represent, and interact with 3D objects/scenes in the real world?
I attempt to tackle this challenge from two different perspectives:
(1) Up-stream Data
i) Data Collection => simplify/eliminate the real-world data collection:
a) TO-Scene (ECCV2022): Combine synthetic with real-world data to avoid scannning tabletop objects.
b) MVImgNet (CVPR2023): Use multi-view videos that is easier to capture to represent the real 3D world.
c) VC-Agent: Extract from Internet videos for customized video dataset collection.
ii) Data Generation/Simulation => generate/simulate real-world 3D data:
Stable-Sim2Real: Use diffusion model to simulate real-world 3D captures given sythetic input.
(2) Downstream Algorithms
i) Label-Efficient => learning without 3D real-world label:
MM-3DScene (CVPR 2023): Apply masked modeling to self-supervised pretraining on 3D scenes.
ii) Data-Efficient => learning without 3D real-world data:
SAMPro3D (3DV 2025): Employ 2D SAM for zero-shot 3D scene segmentation without additional training.
I have also led/participated in various projects related to generative models.
a) Free-ATM (ECCV 2024, as project lead): Use the diffusion model for representation learning.
b) TASTE-Rob (CVPR 2025, as project lead): Hand-object interaction video generation.
c) RichDreamer (CVPR 2024): Text-to-3D generation.
I am seeking a Postdoctoral or Faculty position for the Summer/Fall of 2025. If you have any openings or are interested, please feel free to contact me.
News
- [03/2025] One paper (TASTE-Rob) is accepted to CVPR2025. The code and data will be available.
- [11/2024] One paper (SAMPro3D) is accepted to 3DV2025. Paper and code are available.
- [08/2024] One paper (survey) is accepted to TPAMI. Paper is available.
- [07/2024] One paper (Free-ATM) is accepted to ECCV2024. The updated paper and code will be available.
- [07/2024] MVImgNet is awarded WAIC Youth Outstanding Paper Nomination, 2024 (2024世界人工智能大会青年优秀论文提名奖).
- [03/2024] One paper (RichDreamer) is accepted to CVPR2024 as Highlight (2.8%). Paper and code are available.
- [11/2023] I was recognized as one of the NeurIPS2023 Top Reviewers (9.9%).
- [08/2023] MVImgNet is awared CCF Outstanding Graphics Open-Source Dataset, 2023 (CCF-2023年度优秀图形开源数据集奖).
- [05/2023] I was named one of the CVPR2023 Outstanding Reviewers (3.3%).
- [03/2023] Three papers (MVImgNet, MM-3DScene, REC-MV) are accepted to CVPR2023. Papers, dataset, and codes are all available.
- [07/2022] One paper (TO-Scene) is accepted to ECCV2022 for Oral Presentation (2.7%). Paper and dataset are available.
- [03/2021] One paper (PAConv) is accepted to CVPR2021. Paper and code are available.
- [12/2020] One paper (GDANet) is accepted to AAAI2021. Paper and code are available.
Publications
(* indicates equal contribution, † denotes project lead, ‡ means corresponding author)
Selected Representatives





Learning Geometry-Disentangled Representation for Complementary Understanding of 3D Object Point Cloud
Mutian Xu*, David Junhao Zhang*, Zhipeng Zhou, Mingye Xu, Xiaojuan Qi, Yu Qiao‡.
(AAAI, 2021, BEST performance on OmniObject 3D robust perception) [paper][code]


Simulation of Real-Captured 3D Data via Depth Diffusion
Mutian Xu, Chongjie Ye, Haolin Liu, Yushuang Wu, Jiahao Chang, Xiaoguang Han‡.
(Under submission, 2024)

An Agent for Video Data Collection
Yidan Zhang *, Mutian Xu*, Yiming Hao, Kun Zhou, Jiahao Chang, Xiaoguang Han‡.
(Under submission, 2024)
Others


Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images
David Junhao Zhang, Mutian Xu†, Jay Zhangjie Wu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou‡. (†project lead)
(ECCV, 2024) [paper]



A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen*, Yushuang Wu*, Qiyuan Dai*, Hong-Yu Zhou*, Mutian Xu, Sibei Yang‡, Xiaoguang Han‡, Yizhou Yu‡.
(TPAMI, 2024) [paper]
Activites & Certificates
WAIC Youth Outstanding Paper Nomination Award (世界人工智能大会青年优秀论文提名奖, MVImgNet), 2024 |
Top Reviewer, NeurIPS 2023 (9.9%) |
Outstanding Reviewer, CVPR 2023 (3.3%) |
CCF Outstanding Graphics Open-Source Dataset (CCF年度优秀图形开源数据集奖, MVImgNet), 2023 |
Outstanding Teaching Assistant Award of CUHKSZ, 2022/24 |
Journal Reviewer: TIP, IJCV, TVCG, NEUCOM, TMM, MVAP |
Conference Reviewer: CVPR 23/24/25, ICCV 21/23/25, ECCV 24, ICLR 24/25, ICML 24/25, NeurIPS 23/25, IJCAI 24, WACV 24/25, ACCV 24 |
Experience
-
06. 2020 – 02. 2021
Research Assistant
Topic: 3D Point Cloud Convolution
Advisor: Prof. Xiaojuan Qi
-
07. 2019 – 11. 2019
Visiting Student
Topic: 3D Object Point Cloud Classification and Segmentation
Advisor: Prof. Yu Qiao & Dr. Zhipeng Zhou
Talks
"我要这paper有何用 ? ( Why Do I Need Papers ? )", Valse Webinar 2024 |
Outstanding Student Forum of Valse 2023 |
Outstanding Student Forum of China 3DV 2023 |
Youth PhD Talk - ECCV 2022, invited by AI-TIME |
Teaching
CUHKSZ-CSC1001: Introduction to Computer Science: Programming Methodology (Leading TA) |
CUHKSZ-CSC1002: Computational Laboratory |
CUHKSZ-CSC3002: Introduction to Computer Science: Programming Paradigms |
Miscellaneous
3rd place of the 31st School Singer Contest, Xidian University |
Piano Professional Certificate Level 10 |