Hang Zhao

Google Scholar

Hey, I am Hang Zhao, an Assistant Professor at IIIS, Tsinghua University, Principle Investigator of MARS Lab. My research interests are multi-modal machine learning, autonomous driving and robotics.

I was a Research Scientist at Waymo (known as Google's self-driving project). Before that, I got my Ph.D. degree at MIT under Professor Antonio Torralba (the Great Torralba!), and my M.S. under Professor Ramesh Raskar. Before MIT, I received my B.S. from CKC Honors College, Zhejiang University.

Check out our MARS Lab Website for a full list of research projects and publications.

I am actively looking for PostDoc/PhD/BS students and engineers with CS/EE background to join my team. If you would like to work with me, feel free to drop me an email with your resume.

Contact Info

Selected Projects

VCAD: Vision-Centric Autonomous Driving Hot
Hang Zhao, Yilun Wang, Yue Wang, Justin Solomon, Vitor Guizilini,
Tianyuan Zhang, Qi Li, Xuanyao Chen, Yicheng Liu
"A research effort that pushes the frontiers of camera-centered autonomous driving technology."

InterSim: Interactive Traffic Simulation via Explicit Relation Modeling
Qiao Sun, Xin Huang, Brian C Williams, Hang Zhao
IROS 2022
"Towards closed-loop behavior simulation."

Paper Project
M2I: From Factored Marginal Trajectory Prediction to Interactive Prediction Hot
Qiao Sun, Xin Huang, Junru Gu, Brian C Williams, Hang Zhao
CVPR 2022
"Towards interactive motion prediction."

Paper Project
FUTR3D: A Unified Sensor Fusion Framework for 3D Detection
Xuanyao Chen, Tianyuan Zhang, Yue Wang, Yilun Wang, Hang Zhao
Preprint 2022
Paper Project
MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries
Tianyuan Zhang, Xuanyao Chen, Yue Wang, Yilun Wang, Hang Zhao
Preprint 2022
Paper Project
HDMapNet: An Online HD Map Construction and Evaluation Framework Hot
Qi Li, Yue Wang, Yilun Wang, Hang Zhao
ICRA 2022
"HD map learning from onboard sensors!"

Paper Project
SEMI: Self-supervised Exploration via Multisensory Incongruity
Jianren Wang*, Ziwen Zhuang*, Hang Zhao
ICRA 2022
"Multi-sensory incongruity incentizes RL agents to explore!"

Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao
NeurIPS 2021
"Automatic video dubbing driven by a neural network!"

Paper Project
DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries Hot
Yue Wang, Vitor Campagnolo Guizilini, Tianyuan Zhang, Yilun Wang, Hang Zhao, Justin Solomon
CoRL 2021
"A new paradigm of 3D object detection from 2D images!"

On Feature Decorrelation in Self-Supervised Learning Hot
Tianyu Hua, Wenxiao Wang, Zihui Xue, Yue Wang, Sucheng Ren, Hang Zhao
ICCV 2021 Oral
"It reveals the connection between model collapse and feature correlations!"

Paper Project
Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset
Scott Ettinger, et al.
ICCV 2021 Oral

Paper Waymo Blog
Multimodal Knowledge Expansion
Zihui Xue, Sucheng Ren, Zhengqi Gao, Hang Zhao
ICCV 2021
"Multimodal data brings knowledge for free!"

Paper Project
DenseTNT: End-to-end Trajectory Prediction from Dense Goal Sets
Junru Gu, Chen Sun, Hang Zhao
ICCV 2021
"A SOTA anchor-free and end-to-end multi-trajectory prediction model"

Paper Challenge Report Project
TNT: Target-driveN Trajectory Prediction Hot
Hang Zhao*, Jiyang Gao*, Tian Lan, Chen Sun, Benjamin Sapp,
Balakrishnan Varadarajan, Yue Shen, Yi Shen, Yuning Chai,
Cordelia Schmid, Congcong Li, Dragomir Anguelov
Conference on Robot Learning (CoRL) 2020
"A new motion prediction framework for self-driving!"

VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation Hot
Jiyang Gao*, Chen Sun*, Hang Zhao, Yi Shen,
Dragomir Anguelov, Congcong Li, Cordelia Schmid
In Proc. Computer Vision and Pattern Recognition (CVPR) 2020

Paper Waymo Blog
Scalability in Perception for Autonomous Driving: Waymo Open Dataset
Pei Sun et al.
In Proc. Computer Vision and Pattern Recognition (CVPR) 2020
Seattle (virtual), June. 2020

Project Page Challenges Paper
Through-Wall Human Mesh Recovery Using Radio Signals
Mingmin Zhao, Yingcheng Liu, Aniruddh Raghu, Hang Zhao, Tianhong Li,
Antonio Torralba, Dina Katabi
In Proc. International Conference on Computer Vision (ICCV)
Seoul, Korea, Oct. 2019

HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization
Hang Zhao, Zhicheng Yan, Lorenzo Torresani, Antonio Torralba
In Proc. International Conference on Computer Vision (ICCV)
Seoul, Korea, Oct. 2019
"A large-scale dataset for temporal action localization and recognition."

Paper (arXiv) Project Page GitHub Page
The Sound of Pixels Hot
Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba
In Proc. European Conference on Computer Vision (ECCV)
Munich, Germany, Sep. 2018
"Listen to the sound of pixels!"
Paper (arXiv) Project Page Code News Coverage
RF-Based 3D Skeletons
Mingmin Zhao, Yonglong Tian, Hang Zhao, Mohammad Alsheikh,
Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, Antonio Torralba
Special Interest Group on Data Communications (SIGCOMM)
Budapest, Hungary, August. 2018
Paper News Coverage
Through-Wall Human Pose Estimation Using Radio Signals Hot
Mingmin Zhao, Tianhong Li, Mohammad Alsheikh, Yonglong Tian, Hang Zhao,
Antonio Torralba, Dina Katabi
In Proc. Computer Vision and Pattern Recognition (CVPR)
Salt Lake City, Utah, June. 2018
Paper Project Page News Coverage
Scene Parsing through ADE20K Dataset Hot
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, Antonio Torralba
In Proc. Computer Vision and Pattern Recognition (CVPR)
Honolulu, Hawaii, July. 2017
Paper Dataset Code Online Demo
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, Antonio Torralba
International Journal on Computer Vision 2018 (IJCV)

ILSVRC'16 MIT Scene Parsing Challenge
"I co-organized the scene parsing challenge at ILSVRC'16. Check out our dataset now!"
Paper (arXiv) Dataset Benchmark Page Challenge Page GitHub Page Online Demo

Loss Functions for Neural Networks for Image Processing Hot
Hang Zhao, Orazio Gallo, Iuri Frosio and Jan Kautz
IEEE Transactions on Computational Imaging 2017 (TCI)

"How important are loss functions for image processing tasks in deep neural nets?"
Paper (Journal) Paper (arXiv) Project Page Code

Duckietown: an Open, Inexpensive and Flexible Platform for Autonomy Education and Research
IEEE International Conference on Robotics and Automation (ICRA)
Singapore, May. 2017
"We are building an open-source education and research platform for autonomous driving. "
Paper Video Project Page Code News Coverage

Unbounded High Dynamic Range Photography using a Modulo Camera Hot
Hang Zhao, Boxin Shi, Christy Fernandez-Cull, Sai-Kit Yeung and Ramesh Raskar
In Proc. International Conference on Computational Photography (ICCP)
Houston, USA, Apr. 2015 (Acceptance Rate: 24%)
Oral Presentation [Best Paper runner-up]

Paper Poster Video Project Page News Coverage


  • [Tsinghua] Advances in Autonomous Driving and Intelligent Vehicles (Lecturer)
  • [Tsinghua] Introduction to Multimedia Computing (Lecturer)
  • [MIT 6.869] Advances in Computer Vision (Teaching Assistant)
  • [MIT 2.166] Autonomous Vehicles, also known as "Duckietown" (Course Developer, Teaching Assistant)
  • [MIT 6.870] Smartphone Vision (Teaching Assistant)
  • [MIT 2.007] Design and Manufacturing I (Teaching Assistant)

  • Professional Activities

  • Co-organizer of Workshop on Sight and Sound at CVPR 2022.
  • Co-organizer of HACS Temporal Action Localization Challenge at Workshop on International Challenge on Activity Recognition at CVPR 2020.
  • Co-organizer of Workshop on Sight and Sound at CVPR 2020.
  • Co-organizer of Workshop on Multi-modal Video Analysis and Moments in Time Challenge at ICCV 2019.
  • Co-organizer of Weakly Supervised Learning for Real-World Computer Vision Applications and the 1st Learning from Imperfect Data (LID) Challenge at CVPR 2019.
  • Co-organizer of Places Challenge 2017.
  • Co-organizer of Joint COCO and Places Recognition Challenge Workshop at ICCV 2017.
  • Co-organizer of MIT Scene Parsing Challenge 2016.
  • Co-organizer of ILSVRC'16 challenge workshop at ECCV 2016.
  • Journal reviewer for TPAMI, IJCV, TIP, CVIU, TCI, OE, etc.
  • Conference reviewer for CVPR, ICCV, ECCV, NIPS, ICML, ICLR, etc.
  • Co-chairs of MIT Vision Seminar.

  • Talks

  • Invited talk at IROS Workshop on Behavior-driven Autonomous Driving in Unstructured Environments, Oct 2022.
  • Invited talk at CVPR Tutorial on OpenMMLab, June 2022.
  • Invited talk at ICCV Workshop on Benchmarking Trajectory Forecasting Models, Oct 2021.
  • Invited talk at CVPR Workshop on Autonomous Driving, June 2021.
  • Invited talk at Samsung Research Lab, UK, April 2021.
  • Invited talk at Amazon Alexa, May 2019.
  • Invited talk at Samsung Workshop at MIT, April 2019.
  • Invited talk at Machine Intelligence Conference, March 2019.
  • Invited talk at PHILIPS, Feburary 2019.
  • Invited talk at VALSE, June 2018.
  • Invited talk at Harvard vision seminar, May 2018.
  • Invited talk at Google Cambridge, April 2018.
  • Invited talk at MIT graphics seminar, September 2015.
  • In Chinese:

  • Invited talk at TechBeat on BEV Perception for Vision-Centric Autonomous Driving, March 2022. [Recording]
  • Invited talk at TechBeat on Motion Prediction for Autonomous Driving, December 2020. [Recording]
  • Invited talk at TechBeat on Cross-modal Audio-visual Self-supervised Learning, June 2018. [Recording]
  • Current and Past Affiliations