Hang Zhao




Google Scholar


Hey, I am Hang Zhao, an Assistant Professor at IIIS, Tsinghua University, Principle Investigator of MARS Lab. My research interests are multi-modal machine learning, autonomous driving and robot learning. Check out our MARS Lab Website for a full list of research projects and publications.

I was a Research Scientist at Waymo (known as Google's self-driving project) from 2019 to 2020. Before that, I got my Ph.D. degree at MIT in 2019 under the supervision of Professor Antonio Torralba (the Great Torralba!). Before MIT, I received my B.S. from Zhejiang University in 2013.

I am actively looking for PostDoc/PhD/BS students and engineers with CS/EE background to join my team. If you would like to work with me, feel free to drop me an email with your resume.

Contact Info


Selected Projects

Humanoid Parkour Learning Hot
Ziwen Zhuang, Shenzhe Yao, Hang Zhao
CoRL 2024
"The first humanoid robot that learns to parkour!"
Paper Project
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models Hot
Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Yang Wang, Zhiyong Zhao, Kun Zhan
Peng Jia, Xianpeng Lang, Hang Zhao
CoRL 2024
"Slow-Fast Dual System for autnomous driving!"
Paper Project
Latent Consistency Models: Synthesizing High-Resolution Images With Few-Step Inference Hot
Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, Hang Zhao
"Generating high-resolution images in only 2-4 steps!"
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Simian Luo, Yiqin Tan, Suraj Patil, Daniel Gu, Patrick von Platen, Apolinário Passos,
Longbo Huang, Jian Li, Hang Zhao
"Accelerating your LoRA model by 5x without training!"
LCM Paper LCM-LoRA Report Demo Project
Robot Parkour Learning Hot
Ziwen Zhuang, Zipeng Fu, Jianren Wang, Christopher G Atkeson, Sören Schwertfeger,
Chelsea Finn, Hang Zhao
CoRL 2023 Oral Best System Paper Finalist (Top 3)
"Robot parkour skills empowered by onboard vision and a neural network!"

Paper Code Project
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Simian Luo, Chuanhao Yan, Chenxu Hu, Hang Zhao
NeurIPS 2023

Paper Project
Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving
Xiaoyu Tian, Tao Jiang, Longfei Yun, Yucheng Mao, Huitong Yang,
Yue Wang, Yilun Wang, Hang Zhao
NeurIPS Dataset Track 2023

Paper Project
ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory
Chenxu Hu, Jie Fu, Chenzhuang Du, Simian Luo, Junbo Zhao, Hang Zhao
LLM@IJCAI 2023

Paper Project
VCAD: Vision-Centric Autonomous Driving Hot
Hang Zhao, Yue Wang, Yilun Wang, Justin Solomon, Vitor Guizilini, et al.
"A research effort pushing the frontiers of camera-centric autonomous driving technology."

Project Workshop
Neural Map Prior for Autonomous Driving
Xuan Xiong, Yicheng Liu, Tianyuan Yuan, Yue Wang, Yilun Wang, Hang Zhao
CVPR 2023
"A neural representation of HD maps to improve local map inference."

Paper Project
ViP3D: End-to-end Visual Trajectory Prediction via 3D Agent Queries
Junru Gu*, Chenxu Hu*, Tianyuan Zhang, Xuanyao Chen, Yilun Wang, Yue Wang, Hang Zhao
CVPR 2023
"Vision-based trajectory prediction autonomous driving."
Paper Project
VectorMapNet: End-to-end Vectorized HD Map Learning
Yicheng Liu, Tianyuan Yuan, Yue Wang, Yilun Wang, Hang Zhao
ICML 2023
"Vectorized mapping from onboard sensors!"
Paper Project
InterSim: Interactive Traffic Simulation via Explicit Relation Modeling
Qiao Sun, Xin Huang, Brian C Williams, Hang Zhao
IROS 2022
"Towards closed-loop behavior simulation."

Paper Project
M2I: From Factored Marginal Trajectory Prediction to Interactive Prediction
Qiao Sun, Xin Huang, Junru Gu, Brian C Williams, Hang Zhao
CVPR 2022
"Towards interactive motion prediction."

Paper Project
HDMapNet: An Online HD Map Construction and Evaluation Framework
Qi Li, Yue Wang, Yilun Wang, Hang Zhao
CVPR 2021 Workshop best paper, ICRA 2022
"HD map learning from onboard sensors!"

Paper Project
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao
NeurIPS 2021
"Automatic video dubbing driven by a neural network!"

Paper Project
DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries Hot
Yue Wang, Vitor Campagnolo Guizilini, Tianyuan Zhang, Yilun Wang, Hang Zhao, Justin Solomon
CoRL 2021
"A new paradigm of 3D object detection from 2D images!"

Paper
On Feature Decorrelation in Self-Supervised Learning Hot
Tianyu Hua, Wenxiao Wang, Zihui Xue, Yue Wang, Sucheng Ren, Hang Zhao
ICCV 2021 Oral
"It reveals the connection between model collapse and feature correlations!"

Paper Project
Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset
Scott Ettinger, et al.
ICCV 2021 Oral

Paper Waymo Blog
DenseTNT: End-to-end Trajectory Prediction from Dense Goal Sets
Junru Gu, Chen Sun, Hang Zhao
ICCV 2021
"A SOTA anchor-free and end-to-end multi-trajectory prediction model"

Paper Challenge Report Project
TNT: Target-driveN Trajectory Prediction Hot
Hang Zhao, Jiyang Gao, Tian Lan, Chen Sun, Benjamin Sapp,
Balakrishnan Varadarajan, Yue Shen, Yi Shen, Yuning Chai,
Cordelia Schmid, Congcong Li, Dragomir Anguelov
Conference on Robot Learning (CoRL) 2020
"A new motion prediction framework for self-driving!"

Paper
VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation Hot
Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen,
Dragomir Anguelov, Congcong Li, Cordelia Schmid
In Proc. Computer Vision and Pattern Recognition (CVPR) 2020

Paper Waymo Blog
Scalability in Perception for Autonomous Driving: Waymo Open Dataset
Pei Sun et al.
In Proc. Computer Vision and Pattern Recognition (CVPR) 2020
Seattle (virtual), June. 2020

Project Page Challenges Paper
HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization
Hang Zhao, Zhicheng Yan, Lorenzo Torresani, Antonio Torralba
In Proc. International Conference on Computer Vision (ICCV)
Seoul, Korea, Oct. 2019
"A large-scale dataset for temporal action localization and recognition."

Paper (arXiv) Project Page GitHub Page
The Sound of Pixels Hot
Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba
In Proc. European Conference on Computer Vision (ECCV)
Munich, Germany, Sep. 2018
"Listen to the sound of pixels!"
Paper (arXiv) Project Page Code News Coverage
Through-Wall Human Pose Estimation Using Radio Signals Hot
Mingmin Zhao, Tianhong Li, Mohammad Alsheikh, Yonglong Tian, Hang Zhao,
Antonio Torralba, Dina Katabi
In Proc. Computer Vision and Pattern Recognition (CVPR)
Salt Lake City, Utah, June. 2018
Paper Project Page News Coverage
Scene Parsing through ADE20K Dataset Hot
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, Antonio Torralba
In Proc. Computer Vision and Pattern Recognition (CVPR)
Honolulu, Hawaii, July. 2017
Paper Dataset Code Online Demo
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, Antonio Torralba
International Journal on Computer Vision 2018 (IJCV)

ILSVRC'16 MIT Scene Parsing Challenge
"I co-organized the scene parsing challenge at ILSVRC'16. Check out our dataset now!"
Paper (arXiv) Dataset Benchmark Page Challenge Page GitHub Page Online Demo

Loss Functions for Neural Networks for Image Processing Hot
Hang Zhao, Orazio Gallo, Iuri Frosio and Jan Kautz
arXiv:1511.08861
IEEE Transactions on Computational Imaging 2017 (TCI)

"How important are loss functions for image processing tasks in deep neural nets?"
Paper (Journal) Paper (arXiv) Project Page Code

Duckietown: an Open, Inexpensive and Flexible Platform for Autonomy Education and Research
IEEE International Conference on Robotics and Automation (ICRA)
Singapore, May. 2017
"We are building an open-source education and research platform for autonomous driving. "
Paper Video Project Page Code News Coverage

Unbounded High Dynamic Range Photography using a Modulo Camera Hot
Hang Zhao, Boxin Shi, Christy Fernandez-Cull, Sai-Kit Yeung and Ramesh Raskar
In Proc. International Conference on Computational Photography (ICCP)
Houston, USA, Apr. 2015 (Acceptance Rate: 24%)
Oral Presentation [Best Paper runner-up]

Paper Poster Video Project Page News Coverage


Teaching

  • [Tsinghua] Advances in Autonomous Driving and Intelligent Vehicles (Lecturer)
  • [Tsinghua] Introduction to Multimedia Computing (Lecturer)
  • [MIT 6.869] Advances in Computer Vision (Teaching Assistant)
  • [MIT 2.166] Autonomous Vehicles, also known as "Duckietown" (Course Developer, Teaching Assistant)
  • [MIT 6.870] Smartphone Vision (Teaching Assistant)
  • [MIT 2.007] Design and Manufacturing I (Teaching Assistant)

  • Professional Activities

  • Co-organizer of Workshop on Vision-Centric Autonomous Driving (VCAD) at ECCV 2024.
  • Co-organizer of Workshop on Vision and Language for Autonomous Driving and Robotics at CVPR 2024.
  • Co-organizer of Workshop on Sight and Sound at CVPR 2024.
  • Co-organizer of Workshop on Vision-Centric Autonomous Driving (VCAD) at CVPR 2023.
  • Co-organizer of Workshop on Sight and Sound at CVPR 2023.
  • Workshop co-chair (organizing committee) of ICLR 2023.
  • Co-organizer of Workshop on Sight and Sound at CVPR 2022.
  • Co-organizer of HACS Temporal Action Localization Challenge at Workshop on International Challenge on Activity Recognition at CVPR 2020.
  • Co-organizer of Workshop on Sight and Sound at CVPR 2020.
  • Co-organizer of Workshop on Multi-modal Video Analysis and Moments in Time Challenge at ICCV 2019.
  • Co-organizer of Weakly Supervised Learning for Real-World Computer Vision Applications and the 1st Learning from Imperfect Data (LID) Challenge at CVPR 2019.
  • Co-organizer of Places Challenge 2017.
  • Co-organizer of Joint COCO and Places Recognition Challenge Workshop at ICCV 2017.
  • Co-organizer of MIT Scene Parsing Challenge 2016.
  • Co-organizer of ILSVRC'16 challenge workshop at ECCV 2016.
  • Journal reviewer for TPAMI, IJCV, TIP, CVIU, TCI, OE, etc.
  • Conference reviewer for CVPR, ICCV, ECCV, NIPS, ICML, ICLR, etc.
  • Co-chair of MIT Vision Seminar.

  • Talks

  • Invited talk at ECCV Workshop on Autonomous Vehicles meet Multimodal Foundation Models, Oct 2024.
  • Invited talk at ICCV Workshop on Visual Learning of Sounds in Spaces (AV4D), Oct 2023.
  • Invited talk at CVPR Workshop on Autonomous Driving (WAD), June 2023.
  • Invited talk at VALSE Workshop on Autonomous Driving, June 2023.
  • Invited talk at ICLR Workshop on Representation for Autonomous Driving, May 2023.
  • Invited talk at NeuRIPS Workshop on Machine learning for Autonomous Driving (ML4AD), December 2022.
  • Invited talk at IROS Workshop on Behavior-driven Autonomous Driving in Unstructured Environments, Oct 2022.
  • Invited talk at CVPR Tutorial on OpenMMLab, June 2022.
  • Invited talk at VALSE APR on Autonomous Driving, June 2022.
  • Invited talk at VALSE Workshop on Multimodal Learning, June 2022.
  • Invited talk at ICCV Workshop on Benchmarking Trajectory Forecasting Models, Oct 2021.
  • Invited talk at CVPR Workshop on Autonomous Driving, June 2021.
  • Invited talk at Samsung Research Lab, UK, April 2021.
  • Invited talk at Amazon Alexa, May 2019.
  • Invited talk at Samsung Workshop at MIT, April 2019.
  • Invited talk at Machine Intelligence Conference, March 2019.
  • Invited talk at PHILIPS, Feburary 2019.
  • Invited talk at VALSE, June 2018.
  • Invited talk at Harvard vision seminar, May 2018.
  • Invited talk at Google Cambridge, April 2018.
  • Invited talk at MIT graphics seminar, September 2015.
  • In Chinese:

  • Invited talk at TechBeat on BEV Perception for Vision-Centric Autonomous Driving, March 2022. [Recording]
  • Invited talk at TechBeat on Motion Prediction for Autonomous Driving, December 2020. [Recording]
  • Invited talk at TechBeat on Cross-modal Audio-visual Self-supervised Learning, June 2018. [Recording]
  • Current and Past Affiliations