Wenliang Zhao
I am a fifth-year Ph.D student in the Department of Automation at Tsinghua University, advised by Prof. Jiwen Lu and Prof. Jie Zhou. In 2020, I obtained my B.Eng. in the Department of Automation, Tsinghua University.
I am broadly interested in computer vision and deep learning. My current research focuses on model architectures and generative models.
Email  / 
Google Scholar  / 
Github
|
|
Publications
* equal contribution † project leader
|
|
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
Wenliang Zhao,
Haolin Wang,
Jie Zhou ,
Jiwen Lu
European Conference on Computer Vision (ECCV), 2024
[arXiv]
[Code]
DC-Solver is designed to improve alignment in predictor-corrector diffusion samplers (while also applicable to predictor-only samplers). With negligible search costs, DC-Solver can achieve as few as 5 sampling steps (NFE).
|
|
FlowIE: Efficient Image Enhancement via Rectified Flow
Yixuan Zhu*,
Wenliang Zhao*†,
Ao Li,
Yansong Tang,
Jie Zhou ,
Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Oral Presentation
[arXiv]
[Code]
FlowIE is the first flow-based image enhancement framework that supports various tasks and is efficient in both training (simulation-free) and inference (<5 sampling steps).
|
|
UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models
Wenliang Zhao*,
Lujia Bai*,
Yongming Rao,
Jie Zhou ,
Jiwen Lu
Conference on Neural Information Processing Systems (NeurIPS), 2023
[arXiv]
[Code]
[Project Page]
UniPC is a training-free framework designed for the fast sampling of diffusion models, which consists of a corrector (UniC) and a predictor (UniP) that share a unified analytical form and support arbitrary orders.
|
|
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao*,
Yongming Rao*,
Zuyan Liu*,
Benlin Liu
Jie Zhou,
Jiwen Lu
IEEE International Conference on Computer Vision (ICCV), 2023
[arXiv]
[Code]
[Project Page]
[Rank 1st on NYUv2 Depth Estimation]
VPD (Visual Perception with Pre-trained Diffusion Models) is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
|
|
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao*,
Wenliang Zhao*,
Yansong Tang,
Jie Zhou ,
Ser-Nam Lim ,
Jiwen Lu
NeurIPS, 2022
[arXiv]
[Code]
[Project Page]
[中文解读]
HorNet is a family of generic vision backbones that perform explicit high-order spatial interactions based on Recursive Gated Convolution.
|
|
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao*, Wenliang Zhao*, Guangyi Chen, Yansong Tang, Jie Zhou, Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[arXiv]
[Code]
[Project Page]
[中文解读]
DenseCLIP is a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP.
|
|
Global Filter Networks for Image Classification
Yongming Rao*, Wenliang Zhao*, Zheng Zhu , Jiwen Lu , Jie Zhou
Conference on Neural Information Processing Systems (NeurIPS), 2021
[arXiv] [Code] [Project Page] [中文解读(By HappyAIWalker)]
Global Filter Networks is a transformer-style architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
|
|
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Yongming Rao, Wenliang Zhao, Benlin Liu , Jiwen Lu , Jie Zhou , Cho-Jui Hsieh
Conference on Neural Information Processing Systems (NeurIPS), 2021
[arXiv] [Code] [Project Page] [知乎]
We present a dynamic token sparsification framework to prune redundant tokens in vision transformers progressively and dynamically based on the input.
|
|
Towards Interpretable Deep Metric Learning with Structural Matching
Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu , Jie Zhou
IEEE International Conference on Computer Vision (ICCV), 2021
[arXiv] [Code]
We present a framework (DIML) to add interpretability to metric learning and improve the performance of deep metric learning models.
|
|
Group-aware Contrastive Regression for Action Quality Assessment
Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu , Jie Zhou
IEEE International Conference on Computer Vision (ICCV), 2021
We propose a new contrastive regression (CoRe) framework to learn the relative scores by pair-wise comparison, which highlights the differences between videos and guides the models to learn the key hints for assessment.
|
Honors and Awards
2020 Outstanding Undergraduate, Tsinghua University
2018 Tang Lixin Scholarship, Tsinghua University
2019 Tsinghua Presidential Award Nomination, Tsinghua University
2018 Zheng Weimin Scholarship, Tsinghua University
2018 Jiang Nanxiang Scholarship, Tsinghua University
2018 1st prize in 36th Challenge Cup, Tsinghua University
2017 Qualcomm Scholarship
2017 National Scholarship, Tsinghua University
|
Academic Services
Conference Reviewer CVPR 2022, ECCV 2022, ICME 2022, ICCV 2023, NeurIPS 2023, ICLR 2024
Journal Reviewer T-IP
|
|