Liwei Wang

I am an Assistant Professor in Computer Science and Engineering department at The Chinese University of Hong Kong (CUHK). Before coming to HK, I have worked for more than two years as a Senior Researcher in Tencent America at Bellevue, US.

I got my PhD from Computer Science Department, University of Illinois Urbana-Champaign (UIUC), advised by Prof. Svetlana Lazebnik. The Language and Vision (LaVi) Lab, which I founded at the Department of Computer Science and Engineering at CUHK, conducts research in Language+Vision, i.e. the intersection of language and vision.

If you want to join LaVi Lab, please send an email to lwwang@cse.cuhk.edu.hk

Email / Google Scholar / Publications / Lab website (soon) /



A M
News
  • We are hiring Interns, Postdocs and Phd students to work on Vision+Language and multi-modal LLMs.
  • 2025/08: I will give a keynote speak at MFMSI Workshop at ACM Multimedia 2025.
Recent Research Highlights

My students / interns / postdocs are indicated by '*'. Click full publication list

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

Duo Zheng*, Shijia Huang*, Yanyang Li, Liwei Wang
Arxiv 2025     Code

Learning to Reason from Feedback at Test-Time

Yanyang Li*, Michael R. Lyu, Liwei Wang
ACL 2025     Code

C2LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation

Yanyang Li, Tin Long Wong, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Ka Wai Liu, Michael R. Lyu, Liwei Wang
ACL 2025 Findings     Project

Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding

Duo Zheng*, Shijia Huang*, Liwei Wang
CVPR 2025     Code

AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning

Yiwu Zhong*, Zhuoming Liu, Yin Li, Liwei Wang
ICCV 2025     Code

Fine-grained Spatiotemporal Grounding on Egocentric Videos

Shuo Liang, Yiwu Zhong, Zi-Yuan Hu, Yeyao Tao, Liwei Wang
ICCV 2025     Code

Towards Learning a Generalist Model for Embodied Navigation

Duo Zheng*, Shijia Huang*, Lin Zhao, Yiwu Zhong, Liwei Wang
CVPR 2024 (Poster Highlight)     Code

A Mutual Supervision Framework for Referring Expression Segmentation and Generation

Shijia Huang*, Feng Li, Hao Zhang, Shilong Liu, Lei Zhang, Liwei Wang
IJCV 2024    

Making Long-Context Language Models Better Multi-Hop Reasoners

Yanyang Li*, Shuo Liang*, Michael R. Lyu, Liwei Wang
ACL 2024     Code

Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models

Yiwu Zhong*, Ziyuan Hu*, Michael R. Lyu, Liwei Wang
EMNLP 2024     Code

Enhancing Temporal Modeling of Video LLMs via Time Gating

Zi-Yuan Hu*, Yiwu Zhong, Shijia Huang, Michael R. Lyu, Liwei Wang
EMNLP 2024 Findings     Code

Learning Preference Model for LLMs via Automatic Preference Data Generation

Shijia Huang*, Jianqiao Zhao, Yanyang Li, Liwei Wang
EMNLP 2023 Long Paper

CLEVA: Chinese Language Models EVAluation Platform

Yanyang Li*, Jianqiao Zhao, Duo Zheng, Zi-Yuan Hu, Zhi Chen, Xiaohui Su, Yongfeng Huang, Shijia Huang, Dahua Lin, Michael R. Lyu, Liwei Wang
EMNLP 2023 System Demonstration     Project

VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control

Ziyuan Hu*, Yanyang Li*, Michael R. Lyu, Liwei Wang
ICCV, 2023     Code

Multi-View Transformer for 3D Visual Grounding

Shijia Huang*, Yilun Chen, Jiaya Jia, Liwei Wang
CVPR, 2022     Code

Eliciting Knowledge from Large Pre-Trained Models for Unsupervised Knowledge-Grounded Conversation

Yanyang Li*, Jianqiao Zhao*, Michael R. Lyu, Liwei Wang
EMNLP, 2022 Long Paper, Code

FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows

Jianqiao Zhao*, Yanyang Li*, Wanyu Du*, Yangfeng Ji, Dong Yu, Michael R. Lyu, Liwei Wang
EMNLP, 2022 Long Paper, Code and Dataset

Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency

Yanyang Li*, Fuli Luo, Runxin Xu, Songfang Huang, Fei Huang, Liwei Wang
ACL, 2022, long paper    

SAT: 2D Semantics Assisted Training for 3D Visual Grounding

Zhengyuan Yang, Songyang Zhang, Liwei Wang, Jiebo Luo
ICCV, 2021, Oral Presentation     Code

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Liwei Wang, Jing Huang, Yin Li, Kun Xu, Zhengyuan Yang, Dong Yu
CVPR, 2021     Code

Robust Dialogue Utterance Rewriting as Sequence Tagging

Jie Hao, Linfeng Song, Liwei Wang, Kun Xu, Zhaopeng Tu, Dong Yu
EMNLP, 2021,     Code

Comprehensive Image Captioning via Scene Graph Decomposition

Yiwu Zhong*, Liwei Wang, Jianshu Chen, Dong Yu, Yin Li
ECCV, 2020     Code

Improving One-stage Visual Grounding by Recursive Sub-query Construction

Zhengyuan Yang, Tianlang Chen, Liwei Wang, Jiebo Luo
ECCV, 2020     Code

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Jie Lei*, Liwei Wang, Yelong Shen, Dong Yu, Tamara Berg, Mohit Bansal
ACL, 2020     Code

A Fast and Accurate One-Stage Approach to Visual Grounding​

Zhengyuan Yang*, Boqing Gong, Liwei Wang, Wenbing Huang, Dong Yu, Jiebo Luo
ICCV, 2019, Oral Presentation     Code

Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech

Aditya Deshpande, Jyoti Aneja, Liwei Wang, Alexander Schwing, D. A. Forsyth
CVPR, 2019, Oral Presentation

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

Liwei Wang, Yin Li, Jing Huang, Svetlana Lazebnik
TPAMI, 2018     Code

Learning structural motif representations for efficient protein structure search

Yang Liu, Qing Ye, Liwei Wang, Jian Peng
Bioinformatics, 2018     Code

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

Liwei Wang, Alex Schwing, Svetlana Lazebnik
NeurIPS, 2017

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

Bryan Plummer, Liwei Wang, Chris M. Cervantes, Juan C. Caicedo, Julia Hockenmaier, Svetlana Lazebnik
IJCV, 2016     Project

Learning Deep Structure-Preserving Image-Text Embeddings

Liwei Wang, Yin Li, Svetlana Lazebnik
CVPR, 2016     Code

(Templates adapted from Arun's webpage.)