Liwei Wang

I am an Assistant Professor in Computer Science and Engineering department at The Chinese University of Hong Kong (CUHK). Before coming to HK, I have worked for more than two years as a Senior Researcher in Tencent America at Bellevue, US.

I got my PhD from Computer Science Department, University of Illinois Urbana-Champaign (UIUC), advised by Prof. Svetlana Lazebnik. The Language and Vision (LaVi) Lab, which I founded at the Department of Computer Science and Engineering at CUHK, conducts research in Language+Vision, i.e. the intersection of language and vision.

If you want to join LaVi Lab, please send an email to lwwang@cse.cuhk.edu.hk

Email / Google Scholar / Publications / Lab website (soon) /

News

We are hiring Interns and Phd students to work on Vision+Language and multi-modal LLMs.
2025/08: I will give a keynote speak at MFMSI Workshop at ACM Multimedia 2025.

Recent Research Highlights

My students / interns / postdocs are indicated by '*'. Click full publication list

	NeMo: Needle in a Montage for Video-Language Understanding Zi-Yuan Hu, Shuo Liang, Duo Zheng, Yanyang Li, Yeyao Tao, Shijia Huang, Wei Feng, Jia Qin, Jianguang Yu, Jing Huang, Meng Fang, Yin Li, Liwei Wang Arxiv 2025 Project
	Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors Duo Zheng, Shijia Huang, Yanyang Li, Liwei Wang NeurIPS 2025 Code
	Learning to Reason from Feedback at Test-Time Yanyang Li, Michael R. Lyu, Liwei Wang* ACL 2025 Code
	C²LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation Yanyang Li, Tin Long Wong, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Ka Wai Liu, Michael R. Lyu, Liwei Wang ACL 2025 Findings Project
	Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding Duo Zheng, Shijia Huang, Liwei Wang CVPR 2025 Code
	AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Yiwu Zhong, Zhuoming Liu, Yin Li, Liwei Wang* ICCV 2025 Code
	Fine-grained Spatiotemporal Grounding on Egocentric Videos Shuo Liang, Yiwu Zhong, Zi-Yuan Hu, Yeyao Tao, Liwei Wang ICCV 2025 Code
	Towards Learning a Generalist Model for Embodied Navigation Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, Liwei Wang CVPR 2024 (Poster Highlight) Code
	A Mutual Supervision Framework for Referring Expression Segmentation and Generation Shijia Huang, Feng Li, Hao Zhang, Shilong Liu, Lei Zhang, Liwei Wang* IJCV 2024
	Making Long-Context Language Models Better Multi-Hop Reasoners Yanyang Li, Shuo Liang, Michael R. Lyu, Liwei Wang ACL 2024 Code
	Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models Yiwu Zhong, Ziyuan Hu, Michael R. Lyu, Liwei Wang EMNLP 2024 Code
	Enhancing Temporal Modeling of Video LLMs via Time Gating Zi-Yuan Hu, Yiwu Zhong, Shijia Huang, Michael R. Lyu, Liwei Wang* EMNLP 2024 Findings Code
	Learning Preference Model for LLMs via Automatic Preference Data Generation Shijia Huang, Jianqiao Zhao, Yanyang Li, Liwei Wang* EMNLP 2023 Long Paper
	CLEVA: Chinese Language Models EVAluation Platform Yanyang Li, Jianqiao Zhao, Duo Zheng, Zi-Yuan Hu, Zhi Chen, Xiaohui Su, Yongfeng Huang, Shijia Huang, Dahua Lin, Michael R. Lyu, Liwei Wang* EMNLP 2023 System Demonstration Project
	VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control Ziyuan Hu, Yanyang Li, Michael R. Lyu, Liwei Wang ICCV, 2023 Code
	Multi-View Transformer for 3D Visual Grounding Shijia Huang, Yilun Chen, Jiaya Jia, Liwei Wang* CVPR, 2022 Code
	Eliciting Knowledge from Large Pre-Trained Models for Unsupervised Knowledge-Grounded Conversation Yanyang Li, Jianqiao Zhao, Michael R. Lyu, Liwei Wang EMNLP, 2022 Long Paper, Code
	FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows Jianqiao Zhao, Yanyang Li, Wanyu Du, Yangfeng Ji, Dong Yu, Michael R. Lyu, Liwei Wang* EMNLP, 2022 Long Paper, Code and Dataset
	Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency Yanyang Li, Fuli Luo, Runxin Xu, Songfang Huang, Fei Huang, Liwei Wang* ACL, 2022, long paper
	SAT: 2D Semantics Assisted Training for 3D Visual Grounding Zhengyuan Yang, Songyang Zhang, Liwei Wang, Jiebo Luo ICCV, 2021, Oral Presentation Code
	Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation Liwei Wang, Jing Huang, Yin Li, Kun Xu, Zhengyuan Yang, Dong Yu CVPR, 2021 Code
	Robust Dialogue Utterance Rewriting as Sequence Tagging Jie Hao, Linfeng Song, Liwei Wang, Kun Xu, Zhaopeng Tu, Dong Yu EMNLP, 2021, Code
	Comprehensive Image Captioning via Scene Graph Decomposition Yiwu Zhong, Liwei Wang, Jianshu Chen, Dong Yu, Yin Li ECCV*, 2020 Code
	Improving One-stage Visual Grounding by Recursive Sub-query Construction Zhengyuan Yang, Tianlang Chen, Liwei Wang, Jiebo Luo ECCV, 2020 Code
	MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara Berg, Mohit Bansal ACL*, 2020 Code
	A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang, Boqing Gong, Liwei Wang, Wenbing Huang, Dong Yu, Jiebo Luo ICCV*, 2019, Oral Presentation Code
	Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech Aditya Deshpande, Jyoti Aneja, Liwei Wang, Alexander Schwing, D. A. Forsyth CVPR, 2019, Oral Presentation
	Learning Two-Branch Neural Networks for Image-Text Matching Tasks Liwei Wang, Yin Li, Jing Huang, Svetlana Lazebnik TPAMI, 2018 Code
	Learning structural motif representations for efficient protein structure search Yang Liu, Qing Ye, Liwei Wang, Jian Peng Bioinformatics, 2018 Code
	Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space Liwei Wang, Alex Schwing, Svetlana Lazebnik NeurIPS, 2017
	Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models Bryan Plummer, Liwei Wang, Chris M. Cervantes, Juan C. Caicedo, Julia Hockenmaier, Svetlana Lazebnik IJCV, 2016 Project
	Learning Deep Structure-Preserving Image-Text Embeddings Liwei Wang, Yin Li, Svetlana Lazebnik CVPR, 2016 Code

(Templates adapted from Arun's webpage.)