-->
 
 
Wangxuan Institute of Computer Technology
Peking University
No. 128 Zhongguancun North Street,
Haidian District, Beijing,100871, China

E-mail: yangliu "at" pku.edu.cn

I am now a Tenure-track Assistant Professor (Ph.D. Supervisor) in Wangxuan Institute of Computer Technology, Peking University, Peking University Boya Young Fellow. Also a member of MIPL Group (led by Prof. Yuxin Peng) at Peking University.

Before joining Peking University, I was a Postdoctoral Researcher in the Visual Geometry Group (VGG) at University of Oxford, supervised by Prof. Andrew Zisserman. I received PhD and MPhil in Advanced Computer Science from University of Cambridge, and B.Eng. in Telecommunication Engineering from Beijing University of Posts and Telecommunications (BUPT).

My research interests include computer vision, natural language processing and machine learning, with an emphasis on how these areas can collaborate best to perform real-world tasks. Below are some of my recent research topic:

  • Intersection of visual and language (Retrieval, Captioning, Visual grounding, Visual question answering)
  • Image and video semantic analysis (Image and vieo Classification, Detection, Segmentation)
  • Machine Learning (Deep learning, Multi modal Learning, Transfer Learning, Self supervised learning)

We are always actively recruiting postdocs, Prospective graduate students and interns!

Welcome to contact me with your detailed CV! Please read this Note first!


    *: equivalent contribution, : corresponding author

    Scalable Object Relation Encoding for Better 3D Spatial Reasoning in Large Language Models
    Shengli Zhou, Minghang Zheng, Feng Zheng, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026  
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    OmniVTG: A Large-Scale Dataset and Training Paradigm for open-World Video Temporal Grounding
    Minghang Zheng, Zihao Yin, Yi Yang, Yuxin Peng, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026  
    [ PDF]   [Project Page]   [ Code]   [ Model]   [ Bibtex


    Confidence-Aware Pseudo-Label Self-Correction for Weakly Supervised Visual Grounding
    Yang Liu, Jiahua Zhang, Yue Wu, Zijing Zhao, Qingchao Chen, Yuxin Peng
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2026  
    [ PDF]   [ Code]   [公众号]   [ Bibtex


    Identity-Preserving Text-to-Video Generation via Training-Free Prompt, Image, and Guidance Enhancement
    Winner of ACM-MM 2025 Identity-Preserving Video Generation Challenge
    Jiayi Gao, Changcheng Hua, Qingchao Chen, Yuxin Peng, Yang Liu
    ACM International Conference on Multimedia (ACM-MM), 2025  
    [ PDF]   [Project Page]   [ Code]   [ Challenge]   [ Bibtex


    ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
    Winner of CVPR-2025 Cultural VQA challenge
    Shaofeng Yin, Ting Lei, Yang Liu
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Project Page]   [ Code]   [ Challenge]   [公众号]   [ Bibtex


    Weakly and Single-Frame Supervised Temporal Sentence Grounding With Gaussian-Based Contrastive Proposal Learning
    Minghang Zheng, Yanjie Huang,Qingchao Chen, Yuxin Peng, Yang Liu
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025  
    [ PDF]   [ Bibtex


    Large-Scale Pre-trained Models Empowering Phrase Generalization in Temporal Sentence Localization
    Yang Liu, Minghang Zheng, Qingchao Chen, Shaogang Gong, Yuxin Peng
    International Journal of Computer Vision (IJCV), 2025  
    [ PDF]   [ Bibtex

    Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding
    Minghang Zheng, Yuxin Peng, Benyuan Sun, Yi Yang, Yang Liu
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration
    Ting Lei, Shaofeng Yin, Qingchao Chen, Yuxin Peng, Yang Liu
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Project Page]   [ Code]   [公众号]   [ Bibtex


    Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced In-domain Knowledge Transferring
    Zhu Xu, Ting Lei, Zhimin Li, Guan Wang, Qingchao Chen, Yuxin Peng, Yang Liu
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
    Dejie Yang, Zijing Zhao, Yang Liu
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Project Page]   [ Code]   [公众号]   [ Bibtex


    VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges
    Yuxuan Wang, Yiqi Song, Cihang Xie, Yang Liu, Zilong Zheng
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Project Page]   [ Code]   [公众号]   [ Bibtex


    DisTime: Distribution-based Time Representation for Video Large Language Models
    Yingsen Zeng, Zepeng Huang, Yujie Zhong, Chengjian Feng, Jie Hu, Lin Ma, Yang Liu
    International Conference on Computer Vision (ICCV), 2025  
    [ PDF]   [Dataseat]   [ Code]   [ Bibtex


    Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing
    Zhuoying Li, Zhu Xu, Yuxin Peng, Yang Liu
    International Conference on Machine Learning (ICML), 2025  
    [ PDF]   [Project Page]   [ Code]   [Video]   [ Bibtex


    ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
    Jiayi Gao, Zijin Yin, Changcheng Hua, Yuxin Peng, Kongming Liang,Zhanyu Ma,Jun Guo, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025  
    [ PDF]   [Project Page]   [ Code]   [Video]   [ Bibtex


    Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
    Yiming Qin, Zhu Xu, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025  
    [ PDF]   [Project Page]   [ Code]   [Video]   [公众号]   [ Bibtex


    Customized Human Object Interaction Image Generation
    Zhu Xu, Zhaowen Wang, Yuxin Peng, Yang Liu
    ACM International Conference on Multimedia (ACM-MM), 2025  
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    InteractMove: Text-Controlled Human-Object Interaction Generation in 3D Scenes with Movable Objects
    Xinhao Cai, Minghang Zheng, Xin Jin, Yang Liu
    ACM International Conference on Multimedia (ACM-MM), 2025  
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    Advancing 3D Scene Understanding with MV-ScanQA Multi-View Reasoning Evaluation and TripAlign Pre-training Dataset
    Wentao Mo, Qingchao Chen, Yuxin Peng, Siyuan Huang, Yang Liu
    ACM International Conference on Multimedia (ACM-MM) Dataset Track, 2025 
    [PDF]   [Project Page]   [ Code]   [ Bibtex


    Investigating Domain Gaps for Indoor 3D Object Detection
    Zijing Zhao, Zhu Xu, Qingchao Chen, Yuxin Peng, Yang Liu
    ACM International Conference on Multimedia (ACM-MM) Dataset Track, 2025  
    [PDF]   [Project Page]   [ Code]   [ Bibtex


    Learn 3D VQA Better with Active Selection and Reannotation
    Shengli Zhou, Yang Liu,Feng Zheng
    ACM International Conference on Multimedia (ACM-MM), 2025  
    [ PDF]   [ Code]   [ Bibtex


    PlanLLM: Video Procedure Planning with Refinable Large Language Models
    Dejie Yang, Zijing Zhao, Yang Liu
    Conference on Artificial Intelligence (AAAI), 2025  
    [ PDF]   [Project Page]   [ Code]   [公众号]   [ Bibtex


    Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval
    Dezhao Luo, Shaogang Gong, Jiabo Huang, Hailin Jin, Yang Liu
    Conference on Artificial Intelligence (AAAI), 2025 
    [ PDF]   [Project Page]   [ Bibtex


    3D Weakly Supervised Visual Grounding at Category and Instance Levels
    Xiaoqi Li, Jiaming Liu, Yandong Guo, Hao Dong, Yang Liu
    International Conference on Robotics and Automation (ICRA), 2025 
    [ PDF]   [ Bibtex


    Zero Shot Domain Adaptive Semantic Segmentation by Synthetic Data Generation and Progressive Adaptation
    Jun Luo, Zijing Zhao, Yang Liu
    International Conference on Intelligent Robots and Systems(IROS), 2025 
    [ PDF]   [ Code]   [ Bibtex


    Hierarchical Sub-action Tree for Continuous Sign Language Recognition
    Dejie Yang, Zhu Xu, Xinjie Gao, Yang Liu
    IEEE International Conference on Multimedia&Expo (ICME), 2025 
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    Semantic-Aware Human Object Interaction Image Generation
    Zhu Xu, Qingchao Chen, Yuxin Peng, Yang Liu
    International Conference on Machine Learning (ICML), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [公众号]   [ Bibtex


    Training Free Video Temporal Grounding using Large-scale Pre-trained Models
    Minghang Zheng, Xinhao Cai, Qingchao Chen, Yuxin Peng, Yang Liu
    European Conference on Computer Vision (ECCV), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [ Bibtex


    Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection
    Ting Lei, Shaofeng Yin, Yuxin Peng, Yang Liu
    European Conference on Computer Vision (ECCV), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [公众号]   [ Bibtex


    Active Object Detection with Knowledge Aggregation and Distillation from Large Models
    Dejie Yang, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [ Bibtex


    Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
    Ting Lei, Shaofeng Yin, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [ Bibtex


    OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
    Guan Wang, Zhimin Li, Qingchao Chen, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [ Bibtex


    Diff-BGM: A Diffusion Model for Video Background Music Generation
    Sizhe Li, Yiming Qin, Minghang Zheng, Xin Jin, Yang Liu
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [ Bibtex


    TeachText: CrossModal Text-Video Retrieval through Generalized Distillation
    Ioana Croitoru, Simion-Vlad Bogolin, Marius Leordeanuc, Hailin Jin, Andrew Zisserman, Yang Liu, Samuel Albanie
    Artificial Intelligence Journal (AIJ), 2024  
    [ PDF]   [Project Page]   [Code]  


    3D Vision and Language Pretraining with Large-Scale Synthetic Data
    Dejie Yang, Zhu Xu, Wentao Mo, Qingchao Chen, Siyuan Huang, Yang Liu
    International Joint Conference on Artificial Intelligence (IJCAI), 2024  
    [ PDF]   [Project Page]   [Video]   [ Code]   [ Bibtex


    Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
    Wentao Mo, Yang Liu
    Conference on Artificial Intelligence (AAAI), 2024  
    [ PDF]   [Project Page]   [ Code]   [ Bibtex


    Semantic-Guided Novel Category Discovery
    Weis