MMLabAbout UsResearchJoin

All Publications

AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Team AgiBot-World

arXiv 2025

DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Ping Luo, Andreas Geiger, Hongyang Li

ECCV 2024 Oral

Generalized Predictive Model for Autonomous Driving
Jiazhi Yang, Shenyuan Gao, Yihang Qiu, Li Chen, Tianyu Li, Bo Dai, Kashyap Chitta, Penghao Wu, Jia Zeng, Ping Luo, Jun Zhang, Andreas Geiger, Yu Qiao, Hongyang Li

CVPR 2024 Highlight

OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping
Huijie Wang, Tianyu Li, Yang Li, Li Chen, Chonghao Sima, Zhenbo Liu, Bangjun Wang, Peijin Jia, Yuting Wang, Shengyin Jiang, Feng Wen, Hang Xu, Ping Luo, Junchi Yan, Wei Zhang, Hongyang Li

NeurIPS 2023 Track Datasets and Benchmarks

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark
Li Chen, Chonghao Sima, Yang Li, Zehan Zheng, Jiajie Xu, Xiangwei Geng, Hongyang Li, Conghui He, Jianping Shi, Yu Qiao, Junchi Yan

ECCV 2022 Oral

A large-scale car dataset for fine-grained categorization and verification
Linjie Yang, Ping Luo, Chen Change Loy, Xiaoou Tang

Proceedings of the IEEE conference on computer vision and pattern recognition

ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models
Shuo Liu, Kaining Ying, Hao Zhang, Yuqi Lin, Tianle Zhang, Chuanhao Li, Yu Qiao, Ping Luo, Wenqi Shao, Kaipeng Zhang

Advances in Neural Information Processing Systems

Convbench: A multi-turn conversation evaluation benchmark with hierarchical capability for large vision-language models
Shuo Liu, Kaining Ying, Hao Zhang, Yue Yang, Yuqi Lin, Tianle Zhang, Chuanhao Li, Yu Qiao, Ping Luo, Wenqi Shao, Kaipeng Zhang

arXiv preprint arXiv:2403.20194

DeepAccident: A Large-Scale Accident Dataset for Multi-Vehicle Autonomous Driving
Tianqi Wang Wenxuan Ji Shoufa Chen, Chongjian Ge Enze Xie Ping Luo

none

Deepaccident: A motion and accident prediction benchmark for v2x autonomous driving
Tianqi Wang, Sukmin Kim, Ji Wenxuan, Enze Xie, Chongjian Ge, Junsong Chen, Zhenguo Li, Ping Luo

Proceedings of the AAAI Conference on Artificial Intelligence

Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models
Jin Wang, Chenghui Lv, Xian Li, Shichao Dong, Huadong Li, Chao Li, Wenqi Shao, Ping Luo

CVPR 2025

Gui odyssey: A comprehensive dataset for cross-app gui navigation on mobile devices
Quanfeng Lu, Wenqi Shao, Zitao Liu, Fanqing Meng, Boxuan Li, Botong Chen, Siyuan Huang, Kaipeng Zhang, Yu Qiao, Ping Luo

arXiv preprint arXiv:2406.08451

KET-QA: A Dataset for Knowledge Enhanced Table Question Answering
Mengkang Hu, Haoyu Dong, Ping Luo, Shi Han, Dongmei Zhang

arXiv preprint arXiv:2405.08099

Large-scale celebfaces attributes (celeba) dataset
Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang

Retrieved August

Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models
Peng Xu, Wenqi Shao, Kaipeng Zhang, Peng Gao, Shuo Liu, Meng Lei, Fanqing Meng, Siyuan Huang, Yu Qiao, Ping Luo

IEEE Transactions on Pattern Analysis and Machine Intelligence

Mmt-bench: A comprehensive multimodal benchmark for evaluating large vision-language models towards multitask agi
Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao

arXiv preprint arXiv:2404.16006

Omnimedvqa: A new large-scale comprehensive evaluation benchmark for medical lvlm
Yutao Hu, Tianbin Li, Quanfeng Lu, Wenqi Shao, Junjun He, Yu Qiao, Ping Luo

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Phybench: A physical commonsense benchmark for evaluating text-to-image models
Fanqing Meng, Wenqi Shao, Lixin Luo, Yahong Wang, Yiran Chen, Quanfeng Lu, Yue Yang, Tianshuo Yang, Kaipeng Zhang, Yu Qiao, Ping Luo

arXiv preprint arXiv:2406.11802

Plot2code: A comprehensive benchmark for evaluating multi-modal large language models in code generation from scientific plots
Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo

arXiv preprint arXiv:2405.07990

Towards world simulator: Crafting physical commonsense-based benchmark for video generation
Fanqing Meng, Jiaqi Liao, Xinyu Tan, Wenqi Shao, Quanfeng Lu, Kaipeng Zhang, Yu Cheng, Dianqi Li, Yu Qiao, Ping Luo

arXiv preprint arXiv:2410.05363

V2x-seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting
Haibao Yu, Wenxian Yang, Hongzhi Ruan, Zhenwei Yang, Yingjuan Tang, Xu Gao, Xin Hao, Yifeng Shi, Yifeng Pan, Ning Sun, Juan Song, Jirui Yuan, Ping Luo, Zaiqing Nie

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

T2I-CompBench++: An Enhanced and Comprehensive Benchmark for Compositional Text-to-Image Generation
Kaiyi Huang, Chengqi Duan, Kaiyue Sun, Enze Xie, Zhenguo Li, Xihui Liu

TPAMI 2025

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
Kaiyue Sun, Kaiyi Huang, Xian Liu, Yue Wu, Zihan Xu, Zhenguo Li, Xihui Liu

CVPR 2025

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
Lijun Li, Zhelun Shi, Xuhao Hu, Bowen Dong, Yiran Qin, Xihui Liu, Lu Sheng, Jing Shao

CVPR 2025

LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
Tianwei Xiong, Yuqing Wang, Daquan Zhou, Zhijie Lin, Jiashi Feng, Xihui Liu

NeurIPS 2024

BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
Yuchen Ren, Zhiyuan Chen, Lifeng Qiao, Hongtai Jing, Yuchen Cai, Sheng Xu, Peng Ye, Xinzhu Ma, Siqi Sun, Hongliang Yan, Dong Yuan, Wanli Ouyang, Xihui Liu

NeurIPS 2024

PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
ZiDong Wang, Zeyu Lu, Di Huang, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai

ECCV 2024

EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning
Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu

arXiv 2024

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao

CVPR 2024

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation
Kaiyi Huang, Kaiyue Sun, Enze Xie, Zhenguo Li, Xihui Liu

NeurIPS 2023

The ArtBench Dataset: Benchmarking Generative Models with Artworks
Peiyuan Liao*, Xiuyu Li*, Xihui Liu, Kurt Keutzer

arXiv 2022

Benchmark for Compositional Text-to-Image Synthesis
Dong Huk Park, Samaneh Azadi, Xihui Liu, Trevor Darrell, Anna Rohrbach

NeurIPS Datasets and Benchmarks 2021