Bingyu Li · Ph.D. Student at USTC

Bingyu Li

I am a Ph.D. student at the University of Science and Technology of China (USTC), supervised by Prof. Xuelong Li. My research focuses on open-world visual perception, world action models, and generative video models. The long-term goal behind these directions is simple: Building intelligence for an open and dynamic world.

Portrait of Bingyu Li
Bingyu Li
Ph.D. Student at USTC
Hefei, China
Open-world · Dynamic · Generative

My research explores how machines can perceive open-ended scenes, associate semantic entities through time, model action-conditioned dynamics, and generate plausible futures.

I am particularly interested in connecting these capabilities into a coherent framework for visual intelligence, spanning fine-grained perception, temporal reasoning, world modeling, and controllable generation.

Perceive the environment, associate entities through time, model spatial dynamics, and generate plausible futures.
4 × CVPR 2026Two first-author papers among four accepted works.
AAAI 2026 OralResearch on efficient open-vocabulary segmentation.
National ScholarshipAwarded in 2025.

* denotes equal contribution. A complete list is available on Google Scholar.

Open-World Perception

Selected publications
MTRefSeg
An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation arXiv 2026
Bingyu Li*, Da Zhang*, Tao Huo, Zhiyuan Zhao, Junyu Gao, Xuelong Li

MTRefSeg introduces Multi-temporal Referring Segmentation (MTRS), where models compare temporally related images and segment the language-described changed region. The work builds MTRefSeg-21K with 21K bi-image-text-mask annotations and proposes MTRefSeg-R1, a change-aware LVLM trained by vision-only temporal pretraining followed by referring multi-temporal fine-tuning.

Generative Video Models

Selected publications

World Action Models

Ongoing direction

National Scholarship

National Scholarship for Graduate Students, 2025.

Oral Presentations

AAAI 2026 Oral and ACM MM 2025 Oral.

Journal Reviewer

IEEE TPAMI, IEEE TNNLS, IEEE TGRS, Pattern Recognition, and related journals.

Conference Reviewer

CVPR, NeurIPS, ICLR, and other major conferences.