Delin Qu - Homepage

Delin Qu 0010 屈德林

dlqu22 at m dot fudan dot edu dot cn

I am a fourth-year Ph.D. candidate at the School of Computer Science, Fudan University (FDU) and Shanghai AI Laboratory, Shanghai, China. I'm foutunate to be advised by Prof. Xuelong Li and be part of the EO-Robotics Teamn. My research focuses on Embodied AI and 3D Computer Vision, with a long-term vision of achieving L2-level Physical Intelligence. I'm excited about the prospect of an "GPT moment" in Embodied AI, where AI systems can learn to interact with the physical world in a more human-like way.

I was a research intern at Shanghai AI Laboratory with Prof. Xuelong Li. In Dec 2024, I secured the National Natural Science Foundation of China (NSFC) grant to support my research.

I will finish my PhD in fall 2027, and I am actively looking for research internship or exciting startup opportunities.

News

Dec 2025 🏆 Our Team Comet wins 2nd place in the BEHAVIOR Challenge led by Dr Fei-Fei Li and the Stanford University team. 👏 Check our solution here! Openpi-Comet.
Dec 2025Our paper FreeGaussian is accepted to AAAI 2025 Oral.
Sep 2025 We introduce EO-1 , an open-source unified embodied foundation model trained on Interleaved Embodied Multimodal Data, integrates discrete auto-regressive decoding with continuous flow matching for multimodal embodied reasoning and robot control. EO-Robotic Team: http://eo-robotics.ai.
Apr 2025Our papers SpatialVLA are accepted to Robotics: Science and Systems 2025. I’ll be presenting a short Spotlight Talk in RSS, LA. Meanwhile, I’m actively exploring internships or visiting opportunities. Open to exciting collaborations—DM or email me!
Mar 2025: I am awarded the Top Outstanding Ph.D. Student Scholarship of Fudan University (top 1%).
Feb 2025 Our Lifelong Robot paper Think Small, Act Big is accepted at CVPR 2025.
Mar 2025 I will also be a research interning at AgiBot this summer on Foundation Vision-language-action models.
Nov 2024 I have secured the National Natural Science Foundation of China (NSFC) grant to support my research. Deeply grateful to my supervisor and research team@IPEC
Dec 2024Our paper Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding is accepted to AAAI 2025 as Oral Paper (top 4.6%).
Sep 2024 Our paper LiveScene Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control is accepted at Neurips 2024.
May 2024: We integrated KAN into NeRF and conducted a preliminary evaluation, integrated KAN into NeRF 😆, Hands-On NeRF with KAN!
Feb 2024Our papers GS-SLAM and EN-SLAM are both accepted to CVPR 2024 as Two Highlight Paper (top 2.6% × 2).
Oct 2023: I am awarded the Tencent Scholarship (top 0.1%).
Jun 2023: Our paper Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction got accepted to ICCV 2023
Jun 2023:The extension of my undergraduate thesis Fast Rolling Shutter Correction in the Wild got accepted to TPAMI!
Mar 2023: Our paper Revisiting Rolling Shutter Bundle Adjustment: Toward Accurate and Fast Solution got accepted to CVPR 2023.
Sep 2022: Start my PhD journey at Fudan University!

---- show more ----

Research

	EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control, Delin Qu, Haoming Song, Qizhi Chen, Zhaoqin Chen, Xianqiang Gao, Guanghui Ren, Maoqing Yao, Bin Zhao, Dong Wang, et al. EO-Robotic Team* An open-source unified embodied foundation model integrates discrete auto-regressive decoding with continuous flow matching for embodied reasoning and robot control.
	Hume: Introducing System-2 Thinking in Visual-Language-Action Model Haoming Song, Delin Qu, Yuanqi Yao, Qizhi Chen, Xinyi Ye, Qi Lv, Xianqiang Gao*, Guanghui Ren, Maoqing Yao, Bin Zhao, Dong Wang, Xuelong Li, paper \| project page \| video \| code \| model A spatial-enhanced vision-language-action model trained on 1.1 Million real robot episodes, purely huggingFace-based, concise code with efficient performance.
	SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model Delin Qu, Haoming Song, Qizhi Chen, Yuanqi Yao, Xinyi Ye, Jiayuan Gu, Bin Zhao, Dong Wang, Xuelong Li, Robotics: Science and Systems (RSS), 2025 (Spotlight)* paper \| project page \| video \| code \| model A spatial-enhanced vision-language-action model trained on 1.1 Million real robot episodes, purely huggingFace-based, concise code with efficient performance.
	GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting Chi Yan, Delin Qu, Dan Xu, Bin Zhao, Dong Wang, Zhigang Wang, Xuelong Li, Conference on Computer Vision and Pattern Recognition (CVPR), 2024, (Spotlight, top 2.6%) paper \| project page \| video \| code The first to utilize 3D Gaussian representation in the Simultaneous Localization and Mapping (SLAM) system.
	Implicit Event-RGBD Neural SLAM Delin Qu, Chi Yan, Yin Jie, Qizhi Chen, Bin Zhao, Dong Wang, Zhigang Wang, Dan Xu, Xuelong Li, Conference on Computer Vision and Pattern Recognition (CVPR), 2024, (Spotlight, top 2.6%) paper \| project page \| video \| code \| dataset The first event-RGBD implicit neural SLAM that leverages event stream and RGBD to overcome challenges in motion blur and lighting variation scenes.
	Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding Xianqiang Gao, Pinrui Zhang, Delin Qu, Zhigang Wang, Yan Ding, Dong Wang, Bin Zhao, Xuelong Li, the Association for the Advancement of Artificial Intelligence ((AAAI Oral, top 4.6%))*, 2025, paper \| project page \| code A Multi-Image Guided Invariant Feature Aware 3D Affordance Grounding (MIFAG) framework.
	Fast Rolling Shutter Correction in the Wild Delin Qu, Bangyan Liao, Yifei Xue, Huiqing Zhang, Omar Ait Aider, Yizhen Lao. IEEE Transactions on Pattern Analysis and Machine Intelligence *(TPAMI)*, 2023 paper \| project page \| video \| code \| dataset A pixel-wise varying direct RS correction framework that handles locally varying distortion caused by various sources, such as camera motion, moving objects, and even highly varying depth scenes.
	FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives Qizhi Chen, Delin Qu, Haoming Song, Yiwen Tang, Dong Wang, Bin Zhao, Xuelong Li, the Association for the Advancement of Artificial Intelligence (AAAI Oral, top 4.6%), 2026, paper \| project page \| video \| code An annotation guidance-free method, dubbed FreeGaussian, that mathematically derives dynamic Gaussian motion from optical flow and camera motion using novel dynamic Gaussian constraints.
	LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control Delin Qu, Qizhi Chen, Pingrui Zhang, Xianqiang Gao, Dong Wang, Xuelong Li, Conference on Neural Information Processing Systems (Neurips), 2024 paper \| project page \| video \| code \| dataset Embedding language feature to interactive scenes, grounding and manipulating interactable objects with language instructions.
	Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction Delin Qu, Yizhen Lao, Bin Zhao, Zhigang Wang, Dong Wang, Xuelong Li, Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV)*, 2023 paper \| project page \| video \| code A geometry-based Quadratic Rolling Shutter (QRS) motion solver, which precisely estimates the high-order correction field of individual pixels.
	Revisiting Rolling Shutter Bundle Adjustment: Toward Accurate and Fast Solution Bangyan Liao, Delin Qu, Yifei Xue, Huiqing Zhang, Yizhen Lao. Conference on Computer Vision and Pattern Recognition (CVPR), 2023 paper \| project page \| video \| code An accurate and fast bundle adjustment solution that estimates the 6-DoF pose with an independent RS model of the camera and the geometry of the environment based on measurements from a rolling shutter camera.

Invited Talks

Exploring Spatial Representations for Visual-Language-Action Model
Institute of Artificial Intelligence (TeleAI), China Telecom, hosted by Chenjia Bai, Mar 2025

A spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes, toward the More Generalist Agents System. slides

Selected Projects

	🏆 Openpi Comet: Competition Solution For 2025 BEHAVIOR Challenge Delin Qu, Qizhi Chen, Shangkun Sun*, Zhaoshuo Li†, Yu-Wei Chao, Xiaohui Zeng, Xuan Li, Junjie Bai, Tsung-Yi Lin, Ming-Yu Liu†, paper \| video \| code \| 🏆 Award Focus on systematically building our solution by studying the effects of training techniques and data. Show the scaling power in pre-training and post-training phases for competitive performance in Behavior 1K Challenge.
	FastUMI: A Scalable and Hardware-Independent Universal Manipulation Interface with Dataset Zhaxizhuoma, Kehui Liu, et.al, Delin Qu, Dong Wang, Yan Ding, Bin Zhao, Xuelong Li paper \| project page \| video \| code A substantial redesign of the Universal Manipulation Interface system enabling rapid deployment and delivering robust performance in real-world data acquisition.
	Optics-driven drone Xuelong Li, Guan Huang, Zhigang Wang, Delin Qu, Bin Zhao Science China. Information Sciences, 67(2), 124201, 2024 paper \| project page \| video \| code A remote charging technology for drones to enhance their autonomy and intelligence during mission execution
	Large Model Heterogeneous Intelligent Agent Systems Kehui Liu, Zixin Tang, et.al, Delin Qu, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li International Conference on Intelligent Robots and Systems (IROS), 2025 paper \| project page \| video \| code A novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems including quadrotors, robotic dogs, and robotic arms.
	Any4LeRobot: A tool collection for LeRobot paper \| project page \| video \| code A curated collection of utilities for LeRobot Projects, including data conversion scripts, preprocessing tools, training workflow helpers and etc.

Honors & Awards

Sep 2022 - Now: National Natural Science Foundation of China (NSFC) grant in 2024, Second place in the Behavior 1K Challenge in 2025, Top Outstanding PhD Student Scholarship of Fudan University in 2025, Tencent Scholarship in 2023, Fudan University Master's Excellence Scholarship in 2022, Outstanding Student Award in 2023, Fudan University's Outstanding Youth League Member in 2024.
Sep 2018 - Jun 2022: National Scholarship in 2021, National Scholarship in 2020, National Inspirational Scholarship in 2019, Finalist Prize of Mathematical Contest in Modeling, Second Prize of Asia-Pacific Mathematical Contest in Modeling, Second Prize in National Internet of Things Design Contest, Second Prize in Internet Competition of Hunan Province, Excellence Award in the Huawei AI Cloud Cup, Huawei College Scholarship, Huawei Smart Base Future Star, Excellent Graduation Thesis.

Academic Services

Conference Reviewer: TPAMI, CVPR, ICCV, ECCV, ICLR, ICML, and NeurIPS.
2023 Spring: COMP130135.04 Object Oriented Programming, Teaching Assistant.

template adapted from this awesome website