Implicit Event-RGBD Neural SLAM

1Fudan University,   2Shanghai AI Laboratory,   3Shanghai Jiao Tong University

4Northwestern Polytechnical University,     5Hong Kong University of Sciences and Technology

Paper Code Dataset YouTube

TL;DR: EN-SLAM, the first event-RGBD implicit neural SLAM that leverages event stream and RGBD to overcome challenges in motion blur and lighting variation scenes; differentiable CRF rendering → shared radiance field → RGB+event data; consecutive difference constraints of the event stream → temporal aggregating.

Overview

Implicit neural SLAM has achieved remarkable progress recently. Nevertheless, existing methods face significant challenges in non-ideal scenarios, such as motion blur or lighting variation, which often leads to issues like convergence failures, localization drifts, and distorted mapping. To address these challenges, we propose EN-SLAM, the first event-RGBD implicit neural SLAM framework, which effectively leverages the high rate and high dynamic range advantages of event data for tracking and mapping. Specifically, EN-SLAM proposes a differentiable CRF (Camera Response Function) rendering technique to generate distinct RGB and event camera data via a shared radiance field, which is optimized by learning a unified implicit representation with the captured event and RGBD supervision. Moreover, based on the temporal difference property of events, we propose a temporal aggregating optimization strategy for the event joint tracking and global bundle adjustment, capitalizing on the consecutive difference constraints of events, significantly enhancing tracking accuracy and robustness. Finally, we construct the simulated dataset DEV-Indoors and real captured dataset DEV-Reals containing 6 scenes, 17 sequences with practical motion blur and lighting changes for evaluations. Experimental results show that our method outperforms the SOTA methods in both tracking ATE and mapping ACC with a real-time 17 FPS in various challenging environments.

Contributions:

  • We present EN-SLAM, the first event-RGBD implicit neural SLAM framework that efficiently leverages event stream and RGBD to overcome challenges in extreme motion blur and lighting variation scenes.
  • A differentiable CRF rendering technique is proposed to map a unified representation in the shared radiance field to RGB and event camera data for addressing the significant distinction between event and RGB. A temporal aggregating optimization strategy that capitalizes the consecutive difference constraints of the event stream is present and significantly improves the camera tracking accuracy and robustness.
  • We construct a simulated DEV-Indoors and real captured DEV-Reals dataset containing 17 sequences with practical motion blur and lighting changes. A wide range of evaluations demonstrate competitive real-time performance under various challenging environments.

Citation

@inproceedings{
    anonymous2024implicit,
    title={Implicit Event-{RGBD} Neural {SLAM}},
    author={Anonymous},
    booktitle={Conference on Computer Vision and Pattern Recognition 2024},
    year={2024},
    url={https://openreview.net/forum?id=OjQP10Pgnp}
  }


For more information, check out the paper, code, and YouTube video: