Visual Odometry in Rust (vors)

Matthieu Pizenberg
Pure Rust visual odometry algorithm for RGB-D camera tracking
2019

Code

Currently, visual-odometry-rs, abreviated vors, provides a framework for direct RGB-D camera tracking. A research paper is intended but not finished yet. The code is available online on GitHub at github.com/mpizenberg/visual-odometry-rs. Self contained examples for usage of the API are available in the examples/ directory. A readme is also present there for more detailed explanations on these examples. Have a look at mpizenberg/rgbd-tracking-evaluation for more info about the dataset requirements to run the binary program vors_track. The library is organized around four base namespaces:

  • core:: Core modules for computing gradients, candidate points, camera tracking etc.
  • dataset:: Helper modules for handling specific datasets. Currently only provides a module for TUM RGB-D compatible datasets.
  • math:: Basic math modules for functionalities not already provided by nalgebra, like Lie algebra for so3, se3, and an iterative optimizer trait.
  • misc:: Helper modules for interoperability, visualization, and other things that did not fit elsewhere yet.

Initially, this repository served as a personal experimental sandbox for computer vision in Rust. See for example my original questions on the rust discourse and reddit channel. Turns out I struggled a bit at first but then really liked the Rust way, compared to C++.

As the name suggests, the focus is now on visual odometry, specifically on the recent research field of direct visual odometry. A reasonable introduction is available in those lecture slides by Waterloo Autonomous Vehicles lab. This project initially aimed at improving on the work of DSO by J. Engel et. al. but with all the advantages of using the Rust programming language, including:

I didn't intend to rewrite everything from scratch. I spent literally months on dissecting DSO's code, trying to add improvements I had in mind, only to face memory crashes, unpredictable side effets of my additions to the previous code, and unanswered questions. That is when I decided to start from a clean slate in Rust. Setting all this from the ground up took a lot of time and effort, but I think it is mature enough to be shared as is from now on. Beware, however, that the API might evolve a lot (irregularly). My hope is that in the near future, we can improve the reach of this project by working both on research extensions, and platform availability.

Example research extensions:

  • Using disparity search for depth initialization to be compatible with RGB (no depth) camera.
  • Adding a photometric term to the residual to account for automatic exposure variations (some work happening in the photometric branch).
  • Adding automatic photometric and/or geometric camera calibration.
  • Building a sliding window of keyframes optimization as in DSO to reduce drift.
  • Intregrating loop closure and pose graph optimization for having a robust vSLAM system.
  • Fusion with IMU for improved tracking and reducing scale drift.
  • Modelization of rolling shutter (in most cameras) into the optimization problem.
  • Extension to stereo cameras.
  • Extension to omnidirectional cameras.

Example platform extensions:

  • Making a C FFI to be able to run on systems with C drivers (kinect, realsense, ...).
  • Porting to the web with WebAssembly (some work happening in the interactive vors repository).
  • Porting to ARM for running in embedded systems and phones.