API Reference

This module provides Python bindings for the cuVSLAM library.

Data Structures

class cuvslam.Pose

Transformation from one frame to another.

Consists of rotation (quaternion in x, y, z, w order) and translation (3-vector). cuVSLAM uses OpenCV coordinate system convention: x is right, y is down, z is forward.

property rotation

Rotation (quaternion in x, y, z, w order)

property translation

Translation (3-vector)

class cuvslam.Distortion

Camera distortion model with parameters.

Supports Pinhole (no distortion), Brown (radial and tangential), Fisheye (equidistant), and Polynomial distortion models.

Common definitions used below:

  • principal point \((c_x, c_y)\)

  • focal length \((f_x, f_y)\)

Supported values of distortion_model:

Pinhole (0 parameters): No distortion

Fisheye (4 parameters): Also known as equidistant distortion model for pinhole cameras. Coefficients k1, k2, k3, k4 are 100% compatible with ethz-asl/kalibr/pinhole-equi and OpenCV::fisheye. See: ‘A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses’ by Juho Kannala and Sami S. Brandt for further information. Please note, this approach (pinhole+undistort) has a limitation and works only field-of-view below 180° For the TUMVI dataset FOV is ~190°. EuRoC and ORB_SLAM3 use a different approach (direct project/unproject without pinhole) and support >180°, so their coefficient is incompatible with this model.

  • 0-3: fisheye distortion coefficients \((k_1, k_2, k_3, k_4)\)

Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (distortedR(r) \cdot (x_n, y_n) / r)\)

where:

\(distortedR(r) = \arctan(r) \cdot (1 + k_1 \cdot \arctan^2(r) + k_2 \cdot \arctan^4(r) + k_3 \cdot \arctan^6(r) + k_4 \cdot \arctan^8(r))\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)

Brown (5 parameters):

  • 0-2: radial distortion coefficients \((k_1, k_2, k_3)\)

  • 3-4: tangential distortion coefficients \((p_1, p_2)\)

Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (radial \cdot (x_n, y_n) + tangential)\)

where:

\(radial = (1 + k_1 \cdot r^2 + k_2 \cdot r^4 + k_3 \cdot r^6)\) \(tangential.x = 2 \cdot p_1 \cdot x_n \cdot y_n + p_2 \cdot (r^2 + 2 \cdot x_n^2)\) \(tangential.y = p_1 \cdot (r^2 + 2 \cdot y_n^2) + 2 \cdot p_2 \cdot x_n \cdot y_n\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)

Polynomial (8 parameters): Coefficients are compatible with first 8 coefficients of OpenCV distortion model.

  • 0-1: radial distortion coefficients \((k_1, k_2)\)

  • 2-3: tangential distortion coefficients \((p_1, p_2)\)

  • 4-7: radial distortion coefficients \((k_3, k_4, k_5, k_6)\)

Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (radial \cdot (x_n, y_n) + tangential)\)

where:

\(radial = \frac{1 + k_1 \cdot r^2 + k_2 \cdot r^4 + k_3 \cdot r^6}{1 + k_4 \cdot r^2 + k_5 \cdot r^4 + k_6 \cdot r^6}\) \(tangential.x = 2 \cdot p_1 \cdot x_n \cdot y_n + p_2 \cdot (r^2 + 2 \cdot x_n^2)\) \(tangential.y = p_1 \cdot (r^2 + 2 \cdot y_n^2) + 2 \cdot p_2 \cdot x_n \cdot y_n\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)

Brown = 2
Fisheye = 1
class Model(value)

Distortion model types for camera calibration

Pinhole = 0
Brown = 2
Fisheye = 1
Polynomial = 3
Pinhole = 0
Polynomial = 3
property model

Distortion model type, see Distortion.Model

property parameters

Array of distortion parameters depending on model

class cuvslam.Camera

Camera calibration parameters.

Describes intrinsic and extrinsic parameters of a camera and per-camera settings. For camera coordinate system, top left pixel has (0, 0) coordinate (y is down, x is right).

property border_bottom

Bottom border to ignore in pixels (0 to use full frame)

property border_left

Left border to ignore in pixels (0 to use full frame)

property border_right

Right border to ignore in pixels (0 to use full frame)

property border_top

Top border to ignore in pixels (0 to use full frame)

property distortion

Distortion parameters, see Distortion

property focal

Focal length (fx, fy)

property principal

Principal point (cx, cy)

property rig_from_camera

Transformation from the camera coordinate frame to the rig coordinate frame

property size

Size of the camera (width, height)

class cuvslam.ImuCalibration

IMU Calibration parameters

property accelerometer_noise_density

Accelerometer noise density in \(m/(s^2*sqrt(hz))\)

property accelerometer_random_walk

Accelerometer random walk in \(m/(s^3*sqrt(hz))\)

property frequency

IMU frequency in \(hz\)

property gyroscope_noise_density

Gyroscope noise density in \(rad/(s*sqrt(hz))\)

property gyroscope_random_walk

Gyroscope random walk in \(rad/(s^2*sqrt(hz))\)

property rig_from_imu

Transformation from IMU coordinate frame to the rig coordinate frame

class cuvslam.Rig

Rig consisting of cameras and 0 or 1 IMU sensors

property cameras

List of cameras in the rig, see Camera

property imus

List of IMU sensors in the rig (0 or 1 only), see ImuCalibration

class cuvslam.PoseEstimate

Rig pose estimate from the tracker. The pose is world_from_rig where:

The rig coordinate frame is user-defined and depends on the extrinsic parameters of the cameras. The world coordinate frame is an arbitrary 3D coordinate frame, corresponding to the rig frame at the start of tracking.

property timestamp_ns

Timestamp of the pose estimate in nanoseconds

property world_from_rig

Rig pose in the world coordinate frame

class cuvslam.ImuMeasurement
property angular_velocities

Angular velocities in \(rad/s\)

property linear_accelerations

Linear accelerations in \(m/s^2\)

property timestamp_ns

Timestamp of the IMU measurement in nanoseconds

class cuvslam.Landmark

3D landmark point

property coords

3D coordinates of the landmark in the world coordinate frame

property id

Unique ID of the landmark

class cuvslam.Observation

2D observation of a landmark in an image

property camera_index

Index of the camera that made this observation

property id

Unique ID of the observed landmark

property u

Horizontal pixel coordinate of the observation

property v

Vertical pixel coordinate of the observation

class cuvslam.core.Odometry.MulticameraMode(value)
Performance = 0
Precision = 1
Moderate = 2
class cuvslam.core.Odometry.OdometryMode(value)
Multicamera = 0
Inertial = 1
RGBD = 2
Mono = 3
class cuvslam.core.Odometry.Config
property async_sba

Enable to run bundle adjustment asynchronously

property debug_dump_directory

Directory for debug data dumps. If empty, no debug data will be dumped

property debug_imu_mode

Enable IMU debug mode

property enable_final_landmarks_export

Enable to export final landmarks. Also sets enable_landmarks_export and enable_observations_export.

property enable_landmarks_export

Enable to export landmarks during tracking

property enable_observations_export

Enable to export landmark observations in images during tracking

property max_frame_delta_s

Maximum time difference between frames in seconds

property multicam_mode

See Odometry.MulticameraMode

property odometry_mode

See Odometry.OdometryMode

property rectified_stereo_camera

Enable if stereo cameras are rectified and horizontally aligned

property rgbd_settings

Settings for RGB-D odometry mode. See Odometry.RGBDSettings

property use_denoising

Enable to apply denoising to input images

property use_gpu

Enable to use GPU acceleration

property use_motion_model

Enable to use motion model for pose prediction

class cuvslam.core.Odometry.RGBDSettings

Settings for RGB-D odometry mode

property depth_camera_id

ID of the camera that the depth image is aligned with

property depth_scale_factor

Scale factor for depth measurements

property enable_depth_stereo_tracking

Whether to enable stereo tracking between depth-aligned camera and other cameras

class cuvslam.core.Slam.Config

SLAM configuration parameters

property enable_reading_internals

Enable reading internal data from SLAM

property gt_align_mode

Special mode for visual map building with ground truth

property map_cache_path

If empty, map is kept in memory only. Else, map is synced to disk (LMDB) at this path, allowing large-scale maps; if the path already exists it will be overwritten. To load an existing map, use LocalizeInMap(). To save map, use SaveMap().

property map_cell_size

Size of map cell (0 to auto-calculate from camera baseline)

property max_landmarks_distance

Maximum distance from camera to landmark for inclusion in map

property max_map_size

Maximum number of poses in SLAM pose graph (0 for unlimited)

property planar_constraints

Modify poses so camera moves on a horizontal plane

property sync_mode

If true, localization and mapping run in the same thread as visual odometry

property throttling_time_ms

Minimum time between loop closure events in milliseconds

property use_gpu

Whether to use GPU acceleration

class cuvslam.core.Slam.Metrics

SLAM metrics

property lc_good_landmarks_count

Count of landmarks in loop closure

property lc_pnp_landmarks_count

Count of landmarks in PnP for loop closure

property lc_selected_landmarks_count

Count of landmarks selected for loop closure

property lc_status

Loop closure status

property lc_tracked_landmarks_count

Count of landmarks tracked in loop closure

property pgo_status

Pose graph optimization status

property timestamp_ns

Timestamp of measurements in nanoseconds

class cuvslam.core.Slam.LocalizationSettings

Localization settings

property angular_step_rads

Angular step around vertical axis in radians

property horizontal_search_radius

Horizontal search radius in meters

property horizontal_step

Horizontal step in meters

property vertical_search_radius

Vertical search radius in meters

property vertical_step

Vertical step in meters

Tracker class

class cuvslam.Tracker(rig: Rig, odom_config: Config | None = None, slam_config: Config | None = None)[source]

A wrapper that combines cuVSLAM Odometry and SLAM functionality.

This class automatically manages both the Odometry and SLAM instances, providing a simplified interface for common use cases.

Parameters:
OdometryConfig

alias of Config

OdometryRGBDSettings

alias of RGBDSettings

SlamConfig

alias of Config

SlamMetrics

alias of Metrics

SlamLocalizationSettings

alias of LocalizationSettings

SlamDataLayer

alias of DataLayer

class PoseGraphNode

Node in a pose graph

property id

Node identifier

property node_pose

Node pose

class PoseGraphEdge

Edge in a pose graph

property covariance

Covariance matrix of the transformation

property node_from

Source node ID

property node_to

Target node ID

property transform

Transformation from source to target node

class PoseGraph

Pose graph structure

property edges

List of edges in the graph

property nodes

List of nodes in the graph

SlamLandmark

alias of Landmark

SlamLandmarks

alias of Landmarks

class LocalizerProbe

Localizer probe used during map localization.

Contains input guess pose, result pose and weights, and success flag.

property exact_result_pose

Exact pose if localization succeeded

property exact_result_weight

Weight of the resulting solution

property guess_pose

Input guess pose

property id

Probe identifier

property solved

True if the probe was solved successfully

property weight

Input weight for the probe

class LocalizerProbes

Collection of localizer probes for a localization attempt.

property probes

List of localizer probes, see Slam.LocalizerProbe

property size

Size of search area

property timestamp_ns

Timestamp of localizer try in nanoseconds

track(timestamp: int, images: List[Any], masks: List[Any] | None = None, depths: List[Any] | None = None, gt_pose: Pose | None = None) Tuple[PoseEstimate, Pose | None][source]

Track a rig pose using current image frame.

This method combines odometry and SLAM processing in a single call.

In inertial mode, if visual odometry tracker fails to compute a pose, the function returns the position calculated from a user-provided IMU data. If after several calls of Track() visual odometry is not able to recover, then invalid pose will be returned. Odometry will output poses in the same coordinate frame until a loss of tracking.

To get SLAM poses, SLAM must be enabled in the constructor by providing a non-null slam_config. SLAM poses may have loop closure (LC) jumps when LC is detected and pose graph is optimized. SLAM poses cannot be adjusted retroactively, so use get_all_slam_poses method to get smooth trajectory up to the latest frame. Also, in asynchronous mode, LC is done in a separate work thread to keep track call fast, so SLAM poses are not updated immediately.

All cameras must be synchronized. If a camera rig provides “almost synchronized” frames, the timestamps should be within 1 millisecond.

Images (masks, depth images, etc.) can be numpy arrays or tensors, both GPU (CUDA) and CPU. All data must be of the same type (either GPU or CPU). This is not the same as odom_config.use_gpu - if odometry uses GPU for computations, images etc. can still be either CPU or GPU arrays/tensors.

The images etc. must be in the same order as cameras in the rig. If data for a camera is not available, pass empty array or tensor for that camera image.

Parameters:
  • timestamp (int) – Images timestamp in nanoseconds

  • images (List[Any]) – List of numpy arrays or tensors containing the camera images

  • masks (List[Any] | None) – Optional list of numpy arrays or tensors containing masks for the images

  • depths (List[Any] | None) – Optional list of numpy arrays or tensors containing depth images

  • gt_pose (Pose | None) – Optional ground truth pose. Should be provided if gt_align_mode is enabled, otherwise should be None.

Returns:

The computed pose estimate from Odometry. On failure world_from_rig will be None. Pose: If SLAM is enabled, the computed pose estimate from SLAM, otherwise None.

Return type:

PoseEstimate

Raises:

ValueError – If data checks fail (e.g. timestamps are out of order, images sizes are inconsistent, etc.).

register_imu_measurement(sensor_index: int, imu_measurement: ImuMeasurement) None[source]

Register an IMU measurement with the tracker.

Requires Inertial odometry mode.

If visual odometry loses camera position, it briefly continues execution using user-provided IMU measurements while trying to recover the position. register_imu_measurement() should be called in between image acquisitions however many IMU measurements there are:

  • tracker.track

  • tracker.register_imu_measurement

  • tracker.register_imu_measurement

  • tracker.track

  • tracker.register_imu_measurement

Higher frequency IMU measurements are recommended.

IMU sensors and cameras clocks must be synchronized. track() and register_imu_measurement() must be called in strict ascending order of timestamps.

Parameters:
  • sensor_index (int) – Sensor index; must be 0, as only one sensor is supported now

  • imu_measurement (ImuMeasurement) – IMU measurement to register

Raises:

ValueError – If IMU fusion is disabled or if called out of the order of timestamps.

Return type:

None

get_last_observations(camera_index: int) list[Observation][source]

Get observations from the last frame for specified camera. See Observation. Requires enable_observations_export=True in OdometryConfig.

Parameters:

camera_index (int)

Return type:

list[Observation]

get_last_landmarks() list[Landmark][source]

Get landmarks from the last frame. See Landmark. Requires enable_landmarks_export=True in OdometryConfig.

Return type:

list[Landmark]

get_last_gravity() list[float] | None[source]

Get gravity vector from the last frame. Returns None if gravity is not yet available. Requires Inertial mode (odometry_mode=OdometryMode.Inertial in OdometryConfig)

Return type:

list[float] | None

get_final_landmarks() dict[int, list[float]][source]

Get all final landmarks from all frames. Landmarks are 3D points in the odometry start frame. Requires enable_final_landmarks_export=True in OdometryConfig.

Return type:

dict[int, list[float]]

get_all_slam_poses(max_poses_count: int = 0) List[PoseStamped][source]

Get all SLAM poses for each frame.

Returns all SLAM poses after optimization with the latest pose graph. SLAM poses from track() method will have jumps after loop closures. With this method, on the other hand, you will have smooth output because it recalculates past poses using the current pose graph.

Parameters:

max_poses_count (int) – Maximum number of poses to return (0 for all)

Returns:

List of poses with timestamps

Return type:

List[PoseStamped]

get_slam_landmarks(layer: DataLayer) Landmarks | None[source]

Get landmarks for a given data layer of SLAM. See SlamLandmarks, SlamDataLayer.

Parameters:

layer (DataLayer)

Return type:

Landmarks | None

get_pose_graph() PoseGraph | None[source]

Get pose graph consisting of all keyframes and their connections including loop closures. See PoseGraph.

Return type:

PoseGraph | None

get_localizer_probes() LocalizerProbes | None[source]

Get localizer probes from the most recent localization attempt. See LocalizerProbes.

Return type:

LocalizerProbes | None

get_slam_metrics() Metrics | None[source]

Get SLAM metrics. See SlamMetrics

Return type:

Metrics | None

get_loop_closure_poses() List[PoseStamped] | None[source]

Get list of last 10 loop closure poses with timestamps.

Return type:

List[PoseStamped] | None

Functions

cuvslam.get_version() tuple[str, int, int, int]

Get the version of cuVSLAM library.

Returns a tuple with the semantic version string and major, minor and patch values.

cuvslam.set_verbosity(arg: int, /) None

Set the verbosity level of the library.

Available values: 0 (default) for no output, 1 for error messages, 2 for warnings, 3 for info messages.

cuvslam.warm_up_gpu() None

Warm up the GPU (CUDA runtime).

It is not necessary to call it before the first call to cuvslam, but it can help to reduce the first call latency.Throws an exception if CUDA, cusolver or cublas initialization fails.