API Reference

This module provides Python bindings for the cuVSLAM library.

Data Structures

class cuvslam.Pose

Transformation from one frame to another.

Consists of rotation (quaternion in x, y, z, w order) and translation (3-vector). cuVSLAM uses OpenCV coordinate system convention: x is right, y is down, z is forward.

property rotation: Rotation (quaternion in x, y, z, w order)

property translation: Translation (3-vector)

class cuvslam.Distortion

Camera distortion model with parameters.

Supports Pinhole (no distortion), Brown (radial and tangential), Fisheye (equidistant), and Polynomial distortion models.

Common definitions used below:

principal point \((c_x, c_y)\)
focal length \((f_x, f_y)\)

Supported values of distortion_model:

Pinhole (0 parameters): No distortion

Fisheye (4 parameters): Also known as equidistant distortion model for pinhole cameras. Coefficients k1, k2, k3, k4 are 100% compatible with ethz-asl/kalibr/pinhole-equi and OpenCV::fisheye. See: ‘A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses’ by Juho Kannala and Sami S. Brandt for further information. Please note, this approach (pinhole+undistort) has a limitation and works only field-of-view below 180° For the TUMVI dataset FOV is ~190°. EuRoC and ORB_SLAM3 use a different approach (direct project/unproject without pinhole) and support >180°, so their coefficient is incompatible with this model.

0-3: fisheye distortion coefficients \((k_1, k_2, k_3, k_4)\)

Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (distortedR(r) \cdot (x_n, y_n) / r)\)

where:

\(distortedR(r) = \arctan(r) \cdot (1 + k_1 \cdot \arctan^2(r) + k_2 \cdot \arctan^4(r) + k_3 \cdot \arctan^6(r) + k_4 \cdot \arctan^8(r))\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)

Brown (5 parameters):

0-2: radial distortion coefficients \((k_1, k_2, k_3)\)
3-4: tangential distortion coefficients \((p_1, p_2)\)

Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (radial \cdot (x_n, y_n) + tangential)\)

where:

\(radial = (1 + k_1 \cdot r^2 + k_2 \cdot r^4 + k_3 \cdot r^6)\) \(tangential.x = 2 \cdot p_1 \cdot x_n \cdot y_n + p_2 \cdot (r^2 + 2 \cdot x_n^2)\) \(tangential.y = p_1 \cdot (r^2 + 2 \cdot y_n^2) + 2 \cdot p_2 \cdot x_n \cdot y_n\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)

Polynomial (8 parameters): Coefficients are compatible with first 8 coefficients of OpenCV distortion model.

0-1: radial distortion coefficients \((k_1, k_2)\)
2-3: tangential distortion coefficients \((p_1, p_2)\)
4-7: radial distortion coefficients \((k_3, k_4, k_5, k_6)\)

Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (radial \cdot (x_n, y_n) + tangential)\)

where:

\(radial = \frac{1 + k_1 \cdot r^2 + k_2 \cdot r^4 + k_3 \cdot r^6}{1 + k_4 \cdot r^2 + k_5 \cdot r^4 + k_6 \cdot r^6}\) \(tangential.x = 2 \cdot p_1 \cdot x_n \cdot y_n + p_2 \cdot (r^2 + 2 \cdot x_n^2)\) \(tangential.y = p_1 \cdot (r^2 + 2 \cdot y_n^2) + 2 \cdot p_2 \cdot x_n \cdot y_n\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)

Brown = 2

Fisheye = 1

class Model(value)

Distortion model types for camera calibration

Pinhole = 0

Brown = 2

Fisheye = 1

Polynomial = 3

Pinhole = 0

Polynomial = 3

property model: Distortion model type, see Distortion.Model

property parameters: Array of distortion parameters depending on model

class cuvslam.Camera

Camera calibration parameters.

Describes intrinsic and extrinsic parameters of a camera and per-camera settings. For camera coordinate system, top left pixel has (0, 0) coordinate (y is down, x is right).

property border_bottom: Bottom border to ignore in pixels (0 to use full frame)

property border_left: Left border to ignore in pixels (0 to use full frame)

property border_right: Right border to ignore in pixels (0 to use full frame)

property border_top: Top border to ignore in pixels (0 to use full frame)

property distortion: Distortion parameters, see Distortion

property focal: Focal length (fx, fy)

property principal: Principal point (cx, cy)

property rig_from_camera: Transformation from the camera coordinate frame to the rig coordinate frame

property size: Size of the camera (width, height)

class cuvslam.ImuCalibration

IMU Calibration parameters

property accelerometer_noise_density: Accelerometer noise density in \(m/(s^2*sqrt(hz))\)

property accelerometer_random_walk: Accelerometer random walk in \(m/(s^3*sqrt(hz))\)

property frequency: IMU frequency in \(hz\)

property gyroscope_noise_density: Gyroscope noise density in \(rad/(s*sqrt(hz))\)

property gyroscope_random_walk: Gyroscope random walk in \(rad/(s^2*sqrt(hz))\)

property rig_from_imu: Transformation from IMU coordinate frame to the rig coordinate frame

class cuvslam.Rig

Rig consisting of cameras and 0 or 1 IMU sensors

property cameras: List of cameras in the rig, see Camera

property imus: List of IMU sensors in the rig (0 or 1 only), see ImuCalibration

class cuvslam.PoseEstimate

Rig pose estimate from the tracker. The pose is world_from_rig where:

The rig coordinate frame is user-defined and depends on the extrinsic parameters of the cameras. The world coordinate frame is an arbitrary 3D coordinate frame, corresponding to the rig frame at the start of tracking.

property timestamp_ns: Timestamp of the pose estimate in nanoseconds

property world_from_rig: Rig pose in the world coordinate frame

class cuvslam.ImuMeasurement

property angular_velocities: Angular velocities in \(rad/s\)

property linear_accelerations: Linear accelerations in \(m/s^2\)

property timestamp_ns: Timestamp of the IMU measurement in nanoseconds

class cuvslam.Landmark

3D landmark point

property coords: 3D coordinates of the landmark in the world coordinate frame

property id: Unique ID of the landmark

class cuvslam.Observation

2D observation of a landmark in an image

property camera_index: Index of the camera that made this observation

property id: Unique ID of the observed landmark

property u: Horizontal pixel coordinate of the observation

property v: Vertical pixel coordinate of the observation

class cuvslam.core.Odometry.MulticameraMode(value)

Performance = 0

Precision = 1

Moderate = 2

class cuvslam.core.Odometry.OdometryMode(value)

Multicamera = 0

Inertial = 1

RGBD = 2

Mono = 3

class cuvslam.core.Odometry.Config

property async_sba: Enable to run bundle adjustment asynchronously

property debug_dump_directory: Directory for debug data dumps. If empty, no debug data will be dumped

property debug_imu_mode: Enable IMU debug mode

property enable_final_landmarks_export: Enable to export final landmarks. Also sets enable_landmarks_export and enable_observations_export.

property enable_landmarks_export: Enable to export landmarks during tracking

property enable_observations_export: Enable to export landmark observations in images during tracking

property max_frame_delta_s: Maximum time difference between frames in seconds

property multicam_mode: See Odometry.MulticameraMode

property odometry_mode: See Odometry.OdometryMode

property rectified_stereo_camera: Enable if stereo cameras are rectified and horizontally aligned

property rgbd_settings: Settings for RGB-D odometry mode. See Odometry.RGBDSettings

property use_denoising: Enable to apply denoising to input images

property use_gpu: Enable to use GPU acceleration

property use_motion_model: Enable to use motion model for pose prediction

class cuvslam.core.Odometry.RGBDSettings

Settings for RGB-D odometry mode

property depth_camera_id: ID of the camera that the depth image is aligned with

property depth_scale_factor: Scale factor for depth measurements

property enable_depth_stereo_tracking: Whether to enable stereo tracking between depth-aligned camera and other cameras

class cuvslam.core.Slam.Config

SLAM configuration parameters

property enable_reading_internals: Enable reading internal data from SLAM

property gt_align_mode: Special mode for visual map building with ground truth

property map_cache_path: If empty, map is kept in memory only. Else, map is synced to disk (LMDB) at this path, allowing large-scale maps; if the path already exists it will be overwritten. To load an existing map, use LocalizeInMap(). To save map, use SaveMap().

property map_cell_size: Size of map cell (0 to auto-calculate from camera baseline)

property max_landmarks_distance: Maximum distance from camera to landmark for inclusion in map

property max_map_size: Maximum number of poses in SLAM pose graph (0 for unlimited)

property planar_constraints: Modify poses so camera moves on a horizontal plane

property retention_time_ms: How long the past is preserved. Maximum time to keep odometries delta history to be able to process LocalizeInMap within timestamps from past.

property sync_mode: If true, localization and mapping run in the same thread as visual odometry

property throttling_time_ms: Minimum time between loop closure events in milliseconds

property use_gpu: Whether to use GPU acceleration

class cuvslam.core.Slam.Metrics

SLAM metrics

property lc_good_landmarks_count: Count of landmarks in loop closure

property lc_pnp_landmarks_count: Count of landmarks in PnP for loop closure

property lc_selected_landmarks_count: Count of landmarks selected for loop closure

property lc_status: Loop closure status

property lc_tracked_landmarks_count: Count of landmarks tracked in loop closure

property pgo_status: Pose graph optimization status

property timestamp_ns: Timestamp of measurements in nanoseconds

class cuvslam.core.Slam.LocalizationSettings

Localization settings

property angular_step_rads: Angular step around vertical axis in radians

property horizontal_search_radius: Horizontal search radius in meters

property horizontal_step: Horizontal step in meters

property vertical_search_radius: Vertical search radius in meters

property vertical_step: Vertical step in meters

Tracker class

class cuvslam.Tracker(rig: Rig, odom_config: Config | None = None, slam_config: Config | None = None)[source]

A wrapper that combines cuVSLAM Odometry and SLAM functionality.

This class automatically manages both the Odometry and SLAM instances, providing a simplified interface for common use cases.

Parameters:

rig (Rig)
odom_config (Config | None)
slam_config (Config | None)

OdometryConfig: alias of Config

OdometryRGBDSettings: alias of RGBDSettings

class TrackOptions

Per-frame tracking options (stateless).

These options apply only to the current track() call and do not affect subsequent frames. Instantiate with TrackOptions() and override only the fields you need.

property border_bottom: Bottom border to ignore in pixels

property border_left: Left border to ignore in pixels

property border_right: Right border to ignore in pixels

property border_top: Top border to ignore in pixels

property box3_prefilter: Enable box filter preprocessing

property kf_max_timedelta_between_kfs_s: Maximum time delta between consecutive keyframes (seconds)

property kf_survivor_from_last

survivor tracks percentage (0-100)

Type:: Keyframe selection threshold

property num_desired_tracks: Number of desired feature tracks for this frame

property ransac_filter: Enable RANSAC filtering

SlamConfig: alias of Config

SlamMetrics: alias of Metrics

SlamLocalizationSettings: alias of LocalizationSettings

SlamDataLayer: alias of DataLayer

class PoseGraphNode

Node in a pose graph

property id: Node identifier

property node_pose: Node pose

class PoseGraphEdge

Edge in a pose graph

property covariance: Covariance matrix of the transformation

property node_from: Source node ID

property node_to: Target node ID

property transform: Transformation from source to target node

class PoseGraph

Pose graph structure

property edges: List of edges in the graph

property nodes: List of nodes in the graph

SlamLandmark: alias of Landmark

SlamLandmarks: alias of Landmarks

Track a rig pose using current image frame.

This method combines odometry and SLAM processing in a single call.

In inertial mode, if visual odometry tracker fails to compute a pose, the function returns the position calculated from a user-provided IMU data. If after several calls of Track() visual odometry is not able to recover, then invalid pose will be returned. Odometry will output poses in the same coordinate frame until a loss of tracking.

To get SLAM poses, SLAM must be enabled in the constructor by providing a non-null slam_config. SLAM poses may have loop closure (LC) jumps when LC is detected and pose graph is optimized. SLAM poses cannot be adjusted retroactively, so use get_all_slam_poses method to get smooth trajectory up to the latest frame. Also, in asynchronous mode, LC is done in a separate work thread to keep track call fast, so SLAM poses are not updated immediately.

All cameras must be synchronized. If a camera rig provides “almost synchronized” frames, the timestamps should be within 1 millisecond.

Images (masks, depth images, etc.) can be numpy arrays or tensors, both GPU (CUDA) and CPU. All data must be of the same type (either GPU or CPU). This is not the same as odom_config.use_gpu - if odometry uses GPU for computations, images etc. can still be either CPU or GPU arrays/tensors.

The images etc. must be in the same order as cameras in the rig. If data for a camera is not available, pass empty array or tensor for that camera image.

Parameters:

timestamp (int) – Images timestamp in nanoseconds
images (List[Any]) – List of numpy arrays or tensors containing the camera images
masks (List[Any] | None) – Optional list of numpy arrays or tensors containing masks for the images
depths (List[Any] | None) – Optional list of numpy arrays or tensors containing depth images
gt_pose (Pose | None) – Optional ground truth pose. Should be provided if gt_align_mode is enabled, otherwise should be None.
options (TrackOptions | None) – Optional per-frame options that override default settings for this call only.

Returns:

The computed pose estimate from Odometry. On failure world_from_rig will be None. Pose: If SLAM is enabled, the computed pose estimate from SLAM, otherwise None.

Return type:

PoseEstimate

Raises:

ValueError – If data checks fail (e.g. timestamps are out of order, images sizes are inconsistent, etc.).

register_imu_measurement(sensor_index: int, imu_measurement: ImuMeasurement) → None[source]

Register an IMU measurement with the tracker.

Requires Inertial odometry mode.

If visual odometry loses camera position, it briefly continues execution using user-provided IMU measurements while trying to recover the position. register_imu_measurement() should be called in between image acquisitions however many IMU measurements there are:

tracker.track
tracker.register_imu_measurement
…
tracker.register_imu_measurement
tracker.track
tracker.register_imu_measurement
…

Higher frequency IMU measurements are recommended.

IMU sensors and cameras clocks must be synchronized. track() and register_imu_measurement() must be called in strict ascending order of timestamps.

Parameters:

sensor_index (int) – Sensor index; must be 0, as only one sensor is supported now
imu_measurement (ImuMeasurement) – IMU measurement to register

Raises:

ValueError – If IMU fusion is disabled or if called out of the order of timestamps.

Return type:

None

get_last_observations(camera_index: int) → list[Observation][source]

Get observations from the last frame for specified camera. See Observation. Requires enable_observations_export=True in OdometryConfig.

Parameters:: camera_index (int)
Return type:: list[Observation]

get_last_landmarks() → list[Landmark][source]

Get landmarks from the last frame. See Landmark. Requires enable_landmarks_export=True in OdometryConfig.

Return type:: list[Landmark]

get_last_gravity() → list[float] | None[source]

Get gravity acceleration (m/s²) in the rig / VO frame; +Y is down (OpenCV), so the vector is approximately [0, +g, 0] when the rig is upright. Returns None if gravity is not yet available. Requires Inertial mode (odometry_mode=OdometryMode.Inertial in OdometryConfig)

Return type:: list[float] | None

get_final_landmarks() → dict[int, list[float]][source]

Get all final landmarks from all frames. Landmarks are 3D points in the odometry start frame. Requires enable_final_landmarks_export=True in OdometryConfig.

Return type:: dict[int, list[float]]

get_all_slam_poses(max_poses_count: int = 0) → List[PoseStamped][source]

Get all SLAM poses for each frame.

Returns all SLAM poses after optimization with the latest pose graph. SLAM poses from track() method will have jumps after loop closures. With this method, on the other hand, you will have smooth output because it recalculates past poses using the current pose graph.

Parameters:: max_poses_count (int) – Maximum number of poses to return (0 for all)
Returns:: List of poses with timestamps
Return type:: List[PoseStamped]

get_slam_landmarks(layer: DataLayer) → Landmarks | None[source]

Get landmarks for a given data layer of SLAM. See SlamLandmarks, SlamDataLayer.

Parameters:: layer (DataLayer)
Return type:: Landmarks | None

get_pose_graph() → PoseGraph | None[source]

Get pose graph consisting of all keyframes and their connections including loop closures. See PoseGraph.

Return type:: PoseGraph | None

get_slam_metrics() → Metrics | None[source]

Get SLAM metrics. See SlamMetrics

Return type:: Metrics | None

get_loop_closure_poses() → List[PoseStamped] | None[source]

Get list of last 10 loop closure poses with timestamps.

Return type:: List[PoseStamped] | None

Functions

cuvslam.get_version() → tuple[str, int, int, int]

Get the version of cuVSLAM library.

Returns a tuple with the semantic version string and major, minor and patch values.

cuvslam.set_verbosity(arg: int, /) → None

Set the verbosity level of the library.

Available values: 0 (default) for no output, 1 for error messages, 2 for warnings, 3 for info messages.

cuvslam.warm_up_gpu() → None

Warm up the GPU (CUDA runtime).

It is not necessary to call it before the first call to cuvslam, but it can help to reduce the first call latency.Throws an exception if CUDA, cusolver or cublas initialization fails.