API Reference
This module provides Python bindings for the cuVSLAM library.
Data Structures
- class cuvslam.Pose
Transformation from one frame to another.
Consists of rotation (quaternion in x, y, z, w order) and translation (3-vector). cuVSLAM uses OpenCV coordinate system convention: x is right, y is down, z is forward.
- property rotation
Rotation (quaternion in x, y, z, w order)
- property translation
Translation (3-vector)
- class cuvslam.Distortion
Camera distortion model with parameters.
Supports Pinhole (no distortion), Brown (radial and tangential), Fisheye (equidistant), and Polynomial distortion models.
Common definitions used below:
principal point \((c_x, c_y)\)
focal length \((f_x, f_y)\)
Supported values of distortion_model:
Pinhole (0 parameters): No distortion
Fisheye (4 parameters): Also known as equidistant distortion model for pinhole cameras. Coefficients k1, k2, k3, k4 are 100% compatible with ethz-asl/kalibr/pinhole-equi and OpenCV::fisheye. See: ‘A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses’ by Juho Kannala and Sami S. Brandt for further information. Please note, this approach (pinhole+undistort) has a limitation and works only field-of-view below 180° For the TUMVI dataset FOV is ~190°. EuRoC and ORB_SLAM3 use a different approach (direct project/unproject without pinhole) and support >180°, so their coefficient is incompatible with this model.
0-3: fisheye distortion coefficients \((k_1, k_2, k_3, k_4)\)
Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (distortedR(r) \cdot (x_n, y_n) / r)\)
where:
\(distortedR(r) = \arctan(r) \cdot (1 + k_1 \cdot \arctan^2(r) + k_2 \cdot \arctan^4(r) + k_3 \cdot \arctan^6(r) + k_4 \cdot \arctan^8(r))\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)
Brown (5 parameters):
0-2: radial distortion coefficients \((k_1, k_2, k_3)\)
3-4: tangential distortion coefficients \((p_1, p_2)\)
Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (radial \cdot (x_n, y_n) + tangential)\)
where:
\(radial = (1 + k_1 \cdot r^2 + k_2 \cdot r^4 + k_3 \cdot r^6)\) \(tangential.x = 2 \cdot p_1 \cdot x_n \cdot y_n + p_2 \cdot (r^2 + 2 \cdot x_n^2)\) \(tangential.y = p_1 \cdot (r^2 + 2 \cdot y_n^2) + 2 \cdot p_2 \cdot x_n \cdot y_n\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)
Polynomial (8 parameters): Coefficients are compatible with first 8 coefficients of OpenCV distortion model.
0-1: radial distortion coefficients \((k_1, k_2)\)
2-3: tangential distortion coefficients \((p_1, p_2)\)
4-7: radial distortion coefficients \((k_3, k_4, k_5, k_6)\)
Each 3D point \((x, y, z)\) is projected in the following way: \((u, v) = (c_x, c_y) + diag(f_x, f_y) \cdot (radial \cdot (x_n, y_n) + tangential)\)
where:
\(radial = \frac{1 + k_1 \cdot r^2 + k_2 \cdot r^4 + k_3 \cdot r^6}{1 + k_4 \cdot r^2 + k_5 \cdot r^4 + k_6 \cdot r^6}\) \(tangential.x = 2 \cdot p_1 \cdot x_n \cdot y_n + p_2 \cdot (r^2 + 2 \cdot x_n^2)\) \(tangential.y = p_1 \cdot (r^2 + 2 \cdot y_n^2) + 2 \cdot p_2 \cdot x_n \cdot y_n\) \(x_n = x/z\) \(y_n = y/z\) \(r = \sqrt{x_n^2 + y_n^2}\)
- Brown = 2
- Fisheye = 1
- class Model(value)
Distortion model types for camera calibration
- Pinhole = 0
- Brown = 2
- Fisheye = 1
- Polynomial = 3
- Pinhole = 0
- Polynomial = 3
- property model
Distortion model type, see
Distortion.Model
- property parameters
Array of distortion parameters depending on model
- class cuvslam.Camera
Camera calibration parameters.
Describes intrinsic and extrinsic parameters of a camera and per-camera settings. For camera coordinate system, top left pixel has (0, 0) coordinate (y is down, x is right).
- property border_bottom
Bottom border to ignore in pixels (0 to use full frame)
- property border_left
Left border to ignore in pixels (0 to use full frame)
- property border_right
Right border to ignore in pixels (0 to use full frame)
- property border_top
Top border to ignore in pixels (0 to use full frame)
- property distortion
Distortion parameters, see
Distortion
- property focal
Focal length (fx, fy)
- property principal
Principal point (cx, cy)
- property rig_from_camera
Transformation from the camera coordinate frame to the rig coordinate frame
- property size
Size of the camera (width, height)
- class cuvslam.ImuCalibration
IMU Calibration parameters
- property accelerometer_noise_density
Accelerometer noise density in \(m/(s^2*sqrt(hz))\)
- property accelerometer_random_walk
Accelerometer random walk in \(m/(s^3*sqrt(hz))\)
- property frequency
IMU frequency in \(hz\)
- property gyroscope_noise_density
Gyroscope noise density in \(rad/(s*sqrt(hz))\)
- property gyroscope_random_walk
Gyroscope random walk in \(rad/(s^2*sqrt(hz))\)
- property rig_from_imu
Transformation from IMU coordinate frame to the rig coordinate frame
- class cuvslam.Rig
Rig consisting of cameras and 0 or 1 IMU sensors
- property imus
List of IMU sensors in the rig (0 or 1 only), see
ImuCalibration
- class cuvslam.PoseEstimate
Rig pose estimate from the tracker. The pose is world_from_rig where:
The rig coordinate frame is user-defined and depends on the extrinsic parameters of the cameras. The world coordinate frame is an arbitrary 3D coordinate frame, corresponding to the rig frame at the start of tracking.
- property timestamp_ns
Timestamp of the pose estimate in nanoseconds
- property world_from_rig
Rig pose in the world coordinate frame
- class cuvslam.ImuMeasurement
- property angular_velocities
Angular velocities in \(rad/s\)
- property linear_accelerations
Linear accelerations in \(m/s^2\)
- property timestamp_ns
Timestamp of the IMU measurement in nanoseconds
- class cuvslam.Landmark
3D landmark point
- property coords
3D coordinates of the landmark in the world coordinate frame
- property id
Unique ID of the landmark
- class cuvslam.Observation
2D observation of a landmark in an image
- property camera_index
Index of the camera that made this observation
- property id
Unique ID of the observed landmark
- property u
Horizontal pixel coordinate of the observation
- property v
Vertical pixel coordinate of the observation
- class cuvslam.core.Odometry.Config
- property async_sba
Enable to run bundle adjustment asynchronously
- property debug_dump_directory
Directory for debug data dumps. If empty, no debug data will be dumped
- property debug_imu_mode
Enable IMU debug mode
- property enable_final_landmarks_export
Enable to export final landmarks. Also sets enable_landmarks_export and enable_observations_export.
- property enable_landmarks_export
Enable to export landmarks during tracking
- property enable_observations_export
Enable to export landmark observations in images during tracking
- property max_frame_delta_s
Maximum time difference between frames in seconds
- property multicam_mode
See
Odometry.MulticameraMode
- property odometry_mode
See
Odometry.OdometryMode
- property rectified_stereo_camera
Enable if stereo cameras are rectified and horizontally aligned
- property rgbd_settings
Settings for RGB-D odometry mode. See
Odometry.RGBDSettings
- property use_denoising
Enable to apply denoising to input images
- property use_gpu
Enable to use GPU acceleration
- property use_motion_model
Enable to use motion model for pose prediction
- class cuvslam.core.Odometry.RGBDSettings
Settings for RGB-D odometry mode
- property depth_camera_id
ID of the camera that the depth image is aligned with
- property depth_scale_factor
Scale factor for depth measurements
- property enable_depth_stereo_tracking
Whether to enable stereo tracking between depth-aligned camera and other cameras
- class cuvslam.core.Slam.Config
SLAM configuration parameters
- property enable_reading_internals
Enable reading internal data from SLAM
- property gt_align_mode
Special mode for visual map building with ground truth
- property map_cache_path
If empty, map is kept in memory only. Else, map is synced to disk (LMDB) at this path, allowing large-scale maps; if the path already exists it will be overwritten. To load an existing map, use LocalizeInMap(). To save map, use SaveMap().
- property map_cell_size
Size of map cell (0 to auto-calculate from camera baseline)
- property max_landmarks_distance
Maximum distance from camera to landmark for inclusion in map
- property max_map_size
Maximum number of poses in SLAM pose graph (0 for unlimited)
- property planar_constraints
Modify poses so camera moves on a horizontal plane
- property sync_mode
If true, localization and mapping run in the same thread as visual odometry
- property throttling_time_ms
Minimum time between loop closure events in milliseconds
- property use_gpu
Whether to use GPU acceleration
- class cuvslam.core.Slam.Metrics
SLAM metrics
- property lc_good_landmarks_count
Count of landmarks in loop closure
- property lc_pnp_landmarks_count
Count of landmarks in PnP for loop closure
- property lc_selected_landmarks_count
Count of landmarks selected for loop closure
- property lc_status
Loop closure status
- property lc_tracked_landmarks_count
Count of landmarks tracked in loop closure
- property pgo_status
Pose graph optimization status
- property timestamp_ns
Timestamp of measurements in nanoseconds
- class cuvslam.core.Slam.LocalizationSettings
Localization settings
- property angular_step_rads
Angular step around vertical axis in radians
- property horizontal_search_radius
Horizontal search radius in meters
- property horizontal_step
Horizontal step in meters
- property vertical_search_radius
Vertical search radius in meters
- property vertical_step
Vertical step in meters
Tracker class
- class cuvslam.Tracker(rig: Rig, odom_config: Config | None = None, slam_config: Config | None = None)[source]
A wrapper that combines cuVSLAM Odometry and SLAM functionality.
This class automatically manages both the Odometry and SLAM instances, providing a simplified interface for common use cases.
- OdometryRGBDSettings
alias of
RGBDSettings
- SlamLocalizationSettings
alias of
LocalizationSettings
- SlamDataLayer
alias of
DataLayer
- class PoseGraphNode
Node in a pose graph
- property id
Node identifier
- property node_pose
Node pose
- class PoseGraphEdge
Edge in a pose graph
- property covariance
Covariance matrix of the transformation
- property node_from
Source node ID
- property node_to
Target node ID
- property transform
Transformation from source to target node
- class PoseGraph
Pose graph structure
- property edges
List of edges in the graph
- property nodes
List of nodes in the graph
- SlamLandmark
alias of
Landmark
- SlamLandmarks
alias of
Landmarks
- class LocalizerProbe
Localizer probe used during map localization.
Contains input guess pose, result pose and weights, and success flag.
- property exact_result_pose
Exact pose if localization succeeded
- property exact_result_weight
Weight of the resulting solution
- property guess_pose
Input guess pose
- property id
Probe identifier
- property solved
True if the probe was solved successfully
- property weight
Input weight for the probe
- class LocalizerProbes
Collection of localizer probes for a localization attempt.
- property probes
List of localizer probes, see
Slam.LocalizerProbe
- property size
Size of search area
- property timestamp_ns
Timestamp of localizer try in nanoseconds
- track(timestamp: int, images: List[Any], masks: List[Any] | None = None, depths: List[Any] | None = None, gt_pose: Pose | None = None) Tuple[PoseEstimate, Pose | None][source]
Track a rig pose using current image frame.
This method combines odometry and SLAM processing in a single call.
In inertial mode, if visual odometry tracker fails to compute a pose, the function returns the position calculated from a user-provided IMU data. If after several calls of Track() visual odometry is not able to recover, then invalid pose will be returned. Odometry will output poses in the same coordinate frame until a loss of tracking.
To get SLAM poses, SLAM must be enabled in the constructor by providing a non-null slam_config. SLAM poses may have loop closure (LC) jumps when LC is detected and pose graph is optimized. SLAM poses cannot be adjusted retroactively, so use get_all_slam_poses method to get smooth trajectory up to the latest frame. Also, in asynchronous mode, LC is done in a separate work thread to keep track call fast, so SLAM poses are not updated immediately.
All cameras must be synchronized. If a camera rig provides “almost synchronized” frames, the timestamps should be within 1 millisecond.
Images (masks, depth images, etc.) can be numpy arrays or tensors, both GPU (CUDA) and CPU. All data must be of the same type (either GPU or CPU). This is not the same as odom_config.use_gpu - if odometry uses GPU for computations, images etc. can still be either CPU or GPU arrays/tensors.
The images etc. must be in the same order as cameras in the rig. If data for a camera is not available, pass empty array or tensor for that camera image.
- Parameters:
timestamp (int) – Images timestamp in nanoseconds
images (List[Any]) – List of numpy arrays or tensors containing the camera images
masks (List[Any] | None) – Optional list of numpy arrays or tensors containing masks for the images
depths (List[Any] | None) – Optional list of numpy arrays or tensors containing depth images
gt_pose (Pose | None) – Optional ground truth pose. Should be provided if gt_align_mode is enabled, otherwise should be None.
- Returns:
The computed pose estimate from Odometry. On failure world_from_rig will be None. Pose: If SLAM is enabled, the computed pose estimate from SLAM, otherwise None.
- Return type:
- Raises:
ValueError – If data checks fail (e.g. timestamps are out of order, images sizes are inconsistent, etc.).
- register_imu_measurement(sensor_index: int, imu_measurement: ImuMeasurement) None[source]
Register an IMU measurement with the tracker.
Requires Inertial odometry mode.
If visual odometry loses camera position, it briefly continues execution using user-provided IMU measurements while trying to recover the position.
register_imu_measurement()should be called in between image acquisitions however many IMU measurements there are:tracker.track
tracker.register_imu_measurement
…
tracker.register_imu_measurement
tracker.track
tracker.register_imu_measurement
…
Higher frequency IMU measurements are recommended.
IMU sensors and cameras clocks must be synchronized.
track()andregister_imu_measurement()must be called in strict ascending order of timestamps.- Parameters:
sensor_index (int) – Sensor index; must be 0, as only one sensor is supported now
imu_measurement (ImuMeasurement) – IMU measurement to register
- Raises:
ValueError – If IMU fusion is disabled or if called out of the order of timestamps.
- Return type:
None
- get_last_observations(camera_index: int) list[Observation][source]
Get observations from the last frame for specified camera. See
Observation. Requires enable_observations_export=True inOdometryConfig.- Parameters:
camera_index (int)
- Return type:
list[Observation]
- get_last_landmarks() list[Landmark][source]
Get landmarks from the last frame. See
Landmark. Requires enable_landmarks_export=True inOdometryConfig.- Return type:
list[Landmark]
- get_last_gravity() list[float] | None[source]
Get gravity vector from the last frame. Returns None if gravity is not yet available. Requires Inertial mode (odometry_mode=OdometryMode.Inertial in
OdometryConfig)- Return type:
list[float] | None
- get_final_landmarks() dict[int, list[float]][source]
Get all final landmarks from all frames. Landmarks are 3D points in the odometry start frame. Requires enable_final_landmarks_export=True in
OdometryConfig.- Return type:
dict[int, list[float]]
- get_all_slam_poses(max_poses_count: int = 0) List[PoseStamped][source]
Get all SLAM poses for each frame.
Returns all SLAM poses after optimization with the latest pose graph. SLAM poses from track() method will have jumps after loop closures. With this method, on the other hand, you will have smooth output because it recalculates past poses using the current pose graph.
- Parameters:
max_poses_count (int) – Maximum number of poses to return (0 for all)
- Returns:
List of poses with timestamps
- Return type:
List[PoseStamped]
- get_slam_landmarks(layer: DataLayer) Landmarks | None[source]
Get landmarks for a given data layer of SLAM. See
SlamLandmarks,SlamDataLayer.- Parameters:
layer (DataLayer)
- Return type:
Landmarks | None
- get_pose_graph() PoseGraph | None[source]
Get pose graph consisting of all keyframes and their connections including loop closures. See
PoseGraph.- Return type:
PoseGraph | None
- get_localizer_probes() LocalizerProbes | None[source]
Get localizer probes from the most recent localization attempt. See
LocalizerProbes.- Return type:
LocalizerProbes | None
- get_slam_metrics() Metrics | None[source]
Get SLAM metrics. See
SlamMetrics- Return type:
Metrics | None
Functions
- cuvslam.get_version() tuple[str, int, int, int]
Get the version of cuVSLAM library.
Returns a tuple with the semantic version string and major, minor and patch values.
- cuvslam.set_verbosity(arg: int, /) None
Set the verbosity level of the library.
Available values: 0 (default) for no output, 1 for error messages, 2 for warnings, 3 for info messages.
- cuvslam.warm_up_gpu() None
Warm up the GPU (CUDA runtime).
It is not necessary to call it before the first call to cuvslam, but it can help to reduce the first call latency.Throws an exception if CUDA, cusolver or cublas initialization fails.