Visual Sensors#

nvblox supports both Camera and LiDAR sensors. It is also possible to extend the library to Extending nvblox to support other sensors.

Camera#

The camera sensor is modeled as a pinhole camera operating in the computer vision coordinate frame, defined as:

Z-axis: Forward (optical axis)
X-axis: Right (along the image width)
Y-axis: Down (along the image height)

The corresponding depth images store Z-values, i.e. distances along the camera’s Z-axis.

A 3D point \(P = (X, Y, Z)\) in camera coordinates projects to pixel coordinates \((u, v)\) using the pinhole model:

\[\begin{split}u = f_x \frac{X}{Z} + c_x \\ v = f_y \frac{Y}{Z} + c_y\end{split}\]

where \(Z\) is the depth value stored in the depth image, and the camera intrinsics are defined in the table below:

Notation	Parameter name in camera.h.	Description
\(f_x\)	`fu`	Focal length in the u (width) direction [px]
\(f_y\)	`fv`	Focal length in the v (height) direction [px]
\(c_x\)	`cu`	Principal point offset in the u (width) direction [px]
\(c_y\)	`cv`	Principal point offset in the v (height) direction [px]

LiDAR#

The LiDAR sensor is parameterized using spherical coordinates with range \(r\), azimuth \(\phi\) (horizontal angle), and polar angle \(\alpha\) (vertical angle measured from the +Z axis).

The LiDAR sensor operates in the standard robotics coordinate frame, defined as:

X-axis: Forward (\(\phi = 0^\circ\), \(\alpha = 90^\circ\))
Y-axis: Left
Z-axis: Up

A 3D point \(P = (X, Y, Z)\) in LiDAR coordinates converts to spherical coordinates using:

\[\begin{split}r = \sqrt{X^2 + Y^2 + Z^2} \\ \phi = \operatorname{atan2}(Y, X) \\ \alpha = \arccos\!\left(\frac{Z}{r}\right)\end{split}\]

where \(r\) is the radial distance stored as depth at image coordinate \((u,v)\). The depth image coordinates are otained by applying the LiDAR intrisics to the spherical coordinates:

\[\begin{split}u = (\phi - \phi_{\text{start}}) \cdot \frac{N_\phi}{2\pi} \\ v = (\alpha - \alpha_{\text{start}}) \cdot \frac{N_\alpha - 1}{\text{vfov}}\end{split}\]

with intrinsic parameters defined in the table below:

Notation	Parameter name in lidar.h.	Description
\(N_\phi\)	`num_azimuth_divisions`	Number of azimuth divisions (range image width)
\(N_\alpha\)	`num_elevation_divisions`	Number of elevation divisions (range image height)
\(\phi_{\text{start}}\)	`start_azimuth_angle_rad`	Starting azimuth angle for image mapping [rad]
\(\alpha_{\text{start}}\)	`start_polar_angle_rad`	Starting polar angle (from +Z) for image mapping [rad]
\(\text{vfov}\)	`vertical_fov_rad`	Vertical field of view [rad]

Extending nvblox to support other sensors#

For advanced users, it is possible to extend nvblox to support other sensors by implementing the interface defined in sensor.h. For more details, see camera.h and lidar.h. Your sensor class must inherit from SensorBase and implement all methods checked by the is_sensor_interface trait. Note that this can be done without modifying the core library.

Template Instantiations#

After implementing your sensor class, you must provide explicit template instantiations for certain functions that are using them. Follow the pattern in camera_inst.cu and link with your own instantiations file when building your project.

Some implementation notes:#

Depth Representation: Implement getDepth() to return the appropriate distance metric for your sensor (e.g., Euclidean distance for LiDAR, Z-distance for cameras).
Interpolation: The interpolateDepthImage() method should handle sensor-specific interpolation logic, including discontinuity detection and ray-distance validation.
CUDA Compatibility: All methods must be callable from both host and device code using __host__ __device__ qualifiers.
Build Integration: Add your instantiation file to the appropriate build target to ensure it gets compiled and linked.