A high-performance tool for converting semantic segmentation images to 3D point cloud data, specifically designed for autonomous driving mapping and localization applications. The system processes camera images with vehicle pose data to generate accurate 3D point clouds with real-time visualization capabilities.
- Semantic Image Processing: Convert 2D semantic segmentation images to 3D point clouds
- Vehicle Pose Integration: Incorporate real-time vehicle trajectory and orientation data
- Multi-layer Height Support: Handle different elevation levels for complex environments
- Point Cloud Clustering: Advanced clustering and filtering algorithms for data optimization
- Real-time Visualization: Interactive 3D visualization with PCL viewer
- PCD Files: Point Cloud Data format for 3D processing
- TXT Files: Human-readable point cloud coordinates
- Web Visualization: Interactive HTML-based 3D viewer with Plotly
- Trajectory Visualization: Vehicle path and pose visualization
- Data Classification: 13-class semantic labeling system
- Coordinate Transformation: Multiple coordinate system support
- Voxel Grid Filtering: Configurable resolution for point cloud optimization
- Line Fitting: RANSAC-based line detection and fitting
- Multi-threaded Processing: Optimized performance with OpenMP
- OS: Ubuntu 18.04+ / Debian 10+ (Linux recommended)
- Compiler: GCC 7.0+ with C++14 support
- Memory: 4GB+ RAM recommended
- Display: X11 support for GUI visualization
# Build tools
cmake (≥3.16)
make
g++
pkg-config
# Computer Vision & Point Cloud Libraries
libopencv-dev # OpenCV 4.x
libpcl-dev # Point Cloud Library 1.12
libeigen3-dev # Eigen3 linear algebra
libyaml-cpp-dev # YAML configuration parsing
# Visualization (VTK)
libvtk7-dev # VTK for PCL visualization
# or libvtk9-dev on newer systems
# Clone the repository
git clone https://github.com/xixu-me/image2pcd.git
cd image2pcd
# Install dependencies automatically
chmod +x scripts/install_deps.sh
./scripts/install_deps.sh
# Build the project
chmod +x scripts/build.sh
./scripts/build.sh
# Install system dependencies
sudo apt update
sudo apt install -y build-essential cmake pkg-config
sudo apt install -y libopencv-dev libpcl-dev libeigen3-dev libyaml-cpp-dev
sudo apt install -y libvtk7-dev # or libvtk9-dev
# Configure and build
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j$(nproc)
# For web-based visualization
pip3 install open3d plotly numpy
test_data/0619/1/Location_1750044182929507/
├── LocationImg/
│ ├── 1750044182929507.png # Semantic segmentation image
│ ├── 1750044182929507.jpg # Original camera image
│ ├── 1750044183031123.png # Next frame semantic image
│ └── ...
└── car_pose.txt # Vehicle pose data
# config/jpg_map.txt
test_data/0619/1/Location_1750044182929507/LocationImg
test_data/0619/1/Location_1750044182929507/car_pose.txt
1 # Mapping mode flag (0/1)
1 # Save rotated PCD flag (0/1)
1 # Save semantic 01 flag (0/1)
# car_pose.txt format: x,y,z,yaw,pitch,roll,timestamp
10.245,-5.678,0.125,1.234,0.056,-0.023,1750044182929507
10.267,-5.689,0.127,1.236,0.058,-0.025,1750044183031123
...
Label | Description | Color Coding |
---|---|---|
0 | Parking lines | Yellow |
1 | Lane lines | White |
2 | Lane center lines | Blue |
3 | Direction arrows | Green |
4 | Crosswalks | White |
5 | No parking signs | Red |
6 | Speed bumps | Orange |
7 | Pillars | Gray |
8 | Vehicles | Cyan |
9 | Limiters | Purple |
10 | Walls | Brown |
11 | Ground | Black |
12 | Roadside rocks | Dark Gray |
# Configure input paths
echo "test_data/0619/1/Location_1750044182929507/LocationImg" > config/jpg_map.txt
echo "test_data/0619/1/Location_1750044182929507/car_pose.txt" >> config/jpg_map.txt
echo "1" >> config/jpg_map.txt # Enable mapping mode
echo "1" >> config/jpg_map.txt # Save rotated PCD
echo "1" >> config/jpg_map.txt # Save semantic data
# Run the converter
./image2pcd
# Build the project
./scripts/build.sh
# Run the application
./scripts/run.sh
# Launch web visualization
./scripts/visualize.sh
# Clean build artifacts
./scripts/clean.sh
// Modify processing parameters in source
int image_height = 800; // Input image dimensions
int image_width = 800;
float pixel_scale_x = 0.02; // Meters per pixel
float pixel_scale_y = 0.02;
float voxel_leaf_size = 0.05; // Voxel grid resolution
// Vehicle parameters
float g_car_length = 4.8; // Vehicle length (meters)
float g_car_width = 2.0; // Vehicle width (meters)
float g_back_to_rear = 1.3; // Rear axle to back distance
# Interactive controls during visualization:
# SPACE - Pause/Resume processing
# Arrow keys - Navigate point cloud
# Mouse wheel - Zoom in/out
# Mouse drag - Rotate view
# Key build settings
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_BUILD_TYPE Release)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -O3 -fopenmp")
# Required libraries
find_package(OpenCV REQUIRED)
find_package(Eigen3 REQUIRED)
find_package(yaml-cpp REQUIRED)
pkg_check_modules(PCL REQUIRED pcl_common-1.12 pcl_io-1.12)
Script | Purpose |
---|---|
scripts/build.sh |
Configure and build the project |
scripts/clean.sh |
Clean build artifacts |
scripts/install_deps.sh |
Install system dependencies |
scripts/run.sh |
Execute the application |
scripts/visualize.sh |
Launch web visualization |
- Image Loading: Multi-threaded PNG/JPG loading
- Semantic Classification: 13-class pixel classification
- Coordinate Transformation: Image → Vehicle → World coordinates
- Point Cloud Generation: 3D point creation with intensity values
- Clustering & Filtering: Voxel grid and outlier removal
- Visualization: Real-time PCL rendering
- Processing Speed: ~10-20 FPS (800x800 images)
- Memory Usage: ~2-4GB for typical datasets
- Point Density: ~50,000-200,000 points per frame
- Accuracy: Sub-centimeter precision with proper calibration
// Adjust voxel grid size for performance vs quality
voxel_grid_filter.setLeafSize(0.1, 0.1, 0.1); // Faster
voxel_grid_filter.setLeafSize(0.05, 0.05, 0.05); // Higher quality
// Enable OpenMP for multi-threading
export OMP_NUM_THREADS=4
# Run web visualization tool
python3 tools/pcd_web.py [input.pcd] [output_dir]
# Example
python3 tools/pcd_web.py test_data/map_only23456.pcd web_visualization/
- 3D Interactive Viewer: Rotate, zoom, pan
- Point Cloud Statistics: Count, bounds, density
- Color Coding: By intensity or semantic class
- Cross-platform: Works in any modern browser
- Export Options: Screenshots, point data
# Automatically opens in default browser
firefox web_visualization/index.html
# Or manually navigate to:
file:///path/to/project/web_visualization/index.html
- HD Map Generation: Create high-definition maps from camera data
- Lane Detection: Extract lane markings and road boundaries
- Parking Space Mapping: Identify and map parking areas
- Infrastructure Mapping: Map poles, signs, and road furniture
- SLAM Integration: Provide semantic landmarks for SLAM
- Visual Odometry: Support camera-based navigation
- Map Matching: Compare real-time data with stored maps
- Pose Estimation: Vehicle position and orientation tracking
// Point cloud output format
struct PointXYZI {
float x, y, z; // 3D coordinates (meters)
float intensity; // Semantic class or confidence
};
// Vehicle pose format
struct CarPose {
float x, y, z; // Position (meters)
float yaw, pitch, roll; // Orientation (radians)
int64_t timestamp; // Time (microseconds)
};
# PCL not found
sudo apt install libpcl-dev
export PCL_ROOT=/usr
# VTK version conflicts
sudo apt install libvtk7-dev
# or try: sudo apt install libvtk9-dev
# CMake version too old
sudo apt install cmake
# Check: cmake --version (should be ≥3.16)
# Segmentation fault
export DISPLAY=:0 # For X11 forwarding
ulimit -c unlimited # Enable core dumps
# No display available
export LIBGL_ALWAYS_INDIRECT=1
export LIBGL_ALWAYS_SOFTWARE=1
# Memory issues
# Reduce image resolution or enable swap
free -h # Check available memory
# Invalid pose file format
# Ensure comma-separated values: x,y,z,yaw,pitch,roll,timestamp
head -5 test_data/*/car_pose.txt
# Missing semantic images
# Verify PNG files exist in LocationImg/
ls test_data/*/LocationImg/*.png
# Build with debug symbols
cmake -DCMAKE_BUILD_TYPE=Debug ..
make -j$(nproc)
# Run with gdb
gdb ./image2pcd
(gdb) run
(gdb) bt # Show backtrace on crash
// Enable verbose output
std::cout << "Processing frame: " << timestamp << std::endl;
std::cout << "Point cloud size: " << cloud->size() << std::endl;
std::cout << "Pose: " << pose.x << "," << pose.y << "," << pose.yaw << std::endl;
test_data/
├── map_only23456.pcd # Final merged point cloud (binary)
├── map_only23456.txt # Point cloud in text format
└── intermediate/ # Temporary processing files
web_visualization/
├── index.html # Main visualization page
├── pointcloud_3d.html # 3D interactive viewer
└── assets/ # Static resources
# PCD format (binary)
# VERSION 0.7
# FIELDS x y z intensity
# SIZE 4 4 4 4
# TYPE F F F F
# COUNT 1 1 1 1
# WIDTH [point_count]
# HEIGHT 1
# VIEWPOINT 0 0 0 1 0 0 0
# POINTS [point_count]
# DATA binary
# TXT format (ASCII)
# x y z intensity
10.245 -5.678 0.125 1.0
10.267 -5.689 0.127 1.0
...
This project is licensed under the MIT License - see the LICENSE file for details.
- Point Cloud Library (PCL): 3D processing framework
- OpenCV: Computer vision and image processing
- Eigen3: Linear algebra and transformations
- Plotly: Interactive web visualizations
- Open3D: Point cloud processing and visualization