+
Skip to content

High-performance tool for converting semantic segmentation images to 3D point cloud data, specifically designed for autonomous driving mapping and localization applications

License

Notifications You must be signed in to change notification settings

xixu-me/Image2PCD

Repository files navigation

Image2PCD: Semantic Image to Point Cloud Converter

C++ CMake PCL OpenCV License

A high-performance tool for converting semantic segmentation images to 3D point cloud data, specifically designed for autonomous driving mapping and localization applications. The system processes camera images with vehicle pose data to generate accurate 3D point clouds with real-time visualization capabilities.

🚀 Features

Core Functionality

  • Semantic Image Processing: Convert 2D semantic segmentation images to 3D point clouds
  • Vehicle Pose Integration: Incorporate real-time vehicle trajectory and orientation data
  • Multi-layer Height Support: Handle different elevation levels for complex environments
  • Point Cloud Clustering: Advanced clustering and filtering algorithms for data optimization
  • Real-time Visualization: Interactive 3D visualization with PCL viewer

Output Formats

  • PCD Files: Point Cloud Data format for 3D processing
  • TXT Files: Human-readable point cloud coordinates
  • Web Visualization: Interactive HTML-based 3D viewer with Plotly
  • Trajectory Visualization: Vehicle path and pose visualization

Advanced Features

  • Data Classification: 13-class semantic labeling system
  • Coordinate Transformation: Multiple coordinate system support
  • Voxel Grid Filtering: Configurable resolution for point cloud optimization
  • Line Fitting: RANSAC-based line detection and fitting
  • Multi-threaded Processing: Optimized performance with OpenMP

📋 Prerequisites

System Requirements

  • OS: Ubuntu 18.04+ / Debian 10+ (Linux recommended)
  • Compiler: GCC 7.0+ with C++14 support
  • Memory: 4GB+ RAM recommended
  • Display: X11 support for GUI visualization

Core Dependencies

# Build tools
cmake (≥3.16)
make
g++
pkg-config

# Computer Vision & Point Cloud Libraries
libopencv-dev     # OpenCV 4.x
libpcl-dev        # Point Cloud Library 1.12
libeigen3-dev     # Eigen3 linear algebra
libyaml-cpp-dev   # YAML configuration parsing

# Visualization (VTK)
libvtk7-dev       # VTK for PCL visualization
# or libvtk9-dev on newer systems

🛠️ Installation

Quick Install (Ubuntu/Debian)

# Clone the repository
git clone https://github.com/xixu-me/image2pcd.git
cd image2pcd

# Install dependencies automatically
chmod +x scripts/install_deps.sh
./scripts/install_deps.sh

# Build the project
chmod +x scripts/build.sh
./scripts/build.sh

Manual Installation

# Install system dependencies
sudo apt update
sudo apt install -y build-essential cmake pkg-config
sudo apt install -y libopencv-dev libpcl-dev libeigen3-dev libyaml-cpp-dev
sudo apt install -y libvtk7-dev  # or libvtk9-dev

# Configure and build
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j$(nproc)

Python Visualization Tools (Optional)

# For web-based visualization
pip3 install open3d plotly numpy

📁 Data Structure

Input Directory Structure

test_data/0619/1/Location_1750044182929507/
├── LocationImg/
│   ├── 1750044182929507.png    # Semantic segmentation image
│   ├── 1750044182929507.jpg    # Original camera image
│   ├── 1750044183031123.png    # Next frame semantic image
│   └── ...
└── car_pose.txt                # Vehicle pose data

Configuration File Format

# config/jpg_map.txt
test_data/0619/1/Location_1750044182929507/LocationImg
test_data/0619/1/Location_1750044182929507/car_pose.txt
1    # Mapping mode flag (0/1)
1    # Save rotated PCD flag (0/1)
1    # Save semantic 01 flag (0/1)

Vehicle Pose Data Format

# car_pose.txt format: x,y,z,yaw,pitch,roll,timestamp
10.245,-5.678,0.125,1.234,0.056,-0.023,1750044182929507
10.267,-5.689,0.127,1.236,0.058,-0.025,1750044183031123
...

Semantic Class Labels

Label Description Color Coding
0 Parking lines Yellow
1 Lane lines White
2 Lane center lines Blue
3 Direction arrows Green
4 Crosswalks White
5 No parking signs Red
6 Speed bumps Orange
7 Pillars Gray
8 Vehicles Cyan
9 Limiters Purple
10 Walls Brown
11 Ground Black
12 Roadside rocks Dark Gray

🎛️ Usage Examples

Basic Usage

# Configure input paths
echo "test_data/0619/1/Location_1750044182929507/LocationImg" > config/jpg_map.txt
echo "test_data/0619/1/Location_1750044182929507/car_pose.txt" >> config/jpg_map.txt
echo "1" >> config/jpg_map.txt  # Enable mapping mode
echo "1" >> config/jpg_map.txt  # Save rotated PCD
echo "1" >> config/jpg_map.txt  # Save semantic data

# Run the converter
./image2pcd

Using Build Scripts

# Build the project
./scripts/build.sh

# Run the application
./scripts/run.sh

# Launch web visualization
./scripts/visualize.sh

# Clean build artifacts
./scripts/clean.sh

Advanced Configuration

// Modify processing parameters in source
int image_height = 800;         // Input image dimensions
int image_width = 800;
float pixel_scale_x = 0.02;     // Meters per pixel
float pixel_scale_y = 0.02;
float voxel_leaf_size = 0.05;   // Voxel grid resolution

// Vehicle parameters
float g_car_length = 4.8;       // Vehicle length (meters)
float g_car_width = 2.0;        // Vehicle width (meters)
float g_back_to_rear = 1.3;     // Rear axle to back distance

Real-time Visualization

# Interactive controls during visualization:
# SPACE - Pause/Resume processing
# Arrow keys - Navigate point cloud
# Mouse wheel - Zoom in/out
# Mouse drag - Rotate view

🔧 Build Configuration

CMake Options

# Key build settings
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_BUILD_TYPE Release)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -O3 -fopenmp")

# Required libraries
find_package(OpenCV REQUIRED)
find_package(Eigen3 REQUIRED)
find_package(yaml-cpp REQUIRED)
pkg_check_modules(PCL REQUIRED pcl_common-1.12 pcl_io-1.12)

Build Scripts

Script Purpose
scripts/build.sh Configure and build the project
scripts/clean.sh Clean build artifacts
scripts/install_deps.sh Install system dependencies
scripts/run.sh Execute the application
scripts/visualize.sh Launch web visualization

📊 Performance & Optimization

Processing Pipeline

  1. Image Loading: Multi-threaded PNG/JPG loading
  2. Semantic Classification: 13-class pixel classification
  3. Coordinate Transformation: Image → Vehicle → World coordinates
  4. Point Cloud Generation: 3D point creation with intensity values
  5. Clustering & Filtering: Voxel grid and outlier removal
  6. Visualization: Real-time PCL rendering

Performance Metrics

  • Processing Speed: ~10-20 FPS (800x800 images)
  • Memory Usage: ~2-4GB for typical datasets
  • Point Density: ~50,000-200,000 points per frame
  • Accuracy: Sub-centimeter precision with proper calibration

Optimization Tips

// Adjust voxel grid size for performance vs quality
voxel_grid_filter.setLeafSize(0.1, 0.1, 0.1);  // Faster
voxel_grid_filter.setLeafSize(0.05, 0.05, 0.05); // Higher quality

// Enable OpenMP for multi-threading
export OMP_NUM_THREADS=4

🌐 Web Visualization

Generate Interactive HTML

# Run web visualization tool
python3 tools/pcd_web.py [input.pcd] [output_dir]

# Example
python3 tools/pcd_web.py test_data/map_only23456.pcd web_visualization/

Web Features

  • 3D Interactive Viewer: Rotate, zoom, pan
  • Point Cloud Statistics: Count, bounds, density
  • Color Coding: By intensity or semantic class
  • Cross-platform: Works in any modern browser
  • Export Options: Screenshots, point data

Opening Visualization

# Automatically opens in default browser
firefox web_visualization/index.html

# Or manually navigate to:
file:///path/to/project/web_visualization/index.html

🚗 Autonomous Driving Applications

Mapping Applications

  • HD Map Generation: Create high-definition maps from camera data
  • Lane Detection: Extract lane markings and road boundaries
  • Parking Space Mapping: Identify and map parking areas
  • Infrastructure Mapping: Map poles, signs, and road furniture

Localization Support

  • SLAM Integration: Provide semantic landmarks for SLAM
  • Visual Odometry: Support camera-based navigation
  • Map Matching: Compare real-time data with stored maps
  • Pose Estimation: Vehicle position and orientation tracking

Data Formats

// Point cloud output format
struct PointXYZI {
    float x, y, z;    // 3D coordinates (meters)
    float intensity;  // Semantic class or confidence
};

// Vehicle pose format
struct CarPose {
    float x, y, z;           // Position (meters)
    float yaw, pitch, roll;  // Orientation (radians)
    int64_t timestamp;       // Time (microseconds)
};

🐛 Troubleshooting

Common Issues

Build Errors

# PCL not found
sudo apt install libpcl-dev
export PCL_ROOT=/usr

# VTK version conflicts
sudo apt install libvtk7-dev
# or try: sudo apt install libvtk9-dev

# CMake version too old
sudo apt install cmake
# Check: cmake --version (should be ≥3.16)

Runtime Issues

# Segmentation fault
export DISPLAY=:0  # For X11 forwarding
ulimit -c unlimited  # Enable core dumps

# No display available
export LIBGL_ALWAYS_INDIRECT=1
export LIBGL_ALWAYS_SOFTWARE=1

# Memory issues
# Reduce image resolution or enable swap
free -h  # Check available memory

Data Format Issues

# Invalid pose file format
# Ensure comma-separated values: x,y,z,yaw,pitch,roll,timestamp
head -5 test_data/*/car_pose.txt

# Missing semantic images
# Verify PNG files exist in LocationImg/
ls test_data/*/LocationImg/*.png

Debug Mode

# Build with debug symbols
cmake -DCMAKE_BUILD_TYPE=Debug ..
make -j$(nproc)

# Run with gdb
gdb ./image2pcd
(gdb) run
(gdb) bt  # Show backtrace on crash

Logging & Diagnostics

// Enable verbose output
std::cout << "Processing frame: " << timestamp << std::endl;
std::cout << "Point cloud size: " << cloud->size() << std::endl;
std::cout << "Pose: " << pose.x << "," << pose.y << "," << pose.yaw << std::endl;

📈 Output Files

Generated Files

test_data/
├── map_only23456.pcd       # Final merged point cloud (binary)
├── map_only23456.txt       # Point cloud in text format
└── intermediate/           # Temporary processing files

web_visualization/
├── index.html              # Main visualization page
├── pointcloud_3d.html      # 3D interactive viewer
└── assets/                 # Static resources

File Formats

# PCD format (binary)
# VERSION 0.7
# FIELDS x y z intensity
# SIZE 4 4 4 4
# TYPE F F F F
# COUNT 1 1 1 1
# WIDTH [point_count]
# HEIGHT 1
# VIEWPOINT 0 0 0 1 0 0 0
# POINTS [point_count]
# DATA binary

# TXT format (ASCII)
# x y z intensity
10.245 -5.678 0.125 1.0
10.267 -5.689 0.127 1.0
...

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Point Cloud Library (PCL): 3D processing framework
  • OpenCV: Computer vision and image processing
  • Eigen3: Linear algebra and transformations
  • Plotly: Interactive web visualizations
  • Open3D: Point cloud processing and visualization

📚 References

About

High-performance tool for converting semantic segmentation images to 3D point cloud data, specifically designed for autonomous driving mapping and localization applications

Topics

Resources

License

Stars

Watchers

Forks

Languages

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载