Radar-Camera Fusion

Radar-Camera Fusion System

Overview

Radar-Camera Fusion is an early-stage research project from my master’s studies, exploring multi-sensor fusion techniques for enhanced target detection and tracking. This project implements a late fusion strategy that combines the complementary strengths of mmWave radar and camera sensors to achieve robust object localization and tracking in various scenarios.

Note: This was one of my initial research explorations during my master’s program. While I may not recall every implementation detail, the core methodology and system architecture remain documented here.

System Architecture

The system implements a late fusion strategy where radar and camera data are processed independently before being fused at the decision level. This approach leverages the strengths of both sensors:

  • Camera: Provides rich visual information for object classification
  • mmWave Radar: Offers accurate range, velocity, and robust performance in adverse conditions

Fusion Pipeline

Camera Stream               Range/Azimuth Extraction
     ↓                                ↓
  ROI Crop                  Point Cloud Processing
     ↓                                ↓
YOLO Detection               Radar Point Cloud
     ↓                                ↓
DeepSORT Tracking                      ↓
     ↓                                ↓
Monocular Ranging                      ↓
     ↓                                ↓
     └────→ Data Fusion ←────┘
      Fused Target Info

Core Components

1. Camera Processing Pipeline

ROI Extraction

  • Image Preprocessing: Frame capture and region of interest cropping
  • Adaptive ROI: Dynamic region selection based on scene context
  • Resolution Optimization: Balanced processing speed and detection accuracy

YOLO Object Detection

  • Real-time Detection: YOLOv5 for fast and accurate object detection
  • Multi-Class Recognition: Supports pedestrians, vehicles, and other objects
  • Bounding Box Output: Precise object localization in image coordinates

DeepSORT Tracking

  • Multi-Object Tracking: Maintains consistent IDs across frames
  • Appearance Features: Deep learning-based appearance descriptor
  • Kalman Filtering: Smooth trajectory prediction and association
  • ID Management: Handles occlusions and re-identification

Monocular Distance Estimation

  • Pinhole Camera Model: Geometric-based distance calculation
  • Calibration: Camera intrinsic parameter calibration
  • Height-Based Ranging: Estimates distance using object height and camera parameters
  • Azimuth Calculation: Computes horizontal angle from image coordinates

2. Radar Processing Pipeline

Point Cloud Generation

  • mmWave Radar: Processes raw radar data to extract point clouds
  • Range-Azimuth-Elevation: 3D spatial information for each detection
  • Velocity Information: Doppler-based velocity measurement
  • SNR Filtering: Signal quality-based point filtering

Target Extraction

  • Clustering: Groups radar points into distinct targets
  • Centroid Calculation: Computes target center position
  • Velocity Estimation: Extracts radial velocity for each target

3. Late Fusion Strategy

Spatial Alignment

  • Coordinate Transformation: Converts camera and radar coordinates to common reference frame
  • Calibration Matrix: Extrinsic calibration between camera and radar
  • Temporal Synchronization: Aligns timestamps between sensors

Data Association

  • Position Matching: Associates camera detections with radar points based on spatial proximity
  • Gating Threshold: Defines acceptable matching distance
  • Confidence Weighting: Combines detection confidences from both sensors

Fused Output

  • Enhanced Localization: Improved position accuracy by combining both sensors
  • Velocity Information: Radar-provided velocity enriches camera detections
  • Robust Detection: Maintains tracking even when one sensor fails

Technical Highlights

Advantages of Late Fusion

  • Modularity: Independent processing allows easy sensor replacement or upgrade
  • Robustness: System continues functioning if one sensor fails
  • Flexibility: Can adjust fusion weights based on environmental conditions
  • Interpretability: Clear understanding of each sensor’s contribution

Challenges Addressed

  • Calibration Complexity: Precise spatial and temporal alignment between sensors
  • Coordinate Transformation: Accurate mapping between different sensor coordinate systems
  • Association Ambiguity: Resolving which radar points correspond to which camera detections
  • Computational Efficiency: Real-time processing of dual sensor streams

Applications

The radar-camera fusion system is particularly well-suited for scenarios requiring both visual recognition and precise ranging:

Autonomous Driving

  • Pedestrian Detection: Enhanced detection accuracy in various lighting conditions
  • Vehicle Tracking: Robust tracking with velocity information
  • Obstacle Avoidance: Reliable distance measurement for path planning

Intelligent Surveillance

  • Perimeter Security: Combined visual identification and range verification
  • Intrusion Detection: Multi-modal confirmation reduces false alarms
  • Crowd Monitoring: Track multiple targets with unique identities

Smart Transportation

  • Traffic Flow Analysis: Vehicle counting and speed measurement
  • Parking Management: Occupancy detection with vehicle classification
  • Intersection Monitoring: Multi-target tracking at complex scenarios

Robotics

  • Navigation: Obstacle detection and avoidance
  • Human-Robot Interaction: Safe distance maintenance and gesture recognition
  • Object Manipulation: Precise localization for grasping tasks

System Demonstration

Radar-Camera Fusion System in Action
Radar-Camera Fusion System in Action

The demonstration shows the system’s ability to:

  • Detect and classify objects using camera-based YOLO
  • Track multiple targets with DeepSORT
  • Fuse camera-derived positions with radar point clouds
  • Provide accurate range and velocity information

Technical Specifications

Camera Module

  • Resolution: 1920×1080 (Full HD)
  • Frame Rate: 30 FPS
  • Field of View: 60° horizontal
  • Detection Network: YOLOv5
  • Tracking Algorithm: DeepSORT

Radar Module

  • Frequency: 77 GHz mmWave
  • Range Resolution: 0.1-0.2 m
  • Velocity Range: ±50 m/s
  • Angular Resolution: 15° (azimuth)
  • Update Rate: 10-20 Hz

Fusion Performance

  • Latency: < 100 ms (end-to-end)
  • Position Accuracy: ±0.3 m (fused)
  • Tracking Range: 0.5-50 m
  • Max Targets: 10+ simultaneous objects

Lessons Learned

This early research project provided valuable insights into multi-sensor fusion:

  • Complementary Strengths: Camera excels at classification, radar at ranging
  • Calibration is Critical: Accurate sensor alignment is essential for fusion
  • Late Fusion Trade-offs: Simpler implementation but may miss low-level correlations
  • Real-time Challenges: Synchronization and computational efficiency are key

Future Directions

While this project explored late fusion, modern approaches might consider:

  • Early Fusion: Fusing raw sensor data for richer feature extraction
  • Deep Learning Fusion: End-to-end neural networks for joint processing
  • Adaptive Fusion: Dynamic weighting based on environmental conditions
  • Additional Sensors: Incorporating LiDAR or thermal cameras
Qin Chen 陈钦
Qin Chen 陈钦
Ph.D of Information and Communication Engineering

My research interests include Human-computer interaction, signal processing and machine learning.