Deep Racer | Elton Roque Lemos

This project explores the application of reinforcement learning (RL) for optimizing vehicle speed in autonomous navigation. Using a Unity-based simulator integrated with ROS, we developed a system that balances speed and safety while improving lap times.

The research focuses on integrating Deep Q-Network (DQN) and CNN-based neural networks to dynamically optimize speed in real-time, enabling more efficient and adaptable autonomous navigation.

Prerequisites

Ensure you have the following dependencies installed before running this project:

ROS Noetic
Unity-based F1/10th Simulator
Python 3.8+
Ubuntu 20.04

Installation guides:

The simulator includes built-in sensors for LiDAR, IMU, and odometry.

ROS provides a powerful visualization tool called Rviz, which allows real-time visualization and interaction with standard ROS data types, such as point clouds and markers.

Autonomous Vehicle Simulation

The simulator provides a realistic testing environment with dynamic lighting, race tracks, and real-time sensor data.

Technology Stack

Simulation: Unity-based Audubon-Unity Simulator
Framework: ROS (Robot Operating System)
Deep Learning: Deep Q-Network (DQN), CNNs
Sensors: LiDAR, Odometry
Control: PID Controller

Reinforcement Learning Model

We employed a Deep Q-Network (DQN) to train the vehicle to adjust its speed dynamically based on sensor input.

Observation Space: LiDAR scan data (1080 points) and previous actions.
Action Space: Discretized speed control (35 actions).
Reward System: Balanced incentives for speed efficiency and crash avoidance.

Neural Network Architecture

To process sensor inputs effectively, we used a combination of fully connected layers and CNN-based architectures for improved feature extraction.

Training Process

The training process involved curriculum learning:

Phase 1: Maximizing speed and optimizing lap times.
Phase 2: Introducing safety measures and collision avoidance.
Early Stopping: Prevented overfitting while balancing speed vs. safety.

Results and Evaluation

Our RL model achieved nearly competitive lap times compared to a manually optimized PID controller.

Lap Completion Time: RL model - 36 secs vs. PID control - 32 secs.
Performance: Adaptive braking and turn anticipation improved driving efficiency.

Challenges and Solutions

Implemented CNNs for better feature extraction.
Introduced reward shaping to prevent reckless speed prioritization.
Trained the RL model with real-time simulation updates for improved decision-making.

Limitations and Future Work

The current model requires significant training time (~23 hours). Future work will focus on:

Exploring advanced RL techniques such as Actor-Critic (A2C) and DDPG.
Enhancing reward mechanisms for better speed-safety trade-offs.
Expanding the dataset with more diverse training environments.

Conclusion

This project demonstrated the feasibility of using Deep Q-Networks for speed optimization in autonomous vehicles. The results highlight the potential of RL in improving navigation efficiency, making it a promising approach for real-world deployment in self-driving applications.

Project Architecture

RL-based Speed Optimization in Action