Deep Racer
Inspired by AWS Deepracer, developed and actor critic modeled car that could compete with a PID controller's lap times and safety.
This project explores the application of reinforcement learning (RL) for optimizing vehicle speed in autonomous navigation. Using a Unity-based simulator integrated with ROS, we developed a system that balances speed and safety while improving lap times.
The research focuses on integrating Deep Q-Network (DQN) and CNN-based neural networks to dynamically optimize speed in real-time, enabling more efficient and adaptable autonomous navigation.
Prerequisites
Ensure you have the following dependencies installed before running this project:
- ROS Noetic
- Unity-based F1/10th Simulator
- Python 3.8+
- Ubuntu 20.04
Installation guides:
The simulator includes built-in sensors for LiDAR, IMU, and odometry.
ROS provides a powerful visualization tool called Rviz, which allows real-time visualization and interaction with standard ROS data types, such as point clouds and markers.
Autonomous Vehicle Simulation
The simulator provides a realistic testing environment with dynamic lighting, race tracks, and real-time sensor data.
Technology Stack
- Simulation: Unity-based Audubon-Unity Simulator
- Framework: ROS (Robot Operating System)
- Deep Learning: Deep Q-Network (DQN), CNNs
- Sensors: LiDAR, Odometry
- Control: PID Controller
Reinforcement Learning Model
We employed a Deep Q-Network (DQN) to train the vehicle to adjust its speed dynamically based on sensor input.
- Observation Space: LiDAR scan data (1080 points) and previous actions.
- Action Space: Discretized speed control (35 actions).
- Reward System: Balanced incentives for speed efficiency and crash avoidance.
Neural Network Architecture
To process sensor inputs effectively, we used a combination of fully connected layers and CNN-based architectures for improved feature extraction.
Training Process
The training process involved curriculum learning:
- Phase 1: Maximizing speed and optimizing lap times.
- Phase 2: Introducing safety measures and collision avoidance.
- Early Stopping: Prevented overfitting while balancing speed vs. safety.
Results and Evaluation
Our RL model achieved nearly competitive lap times compared to a manually optimized PID controller.
- Lap Completion Time: RL model - 36 secs vs. PID control - 32 secs.
- Performance: Adaptive braking and turn anticipation improved driving efficiency.
Challenges and Solutions
- Implemented CNNs for better feature extraction.
- Introduced reward shaping to prevent reckless speed prioritization.
- Trained the RL model with real-time simulation updates for improved decision-making.
Limitations and Future Work
The current model requires significant training time (~23 hours). Future work will focus on:
- Exploring advanced RL techniques such as Actor-Critic (A2C) and DDPG.
- Enhancing reward mechanisms for better speed-safety trade-offs.
- Expanding the dataset with more diverse training environments.
Conclusion
This project demonstrated the feasibility of using Deep Q-Networks for speed optimization in autonomous vehicles. The results highlight the potential of RL in improving navigation efficiency, making it a promising approach for real-world deployment in self-driving applications.
Project Architecture
RL-based Speed Optimization in Action