Synthetic Data Generation using Isaac Sim
A simulation pipeline built in NVIDIA Isaac Sim to generate a synthetic warehouse dataset using domain randomization and train an object detection model using the TAO Toolkit.
In this project, I implemented an end-to-end synthetic data generation and training pipeline using NVIDIA Isaac Sim and the TAO Toolkit. The goal was to train a high-performance pallet jack detection model using entirely synthetic data.
I began by cloning the official NVIDIA workflow from synthetic_data_generation_training_workflow and thoroughly analyzing the provided scripts and domain randomization logic.
Warehouse Environment Setup
The core Python script, standalone_palletjack_sdg.py, initializes a simple warehouse environment using the Isaac Sim asset at:
ENV_URL = "/Isaac/Environments/Simple_Warehouse/warehouse.usd"
Using the Isaac Asset Server, I loaded the environment and then added assets from the SimReady Pallet Jack Library.
Adding Pallet Jacks and Camera
I loaded multiple pallet jack USDs (e.g., Scale_A, Heavy_Duty_A) into the warehouse. A camera was positioned in the scene using the Replicator API to support randomized viewpoint generation for data diversity.
Applying Domain Randomization
To make the dataset robust, I applied a variety of domain randomization techniques:
- Randomized camera positions and look-at points.
- Color jittering of pallet jack components like
SteerAxles. - Pose variation of pallet jacks (position, rotation, scale).
- Dynamic lighting conditions using randomized color, intensity, and visibility of light sources.
- Injected distractors like cones, barrels, and wet floor signs using custom pose variation.
Annotation and Data Export
I used rep.WriterRegistry.get("KittiWriter") to annotate the dataset in KITTI format, suitable for object detection tasks. Images and annotations were organized under three categories:
- distractors_warehouse: Warehouse props (cones, bins, etc.)
- distractors_additional: Non-warehouse props (bags, furniture, wheelchairs)
- no_distractors: Clean pallet jack only scenes
Each output contained RGB images and matching KITTI-style annotations.
Dataset Generation
I modified the generate_data.sh script to generate the following:
- 2000 images with warehouse distractors
- 2000 images with additional distractors
- 1000 clean images without any distractors
The image resolution was set to 960x544, and output data was saved to:
synthetic_data_generation_training_workflow/palletjack_sdg/palletjack_data/
Model Training with TAO Toolkit
Once the dataset was generated, I set up the NVIDIA TAO Toolkit for training. I used the DetectNet_v2 model, based on ResNet, for object detection. The training was run on an RTX 3080 GPU for 200 epochs with a batch size of 16.
The pretrained model originally achieved a mean Average Precision (mAP) of 78%. After fine-tuning on my custom synthetic dataset, I improved the performance to 84% mAP.
Conclusion and Learnings
This project validated that synthetic data alone can be used to train a performant object detection model. I was able to generate diverse training data using Isaac Sim, annotate it using Replicator tools, and fine-tune a deep learning model with strong results.
Through this experience, I gained expertise in:
- Isaac Sim scripting and Replicator API
- Domain randomization and synthetic data strategies
- KITTI annotation pipeline and dataset structuring
- Model fine-tuning using NVIDIA TAO Toolkit