Case Study: Using Synthetic Data to Improve Traffic Sign Classification and Achieve Regulatory Compliance

This case study explores whether synthetic traffic sign data can improve a perception model’s traffic sign classification performance. Its results show that synthetic data reduces the need for real training data by 90%.
Apr 20, 2023

Advanced driver-assistance systems (ADAS) and autonomous vehicles (AVs) rely on perception models to accurately detect and classify traffic signs in compliance with required safety technology such as Intelligent Speed Assistance (ISA). To train these perception models, ADAS and AV programs require significant amounts of diverse, accurately labeled data. 

Given the large variations in traffic sign appearance, illumination, weather (Figure 1), and occlusions, it is often expensive and sometimes impossible to collect real-world data that covers all possible scenarios. Using Applied Intuition’s Synthetic Datasets such as Traffic Sign Datasets, ADAS and AV programs can accelerate perception model development while ensuring that resulting models are robust and can perform well in the real world. 

Applied’s perception simulation team has conducted a case study to explore how synthetic data can complement real data in machine learning (ML) model training. The study specifically examines whether synthetic traffic sign data can improve a perception model’s traffic sign classification performance. 

Figure 1: Traffic sign detection is difficult in extreme weather conditions

Synthetic data reduces the need for real training data by 90%

The results of our study show that synthetic data can reduce the need for labeled real-world data while improving perception model performance on the traffic sign classification task. Our experiments demonstrate that a model trained on both synthetic and real data outperforms a baseline model trained on real data exclusively. We also show that synthetic data reduces the need for real training data by 90% (Figure 2). Specifically, a model trained on synthetic data and ten real-world images per class matched the performance of a model trained on at least 100 real-world images per class.

Figure 2: Perception model performance in our data ablation experiment compared to the baseline model as a function of real data size

Continue reading

Enter your email below to read the full-length case study. Contact our team if you have any questions or to learn more about our Traffic Sign Datasets.

Read the case study
Oops! Something went wrong while submitting the form. Please try again.