This blog post is the second in a three-part series highlighting different aspects of Applied’s verification and validation (V&V) handbook. Read part 1 for an introduction to V&V and an overview of the best practices that autonomy programs can follow in different stages of their advanced driver-assistance systems (ADAS) and automated driving systems (ADS) development. Part 2 of our series shows how autonomy programs typically approach scenario creation and test execution depending on their development stage and how they can address common challenges. Keep reading to learn more about this topic, or access the full-length V&V handbook below.
In ADAS and ADS development, a scenario is a description of a scene, including each actor and its behavior over a period of time. The PEGASUS method provides a model for systematically describing scenarios based on six independent layers (Figure 1): Environment topology, traffic infrastructure, environment state, objects and agents, environmental conditions, and digital information.
Scenarios play an essential role in every autonomy program’s V&V efforts. They allow teams to test and evaluate an autonomous system’s performance in specific situations. Scenario-based testing also helps teams systematically build coverage. Coverage measures how much of the operational design domain (ODD) the autonomous system has been tested on so far. The “Defining and measuring coverage” section in our V&V handbook lays out in more detail how autonomy programs in different development stages can define and measure coverage.
The following table shows how autonomy programs typically approach scenario creation depending on their development stage (Figure 2).
As seen in Figure 2, early-stage teams usually focus on building broad coverage across requirements and scenario categories. Once they have built broad coverage, later-stage teams focus on collecting and generating edge case scenarios and expanding into new domains.
Throughout their V&V efforts, autonomy programs need to build a comprehensive scenario library that covers the entire ODD for the intended deployment. Using this library, teams can test their autonomous system against key performance and safety benchmarks for the scenarios that could occur in the ODD. The “Building a comprehensive scenario library” section in the handbook lays out different approaches and techniques that programs can leverage to build out their scenario library.
Based on their scenario library and system requirements, autonomy programs should define evaluation criteria and metrics that test the system’s performance. These evaluation criteria change a scenario into a test case. Autonomy programs should track a measurable, overall pass/fail outcome for each test case. This outcome is a composite of key competency, safety, and comfort factors, where all non-optional evaluation rules must pass, with the ability to dig into each of them and their underlying metrics. The V&V handbook’s “Defining evaluation criteria and metrics” section lists specific metrics and evaluation criteria that teams should assess for their test cases.
The following table lays out which test methods autonomy programs typically use at each stage in their development and what role real-world tests play at each stage (Figure 3).
Autonomy teams can prevent scaling and cost issues by ramping up simulation usage as soon as possible. It can also be beneficial to transition vehicle tests to focus less on core testing and more on final validation and edge case discovery. The “Test execution” section in our handbook explains how autonomy programs can use each test method effectively depending on the team’s development stage while considering each method’s strengths and weaknesses.
One of the main challenges of test execution is the problem of combinatorial explosion. Autonomy programs must bias their resources towards safety-critical scenarios, as those provide the most information to validation, safety, and development teams. However, scenario libraries continually increase in size as the overall testing program matures. The number of scenarios teams need to test usually increases linearly relative to the number of new requirements. The volume of the scenario space and the total number of scenarios teams need to execute increases exponentially with the number of ODD attributes and parameters they need to cover.
For example, an autonomy program might need to test 1.6 million variations to exhaustively test all the possible permutations of a specific test case in (Figure 4). This example does not include different environmental conditions (e.g., time of day, rainfall), map locations, and higher granularity of behavioral parameters that would exponentially increase the required number of tests even further. On top of that, these 1.6 million variations only pertain to a single test case, while autonomy programs need to run thousands of test cases for each software release.
Applied recommends a scalable simulation-first testing strategy. Unfortunately, even with simulation, teams might still need to execute hundreds of millions of scenarios in each release. To supplement their scalable simulation strategy, autonomy programs should leverage intelligent sampling techniques to identify the important scenarios to spend testing resources on. Depending on their development stage, teams should optimize for one of the following things: 1) Testing for coverage and gaining new information about the ODD; 2) finding safety-critical scenarios to drive development forward (Figure 5).
The handbook’s “Combating combinatorial explosion in scenario-based testing” section lists different techniques that autonomy programs can leverage to speed up their testing, development, and information gathering to combat combinatorial explosion.
Autonomy programs in all development stages can leverage best practices for scenario creation and test execution to advance their V&V efforts. Programs should build a comprehensive scenario library, define evaluation criteria and metrics to turn scenarios into test cases, leverage different test methods effectively, and use simulation and intelligent sampling techniques.
Applied Intuition’s V&V handbook discusses these and many other topics in more detail. Download the full-length handbook today, and stay tuned for part 3 of our blog post series, which will explore how autonomy programs can define and measure coverage and analyze their system’s performance.