How to Find Undiscovered Bugs With Modality

# Summary

Modality generative testing creates a vast number of distinct test conditions and then runs them on your system to find new bugs.

Pro This guide makes use of features available in the full version of Modality.

# Use Cases

1. If you're looking for new bugs, Modality can explore your entire system and all its operations for undiscovered issues.

2. If you're tracking an unusual behavior, Modality can target specific inputs and trigger nuanced system changes.

3. If you're hunting bugs in a subsystem, Modality can narrow your field of interest as much or little as you'd like.

Once you find a bug, Modality generative testing returns results that make investigating and troubleshooting bugs much easier.

A diagram of Modality's generative testing process, introducing mutations into your system, observing many conditions including possible bugs, and returning sophisticated analysis

# Compared to Traditional Bug Prevention

Traditional bug prevention requires many cycles of writing specific test cases, running tests, and then repeatedly writing new tests whenever an error occurs. With Modality, you write a generative test once in advance: you describe what region of your system you're interested in, whether that's a component or a chain of events. Modality dynamically generates a slew of different test conditions, executes all of the tests, and then returns traceable results.

Traditional bug prevention demands repeatedly writing custom interfaces to expose behavior. With Modality, systems already have instrumentation that give fine-grain visibility into your components.

Traditional bug prevention requires you to write different tests for simulations and for field tests. Modality probes introduce the same mutations into simulation components and live systems. The same Modality tests works for both environments, with results in the same queryable format.

When traditional bug prevention triggers a bug, the error message often just reports that an error was noticed at a certain surface. When Modality finds a failing measurement, it can describe the entire system state and provide developers with a specific trace query to investigate the actual cause of bugs with Modality.

Modality generates actionable tests that expand your coverage and eliminate the slow, reactive slogs of traditional test-writing.

# Step 1: Make your Objective

Your Modality objective sets the overall terms for your generative tests, obviating the need for hundreds of traditional, individually-written tests.

Writing your objective's mutation constraints lets you describe what types of changes Modality will apply to your system under test, but you don't have to drill down into particular parameters unless you'd like to.

Writing your objective's measurements lets you describe what system properties you actually care about in each test case. Measurements can be as simple as checking for failure events, or they can carefully track complex event sequences. Either way, Modality's reports will still provide the same tools for troubleshooting bugs that get observed.

# 1. Create a TOML file to write your objective in

Create a text file with a .toml extension and provide a name. If you want to work off of a finished <objective-definition-file.toml>, here's an example.

# Objective name
name = "sample objective"

# 2. Define your mutation constraints

Modality mutators can change your system, forcing internal state and simulating external conditions. By default, Modality generative tests will invent test cases across the entire possible range of ways that mutators can change your system. This is extremely useful, but if you want more specificity, add mutation constraints into your objective file to narrow your scope of inquiry.

  1. If you're looking for new bugs, leave this nearly blank and Modality will explore the entire parameter space of your system.

    • If you want to explore more complex system interactions, increase your max_concurrent_mutations so that more than one mutation can be run on your system during each test case.
# Global mutation constraints
[mutation_constraints]
max_concurrent_mutations = 1
  1. If you're tracking an unusual behavior, use global mutation constraints to explore many conditions quickly.

    • If you're only interested in certain types of changes to your system, use mutator_tags and only mutators with one of those tags will trigger mutations.

    • If you're only interested in changes to your system after certain conditions have been met, use mutation_precondition to set a universal precondition for mutations to trigger. For example, "only generate mutations that happen after this system state reports ready:"

# Global mutation constraints
[mutation_constraints]

# Only consider mutators which have at least one of the specified tags
mutator_tags = ['drone', 'environmental-forces', 'simulator']

# This precondition trace expression is a causal prefix for any mutation
# "Only generate mutations after this system's state is ready"
mutation_precondition = 'SYSTEM_STATE_READY@PX4_COMMANDER'
  1. If you're hunting bugs in a subsystem, you can get even more specific by fine-tuning constraints on individual mutators.

    • Add an array of mutators that you'd like to add specific rules to.

    • For name, use the individual mutator's official name. You can see all of your mutators with the command modality sut mutator list. Every other field besides name is optional.

    • If you want more specificity, use mutator.param_constraints to limit the range of values that a specific mutator can change for one of its specific parameters.

    • If you use mutation_precondition inside a specific mutator, you have to remove any mutation_precondition from the global mutation constraints above.

# Per-mutator constraints
[[mutator]]
name = 'simulation-environment-mutator'

# Restrict mutations to the specific probe GAZEBO_MUTATOR
probe = 'GAZEBO_MUTATOR'

# This mutator-specific precondition trace expression is a
# causal prefix for triggering a mutation
# "Wait until the drone's altitude is at least 20 meters
#  before performing mutations"
mutation_precondition = 'SIM_ALTITUDE@GAZEBO_MUTATOR(_.payload > 20)'

  # "Constrain the impact force between 0.1 Newtons and 30.0 Newtons"
  [[mutator.param_constraints]]
  name = 'impact-force-magnitude'
  range = [0.1, 30.0]

  # "Constrain the impact force link to either ROTOR_0 or ROTOR_1"
  [[mutator.param_constraints]]
  name = 'impact-force-location-link'
  range = ['ROTOR_0', 'ROTOR_1']

# 3. Define your measurements

Measurements are Modality queries that check what you want to know about your system for each of the test cases that will be generated. You can observe everything from straightforward expectations to complex chains of events. Your objective can have as many measurements as you'd like, so it's easy to start simple and add more as you go.

  1. If you're looking for new bugs, start with a simple pass/fail measurement, like checking whether a critical failure event occurs.

    • Every measurement needs a name.

    • Every measurement uses check to query your system.

    • Use NEVER to check that some pattern does not occur anywhere.

# This measurement simply fails if any events 
# with severity greater than 8 occur
[[measurement]]
name = 'No critical events occur'

# The trace query expression to check
check = 'NEVER *@*(_.severity > 8)'
  1. If you're tracking an unusual behavior, write a measurement to check system performance when an unusual condition occurs.

    • Every measurement needs a name.

    • Every measurement uses check to query your system.

    • This example uses FOR EACH / VERIFY to check that every occurrence of some event pattern is followed by the appropriate remediation.

# This measurement checks that the unsafe tilt detection and reaction 
# mechanisms occur as expected

[[measurement]]
name = 'The system successfully detects unsafe tilt and terminates flight'

check = '''
FOR EACH
    # Either roll or pitch is in bounds for the tilt detector
    *@*((_.event = "IMU_ROLL" OR _.event = "IMU_PITCH") AND (_.payload > 45 OR _.payload < -45)) AS ExcessiveTilt

VERIFY
    ExcessiveTilt -> UNSAFE_TILT_DETECTED@PX4_COMMANDER -> FLIGHT_TERMINATED@PX4_COMMANDER
'''
  1. If you're hunting bugs in a subsystem, measure whether a particular component is returning acceptable results during all of the test cases.

    • Every measurement needs a name.

    • Every measurement uses check to query your system.

    • This example uses FOR AGGREGATE to check whether this component's outputs are within normal bounds.

# This measurement checks that the IMU gyroscope's
# instability metric is nominal
[[measurement]]
name = 'IMU gyroscope instability is acceptable'

check = '''
FOR AGGREGATE
    IMU_GYRO_INSTABILITY@PX4_VEHICLE_IMU(_.payload != 0.0) AS GyroInstability

VERIFY
    max(GyroInstability.payload) < 0.45,
    mean(GyroInstability.payload) < 0.2,
    stddev(GyroInstability.payload) < 0.1
'''

# 4. Define any stopping conditions

Finally, you can set the circumstances where generative testing will stop. Stopping conditions are useful for saving resources and getting a response as soon as you have enough interesting results to act upon.

  1. As soon as a single stopping condition is met, all generative testing stops.

  2. If you want Modality to keep generating tests forever until you stop it manually, just leave [stopping_conditions] empty.

  3. If you have time or resource constraints, you could set a maximum amount of time or attempts.

   # General stopping conditions, connected by OR
   [stopping_conditions]
   
   # Maximum wall-clock time of related mutation-epoch attempts
   time = "2h 15min"
   
   # Maximum number of related mutation-epoch attempts
   attempts = 50
  1. If you expect to have gathered useful results after so many overall passing or failing measurements, you can set limits on global passing_measurements or failing_measurements.
   # General stopping conditions, connected by OR
   [stopping_conditions]
   
   # Maximum number of times all the measurements passed
   passing_measurements = 10
   
   # Maximum number of times not all the measurements passed
   failing_measurements = 5
  1. If you would like to stop all testing when a particular measurement passes or fails too many times, you can insert passing_measurements and failing_measurements into a specific measurement.
   # Same as before: This measurement simply fails if any failure-like events with severity greater than 8 occur
   [[measurement]]
   name = 'No critical events occur'
   
   # Same as before: The trace query expression to check
   check = 'NEVER *@*(_.severity > 8)'
   
   # New: Measurement-specific stopping conditions, stop immediately if this check fails
   failing_measurements = 1

# Step 2: Run your Generative Tests

Once you've defined your objective, you'll want to import your objective into Modality, then write two scripts to bring your system online and call the Modality commands that run your generative tests.

While you're elsewhere, Modality will proactively generate as many diverse conditions of system state as it can fit into your objective's constraints, run them through your system, and record the results. In the final step, you'll analyze the session, with Modality highlighting issues in the measurements you cared about.

# 1. Import your objective into Modality

  1. Save your completed <objective-definition.toml> file. Here's that example objective again.
  2. In your terminal, run modality objective create <path/to/objective-definition-file.toml>

# 2. Write your per-test-case script

Now that you have a new objective in Modality, add it to a script that handles your system under test's behavior for each of the many test cases that will be run.

  1. In your tool of choice, write a script that does the following:

  2. Run per-test-case start up behaviors: get your system ready to take commands.

  3. Next, call modality mutate --objective-name <objective-name>

  4. If you want some of your test cases to pass outside the predefined safety ranges of your mutators, also pass the --allow-outside-safety-range flag to your modality mutate command.

  5. Add per-test-case behaviors: command your system to induce whatever system behaviors that you'd like to observe during each test case. These are normal commands to your system, not modality mutations.

  6. Add per-test-case tear down behaviors: order your system to do whatever you'd like after each individual test case is run.

  7. Save your per-test-case script. We'll call it <your per-test-case script> below.

# 3. Execute your generative tests lifecycle

Next, you'll use modality to run the generative tests on your live system. We recommend you write a command script in your tool of choice to automate these steps.

  1. Spin up your system so that it is ready for your per-test-case script.

  2. Create a Modality session by calling modality session open <session name> <system under test name>.

  3. Start generative testing by calling modality execute <your per-test-case script> --objective-name <objective-name>.

  4. Your generative tests will run until one of the stopping conditions you specified, like a time limit, is hit. If you don't specify any stopping conditions, Modality will keep generating test cases indefinitely until you stop it manually.

  5. After the tests are complete, close the session by calling modality session close <session name>.

    • All of these steps can be done manually through the command line, but the benefit of using a script is that your modality observation session closes right after all of your tests are complete.

# Step 3: Analyze your Results

Modality has run countless tests over more distinct conditions than any human could write tests for. You can now query the results for meaningful measurements that point directly to bugs.

# 1. Use objective inspect for a big-picture overview

  1. When your objective measurements accurately describe unwanted system behaviors, every generated test case that forces a measurement to fail is pointing to a potential bug.

  2. To start with a high-altitude view of measurement results in the generative testing session, call modality objective inspect <objective name> <objective instance id>.

  
  $ modality objective inspect example-objective 1
  Name: example-objective
  SUT: example-sut
  Session: example-session
  Created at: 2021-05-26 09:33:31 UTC
  Created by: example-user
  Stopped at: 2021-05-26 09:35:04 UTC
  Stopping Conditions Reached: true
  Stopping Conditions Progress:
    Time: 00:01:33
    Passing Measurements: 10
    Failing Measurements: 3
  Passing Measurements: 8
    Consumer heartbeat timing: 10
    Consumer to monitor heartbeat: 13
    Monitor checks consumer heartbeats: 13
    Monitor checks producer heartbeats: 13
    Monitor heartbeat timeouts: 10
    Producer lifecycle: 13
    Producer to consumer communications: 13
    Producer to monitor heartbeat: 13
  Failing Measurements: 2
    Consumer heartbeat timing: 3
    Monitor heartbeat timeouts: 3
  Mutations: 15
  1. The Failing Measurements section shows the name of each failing measurement and the number of times it failed.

  2. If you'd like additional detail before forwarding the results to troubleshooters, add verbosity flags. modality objective inspect <objective name> <objective instance id> -vv will give a breakdown of all mutations that were injected and all measurement results per execution run.

  3. Equipped with these failing measurements, a troubleshooter can dig into the uncovered system behavior as part of investigating the cause of bugs with Modality.

# 2. Use expectations to further search for bugs

  1. If your instrumentation includes expectations to indicate important failures, you can easily query modality for all test cases that caused an expectation failure: modality query '*@*(_.outcome = fail)'

  2. Modality will return all failing expectations, along with their exact trace coordinates. These are precise starting points for your follow-up troubleshooting.

$ modality query '*@*(_.outcome = fail)'
Result 1:
═════════
*@*((_.outcome = FAIL))(78091057:327686:7:4, CONSUMER_TIMEOUT @ MONITOR, outcome=FAIL)

Result 2:
═════════
*@*((_.outcome = FAIL))(78091057:327687:9:4, CONSUMER_TIMEOUT @ MONITOR, outcome=FAIL)
...
  1. If you'd like additional detail before forwarding the results to troubleshooters, your query results can be made much more verbose or specific, see the reference for modality query.

  2. Most importantly, these same queries can be directly applied by troubleshooters when they investigate the cause of bugs with Modality.

# Next steps

  1. If at first your generative tests don't cause any measurement or expectation failures:

    • Add objective measurements that describe unwanted system conditions.

    • Write your existing system requirements as measurement check queries. This is an excellent way to expand coverage and validate your system requirements at the same time.

    • Reduce the number of constraints and preconditions on your mutators.

    • Consider adding the --allow-outside-safety-range to your script's modality mutate command, at least for simulations.

    • Add new mutations that wouldn't be possible in traditional system tests.

  2. Once a potential bug has been identified, developers and other troubleshooters can use these same Modality tools to dig deeper and investigate the cause of bugs with Modality.

  3. If you add new features to your system, simply apply instrumentation to any new components and execute the same generative tests again. Modality will test, trace, and report new interactions that suggest potential bugs.

  4. Once your generative tests include measurements for all of your system requirements, passing all of the tests provides confidence that your system performs to specification under a vast array of circumstances.