How to Reproduce a Bug with Modality

# Summary

Modality generates a flood of different test cases and then runs them on your system to automatically reproduce the bug behavior you've described.

Pro This guide makes use of features available in the full version of Modality.

# Use Cases

You've received a traditional bug report, describing an issue in natural language, and:

  1. The bug report includes conditions that are difficult to reproduce in a test environment.

  2. Following the bug report's description doesn't reproduce bug behavior as expected.

  3. The bug report is missing the necessary details to reproduce bug behavior.

Once Modality reproduces a bug, it returns results that make investigating and troubleshooting bugs much easier.

A diagram of Modality's generative testing process, introducing mutations into your system, observing many conditions including possible bugs, and returning sophisticated analysis

# Compared to Traditional Bug Replication

Traditional bug replication is an ordeal of manually futzing with different system parameters one at a time. Modality tests out a vast number of system parameters automatically, without your manual oversight.

Traditional bug replication requires you to start with a best guess about the cause of the bug, which has to be constantly updated as you try different inputs. You start with Modality by describing bug behavior once, and then let generative testing catch all the conditions that cause it.

Traditional bug replication is constrained by conditions that are tough to reproduce in a test environment. Modality mutations allow direct changes inside your system that are beyond the reach of many bench testing environments.

When a bug is reproduced with traditional methods, you often only know one narrow set of inputs that induced the bug. Modality collects all of the many test cases with a failing measurement, catching intermittent behavior and counterintuitive conditions.

Traditional bug replication only informs you that certain inputs resulted in bug behavior. Modality records your system's entire event history and provides developers with a trace query to investigate the bug's cause.

# Make your Replication Objective

Your Modality objective sets the overall terms for your generative tests, obviating the need for hundreds of traditional system parameter adjustments.

Writing your objective's mutation constraints lets you describe what types of changes Modality will apply to your system under test, which can be helpful in reproducing bug report conditions.

Writing your objective's measurements lets you describe the bug behavior from your bug report in terms of system properties. Measurements can be as simple as checking for failure events, or they can carefully track complex event sequences.

# 1. Create a TOML file to write your objective in

Create a text file with a .toml extension and provide a name. If you want to work off of a finished <objective-definition-file.toml>, here's an example.

# Objective name
name = "sample replication objective"

# 2. Define your mutation constraints

Modality mutators can change your system state. By default, Modality generative tests will invent test cases across the entire possible range of ways that mutators can change your system. This is extremely useful for replicating an intermittent bug, but if you want more specificity, add mutation constraints into your <objective-definition-file.toml> to narrow your scope of inquiry.

  1. If the bug report is missing the necessary details to reproduce bug behavior, leave this nearly blank and Modality will explore the entire parameter space of your system.

    • If you want to explore more complex system interactions, increase your max_concurrent_mutations so that more than one mutation can be run on your system during each test case.
    # Global mutation constraints
    max_concurrent_mutations = 1
  2. If following the bug report's description doesn't reproduce bug behavior as expected, use global mutation constraints to explore many conditions quickly.

    • If you're only interested in certain types of changes to your system, use mutator_tags and only mutators with one of those tags will trigger mutations.

    • If you're only interested in changes to your system after certain conditions have been met, use mutation_precondition to set a universal precondition for mutations to trigger. For example, "only generate mutations that happen after this system state reports ready:"

    # Global mutation constraints
    # Only consider mutators which have at least one of the specified tags
    mutator_tags = ['drone', 'environmental-forces', 'simulator']
    # This precondition trace expression is a causal prefix for any mutation
    # "Only generate mutations after this system's state is ready"
    mutation_precondition = 'SYSTEM_STATE_READY@PX4_COMMANDER'
  3. If the bug report includes conditions that are difficult to reproduce in a test environment, you can get even more specific by fine-tuning constraints on individual mutators.

    • Add an array of mutators that you'd like to add specific rules to.

    • For name, use the individual mutator's official name. You can see all of your mutators with the command modality sut mutator list. Every other field besides name is optional.

    • If you want more specificity, use mutator.param_constraints to have an individual mutator only work within your given range of values for one of its parameters.

    • If you use mutation_precondition inside a specific mutator, you have to remove any mutation_precondition from the global mutation constraints above.

# Per-mutator constraints
name = 'simulation-environment-mutator'

# Restrict mutations to the specific probe GAZEBO_MUTATOR

# This mutator-specific precondition trace expression is a
# causal prefix for triggering a mutation
# "Wait until the drone's altitude is at least 20 meters
#  before performing mutations"
mutation_precondition = 'SIM_ALTITUDE@GAZEBO_MUTATOR(_.payload > 20)'

# "Constrain the impact force between 0.1 Newtons and 30.0 Newtons"
name = 'impact-force-magnitude'
range = [0.1, 30.0]

# "Constrain the impact force link to either ROTOR_0 or ROTOR_1"
name = 'impact-force-location-link'
range = ['ROTOR_0', 'ROTOR_1']

# 3. Define your bug behavior as Modality measurements

Measurements are Modality queries that check what you want to know about your system for each of the test cases that will be generated. Next, you'll use measurements to describe the bug behavior you're trying to reproduce. You can observe everything from straightforward expectations to complex chains of events. Your objective can have as many measurements as you'd like, so it's easy to start simple and add more as you go.

  1. Measurements describe the way your system should work, but isn't working in the case of the bug. Depending on the nature of your bug, it could be described simply as the existence of of some event, like checking whether a critical failure event occurs.

    • Every measurement needs a name

    • Every measurement uses check to query your system.

    • Use NEVER to check that some pattern does not occur anywhere.

# This measurement simply fails if any failure-like events 
# with severity greater than 8 occur
name = 'No critical events occur'

# The trace query expression to check
check = 'NEVER *@*(_.severity > 8)'
  1. If your bug is triggered by an unusual system condition, write a measurement to check system behavior when that unusual condition occurs.

    • Every measurement needs a name

    • Every measurement uses check to query your system.

    • This example uses FOR EACH / VERIFY to check that every occurrence of some event pattern is followed by the appropriate remediation.

# This measurement checks that the tilt detection and reaction mechanism 
# happens within the specified amount of time
name = 'Tilt detection works in a timely fashion'

check = '''
    # Either roll or pitch is in bounds for the tilt detector
    *@*((_.event = "IMU_ROLL" OR _.event = "IMU_PITCH") AND (_.payload > 45 OR _.payload < -45)) AS ExcessiveTilt

  1. If your bug involves problematic parameter values, use measurement aggregates to verify conditions about those values.

    • Every measurement needs a name

    • Every measurement uses check to query your system.

    • This example uses FOR AGGREGATE to check whether this component's outputs are within normal bounds.

# This measurement checks that the IMU gyroscope's
# instability metric is nominal
name = 'IMU gyroscope instability is acceptable'

check = '''
    IMU_GYRO_INSTABILITY@PX4_VEHICLE_IMU(_.payload != 0.0) AS GyroInstability

    max(GyroInstability.payload) < 0.45,
    mean(GyroInstability.payload) < 0.2,
    stddev(GyroInstability.payload) < 0.1

# 4. Define any stopping conditions

Finally, you can set the circumstances where generative testing will stop. Stopping conditions are useful for saving resources and getting a response as soon as you have enough interesting results to act upon.

  1. As soon as a single stopping condition is met, all generative testing stops.

  2. If you want Modality to keep generating tests forever until you stop it manually, just leave [stopping_conditions] empty.

  3. If you have time or resource constraints, you could set a maximum amount of time or attempts.

# General stopping conditions, connected with OR

# Maximum wall-clock time of related mutation-epoch attempts
time = "2h 15min"

# Maximum number of related mutation-epoch attempts
attempts = 50
  1. If you expect to have gathered useful results after so many overall passing or failing measurements, you can set limits on global passing_measurements or failing_measurements.
# General stopping conditions, connected with OR

# Maximum number of times all the measurements passed
passing_measurements = 10

# Maximum number of times not all the measurements passed
failing_measurements = 5
  1. If you would like to stop all testing when a particular measurement passes or fails too many times, you can insert passing_measurements and failing_measurements into a specific measurement.

    • If you have an intermittent bug, we recommend you let generative testing continue until you have a meaningful number of passing/failing measurements for its behavior.
# Same as before: This measurement simply fails if any failure-like events with severity greater than 8 occur
name = 'No critical events occur'

# Same as before: The trace query expression to check
check = 'NEVER *@*(_.severity > 8)'

# New: stop all generative testing if this measurement passes 0 times
# New: Measurement-specific stopping conditions, stop immediately if this check doesn't pass
failing_measurements = 0

# Run Generative Tests

Once you've defined your objective, you'll want to import your objective into Modality, then write two scripts to bring your system online and call the Modality commands that run your generative tests.

While you're elsewhere, Modality will proactively generate as many diverse conditions of system state as it can fit into your objective's constraints, run them through your system, and record the results. In the final step, you'll analyze the session, with Modality highlighting cases where the measurements you used to describe bug behavior caught something interesting.

# 1. Import your objective into Modality

  1. Save your completed <objective-definition-file.toml>. Here's that example objective again.

  2. In your terminal, run modality objective create <path/to/objective-definition-file.toml>

# 2. Write your per-test-case script

Now that you have a new objective in Modality, add it to a script that handles your system under test's behavior for each of the many test cases that will be run.

  1. In your tool of choice, write a script that does the following:

  2. Add per-test-case start up behaviors: spin up your system and get it ready to take commands.

  3. Next, call modality mutate --objective-name <objective-name>

  4. If you want some of your test cases to pass outside the predefined safety ranges of your mutators, also pass the --allow-outside-safety-range flag to your modality mutate command.

  5. Add per-test-case behaviors: command your system to induce whatever system behaviors that you'd like to observe during each test case. These are normal commands to your system, not modality mutations.

  6. Add per-test-case tear down behaviors: order your system to do whatever you'd like after each individual test case is run.

  7. Save your per-test-case script. We'll call it <your per-test-case script> below.

# 3. Execute your generative tests lifecycle

Next, you'll use modality to run the generative tests on your live system. We recommend you write a command script in your tool of choice to automate these steps.

  1. Perform system start-up so that it is ready for your per-test-case script.

  2. Create a Modality session by calling modality session open <session name> <system under test name>

  3. Start generative testing by calling modality execute <your per-test-case script> --objective-name <objective-name>.

  4. Your generative tests will run until one of the stopping conditions you specified, like a time limit, is hit.

  5. After the tests are complete, close the session by calling modality session close <session name>

    • All of these steps can be done manually through the command line, but the benefit of using a script is that your modality observation session closes right after all of your tests are complete.

# Check your Results

Modality has run countless tests over more distinct conditions than any human could write tests for. You can now check the results for meaningful measurements that point directly to your bug.

  1. Call modality objective inspect <objective-name> <objective-instance-number> to see which of your bug measurements passed and failed.

     $ modality objective inspect demo-objective 1
     Name: demo-objective
     SUT: example-sut
     Session: long-running-experiment
     Created at: 2021-05-24 12:50:22 UTC
     Created by: example-user
     Stopped at: 2021-05-24 13:02:26 UTC
     Stopping Conditions Reached: true
     Stopping Conditions Progress:
       Time: 00:12:04
       Passing Measurements: 80
       Failing Measurements: 18
     Passing Measurements: 4
       Consumer lifecycle: 98
       Heartbeat message receive interval: 80
       Heartbeat sender IDs are checked: 98
       Producer to consumer communications: 98
     Failing Measurements: 1
       Heartbeat message receive interval: 18
     Mutations: 100
  2. Measurements are written to describe the way your system is normally expected to behave, but didn't in the case of your bug. If there are failing measurements, then it's likely you reproduced your bug! Move on to Investigating the Cause of a Bug Found with Modality Generative Testing.

  3. If at first your generative test results don't return any failing measurements:

    • Add additional measurements to your replication objective that describe the bug with greater specificity. For example, if bug behavior is occurring at unexpected times, make a measurement to specify the proper order of relevant events.

    • Run a broader session of generative testing to re-reproduce the issue and gather a wider set of conditions for analysis. Do this by reducing the number of constraints and preconditions on your mutators.

    • Run a longer session of generative testing to collect more data for Modality to crunch. You can do this by simply loosening your stopping conditions to generate more tests.

    • Consider adding the --allow-outside-safety-range to your script's modality mutate --objective-name <objective-name> command, at least for simulations.

    • Call modality sut mutator list -v to see all of the mutators in your system, and consider instrumenting additional mutators on components that are related to this bug.

    • Run modality metrics coverage --using <session-name> to see if your test scripts are exercising enough of your system to produce meaningful results in your generative tests. See how to write better tests with Modality.

  4. Once you've reproduced your bug, you can use your Modality replication objective to dig deeper and investigate the cause of bugs with Modality.