# How to Find Undiscovered Bugs With Modality Generative Testing

# Summary

Modality generative testing creates a vast number of distinct test conditions and then runs them on your system to find new bugs.

# Use Cases

1. If you're looking for new bugs, Modality can explore your entire system and all its operations for undiscovered issues.

2. If you're tracking an unusual behavior, Modality can target specific inputs and trigger nuanced system changes.

3. If you're hunting bugs in a subsystem, Modality can narrow your field of interest as much or little as you'd like.

Once you find a bug, Modality generative testing returns results that make investigating and troubleshooting bugs much easier.

A diagram of Modality's generative testing process, introducing mutations into your system, observing many conditions including possible bugs, and returning sophisticated analysis

# Compared to Traditional Bug Prevention

Traditional bug prevention requires many cycles of writing specific test cases, running tests, and then repeatedly writing new tests whenever an error occurs. With Modality, you write a generative test once in advance: you describe what region of your system you're interested in, whether that's a component or a chain of events. Modality dynamically generates a slew of different test conditions, executes all of the tests, and then returns traceable results.

Traditional bug prevention demands repeatedly writing custom interfaces to expose behavior. With Modality, systems already have instrumentation that give fine-grain visibility into your components.

Traditional bug prevention requires you to write different tests for simulations and for field tests. Modality probes introduce the same mutations into simulation components and live systems. The same Modality tests works for both environments, with results in the same queryable format.

When traditional bug prevention triggers a bug, the error message often just reports that an error was noticed at a certain surface. When Modality finds a failing measurement, it can describe the entire system state and provide developers with a specific trace query to investigate the actual cause of bugs with Modality.

Modality generates actionable tests that expand your coverage and eliminate the slow, reactive slogs of traditional test-writing.

# Step 1: Make your Objective

Your Modality objective sets the overall terms for your generative tests, obviating the need for hundreds of traditional, individually-written tests.

Writing your objective's mutation constraints lets you describe what types of changes Modality will apply to your system under test, but you don't have to drill down into particular parameters unless you'd like to.

Writing your objective's measurements lets you describe what system properties you actually care about in each test case. Measurements can be as simple as checking for failure events, or they can carefully track complex event sequences. Either way, Modality's reports will still provide the same tools for troubleshooting bugs that get observed.

# 1. Create a TOML file to write your objective in

Create a file with a .toml extension and provide a name. If you want to work off of a finished objective file, here's an example.

# Objective name
name = "sample objective"

# 2. Define your mutation constraints

Modality mutators can change your system, forcing internal state and simulating external conditions. By default, Modality generative tests will invent test cases across the entire possible range of ways that mutators can change your system. This is extremely useful, but if you want more specificity, add mutation constraints into your objective file to narrow your scope of inquiry.

  1. If you're looking for new bugs, leave this nearly blank and Modality will explore the entire parameter space of your system.

    • If you want to explore more complex system interactions, increase your max_concurrent_mutations so that more than one mutation can be run on your system during each test case.
# Global mutation constraints
[mutation_constraints]
max_concurrent_mutations = 1
  1. If you're tracking an unusual behavior, use global mutation constraints to explore many conditions quickly.

    • If you're only interested in certain types of changes to your system, use mutator_tags and only mutators with one of those tags will trigger mutations.

    • If you're only interested in changes to your system after certain conditions have been met, use mutation_precondition to set a universal precondition for mutations to trigger. For example, "only generate mutations that happen after this system state reports ready:"

# Global mutation constraints
[mutation_constraints]

# Only consider mutators which have at least one of the specified tags
mutator_tags = ['drone', 'environmental-forces', 'simulator']

# This precondition trace expression is a causal prefix for any mutation
# "Only generate mutations after this system's state is ready"
mutation_precondition = 'MATCH (name = "SYSTEM_STATE_READY") AS ReadyState'
  1. If you're hunting bugs in a subsystem, you can get even more specific by fine-tuning constraints on individual mutators.

    • Add an array of mutators that you'd like to add specific rules to.

    • For name, use the individual mutator's official name. You can see all of your mutators with the command modality sut mutator list. Every other field besides name is optional.

    • If you want more specificity, use mutator.param_constraints to limit the range of values that a specific mutator can change for one of its specific parameters.

    • If you use mutation_precondition inside a specific mutator, you have to remove any mutation_precondition from the global mutation constraints above.

# Per-mutator constraints
[[mutator]]
name = 'simulation-environment-mutator'

# Restrict mutations to the specific probe SIM_MUTATOR_PLUGIN
probe = 'SIM_MUTATOR_PLUGIN'

# This mutator-specific precondition trace expression is a
# causal prefix for triggering a mutation
# "Wait until the drone's altitude is at least 20 meters
#  before performing mutations"
mutation_precondition = 'MATCH (SIM_ALTITUDE @ SIM_MUTATOR_PLUGIN AND payload > 20) AS HoverAltitudeReached'

  # "Constrain the impact force between 0.1 Newtons and 30.0 Newtons"
  [[mutator.param_constraints]]
  name = 'impact-force-magnitude'
  range = [0.1, 30.0]

  # "Constrain the impact force link to either ROTOR_0 or ROTOR_1"
  [[mutator.param_constraints]]
  name = 'impact-force-location-link'
  range = ['ROTOR_0', 'ROTOR_1']

# 3. Define your measurements

Measurements are Modality queries that check what you want to know about your system for each of the test cases that will be generated. You can observe everything from straightforward expectations to complex chains of events. Your objective can have as many measurements as you'd like, so it's easy to start simple and add more as you go.

  1. If you're looking for new bugs, start with a simple pass/fail measurement, like checking whether a critical failure event occurs.

    • Every measurement needs a name

    • Every measurement uses check to query your system.

    • MATCH is the way to select an event, AS is the way to label an event.

    • should_pass is optional. should_pass = false indicates that the check is expected to fail under normal circumstances.

# This measurement simply fails if any failure-like events 
# with severity greater than 8 occur
[[measurement]]
name = 'No critical events occur'

# The trace query expression to check
check = 'MATCH (severity > 8) AS AnyCriticalFailures'

# This check is expected to fail because we should 
# not see any of these "AnyCriticalFailures" events
should_pass = false
  1. If you're tracking an unusual behavior, write a measurement to check system performance when an unusual condition occurs.

    • Every measurement needs a name

    • Every measurement uses check to query your system.

    • This example uses AS to group several bad conditions into a single event label: ExcessiveTilt

    • This example uses WHEN to specify the expected relationship between labeled events:

# This measurement checks that the tilt detection and reaction mechanism 
# happens within the specified amount of time
[[measurement]]
name = 'Tilt detection works in a timely fashion'

check = '''
MATCH
  # Either roll or pitch is in bounds for the tilt detector
  ((name = "IMU_ROLL" OR name = "IMU_PITCH") AND (payload > 45 OR payload < -45)) AS ExcessiveTilt,

  # The tilt detection mechanism detected the unsafe conditions
  (name = "UNSAFE_TILT_DETECTED") AS UnsafeTiltDetected,

  # Commander module should react to the condition by terminating the flight
  (name = "FLIGHT_TERMINATED") AS FlightTerminated

WHEN
  # It is a requirement that unsafe tilt (either roll or pitch) is detected and reacted to
  ExcessiveTilt -> UnsafeTiltDetected AND UnsafeTiltDetected -> FlightTerminated
'''
  1. If you're hunting bugs in a subsystem, measure whether a particular component is returning acceptable results during all of the test cases.

    • Every measurement needs a name

    • Every measurement uses check to query your system.

    • This example check uses FILTER to prune out uninteresting results.

    • This example check uses AGGREGATE to check whether this component's outputs are within normal bounds.

# This measurement checks that the IMU gyroscope's
# instability metric is nominal
[[measurement]]
name = 'IMU gyroscope instability is acceptable'

check = '''
MATCH
    (name = "IMU_GYRO_INSTABILITY" AND probe = "DRONE_IMU") AS Metric

FILTER
    Metric.payload != 0.0

AGGREGATE
    max(Metric.payload) < 0.45,
    mean(Metric.payload) < 0.2,
    stddev(Metric.payload) < 0.1
'''

# 4. Define any stopping conditions

Finally, you can set the circumstances where generative testing will stop. Stopping conditions are useful for saving resources and getting a response as soon as you have enough interesting results to act upon.

  1. As soon as a single stopping condition is met, all generative testing stops.

  2. If you want Modality to keep generating tests forever until you stop it manually, just leave [stopping_conditions] empty.

  3. If you have time or resource constraints, you could set a maximum amount of time or attempts.

   # General stopping conditions, OR'd together
   [stopping_conditions]
   
   # Maximum wall-clock time of related mutation-epoch attempts
   time = "2h 15min"
   
   # Maximum number of related mutation-epoch attempts
   attempts = 50
  1. If you expect to have gathered useful results after so many overall passing or failing measurements, you can set limits on global passing_measurements or failing_measurements.
   # General stopping conditions, OR'd together
   [stopping_conditions]
   
   # Maximum number of times all the measurements passed
   passing_measurements = 10
   
   # Maximum number of times not all the measurements passed
   failing_measurements = 5
  1. If you would like to stop all testing when a particular measurement passes or fails too many times, you can insert passing_measurements and failing_measurements into a specific measurement.
   # Same as before: This measurement simply fails if any failure-like events with severity greater than 8 occur
   [[measurement]]
   name = 'No critical events occur'
   
   # Same as before: The trace query expression to check
   check = 'MATCH (severity > 8) AS AnyCriticalFailures'
   
   # Same as before: The check is expected to fail
   should_pass = false
   
   # New: stop all generative testing if this measurement passes 0 times
   # New: Measurement-specific stopping conditions, stop immediately if this check doesn't pass
   failing_measurements = 0

# Step 2: Run your Generative Tests

Once you've defined your objective, you'll want to import your objective into Modality, then write two scripts to bring your system online and call the Modality commands that run your generative tests.

While you're elsewhere, Modality will proactively generate as many diverse conditions of system state as it can fit into your objective's constraints, run them through your system, and record the results. In the final step, you'll analyze the session, with Modality highlighting issues in the measurements you cared about.

# 1. Import your objective into Modality

  1. Save your completed objective TOML file. Here's that example objective again.
  2. In your terminal, run modality objective create <path to your objective file>

# 2. Write your per-test-case script

Now that you have a new objective in Modality, add it to a script that handles your system under test's behavior for each of the many test cases that will be run.

  1. In your tool of choice, write a script that does the following:

  2. Run per-test-case start up behaviors: get your system ready to take commands.

  3. Next, call modality mutate --objective-name <your objective>

  4. If you want some of your test cases to pass outside the predefined safety ranges of your mutators, also pass the --allow-outside-safety-range flag to your modality mutate command.

  5. Add per-test-case behaviors: command your system to induce whatever system behaviors that you'd like to observe during each test case. These are normal commands to your system, not modality mutations.

  6. Add per-test-case tear down behaviors: order your system to do whatever you'd like after each individual test case is run.

  7. Save your per-test-case script. We'll call it <your per-test-case script> below.

# 3. Execute your generative tests lifecycle

Next, you'll use modality to run the generative tests on your live system. We recommend you write a command script in your tool of choice to automate these steps.

  1. Spin up your system so that it is ready for your per-test-case script.

  2. Create a Modality session by calling modality session open <session name> <system under test name>

  3. Start generative testing by calling modality execute <your per-test-case script> --objective-name <your objective>.

  4. Your generative tests will run until one of the stopping conditions you specified, like a time limit, is hit. If you don't specify any stopping conditions, Modality will keep generating test cases indefinitely until you stop it manually.

  5. After the tests are complete, close the session by calling modality session close <session name>

    • All of these steps can be done manually through the command line, but the benefit of using a script is that your modality observation session closes right after all of your tests are complete.

# Step 3: Analyze your Results

Modality has now run countless tests over more distinct conditions than any human could write tests for. You can now query the results for meaningful measurements that point directly to bugs.

# 1. Use pass/fail expectations to quickly point to bugs

  1. When your objective measurements accurately describe unwanted system behaviors, every generated test case that forces a measurement to fail is pointing to a potential bug.

  2. If your system instrumentation includes simple pass/fail expectations, you can easily query modality for all test cases that caused an expectation failure: modality query 'MATCH outcome = fail' --using <session name>

  3. Modality will return all failing expectations, along with their exact trace coordinates. These are precise starting points for your follow-up troubleshooting.


$ modality query 'MATCH outcome = fail' --using session-name
Result 1:
═══════════
(outcome = FAIL)(473396343:393277:49:7, MEASUREMENT_CHECK @ CONSUMER, outcome=FAIL)

Result 2:
═══════════
(outcome = FAIL)(139490477:458762:9:8, CONSUMER_TIMEOUT @ MONITOR, outcome=FAIL)
...

  1. If you'd like additional detail before forwarding the results to troubleshooters, your query results can be made much more verbose or specific, see the modality query reference.

  2. Most importantly, these same queries can be directly applied by troubleshooters when they investigate the cause of bugs with Modality.

# 2. Use metrics summary for a big-picture overview

  1. To start with a high-altitude view of the generative testing session, call modality metrics summary 'session="<your session name>"'

  2. If you want additional high-level details, apply the -v flag to modality metrics summary

   
   $ modality metrics summary 'session="simple-session"' -v
   Interactions: 556
   Passing Expectation Instances: 524
   Failure Instances: 21
   Scopes Begun: 8
   Sessions With Mutations: 1
   Injected Mutations: 5
   Active Mutations: 5
   Mutation Epochs: 17
   Overlapping Mutations By Session:
     Session: 2021-05-12T11-21-30Z
       Injected Mutations:
        ...      
   ​      Instance: 7
   ​        Mutator: heartbeat-delay-mutator
   ​        Probe: CONSUMER_PROBE
   ​        Parameters: [heartbeat-delay=1007]
   ​        Associated Failures: 10
   ​        Distinct Failure Tags: 5
   ​        Probe Breadth: 1
   ​        Probe Reach: 0
   ​        Causal Reach: 0
        ...
   Probes With Passing Expectations: 2
   Probes With Failing Expectations: 1
   Distinct Events: 15
   Distinct Passing Expectations: 2
   Distinct Failing Expectations: 1
   Distinct Events By Component:
     consumer-component: 5
     monitor-component: 5
     producer-component: 5
   
   
  1. Scan the results for Associated Failures and Failing Expectations. For example, the modality metrics summary results above include an instance where heartbeat-delay-mutator has 10 Associated Failures.

  2. If you are looking for new areas of concern, this summary query could be enough to hand off to a troubleshooter.

  3. Equipped with this highlight, a troubleshooter can run follow-up queries into the specifically-named probe, mutator, and parameters as part of investigating the cause of bugs with Modality.

# Next steps

  1. If at first your queries don't return any failing expectations or problematic conditions:

    • Add objective measurements that describe unwanted system conditions.

    • Write your existing system requirements as measurement check queries. This is an excellent way to expand coverage and validate your system requirements at the same time.

    • Reduce the number of constraints and preconditions on your mutators.

    • Consider adding the --allow-outside-safety-range to your script's modality mutate --objective-name <your objective> command, at least for simulations.

    • Add new mutations that wouldn't be possible in traditional system tests.

  2. Once a potential bug has been identified, developers and other troubleshooters can use these same Modality tools to dig deeper and investigate the cause of bugs with Modality.

  3. If you add new features to your system, simply apply instrumentation to any new components and execute the same generative tests again. Modality will test, trace, and report new interactions that suggest potential bugs.

  4. Once your generative tests include measurements for all of your system requirements, passing all of the tests provides confidence that your system performs to specification under a vast array of circumstances.

An abstract depction of one computer connected to many different types of objects, conveying that Modality can generate a vast number of different conditions from a single command.