Specification and Query Language Reference

# Conceptual Overview

The Modality Specification Language is a tool for specifying the expected behavior of a system that is composed of multiple interacting components. You can use it to specify both local behavior and interactions which span multiple components. The modality check command described below will evaluate such specifications over observed system behavior. When appropriate, Modality can call out counterexamples to your specification or help you narrow down which parts of a non-passing specification are not fulfilled by the system.

The closely related Modality Trace Query Language is a tool for directly interacting with system trace data acquired with Modality. It can be used to identify portions of the trace which match a certain pattern, to compute aggregations over part of the trace, or to check that a property holds over the collected trace data.

# What's in a trace?

In Modality, a system trace consists of a collection of related events. These events are logged using the modality-probe library, and collected from many probes into a session. The session data is stored by Modality, which can evaluate specifications and trace queries over it.

Causality is central to Modality's tracing system - it's how Modality can tell if one event is before or after another. Unlike from other systems you may be used to, event ordering in Modality is not derived from a common time source. Instead, Modality uses the snapshot data passed between modality-probe instances to establish before-and-after relationships between events on different probes. This means that event ordering in Modality is partial: Given two events, one may be before or after the other, or they may have no known ordering at all (that is, they are logically concurrent).

This causal ordering is the backbone of the Modality Specification Language.

# What's a specification?

Modality stores all the collected trace data for a session in a causal graph, where each node is an event, and each edge represents a causal (happened-before, happened-after) relationship between the events. You can inspect this graph directly using the modality log command.

Modality Specifications give you a way to use that graph to check things about your system. You might use a specification to:

  • Assert that all values logged at a particular point in your software fall within a certain desired range.
  • Assert that all the faults of a certain type are followed by an automatic remediation.
  • Assert that some error conditions should never happen.
  • Assert that a complex, distributed initialization sequence always occurs as specified.

# A basic example

Here's an example of a Modality Specification:

FOR EACH radar_on@control AS radar_on
  VERIFY radar_on -> WITHIN 250 ms radar_reading@radar(0.2 <= _.payload <= 0.8)

In English, you might describe this as:

Every time the radar turns on, there should be a reading logged within 250 milliseconds, where the payload is between 0.2 and 0.8.

# A more sophisticated example

Here we specify a non-deterministic, distributed initialization sequence.

FOR EACH
  begin_startup@controller AS start
VERIFY
  start
  -> WITHIN 1ms broadcast_init@controller
  -> (
       (REPEAT(1..1000, poll_completion@controller) !-> poll_completion@controller) CONCURRENT WITH
       (recv_init@radar -> WITHIN 500ms send_completion@radar -> recv_radar_init_complete@controller) CONCURRENT WITH
       (recv_init@imu -> WITHIN 250ms send_completion@imu -> recv_imu_init_complete@controller)
     )
  -> finish_startup@controller AS done
AND
  start -> WITHIN 1s done

This describes a distributed initialization sequence, starting with the begin_startup event at the controller probe, and interacting with the radar and imu probes. It specifies time limits using the local timekeeping available at each probe.

When used with modality check, it will verify that all matching start events are followed by the initialization sequence; if there are any which don't, it will tell you which ones.

# Specifications

Modality supports several forms of specifications. These can be used with the modality check command, or inside an objective definition.

  • FOR EACH / VERIFY: Verify that some conditions hold for every match in the defined domain.
  • EXISTS and NEVER: Check that some pattern of events occurred, or did not occur.
  • FOR AGGREGATE / VERIFY: Verify that some conditions hold for an aggregate computed over a defined domain.

# FOR EACH / VERIFY

FOR EACH specifications say that some property should hold for every event that matches a given event pattern (called the domain).

FOR EACH a@probe as a
  VERIFY a -> b@probe

All labels defined in the domain may be used in body of VERIFY.

VERIFY can contain another pattern, in which case modality will check that events matching the pattern will exist. It can also contain an expression:

FOR EACH a@probe AS a
  VERIFY a.payload > 12.0

You can even use both together, combining a pattern and an expression using AND.

# EXISTS and NEVER

If you just want to check that some pattern occurred at all, or did not occur, use EXISTS or NEVER.

EXISTS important_check@probe
NEVER fatal_error@probe

# FOR AGGREGATE / VERIFY

If you want to specify that some property holds for an aggregation over all the events matched by a pattern, you can use the FOR AGGREGATE form. A common use for this is checking the distribution of values for an event's payload.

FOR AGGREGATE a@probe as a
  VERIFY 12.0 < mean(a.payload) < 13.0
     AND stddev(a.payload) < 2.1

See the Aggregation section for details.

# Nested specifications

FOR EACH can accommodate any other specification as its body. This includes NEVER (quite useful), another FOR EACH, and FOR AGGREGATE. The inner specification is evaluated for each of the matches from the FOR EACH pattern.

FOR EACH a@probe AS a
  FOR EACH a -> b@probe as b
    VERIFY a.payload = b.payload + 2

It is often the case that such nested specifications could be more simply written in a non-nested form, but you may find that it's more convenient or more readable to write them this way.

# Queries

For the most part, queries are nothing more than event patterns describing the events of interest. Patterns can be given directly to the modality query command, which will find events which match the pattern. For example:

*@probe1 AS a -> *@probe2 AS b AND a.payload = b.payload

This is a pattern that you could use inside of a FOR ALL specification, but it's also a query that you can pass to modality query. The Event patterns section details how to write event patterns.

When writing a query, you also have access to the aggregation facilities using the AGGREGATE keyword:

*@* AS a
AGGREGATE stddev(a.payload) AS payload_stddev

See the Aggregation section for details.

# Mutation preconditions

When injecting mutations into a SUT using the modality mutate command, or in an automated fashion using an objective, you can specify when the mutation should be injected as a 'mutation preconditon' query. This is a special form of a trace query that has limited features, as it is executed dynamically by the Modality Probe as the events are logged.

  • Each event predicate must specify both an event and a probe, either by id or by name.
  • Each event predicate may specify up to two constraints on the event's payload, as a comparison with a constant value. These must be ANDed together.
  • No other event predicate structures are allowed.
  • Only FOLLOWED BY relationships are supported, without any extra constraints.
  • Relationships which cross from one probe to another are allowed, but they will only work if you have fully integrated modality-probe into the SUT and are sending snapshots across the probe boundary. The connection can be indirect, via an intermediate probe.
  • AGGREGATE is not allowed, nor are the FOR EACH or NEVER forms.
  • The entire SUT has a limit on the number of event predicates which can be under consideration at once. This applies across the preconditions for all active mutations. modality mutate will return an error if this limit is exceeded.

# Event patterns

Specifications are built of one or more event patterns. These can be used to specify a precondition domain for the specification, to specify the system behavior after a precondition is met, or even to simply assert that some behavior must always (or never) occur.

# Basic Structure

An event pattern describes the events it should match, and how those events are connected.

event@probe AS a -> event@probe -> event@probe

Each event is described using the event name and the probe where it should occur. A wildcard can be used in either position, in case you don't want to specify one (or both). Additionally, the event and probe ids may be directly used, in case you don't have the name or the id is more convenient.

The event may be optionally labeled using AS, for use elsewhere in the pattern or specification. It may also specify additional constraints beyond the event name and probe; see below for details.

Events are joined using relationship connectives, such as -> or FOLLOWED BY (which are synonyms). These may also be further modified, for example to require the two events to occur within a given time window. See below for details.

# Example Pattern

Here's an example event pattern:

12@* AS ev1 -> my_event@*(_.payload > 12) AS ev2

In this example, 12@* is used to identify all nodes in the causal graph where an event with id 12 was logged. Similarly, my_event@* is used to identify nodes where the logged event has the specified name. It has an additional predicate which it restricts it to only events with a payload greater than 12. These matched nodes are associated with their respective labels, ev1 and ev2.

# Event relationships (paths)

A pattern can specify multiple events, constraining the matches using structural constraints between them. All relationship constraints are based on these basic causal relationships:

  • FOLLOWED BY (or ->)
  • PRECEDED BY (or <-)

These operate on labeled events to constrain the query results to those where the events occur in the given causal order. For example:

event@probe FOLLOWED BY event@probe

# Time-limited relationships

If you have added wall-clock time information to your events, you can use that in a relationship constraint. Constraints can be specified as WITHIN <time> or AFTER <time>:

ev1@probe -> WITHIN 1 ms ev2@probe

The supported time units are MINUTES, SECONDS, MILLISECONDS, MICROSECONDS, and NANOSECONDS, or their abbreviations m, s, ms, us, and ns.

Note that this will only work for events which share a common time domain, which is specified when you initialize the probe. This should always be the case for events on the same probe. It can work across probes if you have made special accommodations for time synchronization between them (in a way that meets your system requirements) and have appropriately initialized the relevant probes with the same time domain id.

# Step Limits

You can limit the distance between matches in the causal graph using WITHIN <n> STEPS and AFTER <n> STEPS. For example, this will match any two causally connected events within three steps of each other:

ev1@probe -> WITHIN 3 STEPS ev2@probe

Note While step limits are useful for narrowing down a query during debugging, they are inherently somewhat fragile. When you add additional instrumentation to a SUT, you could easily invalidate the structural assumptions made by some query using step limits. As such, you should avoid using step limits when writing specifications or test cases.

# Negative lookahead

A negative relationship with nothing else after it behaves like a lookahead operator. For example, suppose you want to identify errors which are not followed by a remediation event inside a suitable time window:

*@*(tag='error') !-> WITHIN 5 ms *@*(tag='remediation')

The negative relationship operators are:

  • NOT FOLLOWED BY (or !->)
  • NOT PRECEDED BY (or <-!)

Note that the event matched by the right-hand side of the relationship may not be labeled; there's actually nothing there to find, since this is specifying what should /not/ be present.

# Negative path constraints

You can similarly specify a negative constraint to be applied on the path between two events. For example, you might want to look for pairs of startup and shutdown events that don't have an error signaled on the path between them:

startup@p AS a !-> *@*(tag='error') -> shutdown@p AS b

The syntax is similar to negative lookahead, but the subsequent -> means that it will apply to any nodes found between the events a and b. There don't have to be any events there: this query will match even if a is directly followed by b, with nothing in between.

# CROSSING ANY

By default, each relationship matches only the nearest events in either direction. More precisely, it will not match a pair of events if any of the events on the path between them could have been a match. This is the most intuitive behavior most of the time; we refer to it as local reasoning.

There may be times when you want to indicate all possible pairs of matching events regardless of the events on the path between them. This is done using CROSSING ANY.

For example, if you want to match ALL the possible bs for every a , you could write:

a@probe AS a -> CROSSING ANY b@probe AS b

# Adding multi-event predicates

Additional predicate expressions may be added to an event pattern to further constrain matches, especially based on some predicate involving multiple matched events. These expressions use the label assigned to the event using AS, instead of the _ character.

This gives the filter expression access to multiple events at the same time. For example, you might want to find causal pairs of events with the same payload as each other:

*@* AS a -> *@* AS b
AND a.payload = b.payload

External predicates with local reasoning

When using AND to add additional predicates to an event pattern, you should be aware of the local reasoning that is used for all relationships by default. (See the crossing any section) You can use this with external predicates to express some unique queries.

This is best explained by example. Suppose you wanted to find ground_contact events where the most recent accelerometer reading is above a certain threshold:

ground_contact@control AS a <- acceleration@control AS b
AND b.payload > 12.0

This will first match pairs of ground_contact and acceleration events as described, and then limit those matches to only the pairs where the acceleration event has the correct payload.

If the payload constraint was inside the event predicate for acceleration would have an entirely different meaning:

ground_contact@control AS a <- acceleration@control(_.payload > 12.0) AS b

This would match the most recent acceleration event with the desired payload, with respect to the ground contact event, but there could be other more recent acceleration events (with payload <= 12) on the path between them.

# CONCURRENT paths

Events and relationships alone are sufficient to match trace-like patterns, which consist of a single path through the causal graph. But many systems have behaviors which cannot be described this simply.

A very common kind of complex system behavior is parallel execution with synchronization points. Events logged by such systems can be matched using CONCURRENT WITH event patterns.

This is best described by example:

start@common -> (a1@a -> a2@a CONCURRENT WITH b1@b -> b2@b) -> end@common

In this example, the two event patterns a1@a -> a2@a and b1@b -> b2@b are causally independent of each other. But they must both be casually preceded by a single event which matches start@common, and followed by a single event which matches end@common.

If you want to add additional restrictions to the edges going into or out of the CONCURRENT WITH block, you can do so by adding labels and using AND:

start@common as common_start -> (a1@a as a_start -> a2@a CONCURRENT WITH b1@b -> b2@b) -> end@common
AND common_start -> a_start within 10 ms

# Combining paths

If you want to match complex patterns which do not fit the CONCURRENT WITH model, you can do so by describing them as multiple paths which AND with each other, using common labels to indicate intersection points. For example:

source_1@p1 -> meet@center AS meet
AND source_2@p2 -> meet

Note that the meet label is used in both paths: this means that the two nodes are unified. meet has to be the same node in both paths; graphically it's a "V" pattern. Modality will search for ways to choose three nodes such that ALL the given relationships are satisfied.

One common misinterpretation of this is to view them as 'shortcuts', to match multiple events using the same condition. This is not correct. Each time you reuse an event label in you're applying a new constraint to the same event. So each time you talk about event a, it's always the same a within a single query result.

# Grouping

A long or complex path may benefit from some structure. You can define groups in an event pattern by parentheses and apply a name:

(power_on@p AS a -> init@p AS b -> finished_start@p AS c) AS startup -> interesting_error@p

When using grouping, the results are accessed using group.item. For example, the payload of a above could be accessed in a subsequent predicate AND clause with startup.a.payload.

# Repetition

The repetition of a single event or a pattern of events can be described using REPEAT(...). This works a lot like repetition does in a regular expression:

REPEAT(2..3, loop_start@p AS start -> loop_end AS end) AS loop

Each repetition is treated like a group for the purpose of results labeling, except the repetition number is addressed as a zero-indexed array. For example, results in the second match of the above repetition named loop can be addressed as loop[1].start or loop[1].end.

# Pattern Negation

NOT may be used at the pattern level to ensure that a match can only happen if a sub-pattern is not present. This is best used in conjunction with AND. For example:

precondition@probe -> target@probe AS t AND NOT prohibited@probe

NOT may also be nested inside a path, as any other pattern element, but the results are somewhat non-intuitive. We do not recommend using it that way.

# Event Predicates

When matching an event, an additional predicate may be provided to further restrict the matches beyond just the event name/id and probe.

event@probe(<predicate>)

An event predicate is an expression that is evaluated against a candidate event in the causal graph to determine if it should be included as a query result. It should evaluate to a boolean value; true means the event is included, while false means it is not. In the event predicate, the symbol _ is bound to each candidate event; the various fields are available using object-style . notation, such as _.payload.

Operators

Event predicates use typical operators:

  • Boolean operators: AND, OR, NOT
  • Comparison: =, !=, <, >, <=, >=
  • Arithmetic: +, -, *, /

The equality operator "=" is also used for glob-matching: all strings can be compared with a glob pattern that use the "*" character as a wildcard, such as "prefix*suffix".

When a set of values is available, such as the tag, scope, and scope_instance fields, equality tests for membership in that set. Sets of strings support glob matching as well.

Event Fields

The following named fields may be used in event predicates:

  • event: The event that was logged. This may be used either as a string, giving the event name, or a number, giving the event id.
  • probe: The probe where the event was logged. This may be used either as a string, giving the probe name, or a number, giving the probe id.
  • payload: The payload provided when the event was logged.
  • component: The name of the component in which the event was defined.
  • component_id: The UUID of the component in which the event was defined, formatted as a string.
  • coordinates: The event's coordinates, which identify it by position in the causal graph. This value is found in query results, and in the output of modality log.
  • tag: The event's tags. To match a tag, use string equality: tag = "debug" will match any event with the debug tag. Glob wildcards may also be used with tags.
  • file: The path to the file in which the event was defined.
  • scope: The names of all scopes active at the time the event was logged.
  • scope_instance: The numerical ids of all scope instances active at the time the event was logged.
  • mutation_epoch: The numerical id assigned by modality to a section of causal history, used to track the effects of injected mutations.
  • severity: The numerical severity of an event, valued 1-10, derived from the presence of SEVERITY_N tags in the event definition.
  • expectation: The name given to an expectation event. For expectations logged from a probe, this is the same as the event name. For expectations logged with the modality expect command, this is the name given at the command line.
  • is_expectation: A boolean indicating whether an event is an expectation event.
  • outcome: If the event is an expectation, this gives its outcome. This evaluates to the special values PASS or FAIL.
  • time: The nanosecond time associated with the event, if logged.
  • session: The name of the session in which the event was logged.
  • is_local_min: If this event has a numeric payload, tests if this instance is less than the payloads of both the previous and next event with the same id on the same probe.
  • is_local_max: If this event has a numeric payload, tests if this instance is greater than the payloads of both the previous and next event with the same id on the same probe.

Some of these fields are only available on some events. If an event predicate that uses an unavailable field is evaluated against an event that does not have it, the event is simply skipped.

# Aggregation

# In queries

Aggregation works very similarly to aggregation in other query languages. You can specify aggregations over previous query results, do some computations with them, and give them names.

For example:

*@* AS a
AGGREGATE stddev(a.payload) AS payload_stddev

This will compute the standard deviation of the payload value for all events, naming it payload_stddev in the aggregation result. Any events that do not have a payload will be skipped when computing the aggregation.

The parameter to each aggregate function is an expression with access to all matched events.

*@* AS a -> *@* AS b
AGGREGATE stddev(a.payload - b.payload) AS payload_diff_stddev

You can also do computations with the results of multiple aggregations:

*@* AS a -> *@* AS b
AGGREGATE (min(a.payload) - min(b.payload)) / 2 AS min_payload_avg

# In specifications

Aggregation can be used in a specification with the FOR AGGREGATE / VERIFY form.

FOR AGGREGATE *@* AS a -> *@* AS b
VERIFY stddev(a.payload - b.payload) < 2.1

# Aggregate functions

  • count(): Count the number of results. Unlike the other aggregates, this one doesn't have an expression argument.
  • min(...), max(...): Compute the minimum or maximum value of the expression.
  • sum(...), mean(...): Compute the sum or mean of the expression.
  • distinct(...): Return the set of distinct values taken on by the expression.
  • count_distinct(...): Return the number of distinct values taken on by the expression.
  • variance(...), stddev(...): Calculate the variance or standard deviation of the expression.