# Query Language Reference
# Conceptual Overview
The Modality Trace Query Language is a tool for interacting with system trace data acquired with Modality. It can be used to identify portions of the trace which match a certain pattern, to compute aggregations over part of the trace, or to check that a property holds over the collected trace data.
It can also be used as a specification or executable requirement
mechanism; using the
modality check command described below will
evaluate such specifications over actually observed system
behavior. When appropriate, Modality can call out counterexamples to
your specification or help you narrow down which parts of a
non-passing specification are not fulfilled by the system.
# What's in a trace?
In Modality, a system trace consists of a collection of related
events. These events are logged using the
and collected from many probes into a session. The session data is
stored by Modality, which can execute trace queries over it.
Causality is central to Modality's tracing system - it's how
Modality can tell if one event is before or after another. Unlike from
other systems you may be used to, event ordering in Modality is not
derived from a common time source. Instead, Modality uses the
snapshot data passed between
modality-probe instances to establish
before-and-after relationships between events on different probes.
This means that event ordering in Modality is partial: Given two
events, one may be before or after the other, or they may have no
known ordering at all (that is, they are logically concurrent).
This causal ordering is the backbone of the Modality Trace Query Language.
# What's a trace query?
Modality stores all the collected trace data for a session in a
causal graph, where each node is an event, and each edge represents
a causal (happened-before, happened-after) relationship between the
events. You can inspect this graph directly using the
Trace queries give you a way to use that graph to learn or check things about your system. You might use a trace query to:
- Find occurrences of an error event
- Check that all values logged at a particular point in your software are within the desired range
- Check that all the faults of a certain type are followed by an automatic remediation.
- Specify that some error conditions should never happen
- A complex, distributed initialization sequence always occurs as specified.
# A basic example
Here's an example of a trace query:
MATCH radar_power_on@control AS radar_on -> (radar_reading@radar AND (payload < 0.2 OR payload > 0.8)) AS invalid_reading
This will look for a sequence of two events:
radar_on on the
control probe, and
invalid_reading on the
radar probe. The
invalid_reading event is defined as any event named
whose payload is outside the range
[0.2, 0.8]. These matches will be
limited to pairs of events where
invalid_reading is causally after
There may be many pairs of events which satisfy the constraints set
down in this query. When running the query with the
command, each of these pairs of events comprises a single
modality query will show you many different pairs of events
that satisfy the constraints.
# A more sophisticated example
This trace query matches a non-deterministic, distributed initialization sequence. It's written in a specification style, so it can be used to check that the specified sequence is always followed.
FOR ALL begin_startup@controller AS start MATCH start -> broadcast_init@controller AS bcast WITHIN 1ms -> finish_startup@controller AS done WITHIN 1s INTERSECT bcast -> poll_completion@controller REPEAT 1..1000 !-> poll_completion@controller -> done INTERSECT bcast -> recv_init@radar -> send_completion@radar WITHIN 500ms -> recv_radar_init_complete@controller -> done INTERSECT bcast -> recv_init@imu -> send_completion@imu WITHIN 250ms -> recv_imu_init_complete@controller -> done
This describes four intersecting chains of events, starting with the
begin_startup event at the
controller probe, and interacting with
imu probes. It specifies time limits using the local
timekeeping available at each probe.
When used with the
modality query command, this query will find all
the named events for the matching initialization sequences. When used
modality check, it will verify that all matching
events are followed by the initialization sequence; if there are any
which don't, it will tell you which ones.
# Matching events (MATCH clause)
# Basic structure
MATCH clause is used to specify local properties of the events
you're looking for. An event predicate is used to match against the
nodes in the graph, and a label is used to refer to the matched event
throughout the rest of the query.
MATCH (id = 12) AS ev1 -> (name = "MY_EVENT") AS ev2
In this example, the
id = 12 event predicate is used to identify all
nodes in the causal graph where an event with id 12 was
logged. Similarly, the
name = "MY_MATCH" predicate is used to
identify nodes in the causal graph where the logged event has the
specified name. These matched nodes are associated with their
ev2, which are available in further
Many events may be matched in the
MATCH clause. If more than one is
present, their relationships must be specified in the
# Shortcut syntax
# Implicit labels
If you're writing a query that just matches a single event, you don't need to label it:
MATCH name="radar_on" AND probe="ecu"
A label will be automatically generated based on the content of the event predicate.
It is very common to write predicates which match on event name and probe name. There is a shortcut syntax for this purpose:
Event and probe IDs can also be used in this way, if you don't have the names available:
This can be combined with constraints as part of a larger event predicate:
MATCH supply_voltage_measurement@ecu AND payload > 12.0
You can also use wildcards here, in either position:
MATCH *@ecu -> another_event@*
# Event Predicates
An event predicate is an expression that is evaluated against a
candidate event in the causal graph to determine if it should be
included as a query result. It should evaluate to a boolean value;
true means the event is included, while
false means it is not.
Event predicates use typical operators:
- Boolean operators:
The equality operator "
=" is also used for glob-matching: all
strings can be compared with a glob pattern that use the "
character as a wildcard, such as
When a set of values is available, such as the
scope_instance fields, equality tests for membership in that
set. Sets of strings support glob matching as well.
# Event Fields
The following named fields may be used in event predicates:
id: The numerical event id
name: The event's name, as a string
payload: The payload provided when the event was logged.
probe: The probe where the event was logged. This may be compared with either the numeric probe id or the probe's name.
coordinates: The event's coordinates, which identify it by position in the causal graph. This value is found in query results, and in the output of
component: The name of the component in which the event was defined.
component_id: The UUID of the component in which the event was defined, formatted as a string.
tag: The events tags. To match a tag, use string equality:
tag = "debug"will match any event with the
debugtag. Glob wildcards may also be used here.
scope: The names of all scopes active at the time the event was logged.
scope_instance: The numerical ids of all scope instances active at the time the event was logged.
file: The path to the file in which the event was defined.
mutation_epoch: The numerical id assigned by modality to a section of causal history, used to track the effects of injected mutations.
severity: The numerical severity of an event, valued 1-10, derived from the presence of
SEVERITY_Ntags in the event definition.
expectation: The name given to an expectation event. For expectations logged from a probe, this is the same as the event name. For expectations logged with the
modality expectcommand, this is the name given at the command line.
is_expectation: A boolean indicating whether an event is an expectation event.
outcome: If the event is an expectation, this gives its outcome. This evaluates to the special values
Some of these fields are only available on some events. If an event predicate that uses an unavailable field is evaluated against an event that does not have it, the event is simply skipped.
If you need a predicate that can match any event, use the special
MATCH first@p -> ANY -> second@p
A long or complex path may benefit from some structure. You can define
groups in a
MATCH clause by parentheses and apply a name:
MATCH (power_on@p AS a -> init@p AS b -> finished_start@p AS c) AS startup -> interesting_error@p
When using grouping, the results are accessed using
example, the payload of
a above could be accessed in a subsequent
FILTER clause with
The repetition of a single event or a pattern of events can be
REPEAT ... This works a lot like repetition does in
a regular expression:
MATCH (loop_start@p AS start -> loop_end AS end) AS loop REPEAT 2..3
Each repetition is treated like a group for the purpose of results
labeling, except the repetition number is addressed as an index. For
example, results in the second match of the above repetition named
loop can be addressed as
# Event relationships (Paths in MATCH)
MATCH clause can specify multiple events, constraining the
matches using structural constraints betwen them. All relationship
constraints are based on these basic causal relationships:
These operate on labeled events to constrain the query results to those where the events occur in the given causal order. For example:
MATCH (ANY AS a) FOLLOWED BY (ANY as b)
The parentheses are optional; we use them here for clarity. You could equivalently write:
MATCH ANY AS a -> ANY as b
# Time-limited relationships
If you have added wall-clock time information to your events, you can use that in a relationship constraint:
MATCH ANY AS a -> ANY AS b WITHIN 1 ms
The supported time units are
NANOSECONDS, or their abbreviations
Note that this will only work for events which share a common time domain, which is specified when you initialize the probe. This should always be the case for events on the same probe. It can work across probes if you have made special accomodations for time synchronization between them (in a way that meets your system requirements) and have appropriately initialized the relevant probes with the same time domain id.
# Step Limits
You can limit the distance between matches in the causal graph using
WITHIN <n> STEPS and
AFTER <n> STEPS. For example, this will match
any two causally connected events within three steps of each other:
MATCH ANY AS a -> ANY AS b WITHIN 3 STEPS
Note While step limits are useful for narrowing down a query during debugging, they are inherently somewhat fragile. When you add additional instrumentation to a SUT, you could easily invalidate the structural assumptions made by some query using step limits. As such, you should avoid using step limits when writing specifications or test cases.
# Negative lookahead
A negative relationship with nothing else after it behaves like a lookahead operator. For example, suppose you want to identify errors which are not followed by a remediation event inside a suitable time window:
MATCH tag='error' NOT FOLLOWED BY tag='remediation' WITHIN 5 ms
The negative relationship operators are:
NOT FOLLOWED BY(or
NOT PRECEDED BY(or
Note that the event matched by the right-hand side of the relationship may not be labeled; there's actually nothing there to find, since this is specifying what should /not/ be present.
# Negative path constraints
You can similarly specify a negative constraint to be applied on the path between two nodes. For example, you might want to look for pairs of startup and shutdown events that don't have an error signaled on the path between them:
MATCH startup@p AS a !-> tag='error' -> shutdown@p AS b
The syntax is similar to negative lookahead, but the subsequent
means that it will apply to any nodes found between the events
b. There don't have to be any events there: this query will match
a is directly followed by
b, with nothing in between.
By default, each relationship matches only the first (nearest) event
in the specified direction. There may be times when you want more
matches, or even to find out all the matches. The trace query language
CUT relationship operator to describe this; this is meant to
be reminiscent of 'cut' in Prolog, as it stops exploring a particular
branch of potential results once some have been found.
The default behavior is to limit to just one result per relationship. So, these two queries are equivalent.
MATCH ANY AS a -> ANY AS b MATCH ANY AS a -> ANY AS b CUT 1
If you want to see up to 3
bs for every
a, you can say:
MATCH ANY AS a -> ANY AS b CUT 3
If you want to see ALL the possible
bs for every
MATCH ANY AS a -> ANY AS b NO CUT
# Multiple paths (INTERSECT)
MATCH clause is sufficent to match trace-like patterns,
which consist of a single path through the causal graph. If you want
to match more complicated patterns, you can do so by describing them
as multiple paths which
INTERSECT with each other. For example:
MATCH source_1@p1 -> meet@center AS meet INTERSECT source_2@p2 -> meet
Note that the
meet label is used in both paths: this means that the
two nodes are unified.
meet has to be the same node in both paths;
graphically it's a "V" pattern. Modality will search for ways to
choose three nodes such that ALL the given relationships are satisfied.
One common misinterpretation of this is to view them as 'shortcuts', to
match multiple events using the same condition. This is not correct. Each time you reuse an event label in an
INTERSECT clause, you're
applying a new constraint to the same event. So each time you talk
a, it's always the same
a within a single query
# Filtering over multiple events (FILTER clause)
The filter clause consists of a single boolean expression that is
nearly identical to the expressions used for event
predicates. The only difference is that all event
fields must be label-qualified, using
Thus gives the filter expression access to multiple events at the same
time. This is the primary use case for
FILTER. For example, you
might want to find events with the same payload as each other:
MATCH ANY AS a -> ANY AS b FILTER a.payload = b.payload
FILTER, you should be aware of the fact that
relationships only give you the first answer for any relationship by
default. You can use this fact to express some unique
queries. This is best explained by example. Suppose you wanted to find
ground_contact events where the most recent accelerometer reading is
above a certain threshold:
MATCH ground_contact@control AS a <- acceleration@control AS b FILTER b.payload > 12.0
Note that putting the payload constraint in the event predicate for
b would have an entirely different meaning; it would find the
nearest event to
a whose payload is greater than 12.
Aggregation in the Modality Trace Query Language works very similarly to aggregation in other query languages. You can specify aggregations over previous query results, do some computations with them, and give them names.
MATCH ANY AS a AGGREGATE stddev(a.payload) AS payload_stddev
Will compute the standard deviation of the payload value for all
events, naming it
payload_stddev in the aggregation result. Any
events that do not have a payload will be skipped, when computing the
The parameter to each aggregate function is an expression with access
to all matched events, just like the ones used in
MATCH ANY AS a, ANY AS b AGGREGATE stddev(a.payload - b.payload) AS payload_diff_stddev
You can also do computations with the results of multiple aggregations:
MATCH ANY AS a, ANY AS b AGGREGATE (min(a.payload) - min(b.payload)) / 2 AS min_payload_avg
# Aggregate functions
count(): Count the number of results. Unlike the other aggregates, this one doesn't have an expression argument.
max(...): Compute the minimum or maximum value of the expression.
mean(...): Compute the sum or mean of the expression.
distinct(...): Return the set of distinct values taken on by the expression.
count_distinct(...): Return the number of distinct values taken on by the expression.
stddev(...): Calculate the variance or standard deviation of the expression.
# Extended query structures
The basic MATCH/FILTER/AGGREGATE query form is the core of all trace queries. However, there are some extensions to this syntax available.
NEVER MATCH queries
Regular modality queries are great for specifying what should
happen. But what if you want to specify what shouldn't happen?
NEVER MATCH queries do the job.
NEVER MATCH supply_sample_voltage@expensive_sensor AND payload > 1.25
You can prepend
NEVER to any query that doesn't use aggregation to
invert its meaning.
When you execute a
NEVER MATCHquery with the
modality checkcommand, it does what you'd expect: if modality can't find the thing that shouldn't exist, the check succeeds. Otherwise, it fails.
When you execute a
NEVER MATCHquery with the
modality query, command, a single synthetic query result is returned, where the
successlabel is bound to the value
FOR ALL queries
It is common to think of system specification in terms of
conditionals: "if this thing occurs, then that other thing should
respond in this way." You can verify these behaviors with Modality by
FOR ALL query structure.
FOR ALL this_thing@over_here AS antecedent MATCH antecedent -> that_thing@over_there AS consequent
This will start by looking for all the different
events. For each of those, it will check that there's a
event which follows it. If there isn't, the query failures.
There are two logical sections to these queries, the
FOR ALL part,
MATCH part. You can put a whole query in each of them
AGGREGATE), allowing for very sophisticated antecedents
and consequents. Any event labels declared in the
MATCH section of
the antecedent may be used in the consequent.
In addition to
FOR ALL / MATCH, you can also do
FOR ALL / NEVER MATCH.
FOR ALL queries works the same as
# Mutation preconditions
When injecting mutations into a SUT using the
command, or in an automated fashion using an
objective you can
specify when then mutation should be injected as a 'mutation
preconditon' query. This is a special form of a trace query that has
limited features, as it is executed dynamically by the Modality Probe
as the events are logged.
Each event predicate must specify both an event and a probe, either by id or by name.
Each event predicate may specify up to two constraints on the event's payload, as a comparision with a constant value. These must be
No other event predicate structures are allowed.
FOLLOWED BYrelationships are supported, without any extra constraints.
Relationships which cross from one probe to another are allowed, but they will only work if you have fully integrated
modality-probeinto the SUT and are sending snapshots across the probe boundary. The connection can be indirect, via an intermdiate probe.
AGGREGATEare not allowed, nor are the
The entire SUT has a limit on the number of event predicates which can be under consideration at once. This applies across the preconditions for all active mutations.
# Trace query structure
# Query clauses
MATCH is present in all trace queries, but there are some additional
clauses available as well.
MATCH <event path> [INTERSECT <event path>]* [FILTER <filter predicate>] [AGGREGATE <aggregate definitions>]
MATCH clause is used to match events in the trace
based on event-local predicates. They can use properties such as the
event name or id, the probe where the event was logged, the event's
payload, and so on.
A 'path' of related events may be specified in the
using the event relationship syntax. For example,
you can limit matches to only cases where one event is causally
followed by another. The match may be additionally constrained by time
elapsed, and by various constraints on the path between the events.
INTERSECT clause may be added to match
patterns of events that do not lie along a single causal path. You can
add as many of these as you need.
FILTER clause is used to provide an
additional level of filtering that applies after everything in the
INTERSECT clauses is taken into account. It may apply
conditions across multiple matched events: for example, you can use a
FILTER to look for two events that have the same payload.
AGGREGATE clause is used to compute
aggregations over the results matched by the trace query preceding
it. You might use this to compute the standard deviation of a certain
event payload, or to check that an important startup event happens
# Evaluation model
A clear understanding of how each query clause is evaluated is very helpful for writing good trace queries. It is helpful to break it up into three stages.
# Evaluating Events and Relationships (
Modality starts by identifying all the ways that events can be
assigned to labels that are consistent with the predicates and
relationships in the
INTERSECT clauses. Each way this
assignment can happen is a distinct query result.
For example, consider this query:
MATCH ANY AS a -> ANY AS b
Each result from this query is a map which binds the labels in the
a = some trace event, b = some trace event. The
events will be chosen so that
b is always causally following
There are of course many different ways to choose
b to make
this work. By Modality will try show you the single nearest
CUT for ways to control this). The result
of query evaluation is a list of ways to bind its events that match
the event predicates and relationship constraints.
INTERSECT doesn't change this model; it just adds additional
constraints to the matches.
# Filtering Results (
From an evaluation perspective, the
FILTER clause is applied to each
query result after it comes out of the
MATCH/INTERSECT section. Its
result is used to either keep each result, or throw it out.
FILTER clause can be used in a way that could be equivalently
represented in a
MATCH predicate. But much of its unique power
lies in the fact that it is evaluated after event relationships are
taken into account.
# Aggregation (
Aggregation accumulates data from the output of the filtering stage.
# Special syntax rules
Keywords are not case sensitive. For example,
matchare treated identically. We use the upper-case form in the documentation for clarity.
Spaces, tabs, and newlines may be used freely; all whitespace is equivalent, and indentation is not syntactically significant.
# Query evaluation context
There are three main ways to use trace queries with modality, the
query command, the
check command, and as a mutation precondition.
query command is used when you are interested in the events
matched by the query, or the values computed by an aggregation.
check command evaluates the query in a pass/fail
fashion. It behaves slightly differently depending on the content of the
- For a regular
checkwill succeed if the query has any results.
- If the query has an
checkwill succeed if all the boolean-valued aggregates equal
- Some of the extended query behaviors
behave differently in a
# Mutation preconditions
A subset of the Modality Trace Query Language may be used to specify preconditions to mutation injection; it is described in detail in the mutation preconditions section.