Modality Concepts
# Events and Timelines
Modality's data model is based on events and timelines. An event simply records that something happened. A timeline is a sequence of events which happened in order. Timelines can represent different things in your system, depending on the implementation. In a desktop application, each thread could be a timeline, whereas in an embedded system you might have a timeline per microcontroller.
You can think of an event as something happening in your system, and the timeline as being the 'place' that it happened.
imu o-------o--------o----------o--------o------->
start init measure measure report
# Event and Timeline names
Events and timelines are identified in two ways. The first is by name: events have names, and timelines have names. They aren't necessarily unique, so this isn't a great way to identify a specific event occurrence at a specific point in time. But they are very useful for understanding the collected trace, and for categorization.
Some example event names might be:
startup
/shutdown
send_motor_control_command
/receive_motor_control_command
read_imu
over_voltage_detected
Some example timeline names might be:
ui_thread
imu
gateway_connection
# event@timeline
notation
It is very common in Modality to deal with events occurring on
timelines, by name. We do this when printing recorded trace
information using modality log
, when composing SpeQTr queries for
use in modality query
, and in many other places. For this purpose,
we use @
in a fashion that is similar to an email address.
For example, events named measure
on timelines named imu
could be
referred to as measure@imu
.
TIP
When used in a query context, event@timeline
is automatically
expanded to the query _.name = 'event' AND _.timeline.name = 'timeline'
.
# Event Coordinates
Names are useful for referring to some kind of thing happening on some kind of timeline, but we often need to be more specific. For this, we use event coordinates. Event coordinates are composed of two parts: a timeline id, and an event id. They look like this:
60a20a3f243a439baa6211d4969a065e:03a6e0
------------------------------- ------
\ \
-- Timeline Id -- Event Id
The first part of this is the Timeline ID. This is used to identify a specific timeline instance stored in Modality. Each timeline has a name, but many timelines may share the same name (for example, if you record multiple traces from the same system). The Timeline ID is used to identify a specific event trace from the part of the system you've identified with that name.
TIP
Be careful not to confuse timeline names with timeline IDs! Many timeline instances can share the same name, but timeline IDs are used only once.
The second part of the event coordinate (after the :
) identifies the
specific event on the timeline.
# Event and Timeline Attributes
In addition to their names, Events and Timelines can have arbitrary
key/value metadata attached to them called attributes. Attribute
keys are always strings; they use .
characters as a namespacing
mechanism. Attribute values can be a scalar value of a number of
types: strings, numbers, timestamps, etc.
Some example attribute keys and values might be:
- event.name = "startup"
- event.timestamp = 1798789234ns
- timeline.name = "imu"
- timeline.mycorp.build_number = 157
All attributes attached to events start with event.
, and all
attributes attached to timelines start with timeline.
. But, this can
be skipped in places where the context is clear. For example, if
you're writing an event pattern, instead of event.name
you can just
write name
.
You can attach any attribute keys to events and timelines, but some attribute keys are special. Modality looks for these special keys to understand various things about the trace. This includes event timestamps, and information for causality tracking (see below).
For specific attribute semantics, see Special Attributes.
# Connections Between Timelines (Causality)
Timelines can represent a linear sequence of events, from a single 'place' in your system. But actual systems are made of many interacting components; this is what makes them powerful, but also makes them difficult to analyze and test.
This is where Modality is different from other event databases. Modality analyzes the stored events and timelines to derive connections between the timelines. For example, if one timeline has a 'sent message' event, and another has a 'received message' event, and those share the same id (written as event attributes), Modality can create an inter-timeline connection between those two timelines.
start config_imu recv_imu_report
control o-------o----------------------------------o---->
\ /
\ /
imu o----------o--------o----------o--------o------->
start init measure measure report
This is useful in a lot of different ways. The most important, and the reason this is built into Modality, is to have a basis of 'happens-before' and 'happens-after' across different components of your system. This idea is called causality. Since a system's current state and inputs dictate what it will do next, events that come before could have contributed to causing events that come later. By recording system execution and runtime parameters with events and their attributes, and keeping those events in order, you can see which conditions ultimately contribute to causing good or bad behavior.
This example from the modality log
command shows recorded causality
information. The white arrows between the vertical colored timelines
denote interactions. We see a complete interaction from Producer sending measurement message
on the producer
timeline to Received measurement message
on the consumer
timeline. This establishes a
causal link, meaning that events on the consumer
timeline after it
received the measurement message may have been caused by events on the
producer
timeline before it sent the message.
○───╮ ║ ║ ║ [Interaction i0001] "Producer sending measurement message" @ producer [df2571efa4ee4186b4926ed0b31b0ec7:160c0b]
║ │ ║ ║ ║ destination = consumer
║ │ ║ ║ ║ name = Producer sending measurement message
║ │ ║ ║ ║ sample = 1
║ │ ║ ║ ║ severity = info
║ │ ║ ║ ║ source.file = tracing-modality/examples/monitored_pipeline.rs
║ │ ║ ║ ║ source.line = 251
╟─────╮ ║ ║ ║ [Interaction i0002]
║ │ │ ║ ║ ║
■ │ │ ║ ║ ║ "Sending heartbeat message" @ producer [df2571efa4ee4186b4926ed0b31b0ec7:18bd47]
║ │ │ ║ ║ ║ destination = monitor
║ │ │ ║ ║ ║ name = Sending heartbeat message
║ │ │ ║ ║ ║ severity = info
║ │ │ ║ ║ ║ source.file = tracing-modality/examples/monitored_pipeline.rs
║ │ │ ║ ║ ║ source.line = 436
║ │ │ ║ ║ ║
║ ╰──────▶○ ║ [Interaction i0001] "Received measurement message" @ consumer [8ab93cb714214fe38d4c7656dc14dc47:19510f]
║ │ ║ ║ ║ name = Received measurement message
║ │ ║ ║ ║ sample = 1
║ │ ║ ║ ║ severity = info
║ │ ║ ║ ║ source.file = tracing-modality/examples/monitored_pipeline.rs
║ │ ║ ║ ║ source.line = 309
# The SpeQTr Language
Once you have collected data about your running system you can use Auxon's SpeQTr (opens new window) as a query language to ask nuanced questions about what happened. You can look for local or system-wide patterns of events, filter on event and timeline attributes, and calculate aggregate statistics. This lets you confirm that your system is doing exactly what it's supposed to or pinpoint the place where things went wrong. In addition, Modality has tools to help you understand the general structure of your system and find areas of risk.
Here you can see an example of a simple query, finding all of the
places in the collected data where the producer
sends a measurement
and then the consumer
receives it.
"Producer sending measurement message"@producer FOLLOWED BY
"Received measurement message"@consumer
Running this query gives the below results:
Result 1:
═════════
■ ║ "Producer sending measurement message" @ producer [bd40e6ad2b4747a58536cb3c850ffb14:096cb1]
║ ║ destination=consumer
║ ║ nonce=-4882720374935248381
║ ║ sample=-1
║ ║ severity=info
║ ║ source.file=tracing-modality/examples/monitored_pipeline.rs
║ ║ source.line=251
║ ║ source.module=monitored_pipeline::producer
║ ║ timestamp=1663319617641549835ns
║ ║ query.label='Producer sending measurement message'@producer
║ ║
╚═»╗ producer interacted with consumer at 5be148ebc1b84ceba6defedf4667a007:0ba86c
║ ║
║ ■ "Received measurement message" @ consumer [5be148ebc1b84ceba6defedf4667a007:0ba86c]
║ ║ interaction.remote_nonce=-4882720374935248381
║ ║ interaction.remote_timeline_id=bd40e6ad-2b47-47a5-8536-cb3c850ffb14
║ ║ interaction.remote_timestamp=1663319617641282787ns
║ ║ sample=-1
║ ║ severity=info
║ ║ source.file=tracing-modality/examples/monitored_pipeline.rs
║ ║ source.line=309
║ ║ source.module=monitored_pipeline::consumer
║ ║ timestamp=1663319617641830705ns
║ ║ query.label='Received measurement message'@consumer
║ ║
Result 2:
═════════
■ ║ "Producer sending measurement message" @ producer [bd40e6ad2b4747a58536cb3c850ffb14:087c563e]
║ ║ destination=consumer
║ ║ nonce=-6868605139615718718
║ ║ sample=-1
║ ║ severity=info
║ ║ source.file=tracing-modality/examples/monitored_pipeline.rs
║ ║ source.line=251
║ ║ source.module=monitored_pipeline::producer
║ ║ timestamp=1663319617783789766ns
║ ║ query.label='Producer sending measurement message'@producer
║ ║
╚═»╗ producer interacted with consumer at 5be148ebc1b84ceba6defedf4667a007:08cb793d
║ ║
║ ■ "Received measurement message" @ consumer [5be148ebc1b84ceba6defedf4667a007:08cb793d]
║ ║ interaction.remote_nonce=-6868605139615718718
║ ║ interaction.remote_timeline_id=bd40e6ad-2b47-47a5-8536-cb3c850ffb14
║ ║ interaction.remote_timestamp=1663319617782985545ns
║ ║ sample=-1
║ ║ severity=info
║ ║ source.file=tracing-modality/examples/monitored_pipeline.rs
║ ║ source.line=309
║ ║ source.module=monitored_pipeline::consumer
║ ║ timestamp=1663319617788634110ns
║ ║ query.label='Received measurement message'@consumer
║ ║
...
These results allow you to quickly find events that match your query,
including attribute values for the matching events. To continue to
build a better understanding of what happened you can use the modality log
command to explore the surrounding trace.
# Workspaces and Segmentation
In Modality, all collected traces are stored together. In order to split the data into useful pieces, for different users and different kinds of analysis, Modality provides the Workspace and Segmentation features.
# Workspaces
Workspaces are like views onto the data lake. They provide a top-level
filtering to decide which timelines should be included. This works
on the basis of timeline attributes; this includes timeline names, but
can also include any custom timeline attributes which you may have
added. For example, you might have a workspace which looks at the
value of the fleet.geo
custom attribute, and selects only those
timelines with the value of Europe
.
Workspaces also contain additional configuration that is used to further split up the data for analysis (see 'Segmentation' below).
Nearly every operation you do in Modality will be in the context of
some workspace. When you install modalityd
, it comes pre-configured
with a workspace called default
that includes all data.
# Segmentation
Inside of a workspace, you can split up the timelines into different
segments. This is also done on the basis of timeline attributes,
similar to workspace filtering. The difference is that for segments,
we use the attribute values to split the data into chunks. For
example, inside a workspace, you might segment the timelines based on the
value of the fleet.model
custom attribute. This would give you a
bunch of segments, named after the model number, each containing the
timelines collected from systems with that model number.
You can have multiple segmentation methods in a single workspace, so you can cut up the data in different ways.
Many operations in Modality (and the applications built on it) work with the data from a single selected segment (the 'active' one). Others can be configured to work with a single segment, or in an aggregation mode across multiple segments.
The default
workspace comes pre-configured with a single
segmentation method based on the run_id
timeline attribute; this
attribute is provided by all of the Modality collectors as a way to
easily split up the data on the basis of infrastructure cycles. The
most recently collected segment will be active by default.
# How workspaces and segments are meant to be used
Workspaces and segments provide a very generic and dynamic way to split up your collected trace data. You can use them in a lot of different ways, but this is how we designed them to be used:
Workspaces should be used as a coarse, top-level filtering mechanism. You'll probably have a small handful of workspaces, each for a different use case. Most users shouldn't have to deal with more than 1 or 2 workspaces on a daily basis.
Segmentation methods should be configured to reflect some thing that exists in your workflow; maybe a test or CI run, or field test. They could align with a physical piece of infrastructure, like a specific drone or a test rig in your lab. You can have multiple segmentation methods in a workspace, so you can do all of these together.
# Operational Architecture
Modality is a client-server application. The server is called
modalityd
; the command-line client is just called modality
. There
is also an event conversion and router application called
modality-reflector
.
# modalityd
modalityd
is the database server component of Modality. It is
deployed as a single process with local storage, which is split into
multiple files based on the storage class. It implements a token-based
authentication and authorization system to manage data ingest and client
connections.
modalityd
is typically deployed on a central server, which is
accessed by all users who want to use it. It is suitable for both
cloud-based or on-premises deployment.
# modality-reflector
The reflector provides a few key functions in a Modality deployment:
- It manages configuration and execution of plugins for collectors, importing, and mutations.
- It connects back to modalityd, or to another reflector.
- It can add additional attributes to the timelines which pass through it.
Some simple Modality deployments may only need a single reflector, or may not need one at all (if you are using a tracing framework which directly supports the Modality event ingest protocol). Other deployments may use multiple reflectors.
Here are some common scenarios where you might want to deploy additional reflectors:
Some embedded collector and mutation plugins require direct access to peripheral hardware, like a JTAG probe. For these scenarios, you can run modality-reflector on the computer with the hardware installed, configured to run those plugins.
For network-based tracing systems, the network topology may not allow incoming connections from collection infrastructure. In this case, modality-reflector can be deployed in the inner network; all Modality network connections are client-initiated, so many different network topologies can be supported.
If you have a system-of-systems, you may want to deploy an intermediate modality-reflector for routing or annotation purposes. Each reflector process can add its own metadata, allowing clean separation between the information known to a system about itself, and information known about its operational context.
# modality
CLI
The Modality command-line client application (executed as just
modality
) is the primary user interface for most modality users. It
can be used to view logs, evaluate queries, set up and select
workspaces and segments, and perform administrative tasks.