Modality Concepts

# Events and Timelines

Modality's data model is based on events and timelines. An event simply records that something happened. A timeline is a sequence of events which happened in order. Timelines can represent different things in your system, depending on the implementation. In a desktop application, each thread could be a timeline, whereas in an embedded system you might have a timeline per microcontroller.

You can think of an event as something happening in your system, and the timeline as being the 'place' that it happened.

imu     o-------o--------o----------o--------o------->
      start    init   measure    measure   report

# Event and Timeline names

Events and timelines are identified in two ways. The first is by name: events have names, and timelines have names. They aren't necessarily unique, so this isn't a great way to identify a specific event occurrence at a specific point in time. But they are very useful for understanding the collected trace, and for categorization.

Some example event names might be:

  • startup / shutdown
  • send_motor_control_command / receive_motor_control_command
  • read_imu
  • over_voltage_detected

Some example timeline names might be:

  • ui_thread
  • imu
  • gateway_connection

# event@timeline notation

It is very common in Modality to deal with events occurring on timelines, by name. We do this when printing recorded trace information using modality log, when composing SpeQTr queries for use in modality query, and in many other places. For this purpose, we use @ in a fashion that is similar to an email address.

For example, events named measure on timelines named imu could be referred to as measure@imu.

TIP

When used in a query context, event@timeline is automatically expanded to the query _.name = 'event' AND _.timeline.name = 'timeline'.

# Event Coordinates

Names are useful for referring to some kind of thing happening on some kind of timeline, but we often need to be more specific. For this, we use event coordinates. Event coordinates are composed of two parts: a timeline id, and an event id. They look like this:

60a20a3f243a439baa6211d4969a065e:03a6e0
------------------------------- ------
        \                          \
         -- Timeline Id             -- Event Id

The first part of this is the Timeline ID. This is used to identify a specific timeline instance stored in Modality. Each timeline has a name, but many timelines may share the same name (for example, if you record multiple traces from the same system). The Timeline ID is used to identify a specific event trace from the part of the system you've identified with that name.

TIP

Be careful not to confuse timeline names with timeline IDs! Many timeline instances can share the same name, but timeline IDs are used only once.

The second part of the event coordinate (after the :) identifies the specific event on the timeline.

# Event and Timeline Attributes

In addition to their names, Events and Timelines can have arbitrary key/value metadata attached to them called attributes. Attribute keys are always strings; they use . characters as a namespacing mechanism. Attribute values can be a scalar value of a number of types: strings, numbers, timestamps, etc.

Some example attribute keys and values might be:

  • event.name = "startup"
  • event.timestamp = 1798789234ns
  • timeline.name = "imu"
  • timeline.mycorp.build_number = 157

All attributes attached to events start with event., and all attributes attached to timelines start with timeline.. But, this can be skipped in places where the context is clear. For example, if you're writing an event pattern, instead of event.name you can just write name.

You can attach any attribute keys to events and timelines, but some attribute keys are special. Modality looks for these special keys to understand various things about the trace. This includes event timestamps, and information for causality tracking (see below).

For specific attribute semantics, see Special Attributes.

# Connections Between Timelines (Causality)

Timelines can represent a linear sequence of events, from a single 'place' in your system. But actual systems are made of many interacting components; this is what makes them powerful, but also makes them difficult to analyze and test.

This is where Modality is different from other event databases. Modality analyzes the stored events and timelines to derive connections between the timelines. For example, if one timeline has a 'sent message' event, and another has a 'received message' event, and those share the same id (written as event attributes), Modality can create an inter-timeline connection between those two timelines.

          start  config_imu                     recv_imu_report
control     o-------o----------------------------------o---->
                     \                                /
                      \                              /
imu         o----------o--------o----------o--------o------->
          start       init   measure    measure   report

This is useful in a lot of different ways. The most important, and the reason this is built into Modality, is to have a basis of 'happens-before' and 'happens-after' across different components of your system. This idea is called causality. Since a system's current state and inputs dictate what it will do next, events that come before could have contributed to causing events that come later. By recording system execution and runtime parameters with events and their attributes, and keeping those events in order, you can see which conditions ultimately contribute to causing good or bad behavior.

This example from the modality log command shows recorded causality information. The white arrows between the vertical colored timelines denote interactions. We see a complete interaction from Producer sending measurement message on the producer timeline to Received measurement message on the consumer timeline. This establishes a causal link, meaning that events on the consumer timeline after it received the measurement message may have been caused by events on the producer timeline before it sent the message.

───╮                 [Interaction i0001] "Producer sending measurement message" @ producer   [df2571efa4ee4186b4926ed0b31b0ec7:160c0b]
                destination = consumer
                name = Producer sending measurement message
                sample = 1
                severity = info
                source.file = tracing-modality/examples/monitored_pipeline.rs
                source.line = 251
─────╮               [Interaction i0002]
   │ │       
   │ │               "Sending heartbeat message" @ producer   [df2571efa4ee4186b4926ed0b31b0ec7:18bd47]
   │ │                 destination = monitor
   │ │                 name = Sending heartbeat message
   │ │                 severity = info
   │ │                 source.file = tracing-modality/examples/monitored_pipeline.rs
   │ │                 source.line = 436
   │ │       
   ╰──────▶           [Interaction i0001] "Received measurement message" @ consumer   [8ab93cb714214fe38d4c7656dc14dc47:19510f]
                name = Received measurement message
                sample = 1
                severity = info
                source.file = tracing-modality/examples/monitored_pipeline.rs
                source.line = 309

# The SpeQTr Language

Once you have collected data about your running system you can use Auxon's SpeQTr (opens new window) as a query language to ask nuanced questions about what happened. You can look for local or system-wide patterns of events, filter on event and timeline attributes, and calculate aggregate statistics. This lets you confirm that your system is doing exactly what it's supposed to or pinpoint the place where things went wrong. In addition, Modality has tools to help you understand the general structure of your system and find areas of risk.

Here you can see an example of a simple query, finding all of the places in the collected data where the producer sends a measurement and then the consumer receives it.

"Producer sending measurement message"@producer FOLLOWED BY
"Received measurement message"@consumer

Running this query gives the below results:

Result 1:
═════════
■    "Producer sending measurement message" @ producer  [bd40e6ad2b4747a58536cb3c850ffb14:096cb1]
      destination=consumer
      nonce=-4882720374935248381
      sample=-1
      severity=info
      source.file=tracing-modality/examples/monitored_pipeline.rs
      source.line=251
      source.module=monitored_pipeline::producer
      timestamp=1663319617641549835ns
      query.label='Producer sending measurement message'@producer
  
╚═»  producer interacted with consumer at 5be148ebc1b84ceba6defedf4667a007:0ba86c
  
"Received measurement message" @ consumer  [5be148ebc1b84ceba6defedf4667a007:0ba86c]
      interaction.remote_nonce=-4882720374935248381
      interaction.remote_timeline_id=bd40e6ad-2b47-47a5-8536-cb3c850ffb14
      interaction.remote_timestamp=1663319617641282787ns
      sample=-1
      severity=info
      source.file=tracing-modality/examples/monitored_pipeline.rs
      source.line=309
      source.module=monitored_pipeline::consumer
      timestamp=1663319617641830705ns
      query.label='Received measurement message'@consumer
  

Result 2:
═════════
■    "Producer sending measurement message" @ producer  [bd40e6ad2b4747a58536cb3c850ffb14:087c563e]
      destination=consumer
      nonce=-6868605139615718718
      sample=-1
      severity=info
      source.file=tracing-modality/examples/monitored_pipeline.rs
      source.line=251
      source.module=monitored_pipeline::producer
      timestamp=1663319617783789766ns
      query.label='Producer sending measurement message'@producer
  
╚═»  producer interacted with consumer at 5be148ebc1b84ceba6defedf4667a007:08cb793d
  
"Received measurement message" @ consumer  [5be148ebc1b84ceba6defedf4667a007:08cb793d]
      interaction.remote_nonce=-6868605139615718718
      interaction.remote_timeline_id=bd40e6ad-2b47-47a5-8536-cb3c850ffb14
      interaction.remote_timestamp=1663319617782985545ns
      sample=-1
      severity=info
      source.file=tracing-modality/examples/monitored_pipeline.rs
      source.line=309
      source.module=monitored_pipeline::consumer
      timestamp=1663319617788634110ns
      query.label='Received measurement message'@consumer
  
...

These results allow you to quickly find events that match your query, including attribute values for the matching events. To continue to build a better understanding of what happened you can use the modality log command to explore the surrounding trace.

# Workspaces and Segmentation

In Modality, all collected traces are stored together. In order to split the data into useful pieces, for different users and different kinds of analysis, Modality provides the Workspace and Segmentation features.

# Workspaces

Workspaces are like views onto the data lake. They provide a top-level filtering to decide which timelines should be included. This works on the basis of timeline attributes; this includes timeline names, but can also include any custom timeline attributes which you may have added. For example, you might have a workspace which looks at the value of the fleet.geo custom attribute, and selects only those timelines with the value of Europe.

Workspaces also contain additional configuration that is used to further split up the data for analysis (see 'Segmentation' below).

Nearly every operation you do in Modality will be in the context of some workspace. When you install modalityd, it comes pre-configured with a workspace called default that includes all data.

# Segmentation

Inside of a workspace, you can split up the timelines into different segments. This is also done on the basis of timeline attributes, similar to workspace filtering. The difference is that for segments, we use the attribute values to split the data into chunks. For example, inside a workspace, you might segment the timelines based on the value of the fleet.model custom attribute. This would give you a bunch of segments, named after the model number, each containing the timelines collected from systems with that model number.

You can have multiple segmentation methods in a single workspace, so you can cut up the data in different ways.

Many operations in Modality (and the applications built on it) work with the data from a single selected segment (the 'active' one). Others can be configured to work with a single segment, or in an aggregation mode across multiple segments.

The default workspace comes pre-configured with a single segmentation method based on the run_id timeline attribute; this attribute is provided by all of the Modality collectors as a way to easily split up the data on the basis of infrastructure cycles. The most recently collected segment will be active by default.

# How workspaces and segments are meant to be used

Workspaces and segments provide a very generic and dynamic way to split up your collected trace data. You can use them in a lot of different ways, but this is how we designed them to be used:

  • Workspaces should be used as a coarse, top-level filtering mechanism. You'll probably have a small handful of workspaces, each for a different use case. Most users shouldn't have to deal with more than 1 or 2 workspaces on a daily basis.

  • Segmentation methods should be configured to reflect some thing that exists in your workflow; maybe a test or CI run, or field test. They could align with a physical piece of infrastructure, like a specific drone or a test rig in your lab. You can have multiple segmentation methods in a workspace, so you can do all of these together.

# Operational Architecture

Modality is a client-server application. The server is called modalityd; the command-line client is just called modality. There is also an event conversion and router application called modality-reflector.

# modalityd

modalityd is the database server component of Modality. It is deployed as a single process with local storage, which is split into multiple files based on the storage class. It implements a token-based authentication and authorization system to manage data ingest and client connections.

modalityd is typically deployed on a central server, which is accessed by all users who want to use it. It is suitable for both cloud-based or on-premises deployment.

# modality-reflector

The reflector provides a few key functions in a Modality deployment:

  • It manages configuration and execution of plugins for collectors, importing, and mutations.
  • It connects back to modalityd, or to another reflector.
  • It can add additional attributes to the timelines which pass through it.

Some simple Modality deployments may only need a single reflector, or may not need one at all (if you are using a tracing framework which directly supports the Modality event ingest protocol). Other deployments may use multiple reflectors.

Here are some common scenarios where you might want to deploy additional reflectors:

  • Some embedded collector and mutation plugins require direct access to peripheral hardware, like a JTAG probe. For these scenarios, you can run modality-reflector on the computer with the hardware installed, configured to run those plugins.

  • For network-based tracing systems, the network topology may not allow incoming connections from collection infrastructure. In this case, modality-reflector can be deployed in the inner network; all Modality network connections are client-initiated, so many different network topologies can be supported.

  • If you have a system-of-systems, you may want to deploy an intermediate modality-reflector for routing or annotation purposes. Each reflector process can add its own metadata, allowing clean separation between the information known to a system about itself, and information known about its operational context.

# modality CLI

The Modality command-line client application (executed as just modality) is the primary user interface for most modality users. It can be used to view logs, evaluate queries, set up and select workspaces and segments, and perform administrative tasks.