Handling Time

The Two Major Problems

Any solution to representing time must be able to solve the following two problems with appropriate precision and efficiency:

Converting between time domains

Converting between audio and musical time domains is imprecise. If this only happens in one direction, then the nature of the imprecision is well-understood and can be controlled for.

Consider a given musical time Tm. Given the current (and for the purposes of this example, fixed) state of the tempo map, this maps to an audio domain time of Ta. The units of Ta are in samples, either integer, rational or floating point. If Tm corresponds to a position precisely halfway between two samples, then the nature of the representation of Ta is important. If integral, we must round the position up or down toward the nearest integral Ta. If rational or floating point, we are still effectively quantizing the representation of Ta in various ways.

This is not an issue if we only ever convert in one direction. For example, we might decide to use integral samples for audio time, and always round fractional values down, up or to the nearest integer. However, the moment we ever try to convert this value back to Tm, we will get the wrong answer.

The canonical example where this matters is when Tm would normally lie close to the boundary of a process cycle (beginning or end). The boundary is defined in audio time. One one cycle, we decide that Tm should be considered to be in the following cycle. On the next cycle, we decide that it should be in the previous cycle. The result is that we effectively never handle events at Tm.

As mentioned above, this will not happen if we only ever convert in a single direction (e.g. musical -> audio or audio->music) since the rounding will be consistent. But limiting the conversions to be only ever in a single direction in a DAW is hard. We may wish to add values from two different time domains and get the result in either time domain, for example.

Consequently, any time representation must be able to minimize (or preferably eliminate) any real-world penalties that might occur from bidirectional conversions between time domains.

Dealing with changes to the tempo map

Whatever form musical time is represented with, there will be times when the tempo map is changed. When this happens, potentially each and every musical domain time maps to a different audio domain time. This must be handled efficiently and completely.

If musical time is lazily converted to audio time (i.e. there is a canonical form that is recognizable as the musical time domain, and it remains available at all times), then we can simply recompute the mapping between Tm and Ta when necessary.

However, if musical time is converted into some other representation (audio time domain is the most obvious), then when the tempo map changes there is no way to recover the canonical position (see the section above on bi-directional conversion errors).

In addition, if there is a "time type" that does not convey the time domain it is using, there is no way to identify those values that may need updating after a tempo map change, and those that do not.

Consistency when performing time arithmetic

Imagine that we have two time positions (or durations), T1 and T2, and that we need to perform some arithmetic using them (for example, adding them together, or computing the distance between them.

Let's additionally stipulate that we want the syntax for this arithmetic to be as simple as possible. We would rather write, for example: T1 - T2 or T2 + T1. And finally let's stipulate that we do not want to care about the time domains of T1 or T2, but do want to be able to get the result of the computation in any time domain we want.

So, now let's consider an example of where we want to add T2 (time domain: beats/musical) to T1 (time domain: samples/audio), but we want the result in beats.

Let's suppose that our expression looks like:(T1 + T2).beats(). Nice and simple. But wait, T1 uses the audio domain, so there's a need to convert either T2 to the audio domain, or T1 to the musical (beats) domain in order to perform the arithmetic. That means there are actually potentially 2 time domain conversions taking place here (T2 music->audio, then Result audio->music). No fundamental problem with that. Hopefully our abstractions will make this simple. And they do: (T1+T2) will do whatever is required without any explicit syntax required.

But what happens if the tempo map (which connects the audio and music time domains in a complex, non-linear way) changes between the two conversion? The first conversion will use the "old" map state, the second conversion will use the "new" map state, and our answer will be (potentially) wrong due to the inconsistent map state.

A traditional answer might be to explicitly lock the tempo map while we do this arithmetic, possibly using an RAII style based on some local scope. While this works fine for simple examples, it becomes extremely complex given that we want domain conversions to be transparent, implicit and essentially cost-free. We don't want to have to define the precise scope in which a possible series of time domain conversions must be atomic, since this may reach back up a call stack.

Instead, we need to ensure that we have a local copy of the tempo map that we are guaranteed is constant across any possible scope that contains a set of time domain conversions that are intended to be consistent.

We choose to define that scope as either of:

  1. for threads that handle the work of processing data: a single process cycle
  2. for other threads that use a glib event loop (GUI, control surfaces, etc.): a single thread wakeup+dispatch cycle

This means that whatever time arithmetic we perform in a single process callback, no matter how distributed, or nested, it will all be carried out with a constant, consistent version of the tempo map, meaning that all domain conversions will be consistent.

It additionally means that whenever e.g. the GUI thread is handling user interaction (e.g. mouse motion, keyboard events), all time arithmetic performed during a single event dispatch wakeup will also be consistent with each other - the tempo map will not magically change during the event dispatch process. Of course, the GUI may explicitly modify the tempo map that it is using, but that's not a problem since it is single-threaded and presumably understands what it is doing.

We can accomplish this by ensuring that each thread has a thread-local/thread-private shared_ptr<TempoMap>, and accesses the tempo map exclusively via this reference. Even if some other thread modifies the tempo map asynchronously, a thread carrying out time arithmetic will have a constant, consistent tempo map object to use for domain conversions, within the scopes defined above.