Is it real-time?

Several people have asked whether the Gaborator is suitable for real-time applications. There is no simple yes or no answer to this question, because there are many different definitions of "real-time", and the answer will depend the definition. Below are some answers to the question "is it real-time?" rephrased in terms of different definitions.

Can it processes a recording in less time than its duration?

Yes. For example, at 48 frequency bands per octave, a single core of a 2.5 GHz Intel Core i5 CPU can analyze some 10 million samples per second, which is more than 200 times faster than real time for a single channel of 44.1 kHz audio.

Does it have bounded latency? Can it start producing output before consuming the entire input? Will it stream?

Yes. See the streaming example.

Does it have low latency?

Probably not low enough for applications such as live musical effects. The exact latency depends on factors such as the frequency range and number of bands per octave, but tends to range between "high" and "very high". For example, with the parameters used in the online demo, 48 frequency bands per octave down to 20 Hz, the latency of the analysis side alone is some 3.5 seconds, and if you do analysis followed by resynthesis, the total latency will be almost 13 seconds.

This can be reduced by choosing the analysis parameters for low latency. For example, if you decrease the number of frequency bands per octave to 12, and increase the minimum frequency to 200 Hz, the latency will be about 85 milliseconds for analysis only, and about 300 milliseconds for analysis + resynthesis, but this is still too much for a live effect.

Any constant-Q spectrum analysis involving low frequencies will inherently have rather high latency (at least for musically useful values of Q), because the lowest-frequency analysis filters will have narrow bandwidths, which lead to long impulse responses. Furthermore, the Gaborator uses symmetric Gaussian analysis filters that were chosen for properties such as linear phase and accurate reconstruction, not for low latency, so the latency will be higher than what might be achievable with a constant-Q filter bank specifically designed for low latency.

The latency only affects causal applications, and arises from the need to wait for the arrival of future input samples needed to calculate the present output, and not from the time it takes to perform the calculations. In a non-causal application, such as applying an effect to a recording, the latency does not apply, and performance is limited only by the speed of the calculations. This can lead to the somewhat paradoxical situation that applying an effect to a live stream causes a latency of several seconds, but applying the same effect to an entire one-minute recording runs in a fraction of a second.

In analysis and visualization applications that don't need to perform resynthesis, it may be possible to partly hide the latency by taking advantage of the fact that the coefficients for the higher frequencies exhibit lower latency than those for low frequencies. For example, a live spectrogram display could update the high-frequency parts of the display before the corresponding low-frequency parts. Alternatively, low-frequency parts of the spectrogram could be drawn multiple times, effectively animating the display of the low-frequency coefficients as they converge to their final values.

Does it support small blocks sizes?

Yes, but there is a severe performance penalty. The Gaborator works most efficiently when the signal is processed in large blocks, preferably 217 samples or more, corresponding to several seconds of signal at typical audio sample rates. A real-time application aiming for low latency will want to use smaller blocks, for examples 25 to 210 samples. The CPU time it takes to process such a small block will be almost constant regardless of its size, so the total CPU time consumed will rise proportionally to the inverse of the block size. For sufficiently small blocks, it will exceed the duration of the signal, at which point the system can no longer be considered real-time. For example, analyzing a 44.1 kHz audio stream on a 2.5 GHz Intel Core i5 CPU, this happens at block sizes below about 27 = 128 samples.

Can it process a signal stream of any length?

Not in practice — the length is limited by floating point precision. At typical audio sample rates, roundoff errors start to become significant after some hours.

Does it avoid dynamic memory allocation in the audio processing path?

Currently, no — it dynamically allocates both the coefficient data structures and various temporary buffers.