Several people have asked whether the Gaborator is suitable for real-time applications. There is no simple yes or no answer to this question, because there are many different definitions of "real-time", and the answer will depend the definition. Below are some answers to the question "is it real-time?" rephrased in terms of different definitions.
Yes. For example, at 48 frequency bands per octave, a single core of a 2.5 GHz Intel Core i5 CPU can analyze some 10 million samples per second, which is more than 200 times faster than real time for a single channel of 44.1 kHz audio.
Yes. See the streaming example.
Probably not low enough for applications such as live musical effects. The exact latency depends on factors such as the frequency range and number of bands per octave, but tends to range between "high" and "very high". For example, with the parameters used in the online demo, 48 frequency bands per octave down to 20 Hz, the latency of the analysis side alone is some 3.5 seconds, and if you do analysis followed by resynthesis, the total latency will be almost 13 seconds.
This can be reduced by choosing the analysis parameters for low latency. For example, if you decrease the number of frequency bands per octave to 12, and increase the minimum frequency to 200 Hz, the latency will be about 85 milliseconds for analysis only, and about 300 milliseconds for analysis + resynthesis, but this is still too much for a live effect.
Any constant-Q spectrum analysis involving low frequencies will inherently have rather high latency (at least for musically useful values of Q), because the lowest-frequency analysis filters will have narrow bandwidths, which lead to long impulse responses. Furthermore, the Gaborator uses symmetric Gaussian analysis filters that were chosen for properties such as linear phase and accurate reconstruction, not for low latency, so the latency will be higher than what might be achievable with a constant-Q filter bank specifically designed for low latency.
The latency only affects causal applications, and arises from the need to wait for the arrival of future input samples needed to calculate the present output, and not from the time it takes to perform the calculations. In a non-causal application, such as applying an effect to a recording, the latency does not apply, and performance is limited only by the speed of the calculations. This can lead to the somewhat paradoxical situation that applying an effect to a live stream causes a latency of several seconds, but applying the same effect to an entire one-minute recording runs in a fraction of a second.
In analysis and visualization applications that don't need to perform resynthesis, it is possible to partly hide the latency by taking advantage of the fact that the coefficients for the higher frequencies exhibit lower latency than those for low frequencies. For example, a live spectrogram display could update the high-frequency parts of the display before the corresponding low-frequency parts. Alternatively, low-frequency parts of the spectrogram may be drawn multiple times, effectively animating the display of the low-frequency coefficients as they converge to their final values. This approach can be seen in action in the Spectrolite iOS app.
Yes, but there is a significant performance penalty. The Gaborator works most efficiently when the signal is processed in large blocks, preferably 217 samples or more, corresponding to several seconds of signal at typical audio sample rates.
A real-time application aiming for low latency will want to use smaller blocks, for examples 25 to 210 samples, and processing these will be significantly slower. For example, as of version 1.4, analyzing a signal in blocks of 210 samples takes roughly five times as much CPU as analyzing it in blocks of 220 samples.
For very small blocks, the processing time will exceed the duration of the signal, at which point the system can no longer be considered real-time. For example, analyzing a 48 kHz audio stream on a 2.5 GHz Intel Core i5 CPU, this happens at block sizes below about 24 = 16 samples.
The resynthesis code is currently less optimized for small block sizes than the analysis code, so the performance penalty for resynthesizing small blocks is even greater than for analyzing small blocks.
As of version 2, the length of a stream is only limited by the 64-bit signed integers used to represent sample times, allowing for more than a million years of 192 kHz audio both before and after t=0. This assumes the local phase convention is used (which is the default in version 2). When using the global phase convention (as in version 1), the floating point precision of phase values used in internal calculations will decrease as the distance from t=0 increases, causing a noticeable decrease in the signal to noise ratio after some hours.
Currently, no — it dynamically allocates both the coefficient data structures and various temporary buffers.