This example demonstrates how to synthesize a signal by creating spectrogram coefficients from scratch rather than by analyzing an existing signal. It creates a random pentatonic melody of decaying sine waves as spectrogram coefficients and then synthesizes audio from them.
This example program takes a single command line argument, the name of the output file.
#include <memory.h> #include <iostream> #include <sndfile.h> #include <gaborator/gaborator.h> int main(int argc, char **argv) { if (argc < 2) { std::cerr << "usage: synth output.wav\n"; exit(1); }
Although this example does not perform any analysis, we nonetheless
need to create an analyzer
object, as it is used for both
analysis and synthesis purposes. To generate the frequencies of the
12-note equal-tempered scale, we need 12 bands per octave. A multiple
of 12 would also work, but here we don't need the added frequency
resolution that would bring, and the time resolution would be
worse.
To simplify converting MIDI note numbers to band numbers, we choose the frequency of MIDI note 0 as the reference frequency; this is 8.18 Hz, which happens to be outside the frequency range of the bandpass filter bank, but that doesn't matter.
double fs = 44100; gaborator::log_fq_scale scale(12, 20.0 / fs, 8.18 / fs); gaborator::parameters params(scale);
As we create new complex coefficients from scratch, we need to be careful about their phases. For any given frequency, the portions of the output signal contributed by the coefficients at different points in time need to be in phase so that they combine constructively into a single (co)sine wave rather than interfering destructively. The easiest way to achieve this is to select the global phase convention in the synthesis parameters. With this setting, the coefficient phases will be interpreted relative to the common reference point of t=0 rather than the time of each individual coefficient:
params.phase = gaborator::coef_phase::global; gaborator::analyzer<float> analyzer(params);
We will use the A minor pentatonic scale, which contains the following notes (using the MIDI note numbering):
static int pentatonic[] = { 57, 60, 62, 64, 67 };
The melody will consist of 64 notes, at a tempo of 120 beats per minute:
int n_notes = 64; double tempo = 120.0; double beat_duration = 60.0 / tempo;
The variable volume
determines the amplitude of
each note, and has been chosen such that there will be no clipping
of the final output.
float volume = 0.2;
We start with an empty coefficient set:
gaborator::coefs<float> coefs(analyzer);
Each note is chosen randomly from the pentatonic scale and added
to the coefficient set by calling the function fill()
.
The fill()
function is similar to the process()
function used in previous examples, except that it can be used to
create new coefficients rather than just modifying existing ones.
Each note is created by calling fill()
on a region of
the time-frequency plane that covers a single band in the frequency
dimension and the duration of the note in the time dimension. Each
coefficient within this region is set to a complex number whose
magnitude decays exponentially over time, like the amplitude of a
plucked string. The phase is arbitrarily set to zero by using an
imaginary part of zero. Since notes can overlap, the new coefficients
are added to any existing ones using the +=
operator
rather than overwriting them.
Note that band numbers increase towards lower frequencies but MIDI
note numbers increase towards higher frequencies, hence the minus sign
in front of midi_note
.
for (int i = 0; i < n_notes; i++) { int midi_note = pentatonic[rand() % 5]; double note_start_time = beat_duration * i; double note_end_time = note_start_time + 3.0; int band = analyzer.band_ref() - midi_note; fill([&](int, int64_t t, std::complex<float> &coef) { float amplitude = volume * expf(-2.0f * (float)(t / fs - note_start_time)); coef += std::complex<float>(amplitude, 0.0f); }, band, band + 1, note_start_time * fs, note_end_time * fs, coefs); }
We can now synthesize audio from the coefficients by
calling synthesize()
. Audio will be generated
starting half a second before the first note to allow for pre-ringing
of the synthesis filters, and ending a few seconds after the
last note to give the note time to decay.
double audio_start_time = -0.5; double audio_end_time = beat_duration * n_notes + 5.0; int64_t start_frame = audio_start_time * fs; int64_t end_frame = audio_end_time * fs; size_t n_frames = end_frame - start_frame; std::vector<float> audio(n_frames); analyzer.synthesize(coefs, start_frame, end_frame, audio.data());
Since there is no input audio file to inherit a file format from,
we need to choose a file format for the output file by filling in the
sfinfo
structure:
SF_INFO sfinfo; memset(&sfinfo, 0, sizeof(sfinfo)); sfinfo.samplerate = fs; sfinfo.channels = 1; sfinfo.format = SF_FORMAT_WAV | SF_FORMAT_PCM_16;
The rest is identical to Example 2:
SNDFILE *sf_out = sf_open(argv[1], SFM_WRITE, &sfinfo); if (! sf_out) { std::cerr << "could not open output audio file: " << sf_strerror(sf_out) << "\n"; exit(1); } sf_command(sf_out, SFC_SET_CLIPPING, NULL, SF_TRUE); sf_count_t n_written = sf_writef_float(sf_out, audio.data(), n_frames); if (n_written != n_frames) { std::cerr << "write error\n"; exit(1); } sf_close(sf_out); return 0; }
Like Example 1, this example can be built using a one-line build command:
c++ -std=c++11 -I.. -O3 -ffast-math $(pkg-config --cflags sndfile) synth.cc $(pkg-config --libs sndfile) -o synth
Or using the vDSP FFT on macOS:
c++ -std=c++11 -I.. -O3 -ffast-math -DGABORATOR_USE_VDSP $(pkg-config --cflags sndfile) synth.cc $(pkg-config --libs sndfile) -framework Accelerate -o synth
Or using PFFFT (see Example 1 for how to download and build PFFFT):
c++ -std=c++11 -I.. -Ipffft -O3 -ffast-math -DGABORATOR_USE_PFFFT $(pkg-config --cflags sndfile) synth.cc pffft/pffft.o pffft/fftpack.o $(pkg-config --libs sndfile) -o synth
The example program can be run using the command
./synth melody.wav
The resulting audio will be in melody.wav
.