Note: This documentation is for an old version of the Gaborator. For the 2.1 documentation, click here.

Example 5: Synthesis from Scratch

Introduction

This example demonstrates how to synthesize a signal by creating spectrogram coefficients from scratch rather than by analyzing an existing signal. It creates a random pentatonic melody of decaying sine waves as spectrogram coefficients and then synthesizes audio from them.

Preamble

This example program takes a single command line argument, the name of the output file.

#include <memory.h>
#include <iostream>
#include <sndfile.h>
#include <gaborator/gaborator.h>

int main(int argc, char **argv) {
    if (argc < 2) {
        std::cerr << "usage: synth output.wav\n";
        exit(1);
    }

Synthesis Parameters

Although this example does not perform any analysis, we nonetheless need to create an analyzer object, as it is used for both analysis and synthesis purposes. To generate the frequencies of the 12-note equal-tempered scale, we need 12 bands per octave; a multiple of 12 would also work, but here we don't need the added frequency resolution that would bring, and the time resolution would be worse.

To simplify converting MIDI note numbers to band numbers, we choose the frequency of MIDI note 0 as the reference frequency; this is 8.18 Hz, which happens to be outside the frequency range of the bandpass filter bank, but that doesn't matter.

    double fs = 44100;
    gaborator::parameters params(12, 20.0 / fs, 8.18 / fs);
    gaborator::analyzer<float> analyzer(params);

Melody Parameters

We will use the A minor pentatonic scale, which contains the following notes (using the MIDI note numbering):

    static int pentatonic[] = { 57, 60, 62, 64, 67 };

The melody will consist of 64 notes, at a tempo of 120 beats per minute:

    int n_notes = 64;
    double tempo = 120.0;
    double beat_duration = 60.0 / tempo;

The variable volume determines the amplitude of each note, and has been chosen such that there will be no clipping of the final output.

    float volume = 0.2;

Composition

We start with an empty coefficient set:

    gaborator::coefs<float> coefs(analyzer);

Each note is chosen randomly from the pentatonic scale and added to the coefficient set by calling the function fill(). The fill() function is similar to the process() function used in previous examples, except that it can be used to create new coefficients rather than just modifying existing ones.

Each note is created by calling fill() on a region of the time-frequency plane that covers a single band in the frequency dimension and the duration of the note in the time dimension. Each coefficient within this region is set to a complex number whose magnitude decays exponentially over time, like the amplitude of a plucked string. The phase is arbitrarily set to zero by using an imaginary part of zero. Since notes can overlap, the new coefficients are added to any existing ones using the += operator rather than overwriting them.

Note that band numbers increase towards lower frequencies but MIDI note numbers increase towards higher frequencies, hence the minus sign in front of midi_note.

    for (int i = 0; i < n_notes; i++) {
        int midi_note = pentatonic[rand() % 5];
        double note_start_time = beat_duration * i;
        double note_end_time = note_start_time + 3.0;
        int band = analyzer.band_ref() - midi_note;
        fill([&](int, int64_t t, std::complex<float> &coef) {
                float amplitude =
                    volume * expf(-2.0f * (float)(t / fs - note_start_time));
                coef += std::complex<float>(amplitude, 0.0f);
            },
            band, band + 1,
            note_start_time * fs, note_end_time * fs,
            coefs);
    }

Synthesis

We can now synthesize audio from the coefficients by calling synthesize(). Audio will be generated starting half a second before the first note to allow for the pre-ringing of the synthesis filter, and ending a few seconds after the last note to allow for its decay.

    double audio_start_time = -0.5;
    double audio_end_time = beat_duration * n_notes + 5.0;
    int64_t start_frame = audio_start_time * fs;
    int64_t end_frame = audio_end_time * fs;
    size_t n_frames = end_frame - start_frame;
    std::vector<float> audio(n_frames);
    analyzer.synthesize(coefs, start_frame, end_frame, audio.data());

Writing the Audio

Since there is no input audio file to inherit a file format from, we need to choose a file format for the output file by filling in the sfinfo structure:

    SF_INFO sfinfo;
    memset(&sfinfo, 0, sizeof(sfinfo));
    sfinfo.samplerate = fs;
    sfinfo.channels = 1;
    sfinfo.format = SF_FORMAT_WAV | SF_FORMAT_PCM_16;

The rest is identical to Example 2:

    SNDFILE *sf_out = sf_open(argv[1], SFM_WRITE, &sfinfo);
    if (! sf_out) {
        std::cerr << "could not open output audio file: "
            << sf_strerror(sf_out) << "\n";
        exit(1);
    }
    sf_command(sf_out, SFC_SET_CLIPPING, NULL, SF_TRUE);
    sf_count_t n_written = sf_writef_float(sf_out, audio.data(), n_frames);
    if (n_written != n_frames) {
        std::cerr << "write error\n";
        exit(1);
    }
    sf_close(sf_out);
    return 0;
}

Compiling

Like Example 1, this example can be built using a one-line build command:

c++ -std=c++11 -I.. -O3 -ffast-math `pkg-config --cflags sndfile` synth.cc `pkg-config --libs sndfile` -o synth

Or using the vDSP FFT on macOS:

c++ -std=c++11 -I.. -O3 -ffast-math -DGABORATOR_USE_VDSP `pkg-config --cflags sndfile` synth.cc `pkg-config --libs sndfile` -framework Accelerate -o synth

Or using PFFFT (see Example 1 for how to download and build PFFFT):

c++ -std=c++11 -I.. -Ipffft -O3 -ffast-math -DGABORATOR_USE_PFFFT `pkg-config --cflags sndfile` synth.cc pffft/pffft.o pffft/fftpack.o `pkg-config --libs sndfile` -o synth

Running

The example program can be run using the command

./synth melody.wav

The resulting audio will be in melody.wav.