Generating Audio

Music Generation This and the next chapter discuss some of the state-of-the-art techniques to deal with audio generation. On one hand, it’s surprisingly effective to translate the problem into the visual domain by obtaining spectrograms from sound data and treating them as images. Other researchers prefer to approach the problem from the point of view of sequence modeling, which bends naturally to the use of transformers.

Brief Introduction to audio data Spectrograms, mel, etc Generating music with Jukebox How it works: VQ-VAE + transformers Summary