Historians will look back at 2023 as the year text-to-music software made its big debut. AudioCipher published version 3.0 of the text-to-MIDI VST in January, followed by AI services like MusicLM, MusicGen and Chirp in May, June and July.
But what if a developer flipped the script and published an AI music-to-text application instead? AI music companies like Moises have explored this previously, but these were lyric-to-text transcription services, like karaoke or the closed captions that appear on a YouTube video.
The kind of music-to-text app I have in mind would need to listen to the music and generate written descriptions about the instrumental arrangement itself. That's exactly what a small team of developers from South Korea just accomplished this month.
The program, dubbed LP-MusicCaps, is available on Hugging Face for everyone to try. All you have to do is upload a music file and hit the submit button. Behind the scenes, a large language model operates in concert with Google's MusicCaps dataset to generate a list of suggestions about what it thinks the music might sound like.
You might be wondering - are there any practical use cases for software like this or is it just a novelty? For readers who enjoys experimenting with text-to-music software, there's at least one entertaining use case we can think of; infinite music.
You can use LP-MusicCaps to generate descriptions of a song, and then plug those into other text-to-music apps like MusicGen. The adventurous readers among us can take it a step further and bounce between these programs endlessly, to create a near-infinite regression of AI music.
I've recorded and shared a walkthrough of that technique in the video above. For the remainder of this article, I'll try to explain why I think this process is valuable, including the historical precedent for creating infinite music with artificial intelligence.
Table of Contents
LPMusicCaps: Why Music-to-Text is useful
There are at least two ways to turn words into music, aside from writing lyrics. It just depends on what kind of concepts you're trying to work with.
Option 1 - Music Inspiration in the DAW: If the words or phrases you want to use are abstract, like a person's name or a location, AudioCipher uses music cryptograms to encode those words into MIDI melodies and chord progressions.
This can be a great way to break through creative block and get some quick inspiration. From there, you apply layers of sound design in the DAW and mess with the notes until they sounds the way you want them to.
Option 2 - Descriptions-to-music: If your goal is to create a complete song based on a written description of music, like the instruments or genre you're going for, apps like Riffusion, MusicLM or MusicGen are a second option.
I'm going to be focusing on the description-to-music use case in this article, because that's where LP-MusicCaps really shines.
The problem we're solving here is very common. A lot of us lack the vocabulary to be describe a song with enough detail to retrieve a similar kind of song from an AI music generator.
A description of the mood, instruments and genre, like upbeat rock song with vocals, guitar, bass and drums, might be the best we can do.
Instead of stressing ourselves out trying to come up with the right words to use, an AI music-to-text application can listen to reference music and write the prompts for us. Best of all, we can plug those same reference tracks into a text-to-music app like MusicGen, combined with the music description, and generate an entirely new track.
LP-MusicCaps + MusicGen: Creating an Infinite Song
At the top of this article, I shared a video explaining how to use MusicCaps with MusicGen to create an infinite song. Here's a summary of that process.
Step 1. Head over to the LP-MusicCaps Hugging Face Demo page.
Step 2. Upload a music file. Most formats accepted, including MP3, WAV and M4A.
Step 3. Copy a description from LP-MusicCaps.
Step 4. Open up Fffiloni's version of MusicGen on HuggingFace.
Step 5. Load the Music file, then paste LP-MusicCaps output into the text prompt.
Step 6. Extend the duration to 30 seconds and hit the submit button.
That's all there is to it. If you're on Chrome, you can click the three-dot ellipses on the audio player to download the file. Otherwise, right click on the audio player itself and click "save audio as".
Now here's where we get experimental. When you download the music from MusicGen and feed it back into LP-MusicCaps, you'll get a new and slightly different description every time. Run that generation and copy the text.
Clear Musicgen's input audio. Load the new MusicGen track as your input and paste the new description from LP-MusicCaps into the text prompt area. Hit submit and continue this process as long as you'd like. Cycling between music-to-text and text-to-music produces ongoing novelty.
To produce truly infinite music with AudioCraft, you'll need to write some custom code using the library's generate_continuation method. Kevin at Collabage Patch helped us out and shared this code snippet:
output = model_continue.generate_continuation(prompt_waveform, prompt_sample_rate=sr, progress=True)
If you're writing your own code, this will allow you to automate extensions beyond the 30 second limit of Fffiloni's space and create truly infinite music. You can listen below for a 12-minute excerpt from Gary, the Audiocraft remix bot designed by Collabage Patch. Find more of their work on YouTube here.
3 Classic Examples of Infinite Songs in Music History
I was first introduced to the idea of an infinite song by the children's TV show Lamb Chops Sing-along. There was an episode that closed with the song shown above.
The lyrics were: "This is the song that doesn't end, yes it goes on and on my friend. Some people started singing it not knowing what it was, and they will keep on singing it forever just because this is the song that never ends, yes it goes on and on my friend..."
Much later in life, while studying 20th century classical music, I came across the experimental American composer John Cage. He had all sorts of weird ideas, including one piece titled As Slow as Possible. An church organ in Germany began performing it in 2001 and it's currently due to end in 2640. The next note will be played on February 5, 2024.
Centuries before John Cage, Johann Sebastian Bach wrote an infinite song called the Crab Canon. One person plays the sheet music upright while the other plays it upside down (like a crab-walk). When the end of the music is reached, each performer flips the music upside down and continues. The video above shows how the sheet music can be folded into itself on a Mobius strip.
The underlying principles of infinite music, math, and computer science were detailed in the classic book Goedel Escher and Bach, by Douglas Hofstadter.
Dadabots Pizzafire: Infinite AI Music Documentary
While we're exploring infinite audio, we have to touch on extreme AI musicians CJ Carr and Zach Zukowski of Dadabots. They have composed several of their own infinite songs that play 24/7 on YouTube.
Dadabots have also created infinite funk (No Soul), bass solos, and mathcore content too. Check out the full collection here on their YouTube channel.
We've been fans of DadaBots for years, which is why we were thrilled to learn of their brand new AI music documentary, Pizzafire, released earlier this week in August 2023.
The film goes into the history of their machine learning and generative music techniques, including easy-to-understand explanations for the non-technical viewers.
Watch the trailer below and click here to watch the full film on Ensemble.art.
Weird as all of this stuff may seem, Dadabots aren't alone in their infinite AI music experiments. Karen Allen's award winning company Infinite Album aims to create never ending, adaptive AI music for video games. Other companies like WarpSound have explored similar approaches to adaptive music for live twitch streamers.
We're likely to see more examples of this in the coming years, as generative AI music reaches its crescendo. So don't be shy - carve out some time to experiment with recursive AI music and keep an open mind. You never know what might come from it!
---------------------
This post was written by Ezra Sandzer-Bell, founder of AudioCipher Technologies.
Comments