By now you're probably already familiar with OpenAI's tool ChatGPT.
As a text-to-MIDI DAW plugin company, AudioCipher was built for music producers to break through creative block. ChatGPT can be used in a similar way, to create chord progressions, melodies, lyrics, tablature and even machine-level code. However, you'll have less control over MIDI output than with AudioCipher.
All you need to get started is a free and ordinary OpenAI subscription. The solutions outlined here will run in your web browser instead of a DAW, and they are powered by neural networks instead of musical cryptograms.
Be aware, if you're expecting something comparable to other text-to-music and lyrics-to-song generators, you'll need to adjust your expectations. ChatGPT does not generate audio files or singing vocals.
So what can ChatGPT do, musically speaking? Let's find out...
Table of Contents
The first wave of ChatGPT music experiments
The first wave of ChatGPT music came from social media influencers. They were releasing meme videos of famous rappers using UberDuck for vocal impersonation. ChatGPT was only being used to generate lyrics at this stage. The technique was mostly for fun and parody - nobody mistook the music for the actual artists. Here's how the method works:
Users come up with a lyric prompt for ChatGPT, like "write a lyrical verse in the style of [artist] about [topic]"
Find a section of the lyric output that you like and plug it into UberDuck
Export the audio from Uberduck and bring it into your DAW
Use an autotune plugin to apply a melody. If you're looking for a free option, try GVST's GSnap VST.
Since then, a great number of AI voice generator services have been released, specifically for musicians to use. Uberduck is still the most popular for celebrity voices, but other apps provide better support for adding melodies to the voices.
As time went on, people began publishing additional experiments with ChatGPT music prompts focused on chords, melodies, and MIDI generation.
WavTool: An AI DAW powered by GPT4
WavTool is a browser-based AI DAW with an AI assistant that uses GPT4 to understand your music requests and take actions in the workstation. It can generate MIDI melodies, along with many other music production tasks.
So if you want a tool that uses GPT and creates actual music then cut to the chase and try them first. Just be aware that this is not one of the free OpenAI ChatGPT methods. This is a third party product.
Some tips on prompting the GPT service in WavTool:
You'll find that vague prompts like "create a catchy melody" tend to result in underwhelming results, like a C major scale that repeats twice. The more specific you get with your prompts, the better your results will be. Here are some prompt components that I recommend using (or not using) to improve your outcomes:
Request "a combination of quarter, eighth, and sixteenth notes" when asking for a melody. It understands these instructions and will usually comply.
Tell it what scale or key signature you want it to use, if that matters to you.
You can ask it for specific chord progressions using roman numerals.
Avoid asking for a "chord progression in <key signature" because it will usually create an ascending line of chords in that key (i.e. I - ii - iii - IV etc).
You can try asking for chord progressions that fit a genre's style, like a jazzy chord progression or a punk rock progression. Expect mixed results.
If you don't want to sign up for WavTool then you can use ChatGPT to run your experiments within the OpenAI system. Let's move on to that next.
ChatGPT Extensions: SongMaker, Music Composer
There are several ChatGPT music extensions available as of 2024. They can only accessed by users with a Pro plan. That means you'll need a paid subscription or to be prepared to pay OpenAI $20/month to try them out.
I don't recommend upgrading to the pro plan if your only goal is to use these extensions. Here's a detailed overview on why you're better off on the free plan.
Song Maker is the most popular GPT extension currently. They support a diverse range of prompts, ranging from music creation to lyrics and album art. At the end of the process, they build a MIDI file for you. There's even built a beta MIDI editor tool that loads in your browser so you can make changes when it's done.
Here's a snapshot of the first phase, where it recommends some general ideas:
Lyric generation: SongMaker begins by requesting a theme and then creating lyrics. It asks for your approval before moving on to instrumentals.
Music ideation: Get ideas for chord progressions, rhythm, and melody. The recommendations are fairly generic, so you may want to suggest changes before moving forward.
MIDI file generation: Based on the lyrics and music ideation, SongMaker will analyze everything and generate a MIDI or MusicXML file for you to download.
Here's what the final MIDI track looks like when imported into Logic Pro X. As you can see, it's a multi track file with two piano lines and a bass. The chord, melody and rhythmic patterns from the ideation phase were almost entirely ignored, unfortunately:
The MIDI is painfully slow, consisting mostly of quarter and half notes, so we had to speed the track up to 200BPM to hear the structure. The chords are clashing and full of dissonant tone clusters.
Other ChatGPT extensions recommended by Google are typically even worse.
Music Composer generates output that looks like this. It looks like the numbers are supposed to represent their pitch value on the piano. Frankly, this is unusable.
Music Creator is not really a ChatGPT app. It's a lead generation tool that integrates with a MusicGen app called CassetteAI.
FlowGPT music generator is at the top of Google's search results but is completely broken. They state that the the model "will be trained" to do a number of things, but currently it doesn't even attempt to create anything.
So with that said, let's have a look at how you can do the same thing for free, and fine tune your prompts to get better results.
ChatGPT Chord Progressions
Chord progressions are one of the easiest places to get started with ChatGPT music prompts. You can request a progression in a particular key and it will perform quite well. As you drill down further, asking for progressions in the style of a particular composer or genre, the cracks begin to show.
GPT 4.0 requires a paid account but it also represents the latest capabilities of their most powerful model, so we used GPT 4.0 for these experiments.
ChatGPT Music Prompt 1: Write a chord progression in [key signature].
Result (✅): We get a simple four chord progression, including the notes in each chord and a short description of why it chose those chords. The roman numeral format (I - vi - ii - V) accurately corresponds to the chords it provided.
ChatGPT Music Prompt 2: Write a complex chord progression in [key signature].
Result (✅): We chose a more obscure key signature (F# minor) and ChatGPT had no problem handling it. All of the notes and roman numeral progressions are accurate within the context of the key.
ChatGPT Music Prompt 3: Write a chord progression in the style of JS Bach.
Result (✅): We started with a classical composer because ChatGPT would have plenty of access to written information about his chord composing style. The progression included two trademark characteristics of Bach's music - use of the secondary dominants and the use of the numeral 6 representing a first inversion. This distinguishes the notation from contemporary use of the number 6, where it would imply a 6th chord.
ChatGPT Music Prompt 4: Write a guitar chord progression for a jazzy lofi hip hop beat.
Result (✅/❌): Here we find some of the first errors in ChatGPT's output for chords.
The first three chords in this progression do have the characteristic lofi hip hop sound. But the roman numeral notation doesn't make sense. It would be more accurate to call it a ii-V-I in D major.
The last two chords create an unresolved feeling that would not loop well. As a loop-based genre, this is a problem that would take some creativity on the musician's behalf to make it work.
ChatGPT incorrectly identifies G# as the third scale degree of E minor. It makes the same problem with the last chord, calling C# minor the sixth scale degree is C natural. As a result, the last two chords don't match the first three.
ChatGPT Music Prompt 5: Write an 8-chord progression in [key signature].
Result (✅/❌): The results here are technically in F minor, but there are a few problems
The final chord resolves to the tonic prematurely. Normally the return to the tonic would happen after the completion of the 8 chord progression. This could easily be solved by extending the length of the 7th chord for two bars.
The progression is basically just going through the circle of fifths (VII - III - VI - ii - v - i). While this is fairly common, it makes for a jumpy progression unless you use chord inversions. Those inversions aren't indicated her.
The output features triads exclusively, so the chord progression lacks color.
ChatGPT Music Prompt 6: Write an 8-chord progression in [key signature] using inversions for closed harmony.
Result (❌): Here we can really see the limitations of ChatGPT music output. On the one hand, it understands the notion of inversions and closed harmony. This is evident by its written response, but if you actually play the chords here, there are a few glaring problems.
The second chord (G-Bb-Db) was written over D natural in the bass. That note does not belong to the key or chord and is about as ugly as you could ever imagine. It's a terrible second chord for an F minor progression.
The fourth, fifth and sixth chords in these progressions contain large leaps, which violates the prompt's instructions to use closed harmony. ChatGPT even stated its intention to use inversions to create smooth stepwise motion, but violated this rule in its output.
Here is a screenshot of the piano roll to highlight the problems with this progression. The chord highlighted in pink shows the D-G-Bb-Db chord that made our skin crawl. You can see the big chord leaps highlighted in green, that missed the mark for smooth voice leading in the bass.
This concludes the section on generating chord progressions with ChatGPT music prompts. As you can see, it's capable of creating something, but the more specific you get with your requests, the more likely it is to make errors due to its lack of training in the mathematics of music composition.
ChatGPT Guitar Tablature
Generating guitar tabs with ChatGPT is fairly simple. You just ask for what you want and it will spit it out in an ASCII format. Unfortunately, it tends to struggle with getting the key signature and chords right for even the most simple prompts. You can find examples of that in the video above, and watch how he resorted to asking for written descriptions of a guitar solo to eventually get his desired results.
ChatGPT Guitar Tablature Prompt 1: Write a chord progression in guitar tablature
Result (❌): ChatGPT returns an arpeggio in E minor and A minor, but then states that it's a chord progression (it's not) with the chords GM, DM, and CM. It's none of those things, so this is a complete miss.
ChatGPT Guitar Tablature Prompt 2: Create guitar tabs for a ii7 V I chord progression in FMajor.
Result (✅): This time we get the correct response, albeit in a clunky vertical format. By instructing it on the progression we wanted, it seems to have an easier time. We get a rootless Gm7 chord, followed by a C7 chord (we didn't ask for the extension but that's okay) and an F major chord.
ChatGPT Guitar Tablature Prompt 3: Write a metalcore guitar solo in tablature.
Result (❌): Here we asked for metalcore, implying that it should be in a minor or diminished scale. Instead it gave us a rudimentary major scale. It understood that we wanted a melody, but that's about it. It also incorrectly said that this was in E minor, but contains an F natural (10th fret G string) so they key signature is wrong.
We've found other problems with ChatGPT guitar tabs. For example, in tabs the '/' character indicates a slide up and '\' for a slide down. ChatGPT tends to use the wrong kind of slash and doesn't seem to understand what the symbol means.
ChatGPT Lyric and Chord Generation
As a text generator, ChatGPT can perform AI lyric generation all day. When you try to combine the lyrics with chord charts, it works, but if you add melodies, you start to run into trouble. Here's a couple of examples to showcase these limitations.
ChatGPT Lyrics + Chords Prompt 1: Write lyrics for a folk song and add chord charts to it.
Result (✅): The chords are added to the beginning of each word and color coded to make it easily readable. We still need to write our own melody with this prompt.
ChatGPT Lyrics + Chords Prompt 2: Using the same chart, include melodic notation for these chords and lyrics
Result (❌): ChatGPT has created an arpeggiated finger picking guitar riff for the melody. There's no connection between the those arpeggios and the lyrics.
ChatGPT Melodies
Next on the list is melody generation. Unlike chord notation, melodies are a bit trickier to express through plaintext. For this reason, I started by asking ChatGPT for a list of the best formats to use when generating melodies. Here's a list of options that it provides. You can reference them to experiment with your own music prompts:
Standard Notation: Naming notes (e.g., "C-E-G") along with their durations (e.g., "quarter note C, half note E, quarter note G").
Scale Degrees: Naming a melody in relation to the key it's in using scale degrees (e.g., "In the key of C Major: 1-3-5").
Guitar Tablature: You'll get melodies as they appear on a guitar neck, but rhythm will be missing. We already highlighted its problems previously.
Piano Roll Notation: Describe the melody using MIDI note numbers, representing the values on Digital Audio Workstation's piano roll. For example, a C Major arpeggio would be represented as 60, 64, 67. We used this format in the next section, using ChatGPT to write javascript code and create MIDI files. This works surprisingly well with the right prompts.
Solfège: Describe simple melodies using the do-re-mi system. This lacks rhythmic input and will keep you fixed to a simple diatonic scale.
Alphanumeric Pitch Notation: This system uses the letters A-G in combination with numbers 0-9 to represent pitches, with C4 often being set as "middle C". For example, a melody might go "C4-E4-G4-C5".
Pitch frequencies: Name each musical note as a frequency (hz) rather than using standard notation concepts like note names.
To get started, we went with the alphanumeric + rhythm notation.
ChatGPT Melody Prompt 1: Write a melody in <key signature> using alphanumeric pitch notation. Include indicators of rhythm.
Result (✅): The melody we retrieved is a simple C major pentatonic melody, with a combination of quarter and half notes. It's not a particularly interesting tune, but there is some variation, with the arpeggio turning around on the sixth scale degree and returning stepwise with the 2nd scale degree passing tone from E-D-C.
ChatGPT Melody Prompt 2: Use this chord progression and give each chord four beats. Write a catchy melody in the same key signature using quarter and eighth notes. Be sure to indicate the duration of each note in the melody. If the melody contains notes outside of the chord, then it should only be used as passing tones to a note that is in the chord. Here is the progression: Dm7 - G7 - CM9 - Em7
Result (❌): As we saw previously, when the musical mathematics become too complex, ChatGPT starts to lose the plot. It understands what we want it to do, but quickly makes mistakes.
It calls F a passing tone, despite being set over the Dm7 chord.
E minor is a passing tone, but it fails to label it as such
It makes the same mistake with the G note, calling it a passing tone while playing over the G7 chord. Then the next note, A, is not labeled a passing tone. This illustrates that it is not properly checking the melody against the chords, because it lacks the training to do so effectively.
Using Javascript to make ChatGPT MIDI music
The note-and-rhythm notation in the examples above are annoying to use and not very practical. It's more effective to use a free code editor like VSCode to generate MIDI melodies. We'll explore how to do that here.
First, you'll need to download Visual Studio Code and install Node.JS. Both of these are free. Once they're installed, you'll create a new folder on your computer and use VSCode's terminal to CD into it. Save a simple file, let's say "testmidi.js" and within that file, you'll be able to use this technique.
ChatGPT Visual Studio MIDI Prompt 1: Please write javascript code to generate a C major scale in MIDI notation, using the MIDIWriter javascript Node library.
Result (✅): ChatGPT needed a few rounds of debugging to get this right, but all I had to do was tell it the error code and eventually it produced a working piece of code. That's the code we shared above.
constMidiWriter = require('midi-writer-js'); const fs = require('fs'); // Create a new tracklet track = newMidiWriter.Track(); // Define the C Major scale (middle C to the next C) in MIDI numbersletCMajorScale = [60, 62, 64, 65, 67, 69, 71, 72]; // Loop through the notes in the scale and add them to the trackfor(let i = 0; i < CMajorScale.length; i++) { // Adding a note event for each note in the scale, with a duration of '4' (quarter note) track.addEvent(newMidiWriter.NoteEvent({pitch: [CMajorScale[i]], duration: '4'})); } // Create a writerlet writer = newMidiWriter.Writer(track); // Save the MIDI output to a filelet base64String = writer.dataUri().split(',')[1]; let midiData = Buffer.from(base64String, 'base64'); fs.writeFileSync('C_Major_Scale.mid', midiData);
We're using a package called MidiWriter.js. Just paste the code above into your testmidi.js file and when you run "node testmidi.js" in your VSCode terminal, it will automatically save the MIDI file to that folder. You can use ChatGPT to elaborate and create melodies with actual rhythm and character. We'll get to that next.
ChatGPT Chord + Melody MIDI Prompt 2: Using the javascript code I've provided, create a ii7 - V7 - IM7 chord progression and then write a melody on top that uses a combination of quarter and eighth notes. The melody should match the chords being played so that they don't clash. Try to use a fluid, natural sounding melody, and give the chord progression some rhythm too.
Result (✅): Wow! ChatGPT created a melody that goes perfectly with the chord progression we provided. All we had to do was switch over to a programming language and the results improve dramatically.
Here's the code so you can test it yourself:
const MidiWriter = require('midi-writer-js');
const fs = require('fs');
// Create two new trackslet chordTrack = new MidiWriter.Track();
let melodyTrack = new MidiWriter.Track();
// Define the chords - Dm7 (D-F-A-C), G7 (G-B-D-F), and Cmaj7 (C-E-G-B)let chords = [
{ pitch: [62, 65, 69, 72], duration: '4' }, // Dm7 quarter note
{ pitch: [62, 65, 69, 72], duration: '4' }, // Dm7 quarter note
{ pitch: [67, 71, 74, 77], duration: '4' }, // G7 quarter note
{ pitch: [67, 71, 74, 77], duration: '4' }, // G7 quarter note
{ pitch: [60, 64, 67, 71], duration: '2' }, // Cmaj7 half note
{ pitch: [60, 64, 67, 71], duration: '2' } // Cmaj7 half note
];
// Add chords to the chordTrack
chords.forEach(chord => {
chordTrack.addEvent(new MidiWriter.NoteEvent(chord));
});
// Define the melody - a simple melody that fits over the chordslet melody = [
{ pitch: [74], duration: '8' }, // D eighth note
{ pitch: [72], duration: '8' }, // C eighth note
{ pitch: [74], duration: '4' }, // D quarter note
{ pitch: [76], duration: '8' }, // E eighth note
{ pitch: [77], duration: '8' }, // F eighth note
{ pitch: [76], duration: '4' }, // E quarter note
{ pitch: [72], duration: '2' }, // C half note
{ pitch: [71], duration: '2' } // B half note
];
// Add melody to the melodyTrack
melody.forEach(note => {
melodyTrack.addEvent(new MidiWriter.NoteEvent(note));
});
// Create a writer with both trackslet writer = new MidiWriter.Writer([chordTrack, melodyTrack]);
// Save the MIDI output to a filelet base64String = writer.dataUri().split(',')[1];
let midiData = Buffer.from(base64String, 'base64');
fs.writeFileSync('ii7_V7_IM7_Progression_With_Melody.mid', midiData);
Here's a visual of what that MIDI file looks like when imported into a DAW like Logic Pro X. As you can see, the melody has a nice shape and it worked perfectly with the chord progression.
ChatGPT Chord + Melody MIDI Prompt 3:
Result (✅/❌): Here we begin to run up against the limitations of ChatGPT's capabilities as a general purpose AI tool. Four-part Bach chorales require melodies to move in some combination of parallel and contrary motion.
After iterating through another five rounds of instructions we eventually got ChatGPT output that's constrained to a single key signature and exports a MIDI file with four voices. There's a good amount of rhythmic variation, but it sounds more like Debussy than Bach. Here's a screenshot of what that final MIDI looks like:
Here's the code so you can try it out yourself:
const MidiWriter = require('midi-writer-js');
const fs = require('fs');
// Create four new tracks
let sopranoTrack = new MidiWriter.Track();
let altoTrack = new MidiWriter.Track();
let tenorTrack = new MidiWriter.Track();
let bassTrack = new MidiWriter.Track();
// Define the A minor scale in three octaves
let aMinorScale = [57, 59, 60, 62, 64, 65, 67, 69, 71, 72, 74, 76, 77, 79, 81, 83, 84, 86, 88, 89, 91];
// Define four melodies using the indexes of the scale
let sopranoMelody = [
{ index: 20, duration: '4' },
{ index: 18, duration: '8' },
{ index: 16, duration: '8' },
{ index: 14, duration: '16' },
// Contrary motion starts here
{ index: 16, duration: '16' },
{ index: 18, duration: '4' },
{ index: 20, duration: '8' },
{ index: 18, duration: '8' },
// Back to parallel motion
{ index: 16, duration: '4' },
{ index: 14, duration: '8' },
{ index: 12, duration: '8' },
{ index: 10, duration: '4' }
];
let altoMelody = [
{ index: 15, duration: '4' },
{ index: 13, duration: '8' },
{ index: 11, duration: '8' },
{ index: 9, duration: '16' },
// Contrary motion starts here
{ index: 11, duration: '16' },
{ index: 13, duration: '4' },
{ index: 15, duration: '8' },
{ index: 13, duration: '8' },
// Back to parallel motion
{ index: 11, duration: '4' },
{ index: 9, duration: '8' },
{ index: 7, duration: '8' },
{ index: 5, duration: '4' }
];
let tenorMelody = [
{ index: 10, duration: '8' },
{ index: 12, duration: '8' },
{ index: 14, duration: '4' },
{ index: 12, duration: '16' },
// Contrary motion starts here
{ index: 10, duration: '16' },
{ index: 8, duration: '4' },
{ index: 6, duration: '8' },
{ index: 4, duration: '8' },
// Back to parallel motion
{ index: 2, duration: '4' },
{ index: 0, duration: '8' },
{ index: 2, duration: '8' },
{ index: 4, duration: '4' }
];
let bassMelody = [
{ index: 5, duration: '8' },
{ index: 7, duration: '8' },
{ index: 9, duration: '4' },
{ index: 7, duration: '16' },
// Contrary motion starts here
{ index: 5, duration: '16' },
{ index: 3, duration: '4' },
{ index: 1, duration: '8' },
{ index: 3, duration: '8' },
// Back to parallel motion
{ index: 1, duration: '4' },
{ index: 3, duration: '8' },
{ index: 5, duration: '8' },
{ index: 7, duration: '4' }
];
// Add the melodies to the tracks
sopranoTrack.addEvent(sopranoMelody.map(note => new MidiWriter.NoteEvent({ pitch: [aMinorScale[note.index]], duration: note.duration })));
altoTrack.addEvent(altoMelody.map(note => new MidiWriter.NoteEvent({ pitch: [aMinorScale[note.index]], duration: note.duration })));
tenorTrack.addEvent(tenorMelody.map(note => new MidiWriter.NoteEvent({ pitch: [aMinorScale[note.index]], duration: note.duration })));
bassTrack.addEvent(bassMelody.map(note => new MidiWriter.NoteEvent({ pitch: [aMinorScale[note.index]], duration: note.duration })));
// Combine all tracks
let writer = new MidiWriter.Writer([sopranoTrack, altoTrack, tenorTrack, bassTrack]);
// Save the MIDI output to a file
let base64String = writer.dataUri().split(',')[1];
let midiData = Buffer.from(base64String, 'base64');
fs.writeFileSync('Four_Part_Fugue4.mid', midiData);
That's as far as we're going to take the MIDI javascript code generation prompts. You can plug any of those snippets into GPT4 as a starting point and ask for additional variations.
For a technical deep dive further into the musical potential and limitations of GPT-4, check out this video from creative coder and composer Marc Evanstein:
If you want a simpler text-to-MIDI experience for your DAW workflow, be sure to check out the AudioCipher VST.
Comentários