top of page
Search

12 Best AI Text To Music Apps for People of All Skill Levels

2023 was the "Midjourney moment" for AI music. For the first time ever, people could describe music and hear original ideas played back to them almost instantly.


Text to music software lets us generate music, voices and sound effects from the thoughts in our head. Even descriptions of movie scenes can become music.


Four of the world's biggest tech companies have rallied behind AI text-to-music prompts as the defining feature of their new generative audio models. Those are Google (MusicLM/MusicFX), Meta (MusicGen, AudioCraft), StabilityAI (Stable Audio), and Microsoft (Muzik).


In the first half of 2024, we've already seen Suno release their powerful V3 model and a major competitor, Udio, step forward to challenge them. Spotify and Amazon both released AI text-to-playlist generators for their audio libraries.


In this article we'll walk you through the best text to music software, most of which are powered by AI but some that are not. These resources can help you begin exploring this interesting new landscape for yourself.


Table of Contents


9 Best text to music apps in 2023

Other text to music apps (for non-musicians)


AudioCipher DAW plugin

AudioCipher text to music VST

Website: AudioCipher

Format: VST/AU/Standalone for Mac and PC


AudioCipher is a text-to-MIDI generator built around text prompts and musical parameters. Version one launched in 2020 and although it doesn't use AI, it's still the only plugin that offers this experience within a DAW.


The goal with AudioCipher is to transform words into melodies and chord progressions as a source of inspiration. AI generated music apps focus on written descriptions of music, but this VST lets you type in any kind of text including names and other abstract ideas.


Drag the MIDI output to your DAW and edit it in the piano roll until you arrive at something you like. Then choose your instruments and apply sound design to make it fully your own. In the end, the music you come up with will always stem from the words you chose at the beginning.


The main takeaway with AudioCipher is that text-to-music doesn't have to mean turning your creativity over to artificial intelligence. By focusing on the meaning or core idea behind the words you type in, you can break through writer's block while maintaining an active role in the song you're creating,


Version 4.0 of AudioCipher is currently in development and due for release mid-2024. Customers get lifetime free upgrades, so pick up a copy today to save on future versions of the app.


Suno AI song generator

Website: Suno AI / Udio

Format: Web app, Discord app

Price: $10-30 per month


After a year of steady engagement on Discord, Suno AI has migrated to a dedicated browser app. At the end of December 2023, they partnered with MIcrosoft Bing, delivering AI song generation to people of all skill levels.


As of this year, version 3 of Suno's AI song generator allows users type in original lyrics and describe the style of music they want to hear. Within a minute, two two-minute AI songs are delivered, complete with AI vocals in the genre of choice.


Each song can be downloaded and is cleared for commercial use, which has caused some ethicists to question the fairness of their training data. Nevertheless, the app has been extremely popular among professional musicians digging for samples as well as ordinary people looking to make a funny meme.


Udio is the first serious text to song app to challenge Suno. They have an almost identical web application and are backed by some big investors. The engineering team includes former Google employees who worked on AI music at Deepmind, and rap icons Will.i.am and Common are also backing the company.


As far as features are concerned, the app generates two 30-second clips and comes with 600 prompts (1200 audio clips) per month. Users can extend those clips to make them longer or modify prompts to get closer to the target sound.


Describe the kind of music you want to hear and provide lyrics to hear them sing over that instrumental track. Then you can publish directly to social media platform or download the files locally to your computer.


ACE Studio

Websites: ACE Studio

Format: Desktop DAW


Suno and Udio are exceptionally good at generating AI vocals from lyrics, but they don't provide any control over the melody and expressive style.


ACE Studio was created specially for generating and improving on the realism of AI singing Voices. Pick a voice model, draw a melody in the piano roll, and put lyrics to the song. It only takes a moment to transform your text into music, and from there you can continue making changes to the vocal performance.


Here are their four main text-to-music vocal parameters:


  1. Air: This controls the amount of breathiness in the vocals, affecting the softness and intimacy of the sound.

  2. Falsetto: This parameter adjusts the extent to which falsetto is used, allowing for a lighter and more ethereal vocal quality.

  3. Tension: This measures the degree of tension or relaxation in the vocal cords, influencing the strength and emotional intensity of the performance.

  4. Energy: This parameter affects the overall energy and strength of the singing, allowing for dynamic variations in vocal delivery.


In my experience, ACE works well as a prototyping tool for exploring an idea when I hit the limits of what I'm capable of. For example, I don't mind singing over an instrumental track, but I struggle to sing layered four part harmony without falling out of key.


With ACE, I was able to train a custom model on my own singing voice and run polyphonic harmony through that model to see how the composition sounded. The output has a kind of synthy vocoder effect, with polyphonic expression to control the vibrato on individual notes, much like an MPE MIDI controller.


Most of this article will cover instrumental text-to-music tools, but if you're looking for text-to-singing-vocals, visit the ACE Studio website to learn more.


Musicgen and SoundGen

Websites: MusicGen / SoundGen 

Format: Hugging Face browser app


One month after MusicLM was released, Meta put out MusicGen. The audio quality is even better than Google's model and in our estimation is the only AI music generation tool that could disrupt the music industry in any meaningful way.


Their text-to-song technology includes a melody condition where users can upload a recorded audio file and combine it with written instructions about genre and instrumentation to create an entirely new song.

For the first six months, the best way to get high quality music from MusicGen was to sign up for a Hugging Face account and create your own space. When you add a payment card, you'll be able to level up to their medium and large models. Instead of relying on local CPU, Hugging Face provides the computer power as a paid service.


Since then, a new product called SoundGen has come out that provides a better user interface with additional audio editing features that MusicGen lacks. It also includes unconventional prompting options like image to music.


We experimented with dozens of genres and found that it was particularly good at creating jazz, classical, rock, and chiptunes based on melody conditions. Try inputting a melody from the main soundtrack of a classic arcade game and see how it reinterprets it!


Each generation takes between 30 seconds up to 3 minutes, depending on the model you use. Once you've created it, you can take a listen and download it. For a detailed walkthrough on how to use and prompt the models, check out our full length article on MusicGen.


Stable Audio by Stability AI

Website: Stable Audio

Format: Website

Price: Freemium with $12/month


Stable Audio was brought to us by Stability AI, the company who first rivaled MIdjourney with their Stable Diffusion model. They are the first audio synthesis tool to go for a commercial model. Trained on nearly 800,000 labeled audio files from the AudioSparx directory, Stable Audio offers high quality text-to-audio service including both music and sound effect generation. Where it lacks in the music conditioning and extensibility of MusicGen, it makes up for in sound quality.


MusicFX (Previously MusicLM)

Website: MusicLM

Format: Browser and standalone app


The Google Arts and Culture team has been exploring AI music generation for years, notably with Magenta Studio, but MusicLM was the company's first venture into creating songs from text prompts.


We originally covered MusicLM in January 2023, when it was still just a technical paper published by their developers. In May 2023, they published a fully functional beta version that was free for anyone to use. You could access it in a browser or download the AI test kitchen from the App store to open it locally.


Now in 2024, they've made some updates to the app and renamed it to MusicFX.


Google's text-to-song model was a big improvement on Riffusion, producing longer clips at higher fidelity. They accomplished this using three music datasets (MusicCaps, Audioset, and Mulan) that were trained on over 40 million YouTube videos.


The music industry hasn't made much of a fuss over AI Test Kitchen's music generator, probably because the quality is still not good enough to disrupt real music recordings.


It's worth noting that Universal Music Group has already started collaborating with Google to train AI models on their music. We may see a much more powerful version of MusicFX drop this year, with artist remunerations built into the system.


Riffusion

Website: Riffusion

Format: Browser app


Riffusion was one of the first to tackle text-to-music in the public eye. December 2022, they made headlines with a simple web app that turned descriptions of music into short audio samples in that style.

By October 2023, the company had released a new and improved version of the app. Users can log in and build their audio library with text-to-music prompting. Like Suno, users could type in lyrics and hear them played back by an AI vocalist. The company also announced that they had raised a $4M round.


They remained quiet for the better part of the year, before resurfacing in July 2024 with a mobile app that converts videos into songs. Users can doom scroll to their heart's content, hearing custom songs in every genre about their day to day life.


Read our full summary of Riffusion here.


WavTool AI DAW (GPT-4)

Website: WavTool

Format: DAW browser app


Ever wondered what it would be like to have a super-intelligent assistant by your side to help generate music? WavTool is an AI DAW that loads in your web browser and comes equipped with a GPT-4 text to music plugin for advanced users on the monthly premium plan. Prompt the assistant to create new tracks, adjust the mix, and even compose new melodies and chord progressions. It's a lot of fun and will have you wondering why other DAWs don't offer the same features!

We've previously covered the strengths and weakness of ChatGPT music prompts. WavTool's secret sauce is the ability to translate GPT's output into direct actions within the DAW. So instead of simply printing out a list of chords in plain text, you get a complete MIDI file.

Wavtool recently introduced a separate text-to-music sample generator that turns descriptive music prompts into sound effects and short loops. Their artificial intelligence toolset also includes a "continue" feature, for generating music based on an initial MIDI input.


Mubert AI text-to-music generator

Mubert text to music interface

Format: Browser app


Mubert is an AI music generator that comes with a text to music web app. It's not their primary offering, but it's still a fun piece of tech to explore. You can enter prompts, set your track duration, and hit a generate button. In less than a minute, you'll have a complete song idea with details about the BPM and key signature.


Behind the scenes, your text prompt is encoded to latent space vectors of a transformer neural network and matched with existing labeled MIDI loop data. The closest tag vectors are chosen and sent to the Mubert API, where they generate entirely new music. You can find their Python code at this Github repo, if you want to learn more. They also offer a Google Colab environment for more nuanced experimentation.


VoiceMod text-to-song

Website: VoiceMod

Format: Browser app


Sometimes you just want to have fun without trying to create serious music. Voicemod's text-to-song app falls into that category. It's closer to a meme generator than a composition tool for musicians, but it's still an impressive piece of tech.


Users choose a genre and an AI voice to get started. Type in a lyrics and the app will create a short pop song. Part of their AI magic is the ability to match the cadence of your words with a melody that fits into the instrumental backing track. You can share the file with friends and have a laugh, but it won't take you much further than that.


Melobytes

Website: Melobytes

Format: Browser app for procedurally generated music


If Voicemod wasn't lofi and low brow enough for you, Melobytes should do the trick. This web app is great at producing harsh and absurd sound bites based on your text input. It's an AI generated music app, but not the kind of solution that most musicians are looking for. It's more of a crunchy meme generator for internet trolls.

Melobytes includes a number of musical parameters including language, tempo, tonality, and time signature. After experimenting with the site extensively, we're not sure how these attributes are mapped onto the text input. Go into the experience without high expectations and you'll probably have some fun.


Typeatone

Website: Typeatone

Format: Browser app


Typeatone is a simple web app built in 2016 for entertainment purposes. The site lets you use the QWERTY keyboard as a music keyboard. But instead of showing a standard piano interface, it takes your lyrics and turns each letter into a melody sequence. Click the music note icon in the toolbar to switch up your instrument from the default bell tone to a variety of other pleasant sounds. If you hear something you like, use the share icon to send it over to a friend.


We're excited to see how this space develops as AI generated music become more advanced in the coming years. Subscribe to the AudioCipher newsletter for the latest news on this niche!

2 Comments


quinellipe.zorn
Jun 29, 2023

Great list, thanks! But I was quite surprised - even a little shocked - not to see Aiva listed. It's definitely one of the best ones I've seen, for non-musicians and musicians alike, though it is pricey.

Like
Ezra Sandzer-Bell
Ezra Sandzer-Bell
Jul 03, 2023
Replying to

Hey there, thanks for reading. AIVA isn't a text-to-music app in the way we're defining it here, but we did include it in our complete guide to AI music apps here: https://www.audiocipher.com/post/ai-music-app

Like
bottom of page