top of page
Search
Writer's pictureEzra Sandzer-Bell

12 Best AI Text To Music Apps for People of All Skill Levels

Text-to-music apps represent an exciting new frontier for music creation. They range from plugins like AudioCipher to song generators like Suno and Udio.


In 2024, Spotify, Amazon and YouTube released AI text-to-playlist generators to improve on content discovery on their audio libraries. Other companies like WavTool released AI co-producers powered by ChatGPT. Users describe the kind of synth they want and WavTool builds custom wave tables accordingly.


In this article we'll walk you through all of the best text to music software available today. Most but not all of them are powered by AI. These resources will help you get a feel for the landscape and decide which tools are best for you.


Table of Contents


9 Best text to music apps in 2024

Other text to music apps (for non-musicians)


AudioCipher Text-to-MIDI generator



Website: AudioCipher

Format: VST/AU/Standalone for Mac and PC

Cost: $59.99 one time purchase, no subscriptions or hidden fees


AudioCipher is a text-to-MIDI generator built around text prompts and musical parameters. It uses musical cryptograms rather than artificial intelligence and is the only plugin on the market offering this functionality.


How the MIDI generation works: Type in words and transform them into melodies or chord progressions in MIDI format.


  • Text-to-MIDI: Type in any word or phrase as an input

  • Rhythm: Set your BPM, control note duration, or use rhythm randomization

  • Melody: Choose key signature / musical mode for your MIDI cipher

  • Chords: Assign chord extensions in the given key or use chord randomization

  • Join notes: Choose to include repeating notes or merge them

  • Inversions: Enforce better voice leading with the new inversion tool

  • Drag MIDI: Drag ideas to your DAW piano roll and add virtual instruments

  • Save button: Save your favorite ideas as a card in the new MIDI Vault


Drag the MIDI output to your DAW and edit it in the piano roll until you arrive at something you like. Then choose your instruments and apply sound design to make it fully your own. In the end, the music you come up with will always stem from the words you chose at the beginning.



AudioCipher's new MIDI Vault makes it easy to manage your existing audio and MIDI files in one place. Store them in cards and add meta tag. The vault's search filters and sorting make it easy to retrieve those files in the future and pick up where you left off in previous sessions.


The MIDI Vault still has AudioCipher's classic text-to-MIDI generator, now with a save button that creates a new card in the vault. Your MIDI file and info about BPM and key signature are all carried over to that new card.


Suno AI text-to-music song generator

Suno AI text to music

Website: Suno AI

Format: Web app, Discord app

Price: $10-30 per month


Suno AI is probably the most popular text-to-music generator on the market today. Their meteoric rise began in December 2023 when they partnered with MIcrosoft Bing, delivering AI song generation alongside AI powered web searches.


In 2024, Suno raised a $125M round of investment, followed by a tidal wave of ground breaking new features. Their app can be used by anyone, regardless of musical background. Mon-musicians and professionals both use the app, albeit for different purposes.


  • Text-to-song generator: Describe the genre or mood you're going for, with the option to create music with or without lyrics. If you don't have complete lyrics, you can provide a theme and Suno will turn it into full song lyrics.

  • Singing voices: Suno uses your lyrics to generate singing vocals that perfectly match the style of music, key and tempo of the instrumentals generated.

  • Text conditioning for audio extensions: Musicians love Suno's audio extension feature because it builds upon any song you upload, writing new sections according to the text-to-music style descriptions you provide.


Each song can be downloaded and is cleared for commercial use. Ethicists have questioned the fairness of Suno training their model on the world's music, and the RIAA filed a lawsuit in June 2024 with the intent to stop Suno in their tracks.


Udio AI song generator

Udio text to song generator

Website: Udio

Format: Web app, Discord app

Price: $10-30 per month


Udio is the first serious text to song app to challenge Suno. They have an almost identical web application and are backed by some big investors. The engineering team includes former Google employees who worked on AI music at Deepmind, and rap icons Will.i.am and Common are also backing the company.


As far as features are concerned, the app generates two 30-second clips and comes with 600 prompts (1200 audio clips) per month. Users can extend those clips to make them longer or modify prompts to get closer to the target sound.


Describe the kind of music you want to hear and provide lyrics to hear them sing over that instrumental track. Then you can publish directly to social media platform or download the files locally to your computer.


ACE Studio (Lyrics-to-vocals)

Ace studio's AI vocal generator with lyric controls

Websites: ACE Studio

Format: Desktop DAW


Suno and Udio are exceptionally good at generating AI vocals from lyrics, but they don't provide any control over the melody and expressive style.


ACE Studio was created specially for generating and improving on the realism of AI singing Voices. Pick a voice model, draw a melody in the piano roll, and write lyrics for the song. It only takes a moment to transform your text into music, and from there you can continue making changes to the vocal performance.


Here are their four main text-to-music vocal parameters:


  1. Air: This controls the amount of breathiness in the vocals, affecting the softness and intimacy of the sound.

  2. Falsetto: This parameter adjusts the extent to which falsetto is used, allowing for a lighter and more ethereal vocal quality.

  3. Tension: This measures the degree of tension or relaxation in the vocal cords, influencing the strength and emotional intensity of the performance.

  4. Energy: This parameter affects the overall energy and strength of the singing, allowing for dynamic variations in vocal delivery.


In my experience, ACE works well as a prototyping tool for exploring an idea when I hit the limits of what I'm capable of. For example, I don't mind singing over an instrumental track, but I struggle to sing layered four part harmony without falling out of key.


With ACE, I was able to train a custom model on my own singing voice and run polyphonic harmony through that model to see how the composition sounded. The output has a kind of synthy vocoder effect, with polyphonic expression to control the vibrato on individual notes, much like an MPE MIDI controller.


Most of this article will cover instrumental text-to-music tools, but if you're looking for text-to-singing-vocals, visit the ACE Studio website to learn more.


Musicgen and SoundGen

Websites: MusicGen / SoundGen 

Format: Hugging Face browser app


One month after MusicLM was released, Meta put out MusicGen. The audio quality is even better than Google's model and in our estimation is the only AI music generation tool that could disrupt the music industry in any meaningful way.


Their text-to-song technology includes a melody condition where users can upload a recorded audio file and combine it with written instructions about genre and instrumentation to create an entirely new song.

For the first six months, the best way to get high quality music from MusicGen was to sign up for a Hugging Face account and create your own space. When you add a payment card, you'll be able to level up to their medium and large models. Instead of relying on local CPU, Hugging Face provides the computer power as a paid service.


Since then, a new product called SoundGen has come out that provides a better user interface with additional audio editing features that MusicGen lacks. It also includes unconventional prompting options like image to music.


We experimented with dozens of genres and found that it was particularly good at creating jazz, classical, rock, and chiptunes based on melody conditions. Try inputting a melody from the main soundtrack of a classic arcade game and see how it reinterprets it!


Each generation takes between 30 seconds up to 3 minutes, depending on the model you use. Once you've created it, you can take a listen and download it. For a detailed walkthrough on how to use and prompt the models, check out our full length article on MusicGen.


Stable Audio by Stability AI

Website: Stable Audio

Format: Website

Price: Freemium with $12/month


Stable Audio was brought to us by Stability AI, the company who first rivaled MIdjourney with their Stable Diffusion model. They are the first audio synthesis tool to go for a commercial model. Trained on nearly 800,000 labeled audio files from the AudioSparx directory, Stable Audio offers high quality text-to-audio service including both music and sound effect generation. Where it lacks in the music conditioning and extensibility of MusicGen, it makes up for in sound quality.


MusicFX (Previously MusicLM)

Website: MusicLM

Format: Browser and standalone app


The Google Arts and Culture team has been exploring AI music generation for years, notably with Magenta Studio, but MusicLM was the company's first venture into creating songs from text prompts.


We originally covered MusicLM in January 2023, when it was still just a technical paper published by their developers. In May 2023, they published a fully functional beta version that was free for anyone to use. You could access it in a browser or download the AI test kitchen from the App store to open it locally.


Now in 2024, they've made some updates to the app and renamed it to MusicFX.


Google's text-to-song model was a big improvement on Riffusion, producing longer clips at higher fidelity. They accomplished this using three music datasets (MusicCaps, Audioset, and Mulan) that were trained on over 40 million YouTube videos.


The music industry hasn't made much of a fuss over AI Test Kitchen's music generator, probably because the quality is still not good enough to disrupt real music recordings.


It's worth noting that Universal Music Group has already started collaborating with Google to train AI models on their music. We may see a much more powerful version of MusicFX drop this year, with artist remunerations built into the system.


Riffusion

Website: Riffusion

Format: Browser app


Riffusion was one of the first to tackle text-to-music in the public eye. December 2022, they made headlines with a simple web app that turned descriptions of music into short audio samples in that style.

By October 2023, the company had released a new and improved version of the app. Users can log in and build their audio library with text-to-music prompting. Like Suno, users could type in lyrics and hear them played back by an AI vocalist. The company also announced that they had raised a $4M round.


They remained quiet for the better part of the year, before resurfacing in July 2024 with a mobile app that converts videos into songs. Users can scroll to their heart's content, hearing custom songs in every genre about their day to day life.


Read our full summary of Riffusion here.


WavTool AI DAW (GPT-4)

Website: WavTool

Format: DAW browser app


Ever wondered what it would be like to have a super-intelligent assistant by your side to help generate music? WavTool is an AI DAW that loads in your web browser and comes equipped with a GPT-4 text to music plugin for advanced users on the monthly premium plan. Prompt the assistant to create new tracks, adjust the mix, and even compose new melodies and chord progressions. It's a lot of fun and will have you wondering why other DAWs don't offer the same features!

We've previously covered the strengths and weakness of ChatGPT music prompts. WavTool's secret sauce is the ability to translate GPT's output into direct actions within the DAW. So instead of simply printing out a list of chords in plain text, you get a complete MIDI file.

Wavtool recently introduced a separate text-to-music sample generator that turns descriptive music prompts into sound effects and short loops. Their artificial intelligence toolset also includes a "continue" feature, for generating music based on an initial MIDI input.


Mubert AI text-to-music generator

Mubert text to music interface

Format: Browser app


Mubert is an AI music generator that comes with a text to music web app. It's not their primary offering, but it's still a fun piece of tech to explore. You can enter prompts, set your track duration, and hit a generate button. In less than a minute, you'll have a complete song idea with details about the BPM and key signature.


Behind the scenes, your text prompt is encoded to latent space vectors of a transformer neural network and matched with existing labeled MIDI loop data. The closest tag vectors are chosen and sent to the Mubert API, where they generate entirely new music. You can find their Python code at this Github repo, if you want to learn more. They also offer a Google Colab environment for more nuanced experimentation.


VoiceMod text-to-song

Website: VoiceMod

Format: Browser app


Sometimes you just want to have fun without trying to create serious music. Voicemod's text-to-song app falls into that category. It's closer to a meme generator than a composition tool for musicians, but it's still an impressive piece of tech.


Users choose a genre and an AI voice to get started. Type in a lyrics and the app will create a short pop song. Part of their AI magic is the ability to match the cadence of your words with a melody that fits into the instrumental backing track. You can share the file with friends and have a laugh, but it won't take you much further than that.


Melobytes

Website: Melobytes

Format: Browser app for procedurally generated music


If Voicemod wasn't lofi and low brow enough for you, Melobytes should do the trick. This web app is great at producing harsh and absurd sound bites based on your text input. It's an AI generated music app, but not the kind of solution that most musicians are looking for. It's more of a crunchy meme generator for internet trolls.

Melobytes includes a number of musical parameters including language, tempo, tonality, and time signature. After experimenting with the site extensively, we're not sure how these attributes are mapped onto the text input. Go into the experience without high expectations and you'll probably have some fun.


Typeatone

Website: Typeatone

Format: Browser app


Typeatone is a simple web app built in 2016 for entertainment purposes. The site lets you use the QWERTY keyboard as a music keyboard. But instead of showing a standard piano interface, it takes your lyrics and turns each letter into a melody sequence. Click the music note icon in the toolbar to switch up your instrument from the default bell tone to a variety of other pleasant sounds. If you hear something you like, use the share icon to send it over to a friend.


We're excited to see how this space develops as AI generated music become more advanced in the coming years. Subscribe to the AudioCipher newsletter for the latest news on this niche!

2件のコメント


quinellipe.zorn
2023年6月29日

Great list, thanks! But I was quite surprised - even a little shocked - not to see Aiva listed. It's definitely one of the best ones I've seen, for non-musicians and musicians alike, though it is pricey.

いいね!
Ezra Sandzer-Bell
Ezra Sandzer-Bell
2023年7月03日
返信先

Hey there, thanks for reading. AIVA isn't a text-to-music app in the way we're defining it here, but we did include it in our complete guide to AI music apps here: https://www.audiocipher.com/post/ai-music-app

いいね!
bottom of page