Text-to-speech apps like Siri have been around for decades, but they were never able to hold a tune. Recent improvements to AI voice models and AI generated music have create fresh demand for AI singing voice generators. The market is responding with a wide range of different tools and solutions.
AI web apps like Suno and Riffusion combine text-to-music with AI singing voices. Users describe the genre or style of music they want to hear and have a song in 30-60 seconds. The AI vocals match the instrumentals perfectly.
Behind the scenes, these kinds of text-to-singing software were trained on large collections of vocal data. Some companies allow users to train their own AI models with existing voice recordings.
In this article, we’ll share a collection of the best AI singing voice generators that we’ve found and include tutorials on how to use them.
Best AI singing voice generators for music producers
There are a lot of AI voice generators out there, but most are not designed for musicians. This article only focuses on the vocal apps that you can use to start creating vocal melodies with artificial intelligence.
If you enjoy text-to-speech apps and plan on designing melodies for your AI voice generator, be sure to check out AudioCipher's text-to-midi VST. You'll be able to type in words and phrases, convert them into melodies, shape them to sound the way you want and apply them to your AI voices using the apps below.
KITS.AI - Voice to Voice audio conversion
Kits AI is a freemium web app that delivers voice-to-voice audio conversion based on a variety of high-quality royalty-free voice models. Users record vocals directly into the app or upload a clean, vocal audio file in mp3 and wav format. During our tests, it took less than a minute to complete the AI voice conversion and all of the subtleties from the vocal performance were retained.
If you're looking for a sound that the existing voice collection doesn't offer, Kits AI includes an AI voice model creation feature. Upload up to 30 minutes of a capella audio files using a single voice and with a single click, you can train your own custom AI model. Check out their voice model creation guide before you get started with this process, to familiarize yourself with best practices.
Kits has partnered directly with artists to co-release tracks created from their AI voice. As the music industry adopts this kind of technology, it's important that artist consent and ethics are taken into consideration. Users can draw from the official artist library and submit their songs for a commercial co-release alongside that artist. The free plan lets you train two AI voice models and access the royalty-free voice library.
Controlla Voice - Train & blend your own voice model
Controlla.XYZ launched as a spatial audio company, and has grown into a mature web app where people can train their own AI singing voice models. Founder Rohan Paul announced that the company reached 10,000 ethically trained ai voices before the end of November 2023 (source).
How do you train a Controlla Voice model?
Controlla Voice allows users to train AI singing models from a capella vocal stems. Vocal takes should ideally include a few different intensity levels and feature melodies that span an octave or more. There are exceptions, like training a rapping or speech model, where the pitch range can be less than an octave.
When the training is complete, anyone can sing in that vocal style. Or at least, an artificially intelligent approximation of it... Once there are two or more AI voices, things get even more interesting.
Controlla lets you blend voices to create hybrid AI vocalists. Our team enjoyed the blending feature and exploring the latent space between real human voices.
These new, blended vocal avatars represent a completely new and original sound. We can imagine a Controlla audio marketplace where vocalists would create and license their vocal models. Digital music producers who can sing but don't want to use their own voice could purchase access to voices and blend multiple styles together into something new.
Visit their website - Controlla.xyz - to learn more.
ACE Studio - AI singing with an audio-to-MIDI DAW
Serious musicians who want control over their vocal melodies should have a look at ACE Studio. The DAW centers around AI voice changing technology with a granular audio-to-MIDI transcription model. In other words, you can grab individual notes as waveforms and move them around at will. The tone quality stays the same even as you move the pitches up or down.
Sing directly into ACE or upload an audio track, choose your voice model, and transfer the vocal style in a matter of seconds. If you don't have a vocal track, record MIDI directly into the DAW and enter lyrics. ACE will sing the track for you using the voice of your choice.
The Custom Singer feature lets you blend multiple voices together to create your own vocal tone. Control parameters like timbre, style, and volume levels to get the precise sound you're looking for.
Our biggest issue with voice changing software has been the difficulty in controlling the expressiveness of individual notes in a performance. So we were thrilled to find that ACE Studio includes hand-drawn motion parameters so you can shape the energy, tension and breath of each melodic line.
Everything happens in the cloud, which means processing times are super fast and you don't need to bog down your local CPU with rendering times.
Worried about the ethics of it all? Each of the AI vocal models have been licensed and cleared for commercial use. That means it's safe to release a track with these AI voices. Many of the singers are free to use for digital album sales, advertising, and film or television as shown below.
Find a full video tutorial playlist on ACE Studio's YouTube channel here. You'll be able to learn more about how it works before you hop in and get started. Otherwise, if you're ready to download the app, visit their website and have fun!
Synthesizer V DAW + Vocoflex Voice Morphing Plugin
Synthesizer V is a DAW with a built-in, state of the art AI voice generator. Like ACE Studio, users can sculpt the melody performed by the AI. Drag notes up and down in a hybrid waveform-MIDI editor. Synth V lets you control subtle articulations in the vocal performance, to help maintain a realistic, emotional tone of the voice.
The parent company, Dreamtonics, is based in Tokyo. It's a city where virtual singer Hatsune Mike was embraced and where emerging VST companies like NeuTone have a shot at success. Japanese laws protect AI music generation.
Vocoflex is Dreamtonics' latest vocal morphing plugin, released July 2024. It loads inside of any DAW (including Synthesizer V), making it easy for producers to target a voice profile with samples as short as 10 seconds.
In my opinion, the Vocoflex voice model visualizer shown above is the most innovative feature, setting it apart from every other plugin. Here's how it works:
Open your DAW and create a new, empty audio track. Import or record a clean vocal take and load the Vocoflex plugin on that track.
The interface will prompt you to import a collection of isolated vocal samples of at least 10 seconds in length. The cleaner the audio, the better.
Let Vocoflex analyze these imported voices and generate visualizations. Each vocal sample is represented as a curve, with few nodes or dots along the curve representing timbral chunks taken from the original sample.
Hover your mouse over these regions to hear realtime timbre transfer. Vocoflex will draw colorful, geometric lines between all of the nodes along the curve that you've selected, to help you visualize and interact with the model's latent space.
Producers who want to take their hands of the keyboard can add way points to these vocal curves and map them to a MIDI controller. Use knobs and sliders to explore the movement between voices in a more kinesthetic way.
Explore the space between voices, to blend and transform them into something completely new. This can be particularly interesting for sound designers looking to synthesize new, fictional voices. It might also be helpful for prototyping a verse before sending it over to an artist.
Check out the Vocoflex website to learn more.
Vocaloid - Yamaha's AI Voice Generator for musicians
Vocaloid by Yamaha was also built with music producers in mind. With over 100 voices to choose from, you'll be able to easily test out different vocal types on your track. Vocaloid 6 does include a voice changer, so you can sing a melody and transform it, but we found it to be less feature rich than ACE Studio.
Revocalize - Record and convert your voice
Revocalize has skipped over the text-to-speech tools and dove straight into voice changing. That laser focus has given them the bandwidth to become one of the best apps for generating AI vocal tracks. Subtle qualities of your voice, like your accent or the emotion you're feeling when you speak, will transfer to the new voice. Check out their landing page to hear a demo of the voice changer.
The company says that they aim to protect your voice, but it's not clear how they plan to do that. They could follow in Holly Herndon's steps and use a DAOs to manage and sell access. Water and Music has been sharing lots of content about how to sell AI related music materials, including your voice, on Web3 as well.
Emvoice - Choose from a collection of AI singers
Emvoice One has taken a novel approach to AI singing software, combining a MIDI piano roll interface with text boxes for lyrical snippets. Users program a melody manually and for each melodic segment, Emvoice will spawn a dedicated text area. Type in your short phrase and the vocal model will do its best to match the melodic shape to the pattern of your words.
Fans of their software have noted that the point-and-click approach to melodies can be a bit time consuming. If you want to give it a spin before committing to the $69 dollar vocal models (a price tag that's on par with competitors), they do offer a free trial that's limited to melodies with just seven notes.
Uberduck - The O.G. Text to Speech Generator
We've already covered Uberduck in detail here, but thought it would be interesting to point out their new AI Grimes Challenge. At the start of this article, I mentioned that Grimes recently took to Twitter and announced that creators have permission to use her AI Voice, as long as she gets 50% of royalties.
So if vocal synthesis is exciting to you and you want to experiment with using a celebrity voice to write a hit song, Grimes is a safe bet. You'll likely be able to post the music online without a DMCA takedown.
Google Colab - The AI music underground
There are only half a dozen AI voice apps designed for singing, but you can find many more options from independent developers and hackers on the internet. One good way to find them is to look for a popular text-to-speech tool, like ElevenLabs, and then run a google search like "ElevenLabs singing". You'll find a number of Reddit, Twitter, and Quora conversations on the topic.
To give a concrete example, one Reddit thread pointed us to the Singing Voice Conversion model on Google Colab. These tools require little more than the ability to upload a file, press Colab's play button one step at a time, and the patience while your songs render.
A brief history of singing AI voices
In April 2023, an AI Drake song featuring The Weeknd was published. This track, titled Heart on my Sleeve, had a reported 600,000 Spotify streams, 15 million TikTok views, and 275,000 Youtube views when labels clamped down and ordered a takedown.
On October 18th 2023, Universal Music Group issued a press release that they would be partnering with Bandlab to begin protecting artist voices, citing Taylor Swift as an example. The RIAA has repeatedly said that they consider AI voice impersonation a credible threat to their bottom line. They may have legal precedent to sue artists who monetize on these celebrity AI voices.
There are some real dangers with voice transfer technology, and they go beyond music rights. Scammers have started using voice clones to target elderly people and exploit them for money.
It’s not all bad news, though. Some artists are welcoming the new technology and selling direct access to their voice. Grimes, a major pop star and the former wife of Elon Musk, announced in 2023 that anyone could use her AI voice as long as they shared royalties when the AI song becomes a hit. She went on to publish a free AI platform called Elf Tech to distribute direct access.
Back in 2021, the popular indie artist Holly Herndon trained an artificial intelligence model on her own voice and released it under the name Holly+. She sells access to it via a DAO and has an AI music podcast where she discusses these topics in detail.
Web applications are a great starting point, but real artists need music production plugins that fit into their workflow. Early examples of digital voice plugins, like Delay Lama, have propelled some artists to fame.
There you have it - a complete guide to the most popular AI voice generators in 2024. We hope you've found this guide helpful!
Opmerkingen