10 Machine Learning Music Communities on Discord
AI text and image generators have become super powerful over the past two years. These developments have spurred conversations about machine learning and music. What will it take to bring text-to-music generators into a consumer-ready experience, as we've seen with other tools like Midjourney and ChatGPT?
In this article we'll introduce you to ten of the most active Discord communities that specialize in music and machine learning. But before we jump in the deep end, we'll make sure you also have some context for what you're getting into.
10 Discord communities that specialize in music and machine learning
How machine learning is shaping music
Can machine learning be used to compose music based on samples?
What is a music machine learning engineer?
How to gather music data for machine learning
10 Discord communities for music machine learning

They say the best way to learn a foreign language is through immersion. There are fortunately a number of active communities on Discord that center on music and machine learning. So whether your goal is to meet other people in this space and sharpen your technical chops, or just stay abreast of the latest trends, these channels might be just what you're looking for.
Harmonai Community - 21,600+ members around Harmonai's music model
The Audio Programmer - 8,500+ member community of music & audio devs
Jukebox Community - 4,850+ members, based on OpenAI Jukebox model
Riffusion Community - 1,500 members, on Riffusion text-to-music model
Dadabots Community - 500+ members, focused on generating AI metal
Neutone Community - 450 members, on AI music research and creativity
DDSP Community - 450 members, on Google Magenta's DDSP plugin
Machine Learning with Audio - 250+ member research community
AI x Audio Engineering - 100 member with heavy AI focus
Musenet Community - 100 members, on OpenAI Musenet MIDI generator
How machine learning is shaping music
In December 2022, we saw the debut of a tool called Riffusion that sought to turn text prompts into complete music samples. The quality of the output was not great but the concept was brilliant.
Riffusion's developers used image generators to create spectrograms - detailed visual models of audio files - and then turned those images into sound.
Another big innovation occurred in early 2023 with the publication of Google's MusicLM. The team currently withholds public access to their tool, but they have provided a demo page showcasing the audio file output quality that their machine learning models have created.
A second, unofficial Google Research team published a second paper in February 2023 for their project titled Noise2Music. It can handle complex phrases and fine-grained semantics. For example, the same text prompt can be modified with details about genre, instrument, tempo, mood, vocal traits and era of music.

Noise2Music's powerful text-to-music prompt system is powered by MuLan, used to link music and audio to written descriptions of music. The joint audio-text embedding model is trained on 44 million music recordings (370,000 hours) and free-form text annotations. This means a single prompt can create different musical output each time, the same way Midjourney or Dalle-2 do.
Can machine learning be used to compose music based on samples?
Yes, machine learning models can be used to compose music based on samples. Apps like Riffusion provide a simple web interface comparable to Dalle-2 and MuseTree.
Complex, high quality music can require as much as 5 days to render on an A6000 Nvidia graphics card. Engineers currently seem to be running these builds locally with Dance Diffusion and Rave.
Concerned about privacy and the impact of this technology on artists? Independent organizations like HaveIBeenTrained are rolling out tools that help visual artists opt out of machine learning data sets. In most cases, artists find their work in data sets that they never consented to opt into.
HaveIBeenTrained handles imagery only. I'm not aware of an equivalent for audio yet. Let us know in the comment sections if you find a similar resource and we'll update this article.
What is a music machine learning engineer?
Musicians with a background in computer science have a home in machine learning engineering. Their job is to train neural networks to make better predictions from datasets.
So when it comes to AI music generation, an ML music engineer will have an aptitude for math and programming, along with a solid grasp of audio processing concepts. It helps to understand the basics of music theory too.
Engineers who specializing in AI music could be working on well-funded teams at companies like Google and Spotify. But many of them are building open source software independently in Discord communities online.
Machine learning engineer Steve Hiehn is a great example of a solo developer leveraging AI to create his own music apps. We discovered his work in a He used a Python MIDI tool called Mingus to build a free web app and plugin called Signals & Sorcery. It doesn't convert text to music but it does leverage AI to mix pre-existing audio together.
Does AudioCipher use machine learning to generate music?
AudioCipher does not use artificial intelligence to generate audio. It's a MIDI encryption system designed around the need for chord and melody inspiration during DAW sessions. Our software uses a crypto-grammatic algorithm to process text to MIDI conversion, and introduces two layers of chord and rhythm randomization to spawn infinite variations.
This means that you can use AudioCipher with any AI music tool that accepts MIDI or audio file uploading. For example, the MusicLM website demonstrates a text-to-music generator. Users upload an audio file and type in a prompt describing a specific genre or set of instruments. Then you strum a melody and MusicLM uses your melody as the lead voice in your text-prompted music.

How do you gather music data for machine learning?
Each AI music model has its own specifications for music data types that it will accept.
MusicCaps is currently a popular text-to-music dataset, used in Google projects like MusicLM and Noise2Music. Google also uses Audioset.
PapersWithCode offers a complete collection of music data sets for machine learning training. Here are a few of the best options listed.
MUSAN is a dataset for training models for voice activity detection (VAD) and music/speech discrimination. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises.
MAESTRO is a dataset from Google Magenta that contains over 200 hours of paired audio and MIDI recordings from ten years of International Piano-e-Competition. The MIDI data includes key strike velocities and sustain/sostenuto/una corda pedal positions. Audio and MIDI files are annotated with composer, title, and year of performance.
JAMENDO is an open dataset for music auto-tagging. The dataset contains over 55,000 full audio tracks with 195 tags categories (87 genre tags, 40 instrument tags, and 56 mood/theme tags). It is built using music available under Creative Commons licenses. All audio is tagged and distributed in 320kbps MP3 format.
There are many more data sets to explore on the PapersWithCode website.
You'll find that many of the people in these machine learning music Discords are using their own training data from scratch. If you need occasional help with questions or want to help others get their start, check out these communities!