top of page
Search

Spotify AI Tools: Basic Pitch, Enhance, and More

Spotify has been quietly positioning itself in the AI music space for some time.


Founded in 2006, they're the top competitor to Apple Music, with a collection of 70 million songs and 4 million podcasts, 188 million users, and a current valuation of $19.8B.


People tend to think of Spotify as a streaming service. But the company has been investing in other sectors of the music industry. In 2013 they published Soundtrap Studio, a free and mobile-friendly music creation app, to help people write and record songs quickly.


SoundTrap Studio
SoundTrap, the Spotify DAW. Header image from homepage.

A digital audio workstation may seem strange coming from a playlist app, but their mission statement clarifies things a bit:


“Our mission is to unlock the potential of human creativity—by giving a million creative artists the opportunity to live off their art and billions of fans the opportunity to enjoy and be inspired by it.” - Spotify

As it so happens, Spotify has a number of AI music features and products in their bag of tricks. The music library's metadata alone is a goldmine for machine learning experiments.


In this article we'll cover some of Spotify's most popular artificial intelligence projects, some random offshoot applications, and instructions on how to begin your own experiments with their Web API.


Table of Contents



Spotify AI: A Quick Overview


Imagine building a playlist of your favorite songs and sharing it with a friend. You ask them to recommend music they think would go well with it.


Spotify wants to deliver you that highly personalized playlist. But instead of hiring millions of music experts to do it manually, they've acquired and developed machine learning algorithms that can do it automatically (or at least, try to).


MUSIG by Spotify
Spotify's MUSIG model for representing music

There are two major components at play in Spotify's music recommendation engine; content-based and collaborative filtering.

  1. Content-based filters evaluate a track on twelve audio features; acousticness, liveness, speechiness, instrumentalness, energy, loudness, danceability, valence, duration, tempo, key, and mode.

  2. Collaborative filters take a look at Spotify’s user-generated assets (like playlists) to find songs with a similar audience.

Together, these filter sets allow the Spotify AI to make predictions about what you might want to hear next. The illustrations below come from a piece written by Ashrith Shetty, titled What Makes a Song Likeable?


The changing slides of the animations depict six Spotify audio features across a hundred individual Spotify tracks.


Spotify audio features
Six audio features across 100 songs
Overlapping audio features


AI Software and Talent Acquired by Spotify


Francois Pachet

Meet Francois Pachet, founder of Sony’s Computer Science Laboratory in 1997. He pioneered the use of metadata to enhance music creation at Sony and went on to build Flow Machines, a generative music app. Eventually he released a full album of AI Music titled Hello World. His AI song Daddy’s Car was widely publicized.


In 2014, Spotify acquired The Echo Nest, a music intelligence and data platform. According to LinkedIn, Francois Pachet left Sony in 2017 to join Spotify’s team as the director of their Creator Technology Research Lab. That same year, they acquired another AI audio processing startup called Niland.


In 2021, Spotify announced the acquisition of Podz for their podcast recommendation system. A few months ago, in June 2022, they released Spotify Karaoke and bought Sonantic, an AI Voice company. Forbes ran an article at the end of the month, hinting that AI music creation is coming down the pipeline.

1. Basic Pitch: Spotify’s Audio-To-MIDI Converter


In June of 2022, Spotify announced that their Audio Intelligence Lab partnered with the SoundTrap team on an open source audio-to-MIDI application called Basic Pitch.


This software uses machine learning to transcribe audio recordings into MIDI notes. We’ve seen similar features from Melodyne and Ableton in the past, but Basic Pitch is computationally faster and capable of impressive polyphonic transcriptions.


I definitely recommend playing with the Basic Pitch app and uploading some tracks to see how it does. If you’re a musician who’s been trying to figure out what the heck your favorite artist played on some song, now’s your opportunity to find out.


Benefits of Basic Pitch

  • Transcribe any instrument: Unlike some of the other note-detection algorithms, Basic Pitch is trained to convert audio from any instrument.

  • Works with chords (polyphonic): For a long time, audio-to-midi apps have been limited to single-note melodies. They had a hard time processing multiple notes at the same time. Basic Pitch lets you transcribe full chords and melodies.

  • Pitch bend detection: Live music has expressive subtleties, like pitch bending, that are easily lost in MIDI. Basic Pitch captures and conveys these pitch bends in its MIDI output.

  • Faster than most tools: Basic Pitch requires minimal resources to run, so its output is almost instantaneous.

  • Mobile Friendly: Works on iphone and android devices.

Problems with Basic Pitch

  • Difficulty with delay effects: As you can imagine, the system has some trouble with effects like delay. Are they separate notes or a single note with an effect? To the trained ear it’s obvious, but not for Basic Pitch. It's going to record each delay articulation as its own note.

  • Limited vocabulary for percussion: The MIDI output from Basic Pitch is not advanced when it comes to percussion. We’ll let this go, since pitch values are easier to map that percussion sound libraries, but it does impact usability.

  • Struggles with dense arrangements: The more instruments you have in a mix, the harder it will be for Basic Pitch to pull them apart and represent them accurately. Instruments that share a similar timbre and pitch frequency may intrude on one another.

If you're a musician who wants to peer behind the curtain of a track and find out what chords they're using, now's your chance!


While Basic Pitch isn't flawless yet, it’s a step in the right direction. I personally find it exciting to see this kind of collaboration between departments happening at Spotify.



2. The Spotify Enhance Feature


Most premium Spotify accounts have access to a playlist building feature called enhance. Spotify can enhance a collection of tracks and generate personalized recommendations that expand on an existing theme of your choice. The best part is that you can use playlists of any size. So if there’s one song that’s really hitting the spot and you want to hear more like it, enhance lets you do it.


Spotify Enhance
An enhanced Spotify playlist

The example above showcases how a single track can be turned into its own playlist. When you hit the enhance button, all of the other tracks appear moments later. Prepare to be impressed by the quality of their suggestions. I am regularly surprised by new artists that I discover with this tool.


Enhance your own songs: If your music is published on Spotify, you can expand it and hear other artists similar to you. Even if your music is brand new and has never been added to other user playlists, you’ll still get accurate pairings thanks to Spotify's AI audio feature detection.


Publishing tip: Finished your music but don’t know how to get it online? Check out digital distribution services like Distrokid or CD Baby. These are two great entry points for starting your career in the music business.



3. Roast My Spotify (A fun example of Spotify AI)

Judge My Music by Pudding
Using Spotify AI to analyze and make fun of your music taste

The next time you’re feeling good about your taste in music, strike up a convo with the AI bot over at Pudding’s Judge My Music. You’ll come head to head with the snarky cousin of Spotify Wrapped. Instead of analyzing your listening habits and taking you on a nostalgic trip through time, be prepared to be mocked for overdoing it on your favorite artists.


Some impressive tech lies behind the condescending remarks you’ll receive from the Pudding.cool Judge My Spotify experience. They've leveraged the Web API to connect to your Spotify account, review your listening history and make brazen cultural observations about you as a person.


The conversational aesthetic of the app lends itself to a good, immersive user experience. You’ll really feel like you’re being roasted by a know-it-all hipster.


Runner Up: Image-to-Music Discovery App

Mangomoji

Sometimes an emoji conveys your feelings better than words. Music is similar in that way. So this team of independent developers built Mangomoji, a bridge between worlds. Select emojis that describe how you feel and let the app introduce you to corresponding songs. Does it really work? You’ll have to give it a try yourself and be the judge.



So concludes our journey through the world of Spotify's machine learning algorithms and some of the playful apps that have emerged alongside it.


The Dark Side of Spotify and AI Music


If you've heard complaints about Spotify creators not being paid fair royalties, you might find this documentary interesting. It covers Spotify's revenue sharing model and how bad actors set up phone farms to exploit the system. The dark side of Spotify could be summarized as a scenario where the platform strips artists of fair and meaningful compensation for their work.


As a company driven by user generated content, there's some question as to whether Spotify might eventually connect their AI music creation tools with the streaming platform itself. If in the future, users could generate songs from custom parameters and then publish it to the Spotify content pool, it could further dilute artist revenue streams.


There's no telling how Spotify's user base would respond if the platform became saturated with AI-generated content. One can imagine a competitor sprouting up that banned AI music entirely and paid fair wages to human songwriters.


Despite the problem with artist revenue and quality control, I'm optimistic about the creative potential of neural networks in music creativity. Audio engineers already use AI tools to augment and enhance their workflows. They've retained their agency over the music and will continue to, even as those tools become more powerful and sophisticated.


Developing your own Spotify AI app



Spotify is releasing solid libraries that developers can leverage to build their own apps. The Spotify Web API is open source and includes all of the resources you need to get started. Once you've read the docs, you can sign up for Spotify for Developers and log into your dashboard to begin the authentication process.


You can easily get a track's audio features from the Spotify Developer console. To run the query, you'll just need an OAuth token as described in their authorization guide.


Generating the Spotify OAuth Token


This is all in the docs, but here's a summary.


First, you'll need to create a new app in the Spotify developer dashboard. You'll also need to install Node.JS in order to run your server. Once you've cloned the repo, run npm install, update the server.js file with your Spotify client ID and secret, and then run the app.


With the page running locally, you'll be able to auth into your account and generate an OAuth 2.0 access token. This is what you'll use to make your Spotify Web API calls as you develop your application. Pull from their library existing code to save time building search tools and other widgets.


So that just about sums it up. I hope you found this article inspiring. If you're interested in creating music with neural networks, check out article outlining the best commercially available AI music apps.

bottom of page