SPIN: AI-Music Synthesizer

Featured in: Gizmodo, Creative Applications, Yanko Design, Design Boom, Geeky Gadgets, Arduino, Hackster, Gear News, New Atlas, Music Tech, MixMag Asia, EDM, Elektronauts.

The Intro

SPIN is an AI music synthesizer that allows you to co-create compositions with a language model, MusicGen. It is a playful invitation to explore the nuances of algorithmic music, encouraging you to slow down and zoom in on its artifacts. It celebrates the marriage between human and machine creativity through music.

SPIN breaks down the process of co-composing music with an AI using a tangible interface. Enter the desired mood, genre, sounds and bpm to listen to the music come alive on an LP record. A DVS (Digital Vinyl System) allows you to slow down, zoom in, scratch and listen between the notes. Use it to create new compositions, as a simple sound synthesizer, as a playful scratch tool, or to play relaxing music in the background.

SPIN is an artifact from a future where music will be hyper-tailored to people’s tastes and preferences. It is an explorer of musical curiosities that can generate music unlike anything heard before, blending unheard-of combinations of sounds, rhythms and harmonies. This opens up exciting possibilities for pushing the boundaries of music and creating entirely new micro-genres. Who’s ready for some happy, death-metal, disco?

"The future of creativity belongs to those who can harness the power of AI while staying true to their own unique human perspectives." – Steven Pinker, Cognitive scientist and author

How it Works

Under the hood, SPIN takes the input prompts in the form of button presses through an Arduino Mega. This is sent via serial to a Raspberry Pi which prompts the MusicGen API. An mp3 file is received as the output which is loaded onto a Digital Vinyl System (DVS). A transmuted Numark PT-01 and a timecoded control vinyl record serve as the turntable. The Xwax DVS package for Raspberry Pi reads the vinyl timecode through a Behringer audio driver, and the output is played via stereo speakers.

Spin block diagram of componenets and systems

The Process

There are a lot of amazing generative music experiments, from Dadabots’s relentless death metal streaming AI on YouTube to Holly Herndon’s experiments around voice transplantations. But I realized we hit a tipping point when I stumbled upon the Riffusion music model; I was taken aback by its depth and realism, including its new update that adds lyrical voices to the output. Inspired by this, I wanted to build a platform to let me further explore and combine never-before-heard combinations of music and sounds. This laid the seed for building SPIN.

I wanted SPIN to encourage people to be playful; having a scratch interface served this purpose. A DVS (Digital Vinyl System) adds an extra dimension while listening to the generated compositions. It allows us to slow down these synthetic tunes and listen between the notes. So, I decided to combine a DVS system with the MusicGen API in the form of an old-school synthesizer.

Below is an abstracted high-level view of the stages I went through for the technical implementation. Check out the link to this tweet thread for the whole process accompanied by pictures and videos.

Tested the Musicgen API on Raspberry Pi using Python.
Tested the Xwax DVS package with a timecoded LP record on Raspberry Pi.
Prototyped the button input using the keyboard matrix library and tested hardcoded custom animation using a simple LED matrix.
However, I wanted the animations to be more fluid and smooth, so I switched to using the FastLED library with the WS2812b neopixels.

A stop motion animation of a PCB for Spin AI music synthesizer

Designed the PCB for the input and LED interface in KiCAD.
Designed the button enclosures and 3D printed different versions to test LED diffusion.
Received the PCB board, soldered it together and tested animations.
Tested the whole setup for the first time together: PCB with button input and LED along with the Xwax DVS on the turntable.

Designed and milled the wooden cabin enclosure and assembled it with the help of our carpenter.
Modified the Numark PT-01 and AUX speaker. Assembled the power supply.
Sanded and polished the wooden cabin.
Designed and 3D printed mounting stands for the record player and PCB. Laser cut the acrylic top plate.
Finally assembled with proper mounting for all the components.
Designed, branded and labeled interface using vinyl.
However, on the final round of testing, the DVS stopped working. So, I had to break down everything to understand the problem and ended up reverse engineering the output AUX ports of the Numark PT-01.
Final photoshoot and video documentation.

Conclusion

SPIN is part of a series of experiments alongside Ghostwriter that tries to bring AI based experiences into the physical world. It allows us to take advantage of all our senses while slowing down our interaction with them. By doing so, it creates a safe space where we can play, experiment, debate and create a subjective understanding of it at our own pace.

SPIN points to a future where music can be hyper-tailored to people’s tastes. It shows a glimpse of how AI can generate custom micro-genres that didn’t exist before. However, does this come with an ethical cost? As part of my role as an adjunct faculty for AI at CIID, I have also been surveying AI’s unintended consequences; especially around ownership. SPIN knocks on the door of ethical content creation. As MusicGen is trained on datasets of human-generated music, who really owns the copyright to its output? Ethical questions surrounding ownership, creativity, and potential biases in the algorithms are primary topics for discussion.