I got invited into the AI Test Kitchen by Google to begin beta testing out some early versions of their AI apps. The only one I saw available to me at this point in time was MusicLM, which was fine since I am curious about how text might be transformed into music by AI. (I’ve done some various explorations around AI and music lately. See here and here).
MusicLM was simple to use — write a text describing a kind of music (instrument, style, etc.) and you can add things like a mood or atmosphere and it kicks out two sample tracks, with an invitation to choose the best one. This is a trial version of the app and testing platform, so Google is learning from people like me using it. I suspect it may eventually be of use to video makers seeking short musical interlude snippets (but I worry it will put musicians and composers out of work).
I tried out a few prompts. Some were fine, capturing something close to what I might have expected from an AI sound generator. Some were pretty bad, choppy to the point you could almost hear the music samples being stitched together to make the file. Like I said, it’s learning.
The site does let you download your file, so I grabbed a file and took a screenshot and created the media piece above (here is direct link). My prompt here was: “Electronic keys over minor chords.” (An earlier prompt — a solo saxophone — gave me a pretty strange mix and I think I heard some Charlie Parker in there.
Here is what the Google folks write about what they are up to with MusicLM:
We introduce MusicLM, a model generating high-fidelity music from text descriptions such as “a calming violin melody backed by a distorted guitar riff”. MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes. Our experiments show that MusicLM outperforms previous systems both in audio quality and adherence to the text description. Moreover, we demonstrate that MusicLM can be conditioned on both text and a melody in that it can transform whistled and hummed melodies according to the style described in a text caption.
I guess Google will be adding new AI-engined apps into the kitchen for testing. I’ll be curious.
Peace (and Sound),