Meta shares suite of AI tools to generate music and audio

3 Aug 2023

Illustration of black headphones with yellow sound waves between the center of them, on a dark background.

Image: © Igor Nikushin/Stock.adobe.com

The AudioCraft tools are available for researchers and include MusicGen, which is able to generate pieces of music based on text and melody prompts.

Meta has released a suite of generative AI tools, designed to create music and audio clips from text prompts.

These tools – collectively called AudioCraft – are available for researchers and “to further people’s understanding of the technology”.

The suite of AI tools includes MusicGen, the music-making AI model that Meta revealed in June. Built on top of Meta’s EnCodec audio tokeniser, MusicGen can be prompted by both text and melody. This means that it can both generate short pieces of music based on text inputted by a user, as well as complete a melody it is made to hear thanks to its AI transformer model.

AudioCraft also includes an improved version of EnCodec, which is able to generate higher quality music with fewer artifacts, according to Meta.

The third tool is called AudioGen, which is able to generate various audio clips from text prompts. These include environmental sounds or sound effects, such as a dog barking, cars honking, or footsteps on a wooden floor.

Meta said MusicGen was trained using Meta-owned and specifically licensed music, while AudioGen was trained on publicly available sound effects.

“The AudioCraft family of models is capable of producing high-quality audio with long-term consistency and it can be easily interacted with through a natural interface,” Meta said in a blogpost.

“With AudioCraft, we simplify the overall design of generative models for audio compared to prior work in the field – giving people the full recipe to play with the existing models that Meta has been developing over the past several years while also empowering them to push the limits and develop their own models.”

Meta claims AudioCraft is open source, a claim the company makes for many of its AI models. But certain groups have taken issue with this claim as these models are being made available for research purposes, rather than following the specific rules of an open-source license.

The Voices of Open Source group claims Meta’s license for LLaMA does not meet the open-source standard as it “puts restrictions on commercial use for some users and also restricts the use of the model and software for certain purposes”.

Generative AI has gained notoriety in recent years with the rise of text-to-image generators such as Dall-E. But the rush to apply generative AI to various applications was largely prompted by the rise in popularity of OpenAI’s advanced chatbot ChatGPT, which released last November.

But the rise of this technology has also raised concerns around copyright infringement and plagiarism in various scenarios. In February, William Fry’s Barry Scannell discussed the legal trends expected for the year in relation to generative AI. He felt that – given the breadth of music rights that exist – litigation is “bound to happen”.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.