Nvidia announces AI model that generates novel sounds, modifies voices

November 25, 2024 — 12:20 pm EST

In a blog post, Nvidia (NVDA) introduced Fugatto, short for Foundational Generative Audio Transformer Opus 1, an AI model that generates or transforms any mix of music, voices and sounds described with prompts using any combination of text and audio files. For example, it can create a music snippet based on a text prompt, remove or add instruments from an existing song, change the accent or emotion in a voice – even let people produce sounds never heard before. “We wanted to create a model that understands and generates sound like humans do,” said Rafael Valle, a manager of applied audio research at NVIDIA and one of the dozen-plus people behind Fugatto, as well as an orchestral conductor and composer.

