San Jose, CA (November 16, 2023)—Just two weeks after the release of the Beatles’ “Now and Then” single revealed the aural potential of AI, Adobe has unveiled Project Sound Lift, an AI-powered technology that separates speech from background sound.
Project Sound Lift separates speech recordings into distinct tracks of voices, non-speech sounds and other background noise in a video. According to Adobe, the technology is a one-click solution that helps users manipulate audio recordings across a range of scenarios, leveraging AI to independently enhance, transform, and control speech and sound independently. Listen to samples here.
Adobe’s Enhance Speech technology—now available in Adobe applications like Premiere Pro—is integrated within Project Sound Lift to further affect how creators produce and control audio content.
Developed by speech AI researchers at Adobe Research, the technology was announced on stage at MAX in Japan as part of Adobe’s “Sneaks” showcase, where Adobe engineers and research scientists offer sneak peeks at prototype ideas and technologies, each showing future potential to become important elements of Adobe products.
While prior audio AI models often require clean, distinct input sounds—such as a single speaker or sound event without background noise or echoes—real-world recordings rarely meet these conditions: They can contain noise, reverb, multiple speakers and other sound events that are often impossible to control for. This limitation has hindered audio AI’s application in everyday recordings and made it challenging for non-experts to utilize often complex audio tools.
Project Sound Lift can now separate voices and ambient sounds from daily life scenarios, including splitting speech, applause, laughter, music and other various noises into distinct tracks. Each track can be individually controlled to enhance the quality and content of the video.