5/10 Model Release 27 May 2026, 15:00 UTC

ElevenLabs launches new music generation model featuring mid-track genre switching and localized track regeneration.

The ability to regenerate specific sections of a track solves a major UX bottleneck in AI audio production by introducing audio inpainting. By enabling localized edits rather than full-track rerolls, ElevenLabs is shifting AI music from a stochastic novelty to a deterministic, iterative production tool. This granular control is exactly what is needed to integrate generative models into professional DAW workflows.

What Happened

ElevenLabs has introduced a new music generation model that brings unprecedented control mechanisms to AI audio creation. The standout feature is the ability to regenerate specific sections of a generated song without altering the rest of the track. This allows users to seamlessly switch a track's genre mid-song, swap out a specific instrument, or fix a flawed vocal segment without discarding the entire generation.

Technical Details

While the exact architectural specifics are still emerging, this capability essentially acts as advanced audio inpainting. To achieve this, the model must condition its generation on the surrounding audio context—both the preceding and succeeding frames—alongside the new text prompts. This requires highly sophisticated temporal consistency and cross-attention mechanisms to ensure the newly generated segment seamlessly stitches into the existing latent audio representation. The model must match tempo, key, and room acoustics to ensure transitions do not introduce audible artifacts, phase cancellation, or abrupt clipping.

Why It Matters

Until now, AI music generation platforms have largely operated as stochastic "slot machines." Users prompt the model and hope for a good output, frequently discarding near-perfect generations due to a single bad chorus or a weird artifact. ElevenLabs is shifting the paradigm from one-shot generation to iterative editing. From an engineering and production standpoint, this granular control is the missing link required to treat AI as a viable studio tool. Furthermore, localized regeneration drastically reduces compute waste, as users no longer need to run inference on a full three-minute track just to fix a ten-second bridge.

What to Watch Next

Watch for API availability and how quickly this capability is integrated into Digital Audio Workstations (DAWs) via VST plugins. The next technical frontier will be multi-track stem inpainting—for example, changing just the bassline while keeping the original generated vocals intact. Competitors in the generative audio space will likely need to fast-track similar localized editing features to remain competitive.

Sources

https://techcrunch.com/2026/05/27/elevenlabss-new-music-generation-model-can-switch-genres-mid-track/

elevenlabs generative-audio music-generation audio-inpainting model-release