DeepAFx-ST: Style Transfer of Audio Effects with Differentiable Signal Processing

Поділитися
Вставка
  • Опубліковано 4 лип 2024
  • DSP Seminar - November 4, 2022. CCRMA, Stanford
    Abstract: We present a framework that can impose the audio effects and production style from one recording to another by example with the goal of simplifying the audio production process. We train a deep neural network to analyze an input recording and a style reference recording and predict the control parameters of audio effects used to render the output. In contrast to past work, we integrate audio effects as differentiable operators in our framework, perform backpropagation through audio effects, and optimize end-to-end using an audio-domain loss. We use a self-supervised training strategy enabling automatic control of audio effects without the use of any labeled or paired training data. We survey a range of existing and new approaches for differentiable signal processing, showing how each can be integrated into our framework while discussing their trade-offs. We evaluate our approach on both speech and music tasks, demonstrating that our approach generalizes both to unseen recordings and even to sample rates different than those seen during training. Our approach produces convincing production style transfer results with the ability to transform input recordings to produced recordings, yielding audio effect control parameters that enable interpretability and user interaction.
    ArXiv draft: arxiv.org/abs/2207.08759
    Code: github.com/adobe-research/Dee...
    Website: csteinmetz1.github.io/DeepAFx...
    Bio: Christian Steinmetz is a PhD researcher with the Centre for Digital Music at Queen Mary University of London advised by Joshua Reiss. His research focuses on applications of machine learning for audio signal processing with a focus on high fidelity audio and music production. His work has investigated methods for enhancing audio recordings, automatic and assistive systems for audio engineering, as well as applications of machine learning that augment creativity. He has worked as a research scientist intern at Adobe, Meta, Dolby, and Bose. Christian holds a BS in Electrical Engineering and BA in Audio Technology from Clemson University, as well as an MSc in Sound and Music Computing from the Music Technology Group at Universitat Pompeu Fabra.
    Info:
    ccrma.stanford.edu/events/dee...

КОМЕНТАРІ • 1

  • @moedemama
    @moedemama Рік тому

    I believe a dynamics range effect in this case should work both ways, so both as a compressor and an expander. Especially with the target demographic thats relying on a lot of vst instruments that lack dynamics