Audio Data Augmentation Is All You Need

Поділитися
Вставка
  • Опубліковано 30 лип 2024
  • Data scientists know that data augmentation is an important technique to make the machine learning models they train more robust. While there are loads of resources for data augmentation with image data, it’s difficult to find material for audio augmentation
    In this new series, you can learn audio data augmentation from a theoretical and implementation perspectives.
    In the first installment of the series,you can learn:
    - what’s audio data augmentation
    - which use cases it can be fruitfully used for
    - the audio data augmentation golden rule
    Slides:
    github.com/musikalkemist/audi...
    =========================
    Interested in hiring me as a consultant?
    valeriovelardo.com/
    Join The Sound Of AI Slack community:
    valeriovelardo.com/the-sound-...
    Connect with Valerio on Linkedin:
    / valeriovelardo
    Follow Valerio on Facebook:
    / thesoundofai
    Follow Valerio on Twitter:
    / musikalkemist
    =========================
    0:00 Intro
    0:47 What's data augmentation?
    1:38 How does data augmentation work?
    3:03 Why audio data augmentation is important
    5:23 Use cases for audio data augmentation
    7:37 Augmentation chains
    8:56 What data should be augmented?
    9:41 Offline vs online agumentation
    13:23 The golden rule of data augmentation
    14:52 Join the community!
    15:31 What's next?
  • Наука та технологія

КОМЕНТАРІ • 40

  • @jeb4148
    @jeb4148 2 роки тому +1

    Thanks for all the content!! Just finished up the audio signal processing series, excited to jump into this one

  • @AbhinayKhoparzi
    @AbhinayKhoparzi 2 роки тому

    This is so insightful. Eagerly waiting for more in the series.

  • @CourseWare-xg2wq
    @CourseWare-xg2wq 4 місяці тому

    This changed my every view on data augmentation 😮, thnx much Valerio

  • @SHADABALAM2002
    @SHADABALAM2002 2 роки тому

    Much needed and awaited topic

  • @lorenzo87052
    @lorenzo87052 2 роки тому

    Already waiting next video of this topic! Thank you Valerio!

  • @debabratagogoi9038
    @debabratagogoi9038 2 роки тому

    Thank you for the nice explanation

  • @doyourealise
    @doyourealise 2 роки тому

    thank you so much! because of you this kind of videos exist :)

  • @nmirza2013
    @nmirza2013 Рік тому

    I can only say Stay blessed for such a good work

  • @ignaziogulino
    @ignaziogulino 10 місяців тому

    Complimenti Valerio!

  • @heychamp2054
    @heychamp2054 Рік тому

    Thanks, Valerio for this amazing content!
    Question: how can we use onsets for finding similarities between two audio files?
    I am currently working on a project. Please help me out.
    Any help will be appreciated.

  • @kalkidangazahegn5038
    @kalkidangazahegn5038 2 роки тому

    Thanks a lot Valerio, can I apply this audio data augmentation technique to heart sound signal?

  • @AbdullahOlcay-wh4or
    @AbdullahOlcay-wh4or Рік тому

    Thanks very much Valerio, but why using a coupled code (online augmentation) makes it disadvantageous or advantageous (in offline augmentation). I think this depends on how flexibly one can use augmentation and model together without overfitting and with getting a high accuracy.

  • @erikgoron5928
    @erikgoron5928 2 роки тому +1

    Great video, thanks Valerio!
    Question :
    Do you have any tips on how to improve generalizability asides from data augmentation for audio tasks?
    I’m currently working on a speech pathology classifier, it works great on single datasets but when I use the trained models on unseen data from different datasets (same task) it doesn’t perform well anymore. Meaning it would not be a good model in production..
    Any help is appreciate it.

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  2 роки тому

      I addressed this topic in a Linkedin post this wekk. Here's the link: www.linkedin.com/posts/valeriovelardo_ml-data-ai-activity-6886601744583860225-ymSw

  • @SHADABALAM2002
    @SHADABALAM2002 Рік тому

    Hello Valerio... How can we ensure that semantics are not lost after augmentation.

  • @MLDawn
    @MLDawn 2 роки тому

    really nice. well done man!

  • @maryamkhoshkhoo9030
    @maryamkhoshkhoo9030 Рік тому

    Do you have classes to participate for deep learning eeg in python?

  • @syllacamara5504
    @syllacamara5504 2 роки тому

    Thanks a lot for your great work.
    Will you do a video on speech recognition with CTC loss function ?
    Best regards

  • @AdrianFernandezFazio
    @AdrianFernandezFazio 2 роки тому

    Amazing vídeo! Greetings from Argentina!

  • @madmike0304
    @madmike0304 Рік тому

    what about audio data augmentation for music generative models?

  • @BRAGAARS
    @BRAGAARS 2 роки тому

    Hello Valerio, just wondering, did you ever work on audio event detection? Been currently trying to build DNNs that detect a specific sound in audio files through image representation such as spectrograms and can't find a lot of things so I am wondering you ever templated doing something similar.

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  2 роки тому +1

      I've worked on audio event detection. If the data is good, both spectrograms and Mel spectrograms should be valuable representations to use. If I had to pick one, I'd suggest the latter.

    • @Jononor
      @Jononor 2 роки тому +1

      Sound Event Detection using Machine Learning (EuroPython 2021)
      ua-cam.com/video/JrhsFfCOL-s/v-deo.html

  • @mahmodaldahol
    @mahmodaldahol 2 роки тому

    this is exciting topic
    thank you

  • @ont7126
    @ont7126 2 роки тому

    Hi Valerio! Nice topic, can't wait for your implementation. I am training a CNN with a 7k word audio dataset. Could I reduce underfitting by using data augmentation ?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  2 роки тому +1

      I don't think augmenting data will resolve underfitting. For that, increasing model complexity should help.

    • @ont7126
      @ont7126 2 роки тому

      @@ValerioVelardoTheSoundofAI I am using your CNN implementation for the music genre classification. Underfitting tends to decrease after increasing epochs number. Do you think I must increase model complexity or I should find the ideal number of epochs?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  2 роки тому

      @@ont7126 first increase # of epochs. If you still underfit, try increasing complexity.

  • @ivanstepanovftw
    @ivanstepanovftw 3 місяці тому

    Audio in this video augmented too? :)

  • @sutarorem8297
    @sutarorem8297 2 роки тому

    I love you so much

  • @Sawaedo
    @Sawaedo 2 роки тому

    If there would be a conversion from real_data to augmented_data, how much augmented_data do we need to compensate a decrease in real_data?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  2 роки тому +1

      You should keep the original data in the dataset and add the augmented samples to it. As a general rule I would produce 2-4X the original data. More than that you start having too much redundancy.

  • @Bindassmanus
    @Bindassmanus 2 роки тому

    Sir Can you help for detection of vowel like region in speech signal i have my mtech dissertation project on it

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  2 роки тому

      I suggest you to ask a question with specific details of the problems you're encountering in The Sound of AI Slack.

    • @Bindassmanus
      @Bindassmanus 2 роки тому

      @@ValerioVelardoTheSoundofAI ok thanks for replying