Understanding Audio Signals for Machine Learning

Поділитися
Вставка
  • Опубліковано 11 січ 2025

КОМЕНТАРІ • 73

  • @didismit1766
    @didismit1766 3 роки тому +13

    I honoustly never take the time to comment or compliment on youtube videos. You, my man, are simply amazing and I truly enjoy listening to you. Going all the way till your last video =)

  • @hydraulicgames2493
    @hydraulicgames2493 2 роки тому +2

    I was just curious about sound processing and found your lecture series. After I started watching, I binge watched the whole series! Absolute piece of art! PS-I started watching with an absolute zero knowledge about the subject.

  • @theaihacker777
    @theaihacker777 4 роки тому +3

    bro your content is so helpful. very concise and straight to the point.

  • @mudassirkhan9054
    @mudassirkhan9054 5 місяців тому

    Hey Valerio,
    This is just amazing content, i like the depth, the way you explain in so simple terms, you satisfied my curiosity for this whole topic.

  • @adityajindal3738
    @adityajindal3738 3 роки тому +3

    Loves the understanding, clarity in content & excellent examples through applications! love it

  • @jancooqhedon895
    @jancooqhedon895 3 роки тому +1

    i came here for a language classification research, but now im amazed with the music thing

  • @avidreader100
    @avidreader100 4 роки тому

    At 15:45, the picture that appears seems to have an error. The numbers of amplitude scale are out of order in binary notation.

  • @user-ky9ur8gm2f
    @user-ky9ur8gm2f Рік тому

    Браво! Превосходная подача материала!

  • @joeljoseph26
    @joeljoseph26 4 роки тому +1

    you are awesome! the best ML tutorial for audio signals

  • @chriskingston1981
    @chriskingston1981 2 роки тому +2

    Thank you so much for creating these video, I am really enjoying them! Always worked with computers, music and sound when I was young, and still am. Have all the basic knowledge of prorgamming, ml and music. But this is so much more depth, didnt knew I like this stuff. Thank you for creating a new passion for me. Ai with music❤️❤️❤️

  • @hersheyscoco1
    @hersheyscoco1 4 роки тому +10

    Great stuff Valerio, this is amazing content - very educational. When you cover the audio features, can you also cover in depth MFCC's, and how they are typically used? I have yet to seen a good treatment of MFCCs and get an intuitive feel for how they work.

  • @nakarosz
    @nakarosz 4 роки тому +3

    Amazing content Valerio! Thank you!

  • @danielgurgel8080
    @danielgurgel8080 3 роки тому +2

    Loving your content. Thanks a lot!

  • @ramportland
    @ramportland 11 місяців тому

    @9:13 of the video, is it above or below the nyquist frequency?

  • @fredrickpwol8639
    @fredrickpwol8639 2 роки тому

    Thanks a lot for this, It's helping through a project I'm working on. I'm really grateful

  • @rangiding99
    @rangiding99 3 роки тому +1

    Thanks for your wonderful job, beautifully done!

  • @phosphoricx
    @phosphoricx 3 роки тому +4

    I’m not sure your aliasing demo actually has aliasing. Usually you would hear nasty artifacts when downsampling so much without an antialiasing filter. Audacity is likely applying an antialiasing filter to reduce the bandwidth of the signal before downsampling it.

  • @wehbihabli7425
    @wehbihabli7425 3 роки тому +1

    appreciate all the details! great

  • @Drew_7
    @Drew_7 Рік тому

    Thank you for this amazing walkthrough; this is going to help me SO much with ML. Also, question for this section 17:37, do you know why we have to divide the bit depth and resolution sampling rate by 1048 window? In other words, why do we divide by 1,048,576 and then again by 8 bytes? Is there some resource on why this is default? (I'm assuming this has to do with the way computers work.)

    • @johnnyvishnevskiy8090
      @johnnyvishnevskiy8090 Рік тому

      A bit can either be 1 or 0 (On / Off)
      A byte is a group of 8 bits. So if there are 8 bits, where each bit can only be 0 or 1, there are a total of 2^8 = 256 different values that a byte can represent.
      So let's consider bit depth * sampling rate = 16 * 44,100 = 705,600 bits per second.
      There are 1,048,576 bits in megabits (mega means 1 million, but the closest binary representation of that is 2^20 = 1,048,576). So 705,600 / 1,048,576 = 0.6729 megabits per second.
      Remember that 8 bits = 1 byte? 8 megabits is also 1 megabyte. So we just need to divide by 8.
      0.6729/8 = 0.0841 megabytes per second = 5.0468 megabytes per minute. I'm assuming it's a typo in the video.

    • @Drew_7
      @Drew_7 Рік тому +1

      @@johnnyvishnevskiy8090 You're a freaking legend, thank you so much. This all makes sense and now I'm gonna read it like 1,048 times. lol

    • @johnnyvishnevskiy8090
      @johnnyvishnevskiy8090 Рік тому

      @@Drew_7 np!

  • @vigneshreddyjulakanti7583
    @vigneshreddyjulakanti7583 Рік тому +2

    Great one thanks ❤️

  • @mattdistad1338
    @mattdistad1338 Рік тому +1

    When you resample in Audacity, you are not hearing aliasing. Audacity used a LP filter (as any good downsampler should) to avoid aliasing. What you're hearing in the high frequencies being filtered out

  • @parkerhyde_
    @parkerhyde_ Рік тому

    This is invaluable. Thanks so much

  • @mohamadqodosi7057
    @mohamadqodosi7057 8 місяців тому

    I got a liitle confused, Does aliasing mean we can hear frequencys higher than our hearing rang after digitalization of signal?

  • @heecheolcho3246
    @heecheolcho3246 4 роки тому +3

    Thank you for good works. I hava a question. (16x44100x60) / (8x1024x1024) = 5.046844
    Why 5.49MB?

    • @ramkumarkoppu
      @ramkumarkoppu Рік тому

      I guess he intended to say 5.047MB and made a typing error in the slide which he read out.

  • @subramaniannk3364
    @subramaniannk3364 4 роки тому

    A question
    .
    Does the music player in the computer assume/know that audio files are sampled at a particular frequency? Would the music player work fine if the audio files are sampled at a different rate?

  • @duygua1286
    @duygua1286 4 місяці тому

    Amazing content!!!

  • @AhmetAksoy
    @AhmetAksoy 4 роки тому

    Thank you very much. It is a very educational video. ( But in the audio there are some short bass bursts. )

  • @JogosEtudoMais
    @JogosEtudoMais 4 роки тому +3

    Valerio, thank you for this amazing work. You are helping me a lot, I am studying audio and you are answering all my questions. Do you have any book recommendation for me?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  4 роки тому +3

      Thank you! Unfortunately, there aren't many resources about AI audio. I'm currently writing a book on the topic.
      A book with "traditional" DSP approaches is "Fundamentals of Music Processing" www.springer.com/gp/book/9783319219448

    • @JogosEtudoMais
      @JogosEtudoMais 4 роки тому

      @@ValerioVelardoTheSoundofAI I'll buy yours when it is ready!

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  4 роки тому +1

      @@JogosEtudoMais thanks!

  • @sabrinahuda7308
    @sabrinahuda7308 4 роки тому

    Hi Valerio ! do you perform the conversion in librosa or it need to code from the beginning ? and can i have the example of pseudocode or algorithm for the adc conversion?

  • @adamsik1025
    @adamsik1025 3 роки тому

    Is there any sense in buying a headphones that have transfer range up to 80 000Hz if humans are capable of hearing sounds "only" up to 20 000Hz?

  • @maryamashfaq6700
    @maryamashfaq6700 4 роки тому

    What is the Memory storage format of audio and video file???

  • @michaelmanuel1676
    @michaelmanuel1676 3 роки тому

    your amazing please keep it up!!!

  • @arunmehta8234
    @arunmehta8234 4 роки тому

    I really enjoyed it. I couldn't understand the profile photo abstractness.

  • @maryamashfaq6700
    @maryamashfaq6700 4 роки тому

    List all the digital formats of audio that are saved in a memory ??please answer it

  • @vijaykhandagale2591
    @vijaykhandagale2591 3 роки тому

    Would you suggest some reading material to accompany your videos?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  3 роки тому

      Yes, this great book:
      - Music Similarity and Retrieval www.springer.com/gp/book/9783662497203

    • @vijaykhandagale2591
      @vijaykhandagale2591 3 роки тому +1

      @@ValerioVelardoTheSoundofAI Thanks for the suggestion.

  • @venkatesanr9455
    @venkatesanr9455 4 роки тому

    Hi Valerio, I have one doubt that if an audio has a sampling rate of 8000 Hz. Can you say whether it is correct to extract audio features by upsampling the audio to 44100 Hz or 32KHz. KIndly give some suggestions on this

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  4 роки тому

      I would stick with the files at 8KHz, which has the advantage of resulting in lighter data.

    • @venkatesanr9455
      @venkatesanr9455 4 роки тому

      @@ValerioVelardoTheSoundofAI I believe that you are saying, it is to keep 8000 hz for audio feature extraction. Whether it is wrong/some effects will be there when we have tuned sampling rate to 44100hz and extracted audio features. Kindly reply

    • @imamuddin8042
      @imamuddin8042 4 роки тому +1

      @@venkatesanr9455 if you upsample, that would not hurt your signal ... But that would require more data storage to accommodate more samples, as @Valerio mentioned. Moreover, if your signal is bandpass signal of a carrier frequency (not a baseband), there is a limit of upsample as well, I mean you can't upsample your signal as much higher rate as you want

    • @venkatesanr9455
      @venkatesanr9455 4 роки тому

      @@imamuddin8042 Thanks for your kind response, Sharif. I have recorded speech samples at sample rate of 8000 Hz. While processing this I had this doubt by upsampling the speech data towards 44.1k or 32kHz and extracting audio features would have any effects.

    • @imamuddin8042
      @imamuddin8042 4 роки тому

      @@venkatesanr9455 Since you have recorded Speech Signal (that is 20Hz-20 Khz spanned) at sample rate of 8kHz , so already included aliasing effect of signals over 4khz... So I would recommend to put an anti aliasing low pass filter cut off at 4 khz in analog domain, before you sample your data. Or, you can initially use sampling rate 44.1 k , instead of thinking about upsampling it later

  • @vandanagoyal3037
    @vandanagoyal3037 3 роки тому

    Thanks for the video

  • @hossien2843
    @hossien2843 4 роки тому

    can you share the music?

  • @maryamashfaq6700
    @maryamashfaq6700 4 роки тому

    What is the digital format of audio that saves in a memory ?

  • @ankithooda1536
    @ankithooda1536 3 роки тому

    It is so so awesome.

  • @SuperLucasGuns
    @SuperLucasGuns 4 роки тому

    this is great stuff

  • @meedkal79
    @meedkal79 4 роки тому

    We have a project that recognizes speech, can i help me that

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  4 роки тому +1

      If you need general feedback, I suggest you to join The Sound of AI community (sign up link in the description). If you need more involved help, I do consulting.

    • @meedkal79
      @meedkal79 4 роки тому +1

      Ok thank you

  • @juniorsilva5713
    @juniorsilva5713 9 місяців тому

    Thanks a lot! =)

  • @abhishek-shrm
    @abhishek-shrm 4 роки тому +1

    Great Video. You look like a musician in long hair. I wonder if you really know any musical instrument.

  • @KindCrimeRecordings
    @KindCrimeRecordings 7 місяців тому

    Acoustic instruments (like the piano) aren't analog instruments (: