The Spectrogram and the Gabor Transform
Вставка
- Опубліковано 4 чер 2024
- Here I introduce the spectrogram, which is a moving-window Fourier transform, giving insight into the time-frequency content of a data set.
Book Website: databookuw.com
Book PDF: databookuw.com/databook.pdf
These lectures follow Chapter 2 from:
"Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Amazon: www.amazon.com/Data-Driven-Sc...
Brunton Website: eigensteve.com
This video was produced at the University of Washington - Наука та технологія
Is the Gabor Transform a special case of a STFT ? What are the tradeoffs of using other windows functions in place of the gaussian?
You never cease to amaze us Dr. Brunton!!! Keep up your magnificent work!!
Thanks for all lectures, I really appreciate your explanations
You're really an amazing teacher! Explained very clearly. Shows you also understand it very well.
I was trying to understand spectrograms recently and finally thanks to this video it clicked! A few clear visuals do a wonder for elucidating the maths and concepts
Thanks a lot for your inspiring lectures !
Zoheir TIR
Algeria
i wish i had this when i was first learning the math required for signal processing. great stuff!
This is just amazIng. He can pass the overall concept in 10 minutes better than one can read in books in 1 hour. Conceptual understanding is crucial to guide learning. After understanding what this concept is about and where you are within the topic and how can it be used in practice, its much easier to absort the material and guide the learning by connecting the detailed concepts you learn afterward, by digging deeper. But having this overall knowledge is essential and most books don't give that.
Thank you Professor Brunton, you're really excellent at this.
teaching tech is beyond this gen keep it up prof, very useful and understanding
Best video for understanding the intuition of spectrogram!
Dr. Brunton's concise explanations of all these transform and compression algorithms are first class, and the visual here just incomparable!
In this video, on the topic of Shazam's algorithm, you mentioned that it has the caveat that: when the song is stretched in time it makes it harder to match peaks in the power spectrum. This got me thinking about the other dimension: a slightly transposed (pitch shifted) song also breaks the algorithm given the spectrum was measured in fixed frequency.
This could be me imagining things but: it might be useful to have spectrum that measures relative frequency. That way you can match songs even if it's transposed to different keys.
Thank you for clear, concise, organized presentation. Appreciative of how much time and effort such a presentation / explanation takes to create and deliver. Appreciative of the format you use and precision in getting explanation correct. Explanation of terms and where terms originate has always been helpful in your presentations. Thanks again. (Erik Gottlieb)
Thanks Dr Brunton ...The Gabor transform was very well explained ,,,,needed the code for the same
Thank you Dr. Brunton for your always insightful and inspiring lectures. I may find a way to use this in my research
Wonderful!
Thank you Dr. Brunton! Clear and concise. Liked and subscribed.
The description is great, the ideas are clear and the logic is coherent. Thanks for your work.
Glad it was helpful!
You are such a good lecturer!
Thank you!
I like the whole series very much.
Excellent explanation. I'm particularly happy about the mention of how Shazam works: that's something that's intrigued me for a while now. Thank you!
I really love this video. I am working in audio classification and I have learned the basic about Spectrogram(STFT), Mel scale and mel spectrogram, MFCC, Consant Q transform etc but I still cant figure out which spectral representation should I use at which condition . Apart from the representation there is the selection of window length and the hop length of the window (trade off between temporal and frequency resolution). At the end of these series I would love to see the comparison and your view on these different representations.
Awesome videos, really great content and great quality, and also a great topic.
Isso é uma das coisas mais lindas da engenharia. Fundamentalmente você vai calcular a transformada de furrier para janelas de tempo específicas, e vai poder ver quais as componentes de frequencia naquele instante!
This is a wonderful lecture!
The bit about Shazam using the power spectral density property to accurately identify songs was interesting.
Thanks for the content
This is really neat!
Thanks a lot Prof. Steve , please could you upload a video for using Spectrogram on sound classifications and feature extraction , regards
Thank you very much for the clear intuitive explanations Dr. Brunton. I was wondering if there is a Gabor transform analog that uses a data driven approach like the SVD? In the case of SVD would the power spectrum change along with the basis or should one compute the basis with the whole signal and only then with a fixed basis apply the transform?
Excellent quality!
I like the creativity with the transparent wall between the lecturer and the camera, on the other hand the presentation seems strangely surreal due to the fact the presenter is only visible as floating head/shoulder and arms
Beautiful explanation
Wow that's awesome! Thank you for introduce it to me Dr. Brunton :)
I work with number theory. The teacher is powerful.
Very nicely explained!! Thanks!
I love your videos and explanation
I am enjoying your trip of learning process
I'm so glad!
Thanks, this video was very helpful for me
Please video on Mel spectrogram and why it can't be reversed, thanks for the book and the videos.
love your work!!
Thank you very much 🎉🎉 you saved my weekend 😂 Have a great day
Great explanation
If the music is shrinked or stretched, it should still be easy to recognize the music, if the program is adjusted according to the percentages of time intervals between peaks of sound, than a unique pattern can be generated.
This has a great potential in the future…
Thank you …
Super good explanation!! May I know how do we get the power information if the y-axis is frequency and x-axis is time? Like how large the signal is for each frequency at an instant time?
Hi Steve, I can't thank you enough for making these beautiful videos. I have purchased the book as a way to say thank you. The book is beautifully printed, and if I can give you some feedback from a reader's perspective, I would like the book to have a larger font. They are too small to read for a long time. Anywaysl, thank you for your work!
Great video sir please keep posting such videos
Beautiful video! I have the following questions. Why does the weight function have to be a gaussian? What would happen if this function is, for instance, a constant of unitary value (so, I'm applying a wight of 1 across the entire window)?
Hello Steven sir,
I have gone through wavelet transforms back in the day and i wanted to ask that is it not similar in the sense that they too have evolved/developed because Fourier Transform fails to specify the time at which certain frequency occurred in the original signal. And moreover, please do bring up a short video lecture series on wavelet transforms as well.
Thank you.
thank you for your lecture. how to make this kind of video in which the drawings can be shown in front of lecturer?
Thank you so much for your labour. Do you mind to make a video on harmonic distortion?
I can't promise I'll make one, but I will add it to the list.
Wow ! Thanks a lot!
Thanks so much indeed
Perfect! thanks
best explanation ever!
Wow, thanks!
... i'm just silently wondering why this was in my recommendations; i am a social science major and i spend most of my time here on youtube watching cat videos. oh, the yt-algorithm. however: keep up the good work!
SUE UA-cam! You've been scarred for life!
How do I locate the fundamental frequency at that particular instant? and what do I do to find the ratio of the harmonics to the fundamental frequency as it evolves with time? :)
Thanks a lot for the series of videos. They are very useful for my projects. How about S-transform? Thank you again.
Thanks! Maybe I'll make one on the S-transform sometime.
Hi Professor Steve, Nice.
Glad you like it!
Thanks a lot ...
what are the HUP implications on time and freq uncertainty for the Gabor Transform?
Dear sir
Is W vs t a continuous function? Can we identify the nature of the change in frequency and then invert it back to get that part of 'f'?
Great explanation can’t wait for the next vid. BTW this is just like a music score. The wave are decomposed by windowed Fournier transformer. I am wondering in real control or identification system, how do we update the realtime signal? We cannot wait a long period and the windows g(x) size also matters. How do we choose a appropriate length
You are right -- and yes, in control applications, the spectrogram will be computed continuously with a sliding window.
is there any difference between Gabor Transform and STFT? Is it just a particular case with a gaussian window and unitary gain?
Amazing
like your video, especially programming in both python and Matlab
Glad you liked it!
thank you
This videos have an incredible quality, really. Content and graphical. My only real question is: how are you able to explain and write mirrored making it look so natural!!!
You can record the video with backwards writing, then mirror the video in an editor afterwards. :)
Wonderful :)
Thank you! Cheers!
How to find the time resolution? Look at the width of the Gabor function?
Doesn't a Gaussian window alter the frequency content of the signal ??
What about the width of the Gaussian function?
This is behind the uncertainty principle?
Why do we use Gaussian window can't we use a rectangular window ??
So basically it is the short-time Fourier transform-based spectrograms. Please reply if yes or no.
I have a few out-of-this-world examples of spectrograms you just might be interested in.
This is sweeet! How can this be applied to voice recognition? Go dawgs!
I think so. Modern voice recognition uses recurrent neural networks, but the spectrogram can be very useful here too.
I would like to know: what are the advantages of this spectrogram over the STFT algorithm or the mel-spectrogram?
Mr Brunton said some a little misleading information. Gabor transform was the first time-frequency representation of signals, it is special case of STFT (because STFT (Short-Time Fourier Transform) can have any sliding window, Gabor transformation is done using only Gaussian window; there is also Triangular, Hamming, Hann, Blackmann and many others). Each of them has advantage and disadvantage - e.g. wider first lobe in frequency means suppressed the others and vice versa. It always depends on the use and purpose of the analysis. For audio signals, the most common window is Hann window (it doesn't have so sharp edges, e.g. like triangular, it is made by harmonic function - cosinus, so it is more smooth but its also wider for the first frequency lobe). The most common spectrogram is computed with STFT (Fourier T. in general), not Gabor transform. Mel-spectrogram is little different because it uses "mels" - Mel is a unit from psychoacoustic for a subjective melody; it also uses cosine transform but in short, it is another time-frequency representation of a signal and it tries to simulate or imitate human hearing and musical perception. Mel-spectrum (and kepstrum) is commonly used for research purposes of MIR (Music Information Retrieval) because the signal representation is usually closer to the subjective aspects of human hearing and thus is better for most of the applications (so far).
@@MrKrvo Ok, I got it. Thanks for your time.
@@ffelixvideos No problem.
Thanks for the great extra information!
@@MrKrvo Thanks for this !
i know its late but we ahould definitly have two, since left and right channels/phases can be different . would be a good extension. yYet, even more so 3d phases allow for darn near infinite phases so would lobe a transform that splits that u0 based on color and intensity to see the intricacies. I mean if were not talking polyphonic it should be possible, monophonic yeah . but honestly I dont get why its a problem for multiple phases and poly phonic if we take each initiation of intonation as a phase of a fractional phases that doesn't multiple/algorithmically interact with/ into the greater whole/nor parts (if given the total data first). Further, this is based off of first principle in the idea that we can generate moire patterns even with aperiodic data (Glass patterns) easily yet it is the microstructures (local) that we aren't able to inversely appropriate/segment, yet if the phase data is there we should be able to,..-wrong?
typos will not be dealt with, it to much to read. so guess
So the spectrogram just shows the frequency played at each time, but there is no information about the amplitude of these frequencies ?
I got my answer from the next video. Thank you :)
You rock
Zsa Zsa or Eva Gabor?
Pleased to meet you, the Real Men and Women of culture ;)
@6:45 low 22222 high~~~😁😁😁
Fun fact: Gabor transform was named after by Gábor Dénes who was a hungarian physicist and electrical engineer and he also got the Phsics Nobel Prize for inventing holography. Sorry for the grammatical mistakes.
@11:50 I miss the times when UA-cam had dislikes, and we had a chance to avoid the garbage...
Did anyone else realize that Dr. Brunton is writing backwards on a screen?
Did you mention the Gabor transform? I must have dozed off if you did. I've been re-reading his paper and want to understand its relation to the spectrogram.
Music sheets are spectrograms from Gabor transformation LOL
so one can technically write an algorithm to listen to music and generate sheet music.... holy there are plenty after google it
This is indeed one of the big open challenges that people are working on. Can you imagine if researchers could create an algorithm that would generate new Vivaldi?
@@Eigensteve what exactly is the open challenge? If you can write an algorithm to write mew music from scratch? But it would probsbly just be random stitching together of notes from existing pieces wouldn't it?
UA-cam University Freshers where you at!