I would suggest one tutorial by Najim Dahek (Inventor of i-vector). www.superlectures.com/odyssey2016/i-vector-representation-based-on-gmm-and-dnn-for-audio-classification
Hi Krishna.. Thanks for this video. I have a question: As mentioned here ua-cam.com/video/sendxu-rHlY/v-deo.html for each 200 ms of audio you would be predicting the speaker?
Not for each 200ms. They use a window of 200ms with 10ms shift to obtain frame level classification posteriors. Then they pool them to obtain utterance level classification.
Not a query specifically regarding this video but can you suggest me a tutorial to learn about speaker verification using ivector
I would suggest one tutorial by Najim Dahek (Inventor of i-vector).
www.superlectures.com/odyssey2016/i-vector-representation-based-on-gmm-and-dnn-for-audio-classification
i want how to implement this in my project and i want the dataset please
Hii. Is it possible to implement this model on Windows OS.? As in documentation we have Linux OS mentioned.
can't i get the code?
github.com/mravanelli/SincNet
can you do a video on rawnet
ua-cam.com/video/9lOkPtilD74/v-deo.html
Hi Krishna.. Thanks for this video. I have a question:
As mentioned here ua-cam.com/video/sendxu-rHlY/v-deo.html for each 200 ms of audio you would be predicting the speaker?
Not for each 200ms. They use a window of 200ms with 10ms shift to obtain frame level classification posteriors. Then they pool them to obtain utterance level classification.