45:07 sir it is not window size which represent output vector dimension otherwise then why vector parameter is 100 by default and window size present as two separate parameter when you train word2vec model from gensim lib.
you said that window size defines the length of vector to which a word is transformed but in the next video while training you had a window size of 5 but got a vector of 100 dimension ?? pls clarify
why does number of hidden neuron should equal to window size , it can be anything , right? our window size decides the inputs words neuron layer which is window size-1. Correct me if i am wrong
As for the hidden layer size, it also is a hyperparameter. Even in the paper Google published, they used 5 as a typical window size. And for the feature they experimented with embedding sizes of 100 to 1000 dimensions.
I think towards the end the explanation of window size is wrong. if you multiply (7x5 )* (5x7) you output is basically a 7x7 matrix. so for each vocab word you have one vector of size 1x7 representing it. Also I believe window size does not mean feature vector it just means that how many words you are sampling before and after the context word. It is ultimately the final layer output dimensions which would have the embeddings. For e.g. last hidden layer is of size (7x512) you would get (7x7) * (7*512) which would give you embeddings of 7x512.
@@mudumbypraveen3308 bro could you please explain me how the initial 7*5 matrix come for each input word and , like how does the machine is able to attain the concept of feature representation in training
Like your videos and earnest style of speaking, But I was confused about king - man + queen = woman Logically this seems more correct ? King - man + woman = queen ?
this is first time i actually understood how embeddings are generated using word2vec. In most other tutorial on word2vec this exact thing was missing
Excellent series, wonderfully explained mechanisms - you won't see this elsewhere. Thank you!
i love your explanation, can't wait to the next part🤩
Thank You Krish for wonderfull explaination
I can't wait for the next part which the SkipGram approach will be discussed
45:07 sir it is not window size which represent output vector dimension
otherwise then why vector parameter is 100 by default and window size present as two separate parameter when you train word2vec model from gensim lib.
Great video
The videos are very helping for me. Thanks Krish. Waiting For Some More Adv. Conceptual Videos Related to Deep Learning.
If embeddings of two words have a cosine distance of close to 1, it implies high similarity or synonymy (depending on model training).
41:49 - What is the point of initializing weights when all the 0s (which are n-1 in number) multiplied with any number will anyways remain the same?
Sir in Hindi batch there are 5 video already uploaded but in english batch there is only one video .why this difference are there ??
Amazing!!!!
Please teach in FSDS May batch also
thanks a lot
you said that window size defines the length of vector to which a word is transformed but in the next video while training you had a window size of
5 but got a vector of 100 dimension ?? pls clarify
I have same query
why does number of hidden neuron should equal to window size , it can be anything , right? our window size decides the inputs words neuron layer which is window size-1.
Correct me if i am wrong
you are correct , there is no error in your explanation. one note in window size is it is
a hyperparameter that can be fine tuned based on results.
As for the hidden layer size, it also is a hyperparameter. Even in the paper Google published, they used 5 as a typical window size. And for the feature they experimented with embedding sizes of 100 to 1000 dimensions.
grate sir
i leaned so much thing from your videos thanks lot sir
sir can you show us where to to download pretrained model of word2vec
I think towards the end the explanation of window size is wrong. if you multiply (7x5 )* (5x7) you output is basically a 7x7 matrix. so for each vocab word you have one vector of size 1x7 representing it. Also I believe window size does not mean feature vector it just means that how many words you are sampling before and after the context word. It is ultimately the final layer output dimensions which would have the embeddings. For e.g. last hidden layer is of size (7x512) you would get (7x7) * (7*512) which would give you embeddings of 7x512.
I am not sure , but i think its not a matrix multiplication , if its analogus to matrix multiplication then what you said seems to be correct,
@@BenASabu it is always matrix multiplications in deep learning unlike classical ML algos.
@@mudumbypraveen3308 bro could you please explain me how the initial 7*5 matrix come for each input word and , like how does the machine is able to attain the concept of feature representation in training
yup even i didnt understood the last segment, it became hotch potch
the same word can be present in differnt sentences !! so we calc the vector for thtat word in every sentence and take the averge?????
is the number of hidden neurons equal to the window size?
Hope your Dubai tour was good.
Like your videos and earnest style of speaking, But I was confused about king - man + queen = woman
Logically this seems more correct ?
King - man + woman = queen ?
Is this the first video lecture on nlp
ua-cam.com/play/PLZoTAELRMXVNNrHSKv36Lr3_156yCo6Nn.html
No
@@apurvaxyz do you have the entire nlp link by any chance
@@a.chitrranshi
Here's the playlist ua-cam.com/play/PLZoTAELRMXVMdJ5sqbCK2LiM0HhQVWNzm.html
Sir FSDS Batch mai bhi padhao please
Hello sir please reply every video I put a comment about data science course but you don't reply?
Can you please put out videos for computer vision. DL_CV
Window size doesn't give the embedding dimension.
last portion was confusing
i want ML models more efficiency that project link
This is not a good content to learn word2vec. His conception is not clear
Too many ada
wrong in many ways. window size and feature dimensions need not same. word2Vec is 2 layered NN. here only one layer is shown. Overall poorly explained
I do agree brother! It is wrongly explained
Deep learning road map