Amazingly well done! Your paper, this talk, and the wevi tool have made it MUCH easier for me to understand the word2vec model. You definitely succeeded in your goal of explaining this topic better than the original authors. Thank you!
Hi, I am a master student at Nanjing U. and I'm interest in word embedding and such NLP technologies. Can I have your wechat or other social media accounts? Looking forward to knowing you. Thanks.
Excellent presentation! I had kind of got the basics of w2v and applied them in a couple of problems and noticed how well they work, but never found a paper or presentation that would really explain what w2v does and how, so that I'd understand. This presentation did. Thank you!
Oh my god... I came back to this video because of a great explanation...But now after reading comments, I realize that the tragedy already happened the first time when I was watching this :( RIP
i would add that in no way this is a replacement of the explanation of the original paper... the original one(s) was written for researchers in the field - to people who've done neural net, esp neural language modeling for a while, that original paper was a joy to read and offer a lot more insights on the history and competitors of the model
True. I'm not saying that it's a bad paper in any way, but I do feel that it could have benefitted from being more explicit or more detailed at some points. In particular, the negative sampling objective function could have been discussed more. Being familiar with neural networks, but not neural language modelling in particular, it took me quite a while to work out what's really happening in word2vec.
Is that a typo at 22:15 (the 2nd chain rule part)? or I am not following the derivation? - on the video and the paper dE/dw'_ij = (dE/du_j)(u_j/dw'_ij) - shoud it be? dE/dw'_ij = (dE/du_j)(du_j/dw'_ij)
I know this channel will not longer update anymore, RIP bro, sadly to know you in this way, to know you better from your contribution from youtube, thanks
Hey Xin, I've been studying deep learning for about 6 months. I think your slide's description of backprop is the best i've seen yet. I think you've summarized it as concisely as possible. I think the math finally 'clicked.' Great job. Just for others, i don't believe you called it out in the video but it's the chain rule that allows you to work backwards toward the input layer around 16:07, correct?
Great presentation Xin. Very informative. Next time, I'd suggest ensuring the volume is adequate. I've got my volume turned up to 100% and it's barely audible.
Thank you for this video and especially for this awesome paper. What I don't fully understand though is why and/of if the input words do have to be one-hot encoded. What if I'd use a different representation. Let's go crazy and say I use a pre-trained word2vec model with an arbitrary embedding size. What if I used these as inputs in order to learn the weights?
Awesome video! Just one question about the PCA graph. Do you look at the variance of the first two PCs explains? My concern is if the first two PCs explains little about the variance, the graph no longer makes much sense right?
Awesome talk! I'm just starting to learn about this stuff and was wondering if the talk you refer to (during the "Training a Single Neuron" slide) could be found online somewhere?
Do your work will be extent to sentences (sen2vec?). e.g., Input a sentence and get its intention? I tried to connect you on linkedin. Glad to know more about each other.
You said vector of 'drink' will be similar to that of 'milk' after training. That means vectors of context and target will get similar. Then what about the similarity of target words that share similar context ?
thank you so much, but i have i question. if i have 10K words, then i must training all the 10K words one-by-one? and what is the mean of context? i dont understand what is the context. is that a document or what?
Do you mean you have 10K tokens in the corpus? Yes, you will have to train them one-by-one, and maybe multiple iterations for better performance. The context is also a word, or in CBOW a bag of words.
wow oke i see. actually i want to use this method for bahasa indonesia but there are no published pretrained data in the internet so i must to create it by my self. Do i must create the 10K data training like you do in your wevi demo in form "Training data (context|target):" manually one-by-one? is there any method to create the list of data training? i've read your paper and there is chapter "multi-word context". can you give me an example what context has multi word? is it like word "clever" and "smart" in one context?
No. You don't have to. My demo is just for illustration purpose. The word2vec package comes with preprocessing functionality to create context|target pairs from a plain text file. Multi-word context means considering multiple words in the same sentence as a single context. E.g., using CBOW model, in the sentence "The quick brown fox jumps over the lazy dog." For the word "jumps", assuming window size is 3, then the context is quick+brown+fox+over+the+lazy... i.e., a multi-word context.
oh oke i understand what is the context where can i get the package? hmm but if i want to code the word2vec by my self without the package, how to create the 10K data context|target pair? i'm so sorry for asking many question to you :D
I strongly suggest this brain training game”nonu amazing only” (Google it) for anyone who would like to increase and sharpen their brain. So I have been making use of this game a lot for brain training and it works I`ve been checking more things I remember where I left most of my things.
RIP bro. Really sad that we've lost a talent like you. Your paper is now really helping lots of us and thanks for your contribution.
Thanks mate! This is the best explanation for original Word2vec. R.I.P, 一路走好.
Amazingly well done! Your paper, this talk, and the wevi tool have made it MUCH easier for me to understand the word2vec model. You definitely succeeded in your goal of explaining this topic better than the original authors. Thank you!
Thank you!
RIP dear stranger, you've made it so much simpler for all of us.
I have read your paper and after watching your presentation, I've pretty much understood this model. Thanks!
Buddy, you are a saviour..this is all I needed to get started for my project! God Bless!
Just 3 minutes into the lecture, it has already caught my attention and cleared off my sleepiness. :D
One of the most talented AI presentation I've seen, peaceful flight my friend!
Fantastic talk! You give me a more clear understanding of word embedding! Awesome!
Kevin Zhou Thanks. Glad it helped!
Hi, I am a master student at Nanjing U. and I'm interest in word embedding and such NLP technologies. Can I have your wechat or other social media accounts? Looking forward to knowing you. Thanks.
Wei Tong 呃...直接发给我邮件就好啦。ronxin@umich.edu
Thank you for this revealing talk. Good takeaways!
Superb talk, also read your paper before watching this, thanks for helping people understand this great work.
Couldn't help imagining how much he would be able to contribute to the world of NLP if he's still alive...
R.I.P, thank you for your contribution
RIP, Thank you for your contribution
Excellent presentation! I had kind of got the basics of w2v and applied them in a couple of problems and noticed how well they work, but never found a paper or presentation that would really explain what w2v does and how, so that I'd understand. This presentation did. Thank you!
Pitbull Vicious thank you!
Oh my god... I came back to this video because of a great explanation...But now after reading comments, I realize that the tragedy already happened the first time when I was watching this :( RIP
Many thanks for your great presentation and your perfect website!
Really well done, such an improvement over the explanation in the original paper!
thanks!
i would add that in no way this is a replacement of the explanation of the original paper... the original one(s) was written for researchers in the field - to people who've done neural net, esp neural language modeling for a while, that original paper was a joy to read and offer a lot more insights on the history and competitors of the model
True. I'm not saying that it's a bad paper in any way, but I do feel that it could have benefitted from being more explicit or more detailed at some points. In particular, the negative sampling objective function could have been discussed more. Being familiar with neural networks, but not neural language modelling in particular, it took me quite a while to work out what's really happening in word2vec.
agreed.
Great talk! Also, I appreciate the time taken out to put in subtitles! The volume got pretty low at times, and was glad I could rely on the subtitles.
Awesome job! Really straight forward explaination. Thank you very much! :)
Is that a typo at 22:15 (the 2nd chain rule part)? or I am not following the derivation?
- on the video and the paper dE/dw'_ij = (dE/du_j)(u_j/dw'_ij)
- shoud it be? dE/dw'_ij = (dE/du_j)(du_j/dw'_ij)
+Yoo Jongheun You are absolutely correct. I will correct this in the paper. Thanks.
Most impressive... And excellent presentation.
I know this channel will not longer update anymore, RIP bro, sadly to know you in this way, to know you better from your contribution from youtube, thanks
Wawoo.. That was a really great talk..
Awesome talk! Thank you and RIP.
Hey Xin, I've been studying deep learning for about 6 months. I think your slide's description of backprop is the best i've seen yet. I think you've summarized it as concisely as possible. I think the math finally 'clicked.' Great job. Just for others, i don't believe you called it out in the video but it's the chain rule that allows you to work backwards toward the input layer around 16:07, correct?
Yes, chain rule is big in ANN
Very nicely explained.....
Very well done!! Good explanation
Great video! By far the best explanation of Word Embeddings so far! Xin Rong - do one for GloVe too!
Thank you so much for clear explanation.
Wonderful job @Xin Rong
Nice tutorial. Thanks.
Outstanding, thank you Xin.
Thank you so much for the video! :)
Interesting Talk ,... Thanks :)
Great presentation Xin. Very informative. Next time, I'd suggest ensuring the volume is adequate. I've got my volume turned up to 100% and it's barely audible.
Thank you for this video and especially for this awesome paper. What I don't fully understand though is why and/of if the input words do have to be one-hot encoded. What if I'd use a different representation. Let's go crazy and say I use a pre-trained word2vec model with an arbitrary embedding size. What if I used these as inputs in order to learn the weights?
Awesome video! Just one question about the PCA graph. Do you look at the variance of the first two PCs explains? My concern is if the first two PCs explains little about the variance, the graph no longer makes much sense right?
I think that is a great point! For inputs like
a|b,a|c,c|a,c|b,b|a,b|c,d|e,e|d
the PCA would make little sense.
Awesome talk! I'm just starting to learn about this stuff and was wondering if the talk you refer to (during the "Training a Single Neuron" slide) could be found online somewhere?
还是用国语交流,更亲切!非常感谢,真的是很好的工作!
Do your work will be extent to sentences (sen2vec?). e.g., Input a sentence and get its intention?
I tried to connect you on linkedin. Glad to know more about each other.
You said vector of 'drink' will be similar to that of 'milk' after training. That means vectors of context and target will get similar. Then what about the similarity of target words that share similar context ?
I think they would be similar, since we have similar vectors to predict the targets.
黄泉路上,一路走好。RIP.
Qijin Liu 希望他安好吧,不过飞机到底什么原因事故的
Good questions!!
thank you so much, but i have i question. if i have 10K words, then i must training all the 10K words one-by-one?
and what is the mean of context? i dont understand what is the context. is that a document or what?
Do you mean you have 10K tokens in the corpus? Yes, you will have to train them one-by-one, and maybe multiple iterations for better performance. The context is also a word, or in CBOW a bag of words.
wow oke i see. actually i want to use this method for bahasa indonesia but there are no published pretrained data in the internet so i must to create it by my self.
Do i must create the 10K data training like you do in your wevi demo in form "Training data (context|target):" manually one-by-one? is there any method to create the list of data training?
i've read your paper and there is chapter "multi-word context". can you give me an example what context has multi word? is it like word "clever" and "smart" in one context?
No. You don't have to. My demo is just for illustration purpose. The word2vec package comes with preprocessing functionality to create context|target pairs from a plain text file.
Multi-word context means considering multiple words in the same sentence as a single context. E.g., using CBOW model, in the sentence "The quick brown fox jumps over the lazy dog." For the word "jumps", assuming window size is 3, then the context is quick+brown+fox+over+the+lazy... i.e., a multi-word context.
oh oke i understand what is the context
where can i get the package? hmm but if i want to code the word2vec by my self without the package, how to create the 10K data context|target pair?
i'm so sorry for asking many question to you :D
www.tensorflow.org/versions/r0.10/tutorials/word2vec/index.html and github.com/dav/word2vec
Thank you.
R.I.P. Thank you bro.
RIP Xin Rong aka Saginaw John Doe
thank u so much!
Big thank you & RIP
RIP. Gone too soon
Wait, WTF?
Dear Lord. I didn't understand what you meant until I googled it. That's terrible.
what really happened
www.dailymail.co.uk/news/article-4900486/Wife-long-missing-PhD-student-wants-declared-dead.html
Sad.
Thank you and RIP
Why RIP? What happened to him?
m.huffingtonpost.ca/2017/03/23/xin-rong-plane-crash_n_15567112.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAABEjoMpeN7b9fsotLlieE5ozPCYsNlKJwGUd2_KK8Gw0w9lCE3owMkmmqunR_E-033vq8FbU3CmIaOuDdnzJjaLRV_nktW5ZCyqagEbuefYWPfm2OenSZTYgGi5nPslGolgiy3qHBLdLIi-DT4pecXRKW-S777TsCRb-EEuGjk40
He jumped out of his own plane
@@bertmagz8845 hmm the report says how and when he exited is a mystery. Did they ever find his body?
@@Skandawin78 No his body has never been found.
RIP brother
I strongly suggest this brain training game”nonu amazing only” (Google it) for anyone who would like to increase and sharpen their brain. So I have been making use of this game a lot for brain training and it works I`ve been checking more things I remember where I left most of my things.
Thanks and RIP.
RIP.
RIP and big thanks!
RIP!
RIP bro
RIP
R.I.P
Very poor sound quality???
Volume too low even my speaker didn't helping me
haha, you are so funny.
牛逼!
Bad quality of video (sound)
Wawoo.. That was a really great talk..
RIP