Word2Vec Easily Explained- Data Science
Вставка
- Опубліковано 10 лют 2025
- If you are looking for Career Tansition Advice and Real Life Data Scientist Journey. Please check the below link
Spring board India UA-cam url: / channel
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
/ @krishnaik06
github url: github.com/kri...
NLP playlist: • Natural Language Proce...
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06
I was trying to understand word2vec for the past two years with many videos. You made it clear today with just this 20 minute stuff. You are simply amazing :)
Guys please use "words = model.wv.key_to_index" in place of "words = model.wv.vocab" in code line 60, as per gensim update. Thanks Krish sir for all the efforts you made for data science community.
You're a savior!!!
At 3:05 you are saying, in TF - IDF also semantic information is not stored but in video of TF - IDF you said its stores words semantically unlike Bag if words.
Thanks!
sir, please upload practical videos of Glove and Bert
your tutorials are so good, watching only once is enough to understand the concept. thank u sir
I am looking for some explanation of how the vectors were derived. Most of the other youtube videos that i have seen did not explain this. I was expecting that in your videos, but here also only the python implementation is explained. How the vectors got derived mathematically is missing here too. Would appreciate if you could elaborate on that, since you have a special talent of explaining complex things in a simplified manner.
For that you can go to cs224n Stanford.
I was also looking for the same.
ua-cam.com/video/UqRCEmrv1gQ/v-deo.html
That requires good knowledge of probability.
Hi, did you find the answer ?
Requesting you to upload more videos on BERT, Transformers, LSTM, GIU etc in NLP Playlist .It would be of great help and Thanks Krish for making such amazing videos..
Following ur NLP playlist... I must say u are very good at explaining each and every concepts clearly. Thank you so much for the effort that u have put in creating this amazing playlist. I took a course to learn NLP but your playlist is far better than the course. Thank you sir!
Superb video once again Krish. All my doubts about word2vec are noe gone.Thanks
Thank you very much for this video, super helpful 👍👍👍
Excellent Explanation
Zabardast Bhai 😎
Love the conceptual videos, Have been searching everywhere.
Thank you krish 🙏
@Krish Naik Sir please make videos on data structures and algorithms...you are a great teacher 🙏🙏🙏🙏
Thankyou sir❤️🔥
Great explanation.You made the complex topic very simple sir.Thank you very much. One request please upload all the ppts you shown in this nlp series.
Krish, please make a video on Glove model and pickle model
Sir, one modification is there in Gensim from 3.8.0 to 4.0.0 version model.wv.vocab has changed to model.wv.key_to_index (model initialized according to your video) Thank You
Thanks!
Nice explanation. But one thing I didn't understood is how the words you showed are similar. I can see all are of different meaning only
Hi Krish ! Can you make the video on converting whole data frame words to vectors using Word2Vec as you have not completely explained it.
Sir, Please do a similar video about how to Implement GloVe to vectorize text documents using Python.
Thanks Krish. Why you did't manage out the punctuations from the input text? Are you expecting to get some useful information from them?
superb job krish kindly video on Bert which will be very helpful
Amazing tutorials!
The words "not", " haven't" etc should be excluded while performing stopword removal. These words are very useful when constructing bigrams as Word2Vec takes semantic meaning. Correct me if I am wrong.
thank you so much
Geat explanation. How do i evaluate the performance of two or more models trained on the same dataset?
Hello, can we use Word2Vec in same way like Bag of Words and TFIDF for training a classification model ? If yes, how do we do it ? If no, then how exactly can we see whether Word2Vec is overcoming drawbacks of TFIDF or not?
Hello sir very much impressed by your video. I wanted to know whether we can have hindi or punjabi corpus too instead of English. Pls reply
Good Try Krish.Even though i got the concept of using WordtoVec. it not connected well with the code.Please create one Python code for full implementation.
awesome all my doubt are clear now .please make a video on tfidf word to vec.
Check my NLP playlist
excellent explanation sir,I have 2 questions..1)Is the word2vec different than word embedding or is it the same form of word embedding 2)can we use word2vec in both machine learning and deep learning.
Hi Sudip,
Word embadding is general technique to represent document or word in vector form.
Like we do one hot encoding , dummification etc..
Some of embedding tecniques are :
1) Bag or word
2) TF-IDF
3) word2Vec (its capture semantic information: word sequence detailes).
I hope this might help you :)
Did you get answer for this ? Have the same question now
krish sir, can you show how create our own Word2Vec
can you make a video on how to deal with class balance in nlp, active learning , and when to use w2v and when to use tfidf. ps thanks for your content
Hi ..Thanks for making such wonderful videos!!! Small doubt ..NLTK doesn't support Urdu language. Then which library can be used for URDU?
Hi Krish, why did you tokenize the text to sentences rather than the words? Is there a special reason for that? that would give almost the same result.
Plz continue the deployment of ML models series
Hello Mr krish.
Please help me with video tutorial of fake news detection using machine learning algorithms with word2vec as feature extraction method.
Sir, please make a video on elastic search engine
Sir, I downloaded the nltk library, but word2vec_sample is not getting downloaded. It says out dated, how to get it completed?
sir please extend this video by explaining the latest ELMO & BERT.(including handson)
I clicked on the link - "Career Transition Advice and Real Life Data Scientist Journey" but it gives an output as "This channel do not exist". Pls update on this.
Thanks
thank you sir
People afaid of AI to take over humanity,
Also AI: Vikram also looted satish :D
Please make videos on Glove and Bert
Please, make study case: sentiment analysis svm with feature selection word2vec
Hi Krish,
Your videos really helpful us a lot.
Could you please make a video on skip gram and cbow model of word2vec?
Please explain Drain parser algorithm implemented for parsing log files.
Hi sir, I am your biggest fun. In this video, have you used a pre-trained word2vec from gensim to get the embedding vectors of sentences or you just trained this word2vec with sentences.
Why did we not use lemmatization or stemming here? Won't that make the system more smooth?
what is semantic information? do you have any material related to that please tell me.
Thank you very much. Can use word2vec to predict the most frequent 10 words that come before a specific word and the most frequent 10 words that come after that specific word? and how?
Excellent vidoe. Can we have text summarizer using word2vec?
thank you but if you put some subtitle, it would be easier for us, from VIETNAM
How about using N-gram with bag of words?
Example: sent1: he is good boy.
sent2: he is not good boy
using stop words. "not" will be removed..
Can we give word2vec input to machine learning models
Do we not require either of the stemming or lemmatization while converting words into vectors here?
Sir please provide us with an easy-to-preprocess chatbot dataset....
Sir upload video for Glove and BERT too.
You didn't thought me types of NLP plz make video about this....
how does the vector war got 100 dimension and what does it indicate??
and what is logic of finding vocab in algo and how algo. is performing that??
Can you please add the video for Topic modelling and Text Summarization?
is there a need to lemmatize or stem before we do word2vec? Thanks!
Please do a video on glove
I'm tasked to implement w2v multicagorical classification from scratch,,, but I'm too confused on what is the input the network exactly that is the x1, x2 and x3,,, I mean is x1 the 1sr word in a document? Or is it the 1st element in "a word embedded vector“ for instance if" cat=[0.1, 0.8, 0.7] then x1 is 0.1,,, Im really confused about this generally
How to do information extraction to grab sentences for a particular context from multiple websites ?? Can you point me to the right approach or source
Sir, please upload Glove embeddings and BERT Model
best one
Sir Text embedding bhi bta dy kis trha krty han plz
make a video on glove
is it following basically percentile system in vector of finding similar words?
can you please explain what is this join for? I wanted to join but not sure what are these and how this works. It would be really great if you can explain. :-) thanks
sir actually i m getting error while executing
# Training the Word2Vec model
model = Word2Vec(sentences, min_count=1)
words = model.wv.vocab(error is coming in this line)
plz help
Could you please on Make a video carrer in NLP Domain. And give where to start like curriculum
During the preperation of dataset I did this sir, later while training word2vec model words are showing the output as individual letters.... could you pls help me out with this
corpus=[]
sentences=nltk.sent_tokenize(paragraph)
for i in range(len(sentences)):
review=re.sub('[^a-zA-Z]',' ',sentences[i])
review=review.lower()
review=review.split()
review=[word for word in review if word not in set(stopwords.words('english'))]
review=' '.join(review)
corpus.append(review)
After applying w2v, can we proceed for sentiment analysis through the selected words as the sentence is having huge amount of words
Hi Krish, if my text data is Vietnamese or Hebrew, which process will be best to convert text data to vector?
can you please do a tutorial on Glove
Do you provide classes?
Can anyone tell me as to why in the pre processing part white space is removed twice?
why you didn't use stemming and lemmatization instead of regex
sir, why didn't you remove punctuations?
what is the floating number beside most_similar? is it cosine similarity?
Can you please explain POS tagging? thanks
sir i am facing problem while installing genism please help me that
so there is no stemming or lemmatization in W2V
that's nice that he calls "woman" as "human" because most of the people don't consider them as human.
Sir, How to extract the keyword using Word2vec?
can we use word2vec in sentiment analysis?
how can i construct sent2vec from facebook word2vec model?
Using the same command i am unable to import the gensim library
Hi Krish, I tried same steps, KeyError: "word 'infosys' not in vocabulary"
could you please guide me.
that word is not present in vocab. check the spelling.
Hi @krish naik can i make payment through gpay 299/- to join as a member,to access live videos...gpay id which you have given in previous playlist description.
No Mandar ...u have to go through youtube xhannel itself...it is handled by youtube
@@krishnaik06 Sir i am facing problem for joining channel,my be bank servers are down,any other mode of payment plz let me know @krish naik
What do you mean Semantic Data?
can anybody explain why the vector has exactly 100 dimensions
Sir can "Data Protection Act" can effect the jobs in data science
how to change number of dimensions of a word in word2vec
Which particular stufff?
Focus on teaching students not in history and stop quoting such type of controversial statements.
U r dumb?