Thanks for all the amazing information you keep sharing over youtube. Please keep up the excellent work, your data science knowledge is of great help to the community of aspiring data scientists.
I watched this video entirely , its very useful for me. I paid 55K for a data science and I am learning from here you are much better than any one there . Thank you Krish 😍
I am both happy and sad. Happy that I discovered the channel today. Sad that I discovered this channel only today. I wish I discovered this channel 4 years ago. So the overall sentiment of my post is positive :)
Thank you for uploading such nice and comprehensive lectures (not videos) and explaining it so nicely. Your commitment is quite commendable. Please make such one shot videos in the future too.
Thank you for being a guiding light in the vast sea of information, providing clarity and understanding to those who seek knowledge. Your commitment to the betterment of individuals and society as a whole is truly uplifting.
NLTK is a comprehensive and educational toolkit suitable for a wide range of NLP tasks, while SpaCy is a focused and efficient library designed for production use, particularly for tasks like entity recognition and part-of-speech tagging.
Thankyou sir.. It was very useful to have it in single video.. All concepts were very clearly explained. God bless you sir.. I am 45 and trying to learn AI Ml😊
thank you so much krish sir!! i don't know how to thank you, u are just awesome and i hope u will get all the luxury and peace in life thank you so much sir for saving us and understanding these complex concepts in a simpler and most effective manner!!
at 2:30:20 you finish BOW video and transition into TF-IDF video and say that in previous video you have mentioned about N-grams. I didnt find the N-grams video. Am I missing something?
The difference between NLTK and spacy are: 1. NLTK supports various languages whereas spaCy have statistical models for 7 languages (English, German, Spanish, French, Portuguese, Italian, and Dutch). It also supports named entities for multi language. 2. NLTK is a string processing library. It takes strings as input and returns strings or lists of strings as output. Whereas, spaCy uses object-oriented approach. When we parse a text, spaCy returns document object whose words and sentences are objects themselves. 3. spaCy has support for word vectors whereas NLTK does not. 4. As spaCy uses the latest and best algorithms, its performance is usually good as compared to NLTK. In word tokenization and POS-tagging spaCy performs better, but in sentence tokenization, NLTK outperforms spaCy. Its poor performance in sentence tokenization is a result of differing approaches: NLTK attempts to split the text into sentences. In contrast, spaCy constructs a syntactic tree for each sentence, a more robust method that yields much more information about the text.
NLTK is a string processing library. It takes strings as input and returns strings or lists of strings as output. Whereas, spaCy uses object-oriented approach. When we parse a text, spaCy returns document object whose words and sentences are objects themselves.
Upon reading the documentation and paper on CBOW, I have two questions - 1. You explained that when we chose a window size, for example 3, we take 3 consecutive words from the corpus and take the middle word as target word and words before and after (1 each in this case) to provide context for the target word. However the documentation says that the number of words we take as window size determines the number of words taken before and after the target word. So for example if we take window_size = 3, we take 3 words before and 3 words after the target word to provide context. 2. We can chose the hidden layer to be any size. It is not important that it matches the window size, since the input layer does average or sum of the input vector and hence it's size is always [1 x V] where V is the vocabulary size. The input-hidden layer matrix is on size [V x N] where N is the hidden layer size, and then the hidden-output layer matrix is of [N x V] and finally the output layer if [V x 1] Can you please clarify my doubts here
Not sure if you will be reading these comments.. I really really like the way you teach.. its very informative and made it very easy to understand with the examples (giving the not so good ones first and then coming to the one which fixes all, this way gives the clear picture, also helps in clearing interview questions). very good narration, like a director of a movie narrating a story, which shows that you are so passionate in teaching and making others understand what you are explaining. And finally very very good voice. Thanks a lot KRISH NAIK SIR. Subscribed and waiting for more videos from you.
However, the process of tokenization usually goes from a larger unit to a smaller one, not the other way around. So, it’s more common to tokenize a paragraph into sentences or a sentence into words, rather than tokenizing sentences into a paragraph. @37:21 your comment says ##sentence to paragraph tokenization however you ended up using sent_tokenization() which accepts a paragraph and breaks it down into smaller sentence
Can you please create end to end project with real time data, ie using Kafka for streaming, Django for the backend and atleast use kubeflow for tracking 😊,,I'll appreciate
Completed this course end to end and it is super amazing to be frank, i wasted a lot of money going for trainings. only request is could not find things like how to train a model to recognize our own named entities and how we can use nlp to take in un structured data to structured data , how to create a model from scratch to build something similar to word 2 vec with our own corpus kind , also some real time examples would be of great help. I know you are already doing a lot for free, but if you can help in the above requests it would be of great help.. please see if you can do this and I appreciate a looooot for what you are already doing for free. not seen anyone explaining in this detail and simple ways...
Where can I find the next part..? Like practical implementation of word2vec with model training from scratch using gensim or Glove... Also practical implementation of tf-idf, bow... Pls share those videos as well
how is parts of speech going to work for ungrammatical sentences like some word's part of speech may depend on context and semantic in sentence as well right?
okay for lemmetization how would we find the pos is noun, verb, adjective or anything else like for a big corpus?? because we cant explicitly check for all the types right?
Can anybody please tell me how can I enable extension support like code completion in jupyter lab. I have searched stackoverflow but all effort had been in vain.
Thanks for all the amazing information you keep sharing over youtube. Please keep up the excellent work, your data science knowledge is of great help to the community of aspiring data scientists.
I watched this video entirely , its very useful for me. I paid 55K for a data science and I am learning from here you are much better than any one there .
Thank you Krish 😍
No one can be much fool as you are 😂😂😂
hey do i need prior ml knowledge to understand the concepts of this video ?
I am both happy and sad. Happy that I discovered the channel today. Sad that I discovered this channel only today. I wish I discovered this channel 4 years ago. So the overall sentiment of my post is positive :)
sometimes, it's classified as neutral
Nobody has got anything before time and nobody will get anything either.
Who needs institutes when krish sir is ready to give everyone this much free resources.
Thank you for uploading such nice and comprehensive lectures (not videos) and explaining it so nicely. Your commitment is quite commendable. Please make such one shot videos in the future too.
Thank you for being a guiding light in the vast sea of information, providing clarity and understanding to those who seek knowledge. Your commitment to the betterment of individuals and society as a whole is truly uplifting.
NLTK is a comprehensive and educational toolkit suitable for a wide range of NLP tasks, while SpaCy is a focused and efficient library designed for production use, particularly for tasks like entity recognition and part-of-speech tagging.
Aww, you did an incredible job, Krish! I was fully engaged for the entire 4 hours and didn't get bored once.
Excellent Video, Just started with it, got clear about basics of NLP
I had no idea about NLP prior to this...made me understand these concepts and made it seem so easy.
Thank you Krish
Fantastic and clear explanation way better than institute at the fees of 1 lack ruper
Thankyou sir.. It was very useful to have it in single video.. All concepts were very clearly explained. God bless you sir.. I am 45 and trying to learn AI Ml😊
Just finished the video. Learned a lot from this.
thank you so much krish sir!!
i don't know how to thank you, u are just awesome and i hope u will get all the luxury and peace in life thank you so much sir for saving us and understanding these complex concepts in a simpler and most effective manner!!
at 2:30:20 you finish BOW video and transition into TF-IDF video and say that in previous video you have mentioned about N-grams. I didnt find the N-grams video. Am I missing something?
Yes! N-gram topic was skipped
@@islamiczone7731 so where to watch it ?
if you want to learn it, just search on youtube n grams nlp by krish naik
Tutorial 8- Ngrams Indepth Intuition In NLP- Krish Naik Hindi
The difference between NLTK and spacy are:
1. NLTK supports various languages whereas spaCy have statistical models for 7 languages (English, German, Spanish, French, Portuguese, Italian, and Dutch). It also supports named entities for multi language.
2. NLTK is a string processing library. It takes strings as input and returns strings or lists of strings as output. Whereas, spaCy uses object-oriented approach. When we parse a text, spaCy returns document object whose words and sentences are objects themselves.
3. spaCy has support for word vectors whereas NLTK does not.
4. As spaCy uses the latest and best algorithms, its performance is usually good as compared to NLTK. In word tokenization and POS-tagging spaCy performs better, but in sentence tokenization, NLTK outperforms spaCy. Its poor performance in sentence tokenization is a result of differing approaches: NLTK attempts to split the text into sentences. In contrast, spaCy constructs a syntactic tree for each sentence, a more robust method that yields much more information about the text.
NLTK is a string processing library. It takes strings as input and returns strings or lists of strings as output. Whereas, spaCy uses object-oriented approach. When we parse a text, spaCy returns document object whose words and sentences are objects themselves.
Thanks for sharing and educating us. Keep it up.
Thankyou so much Krish Sir for this amazing video.
such a GREAT video on NLP. i just LOVED your explanation!! keep up the good work!
Upon reading the documentation and paper on CBOW, I have two questions -
1. You explained that when we chose a window size, for example 3, we take 3 consecutive words from the corpus and take the middle word as target word and words before and after (1 each in this case) to provide context for the target word. However the documentation says that the number of words we take as window size determines the number of words taken before and after the target word. So for example if we take window_size = 3, we take 3 words before and 3 words after the target word to provide context.
2. We can chose the hidden layer to be any size. It is not important that it matches the window size, since the input layer does average or sum of the input vector and hence it's size is always [1 x V] where V is the vocabulary size. The input-hidden layer matrix is on size [V x N] where N is the hidden layer size, and then the hidden-output layer matrix is of [N x V] and finally the output layer if [V x 1]
Can you please clarify my doubts here
I also have confusion in this step. i cant figure out the output dimension
In CBOW, i was also confused about the size of hidden layer. as i understands, hidden layer nodes can be of any size.
Thank you Krish a friend introduce me to your videos. very wonderful and educative.
Akpe na wo Mawu ne yrawo 💯
great guy, complete description for every term
Great video Krish much needed
Thank you so much, You are Awesome!!!!
Let's be grateful and give a LIKE to this great resource!
Not sure if you will be reading these comments.. I really really like the way you teach.. its very informative and made it very easy to understand with the examples (giving the not so good ones first and then coming to the one which fixes all, this way gives the clear picture, also helps in clearing interview questions). very good narration, like a director of a movie narrating a story, which shows that you are so passionate in teaching and making others understand what you are explaining. And finally very very good voice. Thanks a lot KRISH NAIK SIR. Subscribed and waiting for more videos from you.
One Of the best course, regarding to nlp. thank you so much sir.
can you tell me if he covered TF-IDF?
@@anshulvairagade1604 yeap he covered it
@@anshulvairagade1604yes
Helped to revise the concepts. Thank you Krish sir
can you tell me if he covered TF-IDF?
Thanks so much Professor Naik.
Hello Krish, A small problem at time 2:30:21 the n-gram explanation video is skipped. Please add the video corresponding to N-gram. Thank you.
Tutorial 8- Ngrams Indepth Intuition In NLP- Krish Naik Hindi
However, the process of tokenization usually goes from a larger unit to a smaller one, not the other way around. So, it’s more common to tokenize a paragraph into sentences or a sentence into words, rather than tokenizing sentences into a paragraph.
@37:21 your comment says ##sentence to paragraph tokenization however you ended up using sent_tokenization() which accepts a paragraph and breaks it down into smaller sentence
Yup! There is a mistake over there.
Looks great video, I'm halfway right now.
Thank you so much for showing path to our life krish…
Thank you for this amazing content
Can you please create end to end project with real time data, ie using Kafka for streaming, Django for the backend and atleast use kubeflow for tracking 😊,,I'll appreciate
We really want this kind of project.
in ineuron there is a course for end-to-end data science projects. You can check it out
Thanks for all you do Krish 🙏
I bought FSDS course from ineuron cos of your name but learning from here.
NLTK and spacy are both related to NLP but the main diff is that NLTK is optimized for education purpose and spacy is for development purpose
this session is amazing and great, can someone provide us with next modules RNN, LSTM, BERT and transformers link?
thank you sir. such a wonderful and helpful video. could you please provide part 2 for this.
Excellent roadmap. Really looking ahead.
Hey Krish. İ think you can read my mind😅. Thank you for video
NLTK is widely used in research, Spacy is focuses on production usage.
Please provide a link of 2nd part of this video which contains the after word2vec. The DL part
SpaCy very fast and accurate semantic analysis as compared to NLTK.
First to comment,, The best Data Science,Al, Machine Learning teacher of all time,, Straight outta Kenya❤
Thank you for this oneshort awesome video in NLP
58:10 Sir we have to '^' symbol to remove the expression from the beginning of the word
explanation 👏 perfect
Great insight, what is the name of that digital board that you're using to capture your illustrated drawings?
Great explanation, could you please provide the link for the subsequent parts of the pyramid (RNN, Transformers and BER)
Such a great leason. Thanks man!
Thank you so much for this tutorial.
Godamn that is an incredible speech
Great Video Krish, Some how Ngrams is skipped, Can you please add it
i enjoyed this video, thanks Krish
best course, it really helped me!
After explanation every concept
Please provide practical knowledge.
The practical content should be more
Completed this course end to end and it is super amazing to be frank, i wasted a lot of money going for trainings. only request is could not find things like how to train a model to recognize our own named entities and how we can use nlp to take in un structured data to structured data , how to create a model from scratch to build something similar to word 2 vec with our own corpus kind , also some real time examples would be of great help.
I know you are already doing a lot for free, but if you can help in the above requests it would be of great help.. please see if you can do this and I appreciate a looooot for what you are already doing for free. not seen anyone explaining in this detail and simple ways...
Completed! What is the next step?
personal timestamp
day 1- 31:36
that arabic was - Kayfa Haalak
in case someone was interested in pronunciation
petition to upload the deep learning part of NLP ASAP, i have college exam next month
Hello krish
Are you starting any new batch for data science?? plz let me know
If possible plz shre any link regarding that!!
Thank you
2:30:27 @Krish Sir, The previous lecture was about the bag of words, not N-grams. where can I find the lecture on N-grams?
Tutorial 8- Ngrams Indepth Intuition In NLP- Krish Naik Hindi
Outstanding series
ngram part at time stamp 02:30 is missing.
Amazing videos. Curious to know which app/tool you use for creating notes
really nice explanation brother, please, can you share your notes?
N grams topic is missing. Could you kindly check the video again ?
n-grams tutorial video is missing
Is there an extension to this playlist leading to introduction of LLMs and how to train generative models?
Thank you so much sir
Great video Krish ❤ much needed
Awesome session
Thanks for this video. N-grams is missing from this video. Could you please upload that separately annd share the link?
can i do this without doing ML/AI basic concepts?
Where can I find the next part..? Like practical implementation of word2vec with model training from scratch using gensim or Glove... Also practical implementation of tf-idf, bow... Pls share those videos as well
This video was really needed
This video is from his Udemy Paid Course: "Complete Machine Learning,NLP Bootcamp MLOPS & Deployment" - Section 48
i find it old i cant perform steps he is doing
how is parts of speech going to work for ungrammatical sentences like some word's part of speech may depend on context and semantic in sentence as well right?
Can you please provide notes for this video
okay for lemmetization how would we find the pos is noun, verb, adjective or anything else like for a big corpus?? because we cant explicitly check for all the types right?
Awesome! Thanks Krish sir!
I have doubt why they have taken only cos similarity in word2vec.why not sin.
ngrams video may have missed in the above tutorial(Word Encoding) while merging all other videos. please checck it
hello krish ,
how to handle curse dimensionality in huge corpus?
What is the sw tool and gadgets that you use to present it , writing in a pen ?
Sir will you also cover deep learning models in nlp?
please do include transformers and bert related part too
Thankuuu sir❤
Can anybody please tell me how can I enable extension support like code completion in jupyter lab. I have searched stackoverflow but all effort had been in vain.
Sir, There is N-grams topic is skipped automatically .so, please discuss it again or fixed this video
in NLP do we use standardization or normalization for text data ?
Where is Transformers and BERT?
We want more content
what are tools used in preparing the video?
Please make same one shot video on Deep Learning ❤
its already there..
needed one on GenAI
Give some pratical usecases