Stemming (removing something) vs Lemmatization ( mapped with base word) 4:50 Note : Spacy don't have support of stemming . Code : stemming import nltk import spacy from nltk.stem import PorterStemmer stemmer = PorterStemmer() words = ["eating","eats","eat","ate","adjustable","rafting","ability","meeting"] for word in words: print(word,"|",stemmer.stem(word)) -------------------------------------------------------------------------------- Code : lemmatization nlp = spacy.load("en_core_web_sm") doc = nlp("eating eats eat ate adjustable rafting ability meeting better") for token in doc: print(token,"|",token.lemma_,"|",token.lemma) ----------------------------------------------------------------------------------------- Custom lemmatization Code : ar = nlp.get_pipe('attribute_ruler') ar.add([[{"TEXT":"Bro"}],[{"TEXT":"Brah"}]],{"LEMMA":"Brother"}) doc =nlp("Bro, you wanna go ? Brah , don't say no ! I am exhausted") for token in doc: print(token.text,"|",token.lemma_)
8:36 I noticed that the prebuilt language pipelines return an unexpected lemma for "ate". I assumed that lg and trf pipelines would produce ate -> eat while the sm and md pipelines would produce ate -> ate, but that doesn't seem to be the case. def eat_lemma(lang_pipeline): nlp = spacy.load(lang_pipeline) doc = nlp("ate") print(lang_pipeline, '|', doc[0].lemma_) lp = ["en_core_web_sm", "en_core_web_md", "en_core_web_lg", "en_core_web_trf"] for lang_pipeline in lp: eat_lemma(lang_pipeline) en_core_web_sm | ['eat'] en_core_web_md | ['ate'] en_core_web_lg | ['eat'] en_core_web_trf | ['ate'] Update: I see that when "ate" is used in the context of a sentence each pipeline produces a lemma of "eat". doc = nlp("The person ate an apple.") en_core_web_sm | ['the', 'person', 'eat', 'an', 'apple', '.'] en_core_web_md | ['the', 'person', 'eat', 'an', 'apple', '.'] en_core_web_lg | ['the', 'person', 'eat', 'an', 'apple', '.'] en_core_web_trf | ['the', 'person', 'eat', 'an', 'apple', '.']
Hey Guys when we used stemming and lemmatizing before training the data we just change the words. After training the model model could generate words that are different from lemmatized words. I mean we teach the model `eat` however the model learn also `ate` how?
Hey! Firstly, this is a very good series. But for the exercise, in the last part using lemmatization, some of my words such as cooking were converted into cook and playing to play while running stayed as it is. Do you know what could be the issue? Or do you have any explanation to this? Thank you.
hello sir, if i want to stem and lemmatize my string at the same time, how'd i do that? as spacy doesn't allow stemming. and nltk doesn't allow lemmatization. pls answer asap
Folks, here's a link to our bootcamp for learning AI and Data Science in the most practical way: tinyurl.com/395u4mnm
I love the way you explain - other NLP concepts - customizing the pipeline for example !!!
Stemming (removing something) vs Lemmatization ( mapped with base word) 4:50
Note : Spacy don't have support of stemming .
Code : stemming
import nltk
import spacy
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
words = ["eating","eats","eat","ate","adjustable","rafting","ability","meeting"]
for word in words:
print(word,"|",stemmer.stem(word))
--------------------------------------------------------------------------------
Code : lemmatization
nlp = spacy.load("en_core_web_sm")
doc = nlp("eating eats eat ate adjustable rafting ability meeting better")
for token in doc:
print(token,"|",token.lemma_,"|",token.lemma)
-----------------------------------------------------------------------------------------
Custom lemmatization
Code :
ar = nlp.get_pipe('attribute_ruler')
ar.add([[{"TEXT":"Bro"}],[{"TEXT":"Brah"}]],{"LEMMA":"Brother"})
doc =nlp("Bro, you wanna go ? Brah , don't say no ! I am exhausted")
for token in doc:
print(token.text,"|",token.lemma_)
Thanks so much
Very helpful! Looking forward to the rest of the series! Thank you!
you are my teacher and i am proud of you
Thanks 🙏
There is a quiz now!! thank your for your awsome work♥♥♥
Fantastic ...you make complex NLP topics simple. !!!
This is some quality content.
Thank you!
8:36 I noticed that the prebuilt language pipelines return an unexpected lemma for "ate". I assumed that lg and trf pipelines would produce ate -> eat while the sm and md pipelines would produce ate -> ate, but that doesn't seem to be the case.
def eat_lemma(lang_pipeline):
nlp = spacy.load(lang_pipeline)
doc = nlp("ate")
print(lang_pipeline, '|', doc[0].lemma_)
lp = ["en_core_web_sm", "en_core_web_md", "en_core_web_lg", "en_core_web_trf"]
for lang_pipeline in lp:
eat_lemma(lang_pipeline)
en_core_web_sm | ['eat']
en_core_web_md | ['ate']
en_core_web_lg | ['eat']
en_core_web_trf | ['ate']
Update: I see that when "ate" is used in the context of a sentence each pipeline produces a lemma of "eat".
doc = nlp("The person ate an apple.")
en_core_web_sm | ['the', 'person', 'eat', 'an', 'apple', '.']
en_core_web_md | ['the', 'person', 'eat', 'an', 'apple', '.']
en_core_web_lg | ['the', 'person', 'eat', 'an', 'apple', '.']
en_core_web_trf | ['the', 'person', 'eat', 'an', 'apple', '.']
Excellent Series👌👌🔥🔥
Thanks a bunch ❤
Very helpful
Sir it will be very helpful if you make a NLP project like a Chatbot at the end of the series and thanks for making this series
In this playlist, end to end chatbot development project (using Google's Dialogflow framework) is added.
Sir will you please share ppts also , that will help in clearing the concepts
If possible try to come with live sessions it would be helpful
You are the excellent. Fullstop.
What is Behavioural data science?
amazing videos
❤Nice
thank you, sir
very nice
Hey Guys when we used stemming and lemmatizing before training the data we just change the words. After training the model model could generate words that are different from lemmatized words. I mean we teach the model `eat` however the model learn also `ate` how?
Hey!
Firstly, this is a very good series. But for the exercise, in the last part using lemmatization, some of my words such as cooking were converted into cook and playing to play while running stayed as it is. Do you know what could be the issue?
Or do you have any explanation to this?
Thank you.
it just might be how that specific model of nlp you used, performs. maybe idk
Which one are you? Marc Spector or Steven Grant??
I am Dhaval, Marc and Steven are my alter egos 😎
Hi sir a request for you to make some videos on python
I have a python tutorial playlist with more than 40 videos. in youtube search "codebasics python tutorial"
hello sir, if i want to stem and lemmatize my string at the same time, how'd i do that? as spacy doesn't allow stemming. and nltk doesn't allow lemmatization. pls answer asap
I could not unable to install Ai4bharat package in PC.
Is there solution. For that error
How to write Lemmatizer from scratch?
🤩
Sir last 1year EGO my pc hacked .gujd ransomwer please huw to get back my data 🙏 help mee please sum important data is ther
Hey, aren't you the moon knight?
Ha ha you are the third person to say this 🤣😎😎😎
pleeeeeeeeeease try hindi speaking