Stemming and Lemmatization: NLP Tutorial For Beginners - S1 E10

Поділитися
Вставка
  • Опубліковано 3 жов 2024
  • Stemming and lemmatization are two popular techniques to reduce a given word to its base word. Stemming uses a fixed set of rules to remove suffixes, and prefixes whereas lemmatization use language knowledge to come up with a correct base word. Stemming will be demonstrated in ntlk (spacy doesn't support stemming) whereas code for lemmatization is written in spacy
    NLP platform: www.firstlangu...
    Code: github.com/cod...
    Exercise: github.com/cod...
    Complete NLP Playlist: • NLP Tutorial Python
    🔖Hashtags🔖
    #nlp #nlptutorial #nlppython #spacytutorial #spacytutorialnlp #nlptutorialpython #naturallanguageprocessingstemming #nlpstemming #nlpstemmingtutorial #stemming #lemmatization
    Do you want to learn technology from me? Check codebasics.io/... for my affordable video courses.
    Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
    🎥 Codebasics Hindi channel: / @codebasicshindi
    #️⃣ Social Media #️⃣
    🔗 Discord: / discord
    📸 Instagram: / codebasicshub
    🔊 Facebook: / codebasicshub
    📱 Twitter: / codebasicshub
    📝 Linkedin (Personal): / dhavalsays
    📝 Linkedin (Codebasics): / codebasics
    🔗 Patreon: www.patreon.co...
    ❗❗ DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.

КОМЕНТАРІ • 38

  • @codebasics
    @codebasics  2 роки тому

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @pphantom5037
    @pphantom5037 Місяць тому +1

    There is a quiz now!! thank your for your awsome work♥♥♥

  • @Breaking_Bold
    @Breaking_Bold Рік тому +1

    I love the way you explain - other NLP concepts - customizing the pipeline for example !!!

  • @amandaahringer7466
    @amandaahringer7466 2 роки тому +1

    Very helpful! Looking forward to the rest of the series! Thank you!

  • @Breaking_Bold
    @Breaking_Bold Рік тому

    Fantastic ...you make complex NLP topics simple. !!!

  • @belfloretkoriciza5279
    @belfloretkoriciza5279 2 роки тому

    you are my teacher and i am proud of you

  • @aintgonhappen
    @aintgonhappen Рік тому

    This is some quality content.
    Thank you!

  • @arnavverma8622
    @arnavverma8622 2 роки тому

    Excellent Series👌👌🔥🔥

  • @ayushgupta80
    @ayushgupta80 6 місяців тому +2

    Stemming (removing something) vs Lemmatization ( mapped with base word) 4:50
    Note : Spacy don't have support of stemming .
    Code : stemming
    import nltk
    import spacy
    from nltk.stem import PorterStemmer
    stemmer = PorterStemmer()
    words = ["eating","eats","eat","ate","adjustable","rafting","ability","meeting"]
    for word in words:
    print(word,"|",stemmer.stem(word))
    --------------------------------------------------------------------------------
    Code : lemmatization
    nlp = spacy.load("en_core_web_sm")
    doc = nlp("eating eats eat ate adjustable rafting ability meeting better")
    for token in doc:
    print(token,"|",token.lemma_,"|",token.lemma)
    -----------------------------------------------------------------------------------------
    Custom lemmatization
    Code :
    ar = nlp.get_pipe('attribute_ruler')
    ar.add([[{"TEXT":"Bro"}],[{"TEXT":"Brah"}]],{"LEMMA":"Brother"})
    doc =nlp("Bro, you wanna go ? Brah , don't say no ! I am exhausted")
    for token in doc:
    print(token.text,"|",token.lemma_)

  • @sandeepnaik6437
    @sandeepnaik6437 2 роки тому +5

    What is Behavioural data science?

  • @rajiv7
    @rajiv7 3 місяці тому

    You are the excellent. Fullstop.

  • @MuhammadIBRAHIM-iy3rg
    @MuhammadIBRAHIM-iy3rg 6 місяців тому

    amazing videos

  • @amandaahringer7466
    @amandaahringer7466 2 роки тому +1

    8:36 I noticed that the prebuilt language pipelines return an unexpected lemma for "ate". I assumed that lg and trf pipelines would produce ate -> eat while the sm and md pipelines would produce ate -> ate, but that doesn't seem to be the case.
    def eat_lemma(lang_pipeline):
    nlp = spacy.load(lang_pipeline)
    doc = nlp("ate")
    print(lang_pipeline, '|', doc[0].lemma_)
    lp = ["en_core_web_sm", "en_core_web_md", "en_core_web_lg", "en_core_web_trf"]
    for lang_pipeline in lp:
    eat_lemma(lang_pipeline)
    en_core_web_sm | ['eat']
    en_core_web_md | ['ate']
    en_core_web_lg | ['eat']
    en_core_web_trf | ['ate']
    Update: I see that when "ate" is used in the context of a sentence each pipeline produces a lemma of "eat".
    doc = nlp("The person ate an apple.")
    en_core_web_sm | ['the', 'person', 'eat', 'an', 'apple', '.']
    en_core_web_md | ['the', 'person', 'eat', 'an', 'apple', '.']
    en_core_web_lg | ['the', 'person', 'eat', 'an', 'apple', '.']
    en_core_web_trf | ['the', 'person', 'eat', 'an', 'apple', '.']

  • @jatinnandwani6678
    @jatinnandwani6678 9 місяців тому

    Thanks so much

  • @aashishmalhotra
    @aashishmalhotra 2 роки тому

    If possible try to come with live sessions it would be helpful

  • @omarsalam7586
    @omarsalam7586 Рік тому

    thank you, sir

  • @muzaffariqbalraja6464
    @muzaffariqbalraja6464 Рік тому

    very nice

  • @raphayzia9214
    @raphayzia9214 2 роки тому

    Sir it will be very helpful if you make a NLP project like a Chatbot at the end of the series and thanks for making this series

    • @codebasics
      @codebasics  2 роки тому +1

      Yes I will be making few projects

  • @codebasics
    @codebasics  2 роки тому

    Do you want to learn technology from me? codebasics.io is my website for video courses. First course going live in the last week of May, 2022

  • @berkayates6254
    @berkayates6254 7 місяців тому

    Hey Guys when we used stemming and lemmatizing before training the data we just change the words. After training the model model could generate words that are different from lemmatized words. I mean we teach the model `eat` however the model learn also `ate` how?

  • @JayShah-m1v
    @JayShah-m1v Рік тому

    Hey!
    Firstly, this is a very good series. But for the exercise, in the last part using lemmatization, some of my words such as cooking were converted into cook and playing to play while running stayed as it is. Do you know what could be the issue?
    Or do you have any explanation to this?
    Thank you.

    • @agastyabose1645
      @agastyabose1645 7 місяців тому

      it just might be how that specific model of nlp you used, performs. maybe idk

  • @zaytech528
    @zaytech528 2 роки тому

    hello sir, if i want to stem and lemmatize my string at the same time, how'd i do that? as spacy doesn't allow stemming. and nltk doesn't allow lemmatization. pls answer asap

  • @anaschoudhari511
    @anaschoudhari511 2 роки тому

    Hi sir a request for you to make some videos on python

    • @codebasics
      @codebasics  2 роки тому +1

      I have a python tutorial playlist with more than 40 videos. in youtube search "codebasics python tutorial"

  • @muradmammedzade2885
    @muradmammedzade2885 Рік тому

    How to write Lemmatizer from scratch?

  • @firdospathan3700
    @firdospathan3700 Рік тому

    I could not unable to install Ai4bharat package in PC.
    Is there solution. For that error

  • @Pride_Of_Ultras
    @Pride_Of_Ultras 2 роки тому

    🤩

  • @Telugu-Tech-suport
    @Telugu-Tech-suport 2 роки тому

    Sir last 1year EGO my pc hacked .gujd ransomwer please huw to get back my data 🙏 help mee please sum important data is ther

  • @GAURAVRAUL95
    @GAURAVRAUL95 2 роки тому +1

    Which one are you? Marc Spector or Steven Grant??

    • @codebasics
      @codebasics  2 роки тому +6

      I am Dhaval, Marc and Steven are my alter egos 😎

  • @leoxu1299
    @leoxu1299 2 роки тому

    Hey, aren't you the moon knight?

    • @codebasics
      @codebasics  2 роки тому +1

      Ha ha you are the third person to say this 🤣😎😎😎

  • @thoughtofme8263
    @thoughtofme8263 Рік тому

    pleeeeeeeeeease try hindi speaking