Python Sentiment Analysis Project with NLTK and 🤗 Transformers. Classify Amazon Reviews!!

Поділитися
Вставка
  • Опубліковано 31 тра 2024
  • In this video you will go through a Natural Language Processing Python Project creating a Sentiment Analysis classifier with NLTK's VADER and Huggingface Roberta Transformers. The project is to classify the seniment of amazon customer reviews. 🤗 provides some great open source models for NLP: huggingface.co/models. We will look at the difference between model outputs from the two packages and compare the results. Seniment analysis is an important tool for data scientists to use in laguage modeling.
    Link to Kaggle Notebook: www.kaggle.com/robikscube/sen...
    Timeline:
    00:00 Intro
    01:10 Setup + NLTK
    10:44 VADER Model
    23:42 RoBERTa Model
    35:51 Compare Results
    Follow me on twitch for live coding streams: / medallionstallion_
    My other videos:
    Speed Up Your Pandas Code: • Make Your Pandas Code ...
    Speed up Pandas Code: • Make Your Pandas Code ...
    Intro to Pandas video: • A Gentle Introduction ...
    Exploratory Data Analysis Video: • Exploratory Data Analy...
    Working with Audio data in Python: • Audio Data Processing ...
    Efficient Pandas Dataframes: • Speed Up Your Pandas D...
    * UA-cam: youtube.com/@robmulla?sub_con...
    * Discord: / discord
    * Twitch: / medallionstallion_
    * Twitter: / rob_mulla
    * Kaggle: www.kaggle.com/robikscube
    #nlp #python #machinelearning #huggingface

КОМЕНТАРІ • 389

  • @nixonsebastian2892
    @nixonsebastian2892 Рік тому +233

    great content, this deserves a million views... {'roberta_neg': 0, 'roberta_neu': 0, 'roberta_pos': 100}😀

    • @robmulla
      @robmulla  Рік тому +17

      Haha. Best comment! Pinned.

    • @xBaphometHx
      @xBaphometHx Рік тому +13

      Pos should be 1, since the maximum value is 1. lol

    • @48-tarunsalgotra81
      @48-tarunsalgotra81 Рік тому

      ​@@robmulla plz give ur what's app no

    • @smi14172
      @smi14172 5 місяців тому

      Good one!!😅

  • @chairjacker
    @chairjacker 8 місяців тому +3

    I like the pace at which you teach this content it is relaxed and very enjoyable to watch for me.

  • @juan.o.p.
    @juan.o.p. Рік тому +10

    Really interesting video. I've been following a lot of your tutorials lately and I must say that I really like the way you explain things, it's so easy to understand and follow along. Thank you!

    • @robmulla
      @robmulla  Рік тому +1

      Thanks so much for the feedback Juan. It's always hard to tell when I'm recording these if they are any good, so it's great to hear that it is helpful to you.

  • @user-hk6le3bx4c
    @user-hk6le3bx4c Місяць тому

    Just completed it. I really enjoyed working on it. Your way of teaching is just awesome!

  • @AndrewSeywright
    @AndrewSeywright Рік тому +1

    Thank you so much for this step by step process it has opened up all sorts of new analysis opportunities for our customer insights. Really well explained and easy to follow

  • @Thikondrius
    @Thikondrius Рік тому +42

    I don't often left comments on youtube but, finally someone that explains everything from scratch...I am a JS developer. And it's really cool your that you explain every piece of code. That really helped, I was able to understand everything.

    • @robmulla
      @robmulla  Рік тому +2

      Hey! I really apprecaite this comment. Thanks so muc.

  • @alexthe2
    @alexthe2 6 місяців тому +2

    I'll admit I watched this on two times speed, but those were the best spend 21 minutes of the day!
    Very helpful and we'll explained!

  • @sachingupta5155
    @sachingupta5155 7 місяців тому

    I find the topic really interesting , the way you explain were pretty articulated and having a fundamental approach

  • @farhadnikhashemi8681
    @farhadnikhashemi8681 9 місяців тому +2

    Thanks for such a wonderful tutorial. I used your shared data on my own with Google Collab and worked so well. Just I had to download a few more libraries for tokenization. Wonderful content and I truly enjoyed it.

  • @mateusbalotin7247
    @mateusbalotin7247 Рік тому +3

    Amazing content man! Your channel and videos deserve a lot more attention. Hope you have an amazing week!!

    • @robmulla
      @robmulla  Рік тому

      Thanks so much. I really appreciate the feedback. Please consider sharing the video with anyone else you think might learn from it.

  • @atharvpatawar8346
    @atharvpatawar8346 Рік тому +90

    Huge thank you to you!!! I recently participated in a ML hackathon and they had sentiment analysis as one of their problem statements. I had watched your video prior to the competition and used hugging face whereas everyone else used the standard vader. I ended up getting the highest accuracy and placed first, all in my second year of engineering. Genuinely, can’t thank you enough for the information!
    Team random_state42

    • @mohammedmehdi1940
      @mohammedmehdi1940 Рік тому +3

      Mil gaya tu yaha

    • @robmulla
      @robmulla  Рік тому +14

      This is so awesome! Thanks for sharing. I posted a screenshot of your comment on twitter, hope that's ok!

    • @bhaumik3118
      @bhaumik3118 Рік тому +4

      Btw huge fan of your statistics' notes Mr. Patawar, didn't expect to find you here.

    • @mohammedmehdi1940
      @mohammedmehdi1940 Рік тому +3

      @@bhaumik3118 i also study statistics from mr patawar

    • @TANISHQTHUSE
      @TANISHQTHUSE 3 місяці тому

      nice man

  • @louie0187
    @louie0187 Рік тому +2

    This may be the test tutorial on any language/library/app I have ever watched. One part, very concise and well explained. Thank you.

    • @robmulla
      @robmulla  Рік тому

      Glad it was helpful! This comment makes me really happy and excited to make more tutorials!

    • @bazoo513
      @bazoo513 Рік тому

      More of an appetite wetter. to make any use of it, I have to learn Python first 😀 But then, that's valuable by itself.

  • @naderbazyari2
    @naderbazyari2 9 місяців тому

    I am so happy to have discovered your channel. Many thanks friend.

  • @fabricembida4526
    @fabricembida4526 4 місяці тому

    Good, very good video! You cannot imagine how valuable this kind of video is for someone like me who is trying to transition to data science...

  • @SaurabhSingh-oi5ev
    @SaurabhSingh-oi5ev Рік тому +9

    Your videos like gem to me learned a lot your use of modules packages are like cherry on cake. Currently I'm working as an Jr. Data scientist in KPMG but man oh man you taught me many things thank you 😊 🙏

    • @robmulla
      @robmulla  Рік тому +1

      Great to hear you enjoyed the video. Data science is a never ending learning journey for all of us!

    • @IndianHacker-hisBest
      @IndianHacker-hisBest 9 місяців тому

      Bro, I just need to talk to u. I wanted to ask few questions regarding the profile you are working on. I have secured a job with Deloitte but want to switch to KPMG (Gurgaon).

  • @kaifahmedkhan
    @kaifahmedkhan 3 місяці тому

    Great content. I am doing a project in my uni where I need to do sentiment analysis on book reviews. This helped me a lot. Thanks.

  • @pavlostsoukias8147
    @pavlostsoukias8147 2 роки тому +1

    Rob, you are the Best! Thank you for all the quality content you are uploading!
    Greetings from Greece!

    • @robmulla
      @robmulla  2 роки тому +1

      Thanks so much Pavlos for watching. Sending a 💙 to Greece.

  • @jerrywang3225
    @jerrywang3225 Рік тому +5

    Your channel is a gem, thanks so much for the free course.

    • @robmulla
      @robmulla  Рік тому

      Glad you enjoyed it. Thanks for watching!

  • @evansala7814
    @evansala7814 3 місяці тому

    Great video. Your explanations were very clear and concise and easy to follow.

  • @it029-shreyagandhi5
    @it029-shreyagandhi5 4 місяці тому

    Great work🎉🎉🎉🎉 ty for this amazing video .Your explanation , flow , content everything is up to the mark 🚩

  • @SuperMjJang
    @SuperMjJang 10 місяців тому

    I've watched bunch of ML videos and you are THE TOP! 👍👍👍

  • @brindhaganesan3580
    @brindhaganesan3580 Рік тому +1

    I’m so glad I found this channel!!

  • @dgr8a1
    @dgr8a1 Рік тому +4

    You are my newly found Python mentor. Good content Rob

    • @robmulla
      @robmulla  Рік тому

      Happy to be! There are a lot of good channels out there.

  • @monty510
    @monty510 3 місяці тому

    Great video, I am starting to understand NLP much more. Thank you so much!

  • @vinitkumarpatel1030
    @vinitkumarpatel1030 6 днів тому +1

    Very good explanation . Thanks a lot❤❤

  • @carlossamperquinto2777
    @carlossamperquinto2777 Рік тому +1

    This video is incredibly helpful! Thanks!

  • @ayushapoorva
    @ayushapoorva Рік тому +2

    great content, perhaps the best material I found on sentiment analysis in youtube!!!

    • @robmulla
      @robmulla  Рік тому

      Thanks for the compliment Ayush! That means a lot to me.

  • @stevebim000
    @stevebim000 Рік тому +1

    Extremly useful, super easy to understand! Thank you so much for a great and valuable video !!

    • @robmulla
      @robmulla  Рік тому +1

      Really appreciate the feedback. Comments like this make me want to keep making more videos!

  • @Nitesh717
    @Nitesh717 Рік тому +4

    Hey brother , you just provided the best NLP sentiment project , your channel deserve million+ subscriber , nd now I am just one new subscriber now to reach you there

    • @robmulla
      @robmulla  Рік тому

      Thank you so much 😀

  • @adityabhatt04
    @adityabhatt04 2 роки тому +7

    Thanks for posting the awesome tutorial. Would love to learn more from you.

    • @robmulla
      @robmulla  2 роки тому +2

      Thanks for watching and learning!

  • @ColaWen
    @ColaWen 2 місяці тому

    Awesome! I am shocked that everything is so efficient and amazing. THANKS!

    • @robmulla
      @robmulla  2 місяці тому

      Glad it was helpful! Share the video with friends.

  • @josiel.delgadillo
    @josiel.delgadillo 2 роки тому +3

    Just found your channel through Twitter. Great work, I am doing research in sentiment analysis and related to a lot of the video. Cool stuff! I will have to use the pariplot, I typically use a confusion matrix.

    • @robmulla
      @robmulla  2 роки тому

      Awesome Josiel. Glad you find it helpful. Check out some of my other videos if you have time and share the video with friends!

  • @karthiksheggoju738
    @karthiksheggoju738 7 місяців тому

    I really liked this video a lot, it answered lot of my questions, thanks a lot.

  • @ngominhhieu6602
    @ngominhhieu6602 2 місяці тому

    A great video! Many thanks for your valuable content.❤

  • @sindhumatipanigrahi3801
    @sindhumatipanigrahi3801 9 місяців тому

    Thank you so much. This tutorial helped me in my project. Thanks a lot.

  • @abhishekpadmanabhan3945
    @abhishekpadmanabhan3945 3 місяці тому

    Excellent video, started coding with chatgpt, and this adds a new layer of info , thank you mate :) Subd

  • @rajatshukla2605
    @rajatshukla2605 8 місяців тому

    Extremely helpful! Thanks a bunch!

  • @analysis_maestro_taha
    @analysis_maestro_taha Рік тому +1

    Thank you very much for this video. I'm new to the field of Data Analysis and related disciplines so this sentimental analysis project is pretty insightful for me.

    • @robmulla
      @robmulla  Рік тому

      Glad you found it helpful

  • @priyanshnegi03
    @priyanshnegi03 Рік тому +2

    Really great, helped me a lot in my project!

    • @robmulla
      @robmulla  Рік тому +1

      Glad it helped. Thanks for watching.

  • @sootybuu2963
    @sootybuu2963 2 роки тому +5

    This was a good tutorial. I'm trying to get my feet wet in data analytics and found myself overwhelmed while trying to read the NLTK documentation, so thanks for the structured guidance.
    I'm working on analyzing sentiment across a dataset I've gathered myself, so I wasn't following along in kaggle and hit a hiccup as AutoModelForSequenceClassification requires pytorch and I initialized a python 3.10 environment. Oopsy poopsy. All the same, you made my headache significantly less daunting. Thank you. :)

    • @robmulla
      @robmulla  2 роки тому

      Thanks so much. I’m glad it helped you get started with NLTK it can be a lot easier when you see it in action once. Setting up an environment that works with all the packages can also sometimes be frustrating so I can relate!

  • @chrisogonas
    @chrisogonas Рік тому +1

    Great resource! Thanks Rob.

    • @robmulla
      @robmulla  Рік тому

      Glad you liked it! Thanks for watching.

  • @davv02
    @davv02 Рік тому +1

    just did all of that as a thesis by myself without knowing you made a video about it lol, luckily I've used a different Bert model from hug face at least. Nice video btw!

  • @blanka_herceg
    @blanka_herceg 11 місяців тому

    This video was genius and very helpful thank you

  • @patrickonodje1428
    @patrickonodje1428 Рік тому +1

    I founf this video immensely helpful Rob
    Thanks

    • @robmulla
      @robmulla  Рік тому +1

      So glad you found it helpful!!

  • @kmkushad
    @kmkushad Рік тому +1

    Thanks for the video, we have a school project to do anything coding related and while my classmates are using scratch I wanted to do something flashier, and some kind of language analysis seemed the way to go. I'll use this video as inspiration.

  • @sebastianbenitez4401
    @sebastianbenitez4401 Рік тому +1

    thank you for this content! Great quality! Now subscribed!

    • @robmulla
      @robmulla  Рік тому

      Thanks so much for watching!

  • @seblewongelawash5891
    @seblewongelawash5891 Рік тому +1

    Thank you! Great content and easy to understand!

  • @srishtikaranth
    @srishtikaranth Рік тому +1

    i cannot thank you enough , you saved my 6th semester

  • @ahmadnawaz3683
    @ahmadnawaz3683 8 місяців тому

    Rob you are the best. Hands Down mate.

  • @jenniferchi2117
    @jenniferchi2117 Рік тому +2

    Thank you so much for this video tutorial! I wanted to ask if you created the Amazon review dataset from scratch or was it already pre-made from somewhere else?

  • @666rony
    @666rony Рік тому +1

    crystal clear explanation thanks my friend

  • @tusharguys1234
    @tusharguys1234 День тому

    🎯 Key points for quick navigation:
    00:00 *🎬 Introduction to Sentiment Analysis*
    - Introduction to natural language processing (NLP) and sentiment analysis.
    - Overview of the project, including using traditional techniques like VADER and more advanced models like RoBERTa.
    - Explanation of the dataset used for sentiment analysis, which consists of Amazon food reviews with ratings.
    03:00 *📊 Data Preprocessing and Exploration*
    - Importing necessary libraries for data analysis and visualization.
    - Reading the dataset and performing basic exploratory data analysis (EDA).
    - Downsampling the dataset for quicker analysis and showcasing the structure of the data.
    05:05 *📈 Exploring Sentiment Distribution*
    - Analyzing the distribution of sentiment scores based on review ratings.
    - Visualizing the distribution of sentiment scores across different star ratings using bar plots.
    - Observing the relationship between review ratings and sentiment scores.
    07:00 *🧠 Introduction to NLTK for Sentiment Analysis*
    - Overview of NLTK (Natural Language Toolkit) and its capabilities for text processing.
    - Demonstrating tokenization and part-of-speech tagging using NLTK.
    - Explaining the process of chunking text into entities using NLTK.
    10:48 *📉 Sentiment Analysis with VADER*
    - Introduction to VADER (Valence Aware Dictionary and sEntiment Reasoner) for sentiment analysis.
    - Understanding how VADER assigns sentiment scores based on individual words.
    - Applying VADER sentiment analysis to example sentences and the food review dataset.
    23:41 *🔍 Advanced Sentiment Analysis with RoBERTa*
    - Introducing RoBERTa, a transformer-based deep learning model for contextual understanding.
    - Preprocessing text and encoding it for analysis using RoBERTa's tokenizer.
    - Applying the pre-trained RoBERTa model to perform sentiment analysis on text data.
    29:05 *📊 Comparing Vader and Roberta sentiment analysis models*
    - Demonstrated how to print scores from both Vader and Roberta sentiment analysis models.
    - Created a scores dictionary for both models to store negative, neutral, and positive scores.
    - Illustrated the difference in sentiment analysis results between the Vader and Roberta models using a negative review as an example.
    35:52 *📈 Comparing sentiment scores across models and reviewing examples*
    - Utilized Seaborn's pair plot to compare sentiment scores between Vader and Roberta models.
    - Reviewed examples where the sentiment analysis model contradicted the actual review sentiment, showcasing nuances in language understanding.
    - Examined instances where both models misinterpreted the sentiment of reviews, highlighting the limitations of bag-of-words approaches like Vader.
    42:08 *🤖 Simplifying sentiment analysis with Hugging Face Transformers pipeline*
    - Demonstrated how to use Hugging Face Transformers pipeline for sentiment analysis, simplifying the process to just two lines of code.
    - Showcased the ease of changing models and tokenizers within the pipeline for different analysis tasks.
    - Provided examples of sentiment analysis using the pipeline, showcasing its efficiency and accuracy.
    Made with HARPA AI

  • @ademhilmibozkurt7085
    @ademhilmibozkurt7085 Рік тому +1

    What a video! I lovee this. Please keep continue this content. Greetings

    • @robmulla
      @robmulla  Рік тому +1

      Thank you! Will do, Adem!

  • @engmohammedbahanshal5204
    @engmohammedbahanshal5204 Рік тому +1

    Thanks for great model ideas.

  • @nandanhegde532
    @nandanhegde532 Рік тому +2

    Great Content, thanks man

  • @francofmm
    @francofmm 3 місяці тому

    New viewer and sub!! great work!!!

  • @spicytuna08
    @spicytuna08 8 місяців тому

    wow. speechless. both you and ml.

  • @marcodigennarobari
    @marcodigennarobari 23 дні тому

    great stuff!!

  • @anishshah4850
    @anishshah4850 Рік тому

    Great tutorial, for anyone facing the error of tensor_size more than 514 need to add the max_length as an argument in tokenizer...
    def polarity_scores_roberta(example):
    encoded_text= tokenizer(example, return_tensors='pt', truncation=True, max_length=512) # (max_length should be 512)
    output= model(**encoded_text)
    scores= output[0][0].detach().numpy()
    scores= softmax(scores)
    scores_dict= {
    'roberta_neg': scores[0],
    'roberta_neu': scores[1],
    'roberta_pos': scores[2]
    }
    return scores_dict

  • @TugelaCo
    @TugelaCo Рік тому +1

    I rarely comment on YT videos but this is amazing! +1 subscriber!

    • @robmulla
      @robmulla  Рік тому

      That really means a lot to me. Thanks for leaving a comment.

  • @andreascalenghe8068
    @andreascalenghe8068 10 місяців тому

    Great content, thanks

  • @NisaRoy-jo2wi
    @NisaRoy-jo2wi 2 місяці тому

    Great content.thank u

  • @rishirajmathur07
    @rishirajmathur07 9 місяців тому

    Great content. Please do more content model which solves attrition prediction for org. Very complex subject because its hard to find already made models on such topics. It would be great help if you can make something attrition prediction model with variables more than 45-50.

  • @-zak-7048
    @-zak-7048 Місяць тому

    what an absolute legend

  • @deepeshrajak3407
    @deepeshrajak3407 Рік тому +1

    your content is goldmine

    • @robmulla
      @robmulla  Рік тому

      Thank you sir! Share the goldmine with others!

  • @analyticswithadam
    @analyticswithadam Рік тому +1

    This is a great video, thanks a lot.

    • @robmulla
      @robmulla  Рік тому

      Glad you like it. Thanks for watching

  • @merwinjosepha3897
    @merwinjosepha3897 3 місяці тому

    Thnak you so much

  • @rachmanmohammad6210
    @rachmanmohammad6210 Рік тому

    Thank you. Great content

    • @robmulla
      @robmulla  Рік тому

      Glad you enjoyed it! Make sure you sub and share!

  • @usamaarif5763
    @usamaarif5763 22 дні тому

    Thanks for this video, it was descriptive, well structured and well explained.
    I have two questions and I would appreciate if you can give your opinion and guidence on that.
    1. At the end of the day star reviews and sentiment are giving the same results so how can we justify going through all this process when we already have a very good indication of user sentiment based on the star reviews.
    2. How can we get the strength and weakness of the product based on the reviews using the sentiment analysis.

  • @MoAlarawi
    @MoAlarawi 11 місяців тому

    Great content.

  • @mohammedkastali7096
    @mohammedkastali7096 4 місяці тому

    importante lesson thanks

  • @zikrifisehaye323
    @zikrifisehaye323 6 місяців тому

    THANK YOU!

  • @mohit_hada
    @mohit_hada Рік тому +1

    Pls make more such videos, that was great. I am a data engineer and wants to move to Data Science, please make videos for guidance also.
    Love from India

    • @robmulla
      @robmulla  Рік тому

      I will! Hope this video was helpful for you in your journey into data science.

  • @user-bc5wf2qq2r
    @user-bc5wf2qq2r 4 місяці тому

    Amazing!

  • @PriteshRPatel-lr5uh
    @PriteshRPatel-lr5uh 2 місяці тому

    loved what you did, but would be nice to show how you got the amazon data as well. Plus, do you have any videos on sentiment analysis for company stocks?

  • @CaribouDataScience
    @CaribouDataScience Рік тому +1

    Very interesting!!

  • @ryrylc
    @ryrylc Рік тому +1

    Awesome video. Would be great to see you follow the sentiment analysis with a topic analysis. I’ve seen a few different options out there (LDA, Top2Vec and BERTopic), but would love to see your take on it.

    • @robmulla
      @robmulla  Рік тому +1

      Great suggestion! I'll keep that in mind for future videos.

    • @GaurangDave
      @GaurangDave Рік тому

      @@robmulla Looking forward to that!! :)

  • @gangxaaku
    @gangxaaku 2 роки тому +1

    Top-notch 🔥 !!

  • @kalyanijog
    @kalyanijog 10 місяців тому

    hey rob,while running the polarity score on entire dataset, I only get one values which iterates after each run, and instead of "id", I see the 0th column,
    with pos,negative and neutral rows what should I do?

  • @sahilkakkar5628
    @sahilkakkar5628 Рік тому

    Thank you

  • @daredevilxrage
    @daredevilxrage 11 місяців тому +2

    The huggingface model , should it require any preliminary dataset while we are importing it?

  • @savichopra9083
    @savichopra9083 9 місяців тому

    very usefulll!

  • @jbie4590
    @jbie4590 10 місяців тому

    thanks man

  •  Рік тому +1

    Nice work

  • @sudurimabanerjee4612
    @sudurimabanerjee4612 Місяць тому

    Thanks for the video. Very well explained.
    Is there any token limit for the transformer based Roberta model ?

  • @DailyVibz
    @DailyVibz 10 днів тому

    WOW! Help me learn some Python of this level ! i am now at 0. learning to install it.

  • @kimnhunguyent1489
    @kimnhunguyent1489 Рік тому

    Hi, thank you for the amazing video. Your presentation was informative and insightful. Looking forward to your future content! Btw, I want to ask how can I save my expected result, it seems like I had a good training and dont want to keep going. What should I do in this situation ?
    Thank you

  • @OnLyhereAlone
    @OnLyhereAlone 11 місяців тому

    @robmulla, great presentation but I have looked through videos on your channel, it appears you have not done one on finetunning a BERT model with custom dataset. I am particularly wanting to learn how you would finetune a BERT model for multiclass text classification, maybe on Google collab. I think many of us subscribers would love it. Thanks.

  • @muslumyildiz5694
    @muslumyildiz5694 Рік тому +1

    you are awesome.. thanks a lot..

    • @robmulla
      @robmulla  Рік тому

      Thanks for watching. Share with a friend!

  • @mohan250s
    @mohan250s Рік тому +2

    you are awesome bro

    • @robmulla
      @robmulla  Рік тому

      No, YOU are awesome. Thanks for watching.

  • @jilanikashif
    @jilanikashif Рік тому +1

    Great Content, We need more tutorial on Transformers please

    • @robmulla
      @robmulla  Рік тому

      Glad you liked it. Anything specific about transformers you would like to see? Huggingface has so many of them for various NLP tasks.

    • @jilanikashif
      @jilanikashif Рік тому

      @@robmulla Please explain Transformers and BERT architect. Also tutorial with use case in current industry

  • @Midhun938
    @Midhun938 Рік тому +1

    Love from India ♥️

  • @breadandcheese1880
    @breadandcheese1880 Рік тому

    Hi Rob! When trying to use the Transformers pipeline function, I keep getting the SSL error, any ideas why i am getting this error?

  • @eleonorpatak4698
    @eleonorpatak4698 Рік тому

    hey sir! thx for the tuto!!
    for an end to end project , can we save those models example roberta with pickle to deploy it on the web or is there other method for this kind of models?

  • @revathyarumugam5359
    @revathyarumugam5359 10 місяців тому

    Super

  • @timdentry9754
    @timdentry9754 Рік тому +1

    One of the best tutorials on Vader and the Huggingface Transformers I have seen. One question I had: How is the confidence score calculated on the Pipeline model and is there a way to evaluate the model's performance on these calculations?

    • @robmulla
      @robmulla  Рік тому

      Thanks so much for the feedback. Glad you found it helpful. Evaluating the model performance is a bit tricky without ground truth labels. The output of the Pipeline model is essentially the probability the model predicts of each class given the dataset it was trained on. Check out the actual model description on the huggingface site here along with the noted limitations: huggingface.co/distilbert-base-uncased-finetuned-sst-2-english
      Specifically this part is interesting:
      ```
      Based on a few experimentations, we observed that this model could produce biased predictions that target underrepresented populations.
      For instance, for sentences like This film was filmed in COUNTRY, this binary classification model will give radically different probabilities for the positive label depending on the country (0.89 if the country is France, but 0.08 if the country is Afghanistan) when nothing in the input indicates such a strong semantic shift. In this colab, Aurélien Géron made an interesting map plotting these probabilities for each country.
      ```

    • @timdentry9754
      @timdentry9754 Рік тому +1

      @@robmulla FWIW - I reached out to the creator of this and what I was told is that the score is calculated using the activation function after the final layer of the neural net. It is used to determine polarity (and is not a confidence score). The model returns an array with the score for each polarity, and the larger is the prediction. The values will always be positive, regardless of the actual sentiment class tagged to the text. This is unlike Vader's model which provides a composite polarity score that could be a positive or negative float based on the inferred sentiment (positive, negative, neutral).

    • @robmulla
      @robmulla  Рік тому

      @@timdentry9754 thanks for clarifying. Cool that you got a response from the creator!

  • @manasghosh3709
    @manasghosh3709 17 днів тому

    Excellent explanation and material. Thank you for your efforts in making learning enjoyable. A brief query about reviews that are negative (5 stars) and positive (1 stars), where the algorithm is unable to forecast the relevancy score. Regarding these kinds of situations, how would you advise handling them??

  • @thuhuong-it700
    @thuhuong-it700 Рік тому +2

    great!! i hope you will create video more than!! tkssssssssss

    • @robmulla
      @robmulla  Рік тому

      Thank you, I will. I appreciate you watching.

  • @user-ls3ds3fn4t
    @user-ls3ds3fn4t 5 місяців тому

    how did u download the nltk packages like vader lexicon and maxent_ne_chucker?

  • @V3geta420
    @V3geta420 4 місяці тому +1

    Is there a other source then Kaggle where you got that csv from ??