Great, thanka a lot ma'am 🤗🤗 can i ask to you, what if i want to show xtest result after tf-idf ma'am? I have tried only with the xtest code but the results are not as desired
very useful video.....easy to understand......but when i tried the same code getting below error in the line print(df.iloc[:,0].apply(text_cleaning)) AttributeError: 'set' object has no attribute 'words' please help me out
1 year late but running through this now and encountered the same error. The reason this is happening is that when Aarohi was showing us how stopwords work, we wrote the following: from nltk.corpus import stopwords stopwords = set(stopwords.words("english")) print(stopwords) This re-assigns the stopwords library to our set. That breaks the code further down. Rename the assignment here, like stopwords1, and it will work further down when you use stopwords. That lets the runtime know it's the library, not the variable you reassigned. LIke so: from nltk.corpus import stopwords stopwords1 = set(stopwords.words("english")) print(stopwords1)
Hi Mam, You have explained very well, I have one doubt, How can we save the model and fine-tune it for new input text and category label. Waiting for your reply
import joblib # Save the model to a file joblib.dump(model, 'multinomial_nb_model.joblib') # Load the saved model loaded_model = joblib.load('multinomial_nb_model.joblib') # Fine-tune the model with new data loaded_model.fit(X_new, y_new) #If you want to save the model after fine-tuning, follow the saving steps again. joblib.dump(loaded_model, 'updated_multinomial_nb_model.joblib')
@@CodeWithAarohi Can we fine tune existing model and save the changes in the same model (not creating a new updated_multinomial_nb_model.joblib). How can we do it?
The explanation was Awesome just make one correction @10.20 you said that label encoding means converting text data into numerical. basically it's word embedding. label encoding means converting categorical data to numeric.
Ma'am I'hv been following this video through out my minor project of 7th sem. Now I am stuck at the last phase.. I'hv trained the model and save it using pickle . And while loading it in other python file and use for prediction . It shows "Raise value: Dimension mismatch error" . Kindly help.
Hello i want to build a recommender system which recommends things by keyword matching.What i want to do is take user details in my app and then data is used as collection of keywords .Then i want to recommend similar things(in my case jobs) on the basis of keywords.All the recommender systems on youtube have projects in which they choose the items from dataset itself.Please help me out Thanks
Hello ma'am! Thanks for such a simple explanation of the design. I have a few doubts.. looking forward to your reply on these 1. in which step the training data and testing data are segregated. How can we edit its percentage? 2. How to use this trained model on the unlabelled data? Let me know if any further clarification is required in questions. Thank you for the video.
Hi Mohina, Glad you like the video. And all your queries are addressed on this link. Check this github link: Code and sataset is attached. github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/news_classifier_unseen_input.ipynb
Thanks Mam for explaining in such a easier way.. it's really help for me and I also implemented the same to identify the error in our project. Currently it's identifying its a valid error message or not....but mam I have a below concern..please explain to me. 1- how to prepare CSV file (data dictionary file ) dynamically for error handler 2- suppose currently if any new error is comming it's giving the prediction as T or F after the successful algorithm execution but I want the accuracy for this particular error message as well so that I can display in UI it's 50% or 60% match....like as per your comment..i wanted the accuracy of unseen data..whtever you mentioned on the another comments..please help on this....Once again thanks for this video
Hi Babloo, Glad you like the video. And as per your query, you can check the accuracy on unseen data. Please refer this code: github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/news_classifier_unseen_input.ipynb
@@CodeWithAarohi Thanks mam for quick reply.. x=["Nifty IT index down nearly 3% on Infosys weak guidance"] r = predict_news(x) print (r) Entertainment I want the accuracy in terms of percentage or 0.75 something format for the above unseen data..I can understand that it's giving me the result that this unseen data lies on which category based on algo prediction..but I want percentage that how much percentage it's match that's why it's lies on category section
@@CodeWithAarohi Thanks but I want the below type output for the unseen data which is lying on entertainment categories section Printing accuracy of the our model accuracy_score(result,y_test) Out[75]: 0.9411764705882353
Nice video! Just wondering why the shape of title_tfidf is (77,257), in the video it just seems like a shape with far more than 77 instances (because there are multiple words in each setence) and only 2 features --- the (*,*) and a decimal number.
77 rows and 257 columns (257 different vocabulary in each row). 257 columns because we have perform word tokenizing . And each word stored in a separate column
This is generally an supervised learning classification technique (meaning that labels are required and good quality data and labels recommend [even with small samples])
Great, thanka a lot ma'am 🤗🤗 can i ask to you, what if i want to show xtest result after tf-idf ma'am? I have tried only with the xtest code but the results are not as desired
very useful video.....easy to understand......but when i tried the same code getting below error in the line
print(df.iloc[:,0].apply(text_cleaning))
AttributeError: 'set' object has no attribute 'words'
please help me out
Send me your jupyter notebook on my mail id so that i can check it
1 year late but running through this now and encountered the same error.
The reason this is happening is that when Aarohi was showing us how stopwords work, we wrote the following:
from nltk.corpus import stopwords
stopwords = set(stopwords.words("english"))
print(stopwords)
This re-assigns the stopwords library to our set. That breaks the code further down.
Rename the assignment here, like stopwords1, and it will work further down when you use stopwords. That lets the runtime know it's the library, not the variable you reassigned. LIke so:
from nltk.corpus import stopwords
stopwords1 = set(stopwords.words("english"))
print(stopwords1)
Thanks for the clear explanation mam, but mam from where we get dataset of this model you used in this video
github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/first_batch.csv
What if we don't have category column in our table, and just have the contiments?? how will we implement such model?
Hi Mam,
You have explained very well, I have one doubt, How can we save the model and fine-tune it for new input text and category label.
Waiting for your reply
import joblib
# Save the model to a file
joblib.dump(model, 'multinomial_nb_model.joblib')
# Load the saved model
loaded_model = joblib.load('multinomial_nb_model.joblib')
# Fine-tune the model with new data
loaded_model.fit(X_new, y_new)
#If you want to save the model after fine-tuning, follow the saving steps again.
joblib.dump(loaded_model, 'updated_multinomial_nb_model.joblib')
@@CodeWithAarohi Can we fine tune existing model and save the changes in the same model (not creating a new updated_multinomial_nb_model.joblib). How can we do it?
Great! step by step Clear Explanation ..... this helped alot
Glad it helped!
Hello
I have a question. The data set that you used was for training, which one did you use for predictions? Or am I missing something?
Thanks !
split your dataset in train and test and then execute this code
The explanation was Awesome just make one correction @10.20 you said that label encoding means converting text data into numerical. basically it's word embedding. label encoding means converting categorical data to numeric.
Thank you!
Good day ma I am having a problem implementing this line of code in my program "print(df['review'].apply(text_cleaning))". It keeps giving me an error
send me your code and sample of your dataset. I will check
@@CodeWithAarohi Done ma
Ma'am I'hv been following this video through out my minor project of 7th sem. Now I am stuck at the last phase.. I'hv trained the model and save it using pickle . And while loading it in other python file and use for prediction . It shows "Raise value: Dimension mismatch error" . Kindly help.
Dimension mismatch here because the training data vocabulary size is different from vocabulary size of test data
Great video, thanks for taking the time creating it
Glad you enjoyed it!
Hey Aarohi, can you provide me with the link for the sample dataset you are using?
github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/first_batch.csv
Keep on Uploading ML projects mam. Your. Your accent shows that you have a great potential in this field.
Thank you, I will
Hey, Your explanation is just awesome.
You made these horrifying terminologies so easy!
Thank you so much!!!
Glad my video helped you.
Thank you so much exactly what i was searching for .
Glad I could help!
how to use the attribute feature_log_prob_ in MNB. please reply
This was very simple explanation. Very easy to understand.
one doubt: how to proceed to predict the class of new title?
sahana prasad thanks ... answer to your doubt is github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/news_classifier_unseen_input.ipynb
@@CodeWithAarohi Thanks so much.
how can i obtain confidence for each match here?
searching for this everywhere. THANK YOU
Welcome
Very helpfull thank you so much, you are doing good job, please keep it up
You are welcome
Hi I’m having trouble with one part of the coding could you help me.
Sure
Code With Aarohi I’ve sent you an email with the part of the code that I’m struggling with
Hello i want to build a recommender system which recommends things by keyword matching.What i want to do is take user details in my app and then data is used as collection of keywords .Then i want to recommend similar things(in my case jobs) on the basis of keywords.All the recommender systems on youtube have projects in which they choose the items from dataset itself.Please help me out
Thanks
Similar to a search engine
Are you looking something like this : ua-cam.com/video/gUpPXLv00lM/v-deo.html
Hello ma'am!
Thanks for such a simple explanation of the design. I have a few doubts.. looking forward to your reply on these
1. in which step the training data and testing data are segregated. How can we edit its percentage?
2. How to use this trained model on the unlabelled data?
Let me know if any further clarification is required in questions. Thank you for the video.
Hi Mohina, Glad you like the video. And all your queries are addressed on this link. Check this github link: Code and sataset is attached.
github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/news_classifier_unseen_input.ipynb
@@CodeWithAarohi Got it. Thanks for the quick response.
@@mohinakharbanda5398 welcome
Thanks Mam for explaining in such a easier way.. it's really help for me and I also implemented the same to identify the error in our project. Currently it's identifying its a valid error message or not....but mam I have a below concern..please explain to me.
1- how to prepare CSV file (data dictionary file ) dynamically for error handler
2- suppose currently if any new error is comming it's giving the prediction as T or F after the successful algorithm execution but I want the accuracy for this particular error message as well so that I can display in UI it's 50% or 60% match....like as per your comment..i wanted the accuracy of unseen data..whtever you mentioned on the another comments..please help on this....Once again thanks for this video
Hi Babloo, Glad you like the video. And as per your query, you can check the accuracy on unseen data. Please refer this code: github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/news_classifier_unseen_input.ipynb
And do let me know if you have any other queries
@@CodeWithAarohi Thanks mam for quick reply..
x=["Nifty IT index down nearly 3% on Infosys weak guidance"] r = predict_news(x) print (r)
Entertainment
I want the accuracy in terms of percentage or 0.75 something format for the above unseen data..I can understand that it's giving me the result that this unseen data lies on which category based on algo prediction..but I want percentage that how much percentage it's match that's why it's lies on category section
BABLOO KUMAR will send you that code in some time( today)
@@CodeWithAarohi Thanks but I want the below type output for the unseen data which is lying on entertainment categories section
Printing accuracy of the our model accuracy_score(result,y_test)
Out[75]:
0.9411764705882353
I want dataset for multinomial classifier from where i download data set??
you can get that data from github.com/AarohiSingla/Multinomial-Naive-Bayes
Could not found datasets on this link plz guide me how can i get ?
Very helpful ... Thanks for the clear explanation ❤️
welcome
Thanks a lot! You explained it very nicely!
Glad it helped you
What datasets u have used , can i get that datasets?
Share ur email id, i will send you that dataset
Share ur email id, i will send you that dataset
Share ur email id, i will send you that dataset
@@CodeWithAarohi shakiralam2017@gmail.com
github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/first_batch.csv
can I get this dataset from anywhere?
github.com/AarohiSingla/Multinomial-Naive-Bayes
Nice video! Just wondering why the shape of title_tfidf is (77,257), in the video it just seems like a shape with far more than 77 instances (because there are multiple words in each setence) and only 2 features --- the (*,*) and a decimal number.
77 rows and 257 columns (257 different vocabulary in each row). 257 columns because we have perform word tokenizing . And each word stored in a separate column
Thanks a lot for such a helpful video ..
Richa Aggarwal welcome
Very helpful video Mam 👍
Thankyou
very good explained .
Thankyou
Ma'am can i perform text classification for unlabeled data
JOEL MASCARENHAS hi, no for multinational naive bayes you need labeled data
This is generally an supervised learning classification technique (meaning that labels are required and good quality data and labels recommend [even with small samples])
Thanks a lot, nobody has explained MNB() like this.
Glad it helped you
This was so helpful! Can I know how to plug in new unseen data as csv and get the results also in a csv format?
Vela please check this : github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/news_classifier_unseen_input.ipynb
mam please share this notebook in the description box!
github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/youtube_multinomial_naive_bayes.ipynb
You did not say anything about Bag of Words model.
You can check this video for bag of words ua-cam.com/video/tBooegCTNXM/v-deo.html
Pls is there a github link for this video
If so...
github.com/AarohiSingla/Multinomial-Naive-Bayes
Good Work ❤😍
Thanks
Thank you ...
Welcome
Yes you can use multinational naive bayes for more than 2 class classification
mam can u plz share that csv file?
github.com/AarohiSingla/Multinomial-Naive-Bayes
Ma'am,
can You please attach source code so we can easily understand it.
Thank you
github.com/AarohiSingla/Multinomial-Naive-Bayes/blob/master/youtube_multinomial_naive_bayes.ipynb
Here you can get the code for this example
@@CodeWithAarohi maam i need this dataset also which is used for unseen text. Mail me at shakiralam2017@gmail.com
permission to learn
Kurnia Adi C hi, I didn’t understand what are you trying to say.