Text Analytics Fundamentals | Introduction to Text Analytics with R Part 2

Data Science Dojo

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 5 січ 2025

КОМЕНТАРІ • 61

@maysammansor8290 3 роки тому ⁺⁷
I learned from your videos more than I learned from my master degree honestly. I can't stop watching!
@leslieormenita3162 3 роки тому ⁺¹
This series keeps me sane during this pandemic, thanks a lot!
@Datasciencedojo 3 роки тому
Keep following us for more content!
@TheBebwa 6 років тому ⁺¹
I don't understand why i landed on these videos this late in my life.but i thank God i did. going through all your text analytics videos
@atulitraopervatneni9320 6 років тому ⁺⁶
Hey Dave, you have extraordinary teaching skills, the way you explained everything is just awesome. Thanks alot..!!
@shoebmohammad926 6 років тому
Hi Dave, Thank you very much for posting quality videos. Your tutorials helped me a lot to understand the basic fundamentals of text analytics. I did a certificate course via online training companies but I could not pick up the topics very well and left half learned, at some point i felt my money wasted. But then, I came across your videos that are upto the mark and quite easy to learn and apply. I am thanking you for your great service.
Hope to learn more from you and wish you for a great life ahead.
@shivangsharma54 5 років тому
Hello Dave,
You’ve made a positive difference in my life.
After I saw your videos I learn many things. Your way of teaching is awesome. Many codes I learn here I applied to my projects and I got very good results. Your sacrifices don’t go unnoticed.
:) Thank you
@bnouadam 6 років тому ⁺¹
Impressive rhetoric abilities. Precise language. Beautiful
@rosiebai6606 5 років тому ⁺¹
Please post more teaching materials about R programming, you have great demonstration skills, Dave!
@rafaelfonseca7942 Рік тому
Really good lessons!
Thank you for all this free stuff!
@Datasciencedojo Рік тому
Stay tuned with us for more tutorials, Rafael.
@163ii 3 роки тому
Very intuitive and detailed explanation.
@intelcore2duo266 7 років тому ⁺³
Can't wait for the next part. Loving the elementary approach to teaching this stuff!
@intelcore2duo266 7 років тому ⁺²
Good to hear Dave. Thanks again!
@Datasciencedojo 7 років тому
@Lord Knight - Glad you liked the video. We will be releasing videos each week until the series is complete. Video #3 will be up early next week. Stay tuned!
Dave
@jonimatix 7 років тому ⁺¹
What do you have in mind for next series?
@Datasciencedojo 7 років тому
@jonimatix - This is currently still in planning, stay tuned!
@leromerom 7 років тому ⁺¹
Great work thank you and congratulations
People like you make a dent in this world!
@saifrehman6046 4 роки тому ⁺¹
Dave, you are doing a great job
@sumitdargan3288 7 років тому ⁺¹
Hey, Dave
Your hard work is really paying off in creating content that is of great value to all the users. I am really looking forward to watch all the video lectures on data mining. But preferably, I would also like to have a video on how sense can be taken out of data on specifically facebook pages and groups.
Cheers!
@Datasciencedojo 7 років тому
@Sumit Dargan - Thank you for the kind words, glad you liked the videos. Your request is duly noted!
@Jayl__ 7 років тому ⁺²
I love how you explain things so clearly! More text analysis videos please! :).. I'm going to watch the other ones but I hope you go through how to use different lexicons and other classifiers like Naive Bayes for theme analysis. Anyways thank you so much!
@Datasciencedojo 7 років тому ⁺¹
@Jayl - Glad you like the videos. There will be a new one each week until the series is complete. This series will focus on using tree-based mechanisms, in particular the mighty random forest. However, the feature engineering will work with any algorithm (e.g., Naive Bayes) that can be trained to perform binary classification.
@thoothukudi_to_berlin 6 років тому ⁺¹
Just love the way of your teaching!
@VijayKumar-pd8mu 7 років тому
Simply superb,excellent explanation
@TheEngineeringToolboxChannel 6 років тому
Excellent videos! Thank you so much.
@va940 5 років тому ⁺¹
Amaizing
@pontusmusic2140 Рік тому
Dave! Great video. I have one question regarding the index function. I don't get the same twenty values as you do after having set the same random seed. This affects the text messages I see when looking at the next video about HTML-escaped ampersand characters. Any suggestions on what I should do?
@PSNzzirGrizzHD 7 років тому ⁺²
Hey dave! I've been watching your videos ever since the Titanic data set, which really got me into using R and data science! I love the way you teach, its simple and too the point! I love how you apply and acknowledge the 80/20 rule. I was thinking about learning python to do some data analytics as well, but your videos just seem to overpower my decision to switch from R to python.
I have a question tho. In your opinion, how do you value data science bootcamps in regards to job preperation? Is there any benefits to learning python than R?
Thanks again for all the work you put into your videos. It's 12am right now and I'm eating oatmeal while watching this haha!
@Datasciencedojo 7 років тому ⁺¹
@PSNzzirGrizzHD - First of all, thanx for watching the videos and I am glad that you have found them useful. To answer the common question of Python vs. R I always tell folks the same thing:
1 - It is better to be awesome at one language rather than OK at both.
2 - If you already know Python, stick with that. If you know R, stick with that. See #1. Above.
3 - Certain geographies/industries prefer one over the other. For example, if you are targeting Silicon Valley companies then Python is the way to go. See #1.
Regarding bootcamps, I can only comment on the Data Science Dojo bootcamp where I teach presently as my day job. Our students find our week-long curriculum an excellent way to bootstrap into foundational data science skills and start their journey. However, it is only the start of the journey. Becoming a great data scientist (which I do not classify myself as) requires long-term concerted effort.
HTH,
Dave
@Siarkmic 7 років тому ⁺²
Hey, Dave.
I'm starting with text mining and your videos are absolutely genius.
I decided to tackle with text data of my native language Sadly quantida doesn't support my language (polish). I managed to get my hands on polish stop words and remove them , but i cannot pass the stem. I have array containing core and generally written word but i have problems with applying the stem to my tokens. I tried asking stack overflow but without any success.
@Datasciencedojo 7 років тому ⁺³
@Michał Siarkiewicz - Unfortunately, the quanteda package (as do most R packages) take a dependency on the Snowball stemmer (i.e., package SnowballC). The Snowball stemmer doesn't appear to have support for Polish at this time. I performed a quick Google and found the following document that may assist in finding Polish stemmers.:
www.cs.put.poznan.pl/dweiss/site/publications/download/ltc_092_weiss_2.pdf
HTH,
Dave
@TomerBenDavid 7 років тому
This is really really good. Just wish there was a version of this with apache-spark and scala/java.
@me3jab1 4 роки тому
great Lesson
@hamzamighri3857 7 років тому ⁺⁴
Thank you very much with this. Would it be possible to have a full lesson on social media mining especially with Facebook. I think that companies are having a constantly growing interest in knowing about their customers needs.
@hamzamighri3857 7 років тому ⁺²
Thank you for your quick reply David. Ideally, that would include extracting relevant data from Facebook, sentiment analysis through likes, etc. Besides, an analysis of people's reaction on a certain post/video/photo could bring about useful insights.
Overall, any significant data and learning from Facebook, is of utmost importance.
Thank you for your willingness to do that.
@Datasciencedojo 7 років тому
@Hamza MIGHRI - Thank you for the suggestion. Could you elaborate on what specific topics would be most useful to you/others in such a tutorial?
@bhagyabhutani7752 2 роки тому
Hey! I have been following up your videos for text analytics in R and I am unable to run the index function. Can you please help?
@trangluong6320 6 років тому
Really good video. Thanks a lot!
@mhjrt 6 років тому
Another good video, thanks!
@Tracks777 7 років тому
I enjoyed your video :) Keep it up!
@Datasciencedojo 7 років тому
@MisterBassBoost - Thanx, glad you liked it!
@dylangeng3687 7 років тому
great tutorial, really helpful,
I notice that the dataset shown in the video have pre-defined labels(ham and spam), i want to ask that how to deal with the situation that a dataset do not have labels beside manually label the every text document?
@juanmauricioarrietalopez2395 5 років тому
That would be for Unsupervised ML techniques.
@FadilAidid 6 років тому ⁺¹
i have question, is this set.seed(32984) number choose arbitrary?
@rafaelmejia2803 5 років тому
I have the same question.
@chandrikap.v.1921 7 років тому
Hey dave!! Excellent. I have learnt complete text analytics and can we have some lessons on Google Analytics as well!!!!!
@mayursatardekar8710 7 років тому
Hi Dave,
Thank you for the video, May i know if we can use Sample.split instead of createDatapartition, just want to know if there is an advantage of one over the other or is there any specific reason you have used createDataPartition?
Also wishing you a very Happy New Year..!! God bless..!!
@nehanandwani07 7 років тому ⁺¹
Hi Dave, you are an excellent teacher. I've been following your video series since R programming for excel users.
I need your help on this video.
While running the code, i'm unable to find createDataPartition function in my R studio even though i have installed caret package.
When I execute the command - library(caret) i get this error:
Error: package or namespace load failed for ‘caret’ in loadNamespace(j
@Datasciencedojo 7 років тому
@Neha Nandwani - You are too kind, glad you like the videos! The error above indicates that you need to install the "SparseM" package. You can use the following code to do so:
install.packages("SparseM")
HTH,
Dave
@nehanandwani07 7 років тому ⁺¹
Oh..yes, i could have guessed that from the error message itself. Thanks Dave for replying. I'm tuned to your youtube data science video series. Do you have plans to conduct boot camp in India ?
@charutgs 6 років тому
the GitHub link is not working;can u please hare the latest one
@dantecampiglio7094 6 років тому
The R code link is broken
@celloharper 5 років тому
Error when I use this code with my data - Error in terms.formula(formula, data = data) :
duplicated name 'document' in data frame using '.'
What does this mean and how can it be solutioned. Thanks
@satishbhonagiri9532 7 років тому ⁺¹
Hi David..good video!!What is the problem statement or objective reg. HAM and SPAM?
@Datasciencedojo 7 років тому
@Satish Bhonagiri - If I understand your question correctly, the goal is to create a binary classification model that can accurately predict whether a new (unseen) SMS text message is ham or spam.
@MasoudPaydar 7 років тому ⁺¹
Perfect
@Datasciencedojo 7 років тому
@Masoud Paydar - Thank you for the compliment and glad you liked the video!

Наступне

Автоматичне відтворення

Data Pipelines | Introduction to Text Analytics with R Part 3