Handling Categorical Data in Machine Learning: Easy Explanation for Data Science Interviews

Поділитися
Вставка
  • Опубліковано 3 гру 2024

КОМЕНТАРІ • 17

  • @emma_ding
    @emma_ding  Рік тому +1

    Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!

  • @linghaoyi
    @linghaoyi Рік тому

    Thank you. Merry Christmas and Happy New Year!

  • @junlizhou7167
    @junlizhou7167 Рік тому

    Thanks for the informative video Emma! Love the Notion notes you created

    • @emma_ding
      @emma_ding  Рік тому +1

      So glad you enjoyed it! Thank you for watching. 😊

  • @hsuya3925
    @hsuya3925 Рік тому +3

    Hi Emma, very informative video. Thanks for working on all these types of videos and sharing with us. Wanted to know is your notion page public? or can you share if possible.

    • @Doctor_monk
      @Doctor_monk Рік тому

      I have been waitiing for these as well. :)

    • @emma_ding
      @emma_ding  Рік тому

      Of course! I'm working on getting all notes organized and sharable in one location, will let you know as soon as they are ready! :)

    • @emma_ding
      @emma_ding  Рік тому

      @sukumargv @hsuya3925 Here you go! You can now download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!

  • @sruthimallarapu7662
    @sruthimallarapu7662 Рік тому +1

    Hi Emma, Can decision trees handle string categorical values (For example "gender" column takes "M" or "F"). Is it not necessary to convert the strings to numericals?

    • @georgezevallos
      @georgezevallos 11 місяців тому

      All ML algorithms require to convert the strings into numerical values. Even NLP does it. Hope it helps.

  • @rakeshkumarsharma2250
    @rakeshkumarsharma2250 Рік тому +1

    How I convert pincode /postal code

  • @saudiorchestra6443
    @saudiorchestra6443 Рік тому

    How do we deal with a category that appears for the first time in the test data? For examples, I the training data I have a column for the jobs. The training data contains these jobs:
    Doctor, Nurse, Lab technician, Administrator
    I used one hot encoding for the job column. What if the test data has an additional job Surgeon? How do we handle this situation?

  • @Digital_awara
    @Digital_awara Рік тому

    Hi Emma, thank you soo much for this insight. Addition to this i also want to know how to handle large datasets like very large datasets because i was asked in an interview but i was unable to answer it correctly. So wanted to know from you how to handle very huge datasets and how to load ? what steps you would take to load these datasets. If you can make one video on this topic that would be great.

  • @qingxiawang161
    @qingxiawang161 Рік тому

    Hi, Emma, thank you very much for the informative video, I really learned a lot from it! Keep up the good work❤

  • @jet3111
    @jet3111 Рік тому

    Hi Emma, thank you for the very informative video. It would be great to discuss embedding methods for handling categorical data.

    • @emma_ding
      @emma_ding  Рік тому

      Great suggestion! I've added it to my list of content ideas. 😊 Thanks for watching!

  • @ABCEE1000
    @ABCEE1000 Місяць тому

    thank you but you didnt clarify much how to use hashing method