Complete Machine Learning Project with PySpark MLlib Tutorial ❌Logistic Regression with Spark MLlib

Поділитися
Вставка
  • Опубліковано 28 вер 2024

КОМЕНТАРІ • 33

  • @DecisionForest
    @DecisionForest  4 роки тому +3

    Hi there! If you want to stay up to date with the latest machine learning and big data analysis tutorials please subscribe here:
    ua-cam.com/users/decisionforest
    Also drop your ideas for future videos, let us know what topics you're interested in! 👇🏻

  • @LuciaBukovaLushspaces
    @LuciaBukovaLushspaces 4 роки тому +5

    Great that you've put the timelines in the description. Really helpful!

    • @DecisionForest
      @DecisionForest  4 роки тому +2

      Glad it was helpful! Wanted to make it easier to scan through the content.

    • @LuciaBukovaLushspaces
      @LuciaBukovaLushspaces 4 роки тому

      DecisionForest Yeah good thinking!😁

  • @hilmi8992
    @hilmi8992 3 роки тому +2

    Hey, Radu! Thank you for these very informative and practical tutorials. They really helped me to figure out big data preprocessing and building pipelines. Please keep going and keep adding new tutorials.

    • @DecisionForest
      @DecisionForest  3 роки тому

      Hi Hilmi, thank you for the kind words, trying my best!

  • @amitkumargangwar8818
    @amitkumargangwar8818 3 роки тому +7

    Hi there,
    I think have made a basic mistake, you are actually using test data (pp_df) to train the model.

    • @riseshrox
      @riseshrox 3 роки тому

      Yeah this seems off to me too.

    • @nickp7526
      @nickp7526 2 роки тому

      Good thing you said it, I was also wondering that😅

  • @TheLeoncer
    @TheLeoncer 3 роки тому

    Liked and sub'd. Incredible. I pay my lecturer thousands of dollars and he can't explain nearly as clearly what you have just showcased.

  • @fatenlouati4325
    @fatenlouati4325 2 роки тому

    Thank you for this lessoon. I wish you explain how to do this in a spark cluster

  • @shann9404
    @shann9404 3 роки тому

    I was so helpful, it will help to make a homework, thanks

  • @guneetkaur6895
    @guneetkaur6895 2 роки тому

    __init__() got an unexpected keyword argument 'inputCols' getting this error in the step one_hot_encoder = [OneHotEncoder(inputCols=[f" {x}_StringIndexer" for x in catCols],
    outputCols=[f" {x}_OneHotEncoder" for x in catCols])] PLEASE HELP!

  • @amitsrivastava9152
    @amitsrivastava9152 2 роки тому

    Hello There!i am getting the error message at onehotencoder steps such as "TypeError: __init__() got an unexpected keyword argument 'inputCols".can you please help

  • @flamboyantperson5936
    @flamboyantperson5936 4 роки тому

    Thank you very much for this topic. I loved it.

  • @emafotolescu860
    @emafotolescu860 4 роки тому

    👍🏻👍🏻

  • @saurabrao8920
    @saurabrao8920 3 роки тому +1

    It would really help if you can prepare videos for end to end pipeline and production based implementations of other ML algorithms using Mlib, like RandomForest, LinearReg, SVMs etc. Thank you!

  • @DataScienceGarage
    @DataScienceGarage Рік тому

    Very rich explanation, thanks for that!

  • @carlvinabonyo9729
    @carlvinabonyo9729 2 роки тому

    Great video subscribed!, would you know or have a video of , if pyspark ML has a functionality similar to Voting Regressor in sckit where you can use multiple regression types like linear, random forest and create one ensemble that takes the average of the predictions of these multiple models.

  • @pronoy592
    @pronoy592 3 роки тому

    I am using PySpark MLIB for multiclass image classification. Can anyone suggest to me the stages of my deep learning pipeline for the concerned task? I am using the latest PySpark version so things like DeepImageFeaturizer have long been deprecated

  • @haneulkim4902
    @haneulkim4902 2 роки тому

    Great tutorial! What if you want to preprocess using pyspark but then convert back to pandas for use of tensorflow? In such case I would need to extend onehot encoded vector into separate columns with correct column names, Any tips on doing this?

  • @rezahamzeh3736
    @rezahamzeh3736 2 роки тому

    Missed your amazing training videos. Hope to see more of them in near future

  • @tatidutra
    @tatidutra 2 роки тому

    Thank you for this video! It helped me a lot! :)

  • @flamboyantperson5936
    @flamboyantperson5936 4 роки тому

    Also make one on k means clustering algorithm with pyspark

  • @alejandrofleitas1055
    @alejandrofleitas1055 3 роки тому

    Excelent video congrats. One question, whats the diference between Spark mlib and pyspark mlib? Thanks!

    • @DecisionForest
      @DecisionForest  3 роки тому

      Thank you! PySpark is the Python API for Spark, so it's just a language difference.

  • @mayraju.p5591
    @mayraju.p5591 3 роки тому

    Hii, can you please explain to me the last part of the recall and precision table and how can I understand that one?
    model.summary.pr.show()

    • @SoyeBoy
      @SoyeBoy 2 роки тому

      Yes agree on this. You should only be getting single values for precision and recall, yet you seem to have one for every instance. You skipped over this without explaining