PySpark Tutorial for Beginners | Apache Spark with Python -Linear Regression Algorithm

Поділитися
Вставка
  • Опубліковано 24 жов 2024

КОМЕНТАРІ • 42

  • @bunnyvlogs7647
    @bunnyvlogs7647 3 роки тому +1

    Love u
    Finally found why vector implementation is required

  • @ElhamMirshekari
    @ElhamMirshekari 5 років тому +42

    Your voice goes like a sine function!! periodically up and down!

  • @vimalkrishnan9801
    @vimalkrishnan9801 5 років тому +7

    Hi Krish,
    Can you please rearrange the data science tutorials playlist. It would be a great help for many.
    Thanks in advance.

  • @rahadulhaq6387
    @rahadulhaq6387 4 роки тому

    if there had final overall explanation the video would more better and easy to understand.....nd thnx for ur nice effort

  • @musthafaofficial4773
    @musthafaofficial4773 3 роки тому

    We really understood the overall.

  • @souravbiswas6892
    @souravbiswas6892 4 роки тому +1

    Show some data cleaning, preparation, feature engineering, feature selection, normalisation, cross validation, hyperparameter optimization, model validation using pyspark..

  • @Ap-dv6lg
    @Ap-dv6lg 5 років тому +2

    Very good explanation!!

  • @pythonwiz8516
    @pythonwiz8516 3 роки тому +1

    Hi @Krish how can we get the coefficients along for each independent variable?

  • @viane123456
    @viane123456 3 роки тому

    excellent tutorial , thank you for this

  • @chd9841
    @chd9841 3 роки тому

    Also please do a neural network lecture in pyspark

  • @hmcsandy
    @hmcsandy 4 роки тому

    Krish Sir, is Park still used today? What is the comparison between Spark and TensorFlow?

  • @naveenajaykumar675
    @naveenajaykumar675 4 роки тому

    can u make a video on read the data from different sources

  • @louerleseigneur4532
    @louerleseigneur4532 3 роки тому

    Thanks Krish

  • @statsbyindian3842
    @statsbyindian3842 5 років тому

    hi, can u make more videos on spark and dataengineering concepts

  • @Melukote_Sriharsha
    @Melukote_Sriharsha 4 роки тому

    Can you please share the dataset you are using in this tutorial so that we can try this

  • @anilgadekar848
    @anilgadekar848 4 роки тому

    Great explains thanks

  • @Jamesomnipotent
    @Jamesomnipotent 4 роки тому

    Hi @KrishNaik, is it possible to specify the actual size for training and testing sets during the split in pyspark? I was able to do this in pandas.. Thanks

  • @fawadkhan8905
    @fawadkhan8905 4 роки тому

    @
    Krish Naik Hey, Sir i come across with this issue, if possible kindly help me in, i have been tried but could not figure it out yet. 20/02/23 22:00:54 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
    20/02/23 22:00:54 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS

  • @saumitravshl
    @saumitravshl 5 років тому

    Can you please make a video on how to work with HDFS file in ML using PySpark?

  • @papachoudhary5482
    @papachoudhary5482 3 роки тому

    How should I get the complete sessions with realtime project? Can you help

  • @chetanmundhe8619
    @chetanmundhe8619 4 роки тому

    Very nice

  • @012akashh
    @012akashh 6 років тому +1

    Nice video 👍

  • @202rupesh
    @202rupesh 5 років тому +3

    nice tutorial :)

  • @YashSharma-xb2os
    @YashSharma-xb2os 4 роки тому

    i am not able to do show after transforming the data in Pycharm. what can i do. Please help i am stucked

  • @papachoudhary5482
    @papachoudhary5482 3 роки тому

    Thanks ! Sir

  • @amitpadhi9021
    @amitpadhi9021 3 роки тому

    in
    ----> 1 featureassembler.transform(dataset) nalysisException: Cannot resolve column name "Avg. Session Length" among (Email, Address, Avatar, Avg. Session Length, Time on App, Time on Website, Length of Membership, Yearly Amount Spent); did you mean to quote the `Avg. Session Length` column?;

  • @yuvrajverma6832
    @yuvrajverma6832 3 роки тому +1

    how to install pyspark
    please help me

  • @bhavitavyashrivastava8600
    @bhavitavyashrivastava8600 4 роки тому

    how can i make web app of this model and run on as web application

  • @aksingh6697
    @aksingh6697 4 роки тому

    sir i am getting this type of error
    ---------------------------------------------------------------------------
    AttributeError Traceback (most recent call last)
    in
    1 from pyspark.sql import SparkSession
    ----> 2 Spark=SparkSession.builder.appName('customers').getOrcreate()
    AttributeError: 'Builder' object has no attribute 'getOrcreate'

  • @XERXEZITTRAININGANDPROJECTS
    @XERXEZITTRAININGANDPROJECTS 2 роки тому

    how to save Pyspark based ML Model...

  • @ashishnautiyal2620
    @ashishnautiyal2620 2 роки тому

    how can I download this csv file?

  • @chittineediteja7330
    @chittineediteja7330 5 років тому +2

    Sir please help me how to install Pyspark in windows

    • @Melukote_Sriharsha
      @Melukote_Sriharsha 4 роки тому +4

      Now you can just open command line and type "python -m pip install pyspark" Thats all.. Its simplified now

    • @chittineediteja7330
      @chittineediteja7330 4 роки тому +1

      @@Melukote_Sriharsha tq 😍

  • @tejam1837
    @tejam1837 4 роки тому +2

    import pyspark as py
    from pyspark.sql import SparkSession
    spark=SparkSession.builder.appName('Customers').getOrCreate()
    while running the command, getting error

  • @constantinosmorfakis940
    @constantinosmorfakis940 6 років тому +1

    Hi, thanks for the video!
    When I call 'linreg.fit(train_data)', I get the error:
    An error occurred while calling o143.fit.
    : java.lang.AssertionError: assertion failed: lapack.dppsv returned 2.
    I am running spark on a windows machine, not sure if this is the issue. Any ideas?

    • @krishnaik06
      @krishnaik06  6 років тому +1

      Hi, make sure the output column that you have created in the VectorAssembler should be the same as used in the LinearRegression method.
      See the syntax below
      featureassembler=VectorAssembler(inputCols=["Avg Session Length","Time on App","Time on Website","Length of Membership"],outputCol="Independent features")
      lr = LinearRegression(featuresCol="Independent features",labelCol='Yearly Amount Spent')
      You can download the code from the github link that I have provided in the description section.
      Please let me know if you face any issue

  • @ArunKumar-sg6jf
    @ArunKumar-sg6jf 3 роки тому

    u missed accuracy in the video

  • @saitulasikolapudi1122
    @saitulasikolapudi1122 4 роки тому +1

    Sir this is clearly not for beginners,lots of things