How to run PySpark in Google colab online on web without installing Jupyter Notebook & Databricks

Поділитися
Вставка
  • Опубліковано 5 вер 2024
  • This is a PySpark tutorial video on how to run PySpark online using Google Colab python notebook. It's super easy to run PySpark on Google colab 2023 like any other popular platforms like kaggle notebook, Databricks for PySpark in web browser and without installaing pyspark on local machine. The video quickly shows the steps to install pyspark in google colab with a simple python pip command.
    #googlecolab #pyspark #jupyternotebook #databricks
    pip install pyspark OR
    !pip install pyspark
    Although I am using this command to install pyspark on google colab notebook, this pip command or python code syntax to install pyspark is in fact global, you can use this same command to install pyspark package irrespective of the platform you are using.
    • Jupyter notebook
    • Databricks
    • PyCharm
    • Google Colaboratory
    • Python Online Compiler
    • Python command line
    The exact same code snippet works for all above
    • Google colab vs jupyter notebook
    Why Google colab is better that jupyter notebook?
    Installing and running PySpark on jupyter notebook is a tedious job for the people who are new to pyspark and just want to run PySpark code for the sake of learning.
    In addition, you need to have python or anaconda distribution already installed in your system. Jupyter notebooks pip module is also a necessity for jupyter notebook to be launched in your web browser. Also Java is one of the dependency which should be installed in your system in order to execute pyspark commands in web browser.
    • Google colab vs Databricks
    Databricks is also one of the popular platform to run PySpark online. Databricks is nowadays available on top cloud providing services like AWS, Google Cloud Platform & Microsoft azure.
    Moreover, to try Databricks for free, you need a email from a company or cooperation or your employers organisation.
    • Google colab vs Local Machine
    In order to run PySpark on your local machine following installations should be present on your system
    1. Python or Anaconda
    2. Java
    3. IDE (Visual Studio Code or PyCharm)
    4. Apache Spark
    5. winutils.exe
    Timestamps:
    0:00 - Introduction
    0:25 - Google Colab intro
    1:46 - Create new Colab Notebook
    If you are a beginner and just want to run PySpark syntax & inbuild functions and understand functionality, then I would suggest you to start with Google Colab as it will skip all the above installations.
    Like, Share and Subscribe the video !

КОМЕНТАРІ • 3