How to read csv file in PySpark dataframe | Read csv in Google colab using pyspark example code

Поділитися
Вставка
  • Опубліковано 5 гру 2022
  • In this pyspark tutorial for beginners video I have explained how to read csv file in google colab using pyspark. The steps and the pyspark syntax to read csv file can work anywhere. The spark.read.csv pyspark example in this video can be executed on platforms and Python notebooks like Databricks and Jupyter Notebook as well.
    #pyspark #googlecolab #pandas #jupyternotebook #databricks
    If you just follow the same code, it would be enough to read csv file in databricks using pyspark and also jupyter notebook. The .csv file name and path can vary as per the user.
    There are some other methods in pyspark to read csv files, but for this specific video, I am demonstrating it with the most basic and simple PySpark commands. It is also possible to perform same task in python using Pandas library. There are some minor changes you need to make to read csv file in google colab using Pandas. I will make a separate video to cover that topic.
    • pyspark code to read csv file
    spark = SparkSession.Builder().master("master_name").appName("app_name").getOrCreate()
    df = spark.read.csv("file_path")
    df.show()
    df.printSchema()
    Jump directly to the particular topic using below Timestamps:
    0:00 - Introduction
    0:57 - How to create SparkSession
    2:44 - How to read csv in dataframe
    3:50 - import file in google colab
    4:47 - Copy csv file path
    5:15 - Display dataframe created from csv
    6:21 - PySpark dataframe schema
    I am using the exact code for read csv pyspark example taken in this vide.
    For this particular example, I have used a .csv file that is already provided under Google Colab files folder in sample data folder. But, it is also possible to read csv file in google colab from desktop that you can find in other video from my UA-cam channel ‪@datawithvedant‬.
    Moreover, you can read data from google drive in colab, this can be achieved once you mount drive on google colab. There is a small code for mounting drive in google colab so you can acess each and every file from google drive. It is possible from UI as well that you can find on my channel.
  • Наука та технологія

КОМЕНТАРІ • 1