Schema Merge | Schema Evolution | Parquet| Spark with Scala | Scenario based questions

Поділитися
Вставка
  • Опубліковано 23 чер 2021
  • Hi Friends,
    In today's video, i have discussed about Schema, Schema evoluation and mergeSchema option in Spark with a sample Scala code.
    Please subscribe to my channel and provide your feedback in the comments section.

КОМЕНТАРІ • 11

  • @ManishSharma-wy2py
    @ManishSharma-wy2py 10 місяців тому +1

    very clear explanation Mam, Thank You

  • @abhiganta
    @abhiganta 2 роки тому +3

    Clear explanation mam.. thanks for this entire playlist.
    btw I believe its schema evolution and evaluation ?

  • @nareshkumar1919
    @nareshkumar1919 3 роки тому

    Clear Explanation, Can you please share the Dataset and it will be good to start practice.

    • @sravanalakshmipisupati6533
      @sravanalakshmipisupati6533  3 роки тому +1

      Thank you Naresh . Plz take *.parquet files from the GITHUB - github.com/sravanapisupati/SampleDataSet

    • @nareshkumar1919
      @nareshkumar1919 3 роки тому +1

      @@sravanalakshmipisupati6533 thank you 😊

  • @vikastv9593
    @vikastv9593 Рік тому +1

    Hi Sravana,
    Hope you are doing well!
    I have been blocked for one of the scenarios in the project, I hope you provide guidance in regards to it.
    Background of Project:
    There are some 10 base tables, from each table there are bringing some 5 columns and creating other 5 derived tables as dataframe (Scala/spark) by joining all the columns from 10 base tables. After this they are making a parquet file and publishing the data in snowflake via DAG run and Databricks.
    Business Requirement:
    From one particular base table, I need to bring one particular column and need to add this column to all the 5 derived tables.
    Current Situation:
    I have brought that column from the base table and added that column in all the 5 derived table dataframe. After the DAG run, I am not able to see that column in snowflake tables.
    Issue:
    After running the DAG run it's showing an error that Exception found when writing the partition.
    Observations:
    Scala
    Please provide some guidance, looking forward to your reply.
    Regards,
    Vikas

    • @vikastv9593
      @vikastv9593 Рік тому

      Observations:
      In the Scala/spark code there is no mention of .options("schemamerge":true).

    • @sravanalakshmipisupati6533
      @sravanalakshmipisupati6533  Рік тому +1

      Hi Vikas, please check this video for schema merge - ua-cam.com/video/w2EJATgekUo/v-deo.html

    • @vikastv9593
      @vikastv9593 Рік тому

      @@sravanalakshmipisupati6533 Hi Lakshmi, even after adding .option("mergeSchema", "true"). Still the issue persist. Throwing the same error-- Exception found when writing the partition to the table.

    • @sravanalakshmipisupati6533
      @sravanalakshmipisupati6533  Рік тому

      @@vikastv9593 Please try to write the data to a new table. if the issue still persists, there might be issue with schema.