Apache Hudi with DBT Hands on Lab.Transform Raw Hudi tables with DBT and Glue Interactive Session

Full Schema Evolution hands on Lab

How to convert Existing data in S3 into Apache Hudi Transaction Datalake with Glue | Hands on Lab

Их было двое😍 #аняищук #димасблог #семья #anyaischuk #дети

아이스크림으로 체감되는 요즘 물가

когда повзрослела // EVA mash

Learn Schema Evolution in Apache Hudi Transaction Datalake with hands on labs

Soumil Shah

Переглядів 1 028

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 20 гру 2022
Complete Step by step guide can be found here
www.linkedin.com/pulse/buildi...
Наука та технологія

КОМЕНТАРІ • 12

@balajis4788 Місяць тому
Very Useful Video, It really helped me in solving my issue after watching this vide. Thank you!
@SoumilShah Місяць тому
Glad it helped!
@withtheengineer-hamza-3255 Рік тому
thank you, extremely useful
@yulinshao8576 Рік тому ⁺¹
Thanks very much for this sharing. Is it possible to drop columns in hudi tables in aws?
@harjeetsinghgoldy1 10 місяців тому
How to handle the delete an existing column in table? Huri throwing errors while upserting that batch which does not have the column.
@federicomanueldlouky5231 4 місяці тому
Link to the notebook is not working! Could you please share the new link?
@vinjitsharma1875 Рік тому
Can we do Schema evolution in MOR type HUDI table? Also if we drop a column in our Database and dump it to an S3 using DMS, will Hudi adjust itself to this change in schema?
@SoumilShah Рік тому ⁺¹
Hi
Answer to your first question
Yes you can do schema evolution in MOR
Answer to question 2
Depends if you are using hive sync or if you are creating tables using DDL
But you can define schema and evolve as shown in video
@vinjitsharma1875 Рік тому
@@SoumilShahOkay, We tried schema evolution in our MOR Hudi table and we are able to add new column, change datatype of column, rename column. But when we delete a column, it gives this error - "org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: Avro field 'emp_salary' not found
at org.apache.parquet.avro.AvroRecordConverter.getAvroField(AvroRecordConverter.java:221)"
No we are not using Athena. We are doing schema changes on the fly.
df = spark.createDataFrame(data = datalist, schema = df_schema)
(
df.write.format("org.apache.hudi")
.options(**CombinedConfig)
.mode("append")
.save(f"s3a://{hudi_table_bucket}/{hudi_table_path}/{schema}/{table}")
)
where df_schema is in this format - struct
when we deleted the column emp_salary, we removed it from the df_schema struct.
@mennagamea4634 Рік тому
when we apply this we sometimes get athena error that the column doesn't exist in the schema; the error is inconsistent though, it sometimes appears and other times it adds the col as expected and we can see it.. any idea why?
@SoumilShah Рік тому
Switch to Athena engine 3 to resolve issue 😀😀
@mennagamea4634 Рік тому
@@SoumilShah I did but athena enginer 3 have an error in schema changes it always gives me an error of col is not in schema

Наступне

Автоматичне відтворення

Apache Hudi with DBT Hands on Lab.Transform Raw Hudi tables with DBT and Glue Interactive Session

Apache Hudi with DBT Hands on Lab.Transform Raw Hudi tables with DBT and Glue Interactive Session

Full Schema Evolution hands on Lab

Full Schema Evolution hands on Lab

How to convert Existing data in S3 into Apache Hudi Transaction Datalake with Glue | Hands on Lab

How to convert Existing data in S3 into Apache Hudi Transaction Datalake with Glue | Hands on Lab

Их было двое😍 #аняищук #димасблог #семья #anyaischuk #дети

Их было двое😍 #аняищук #димасблог #семья #anyaischuk #дети

아이스크림으로 체감되는 요즘 물가

아이스크림으로 체감되는 요즘 물가

когда повзрослела // EVA mash

когда повзрослела // EVA mash

В Украине траур, а в РФ ликование - ударили точно по детям и роженицам

В Украине траур, а в РФ ликование - ударили точно по детям и роженицам

Build an SQL Agent with Llama 3 | Langchain | Ollama

Build an SQL Agent with Llama 3 | Langchain | Ollama

Schema Merge | Schema Evolution | Parquet| Spark with Scala | Scenario based questions

Schema Merge | Schema Evolution | Parquet| Spark with Scala | Scenario based questions

Leverage Apache Hudi incremental query to process new & updated data

Leverage Apache Hudi incremental query to process new & updated data

Apache Iceberg on AWS with S3 and Athena [FULL COURSE IN 30MIN]

Apache Iceberg on AWS with S3 and Athena [FULL COURSE IN 30MIN]

Data Lineage with Apache Airflow using OpenLineage | Datakin

Data Lineage with Apache Airflow using OpenLineage | Datakin

Apache Iceberg - A Table Format for Huge Analytic Datasets

Apache Iceberg - A Table Format for Huge Analytic Datasets

Modern Data Lake Storage Layers

Modern Data Lake Storage Layers

Learn Apache Airflow in 10 Minutes | High-Paying Skills for Data Engineers

Learn Apache Airflow in 10 Minutes | High-Paying Skills for Data Engineers

Что не так с яблоком Apple? #apple #macbook

Что не так с яблоком Apple? #apple #macbook

Спутниковый телефон #обзор #товары

Спутниковый телефон #обзор #товары

ИГРОВОВЫЙ НОУТ ASUS ЗА 57 тысяч

ИГРОВОВЫЙ НОУТ ASUS ЗА 57 тысяч

Не покупай DDR5 в свой ПК! #пк #игры #гейминг #озу #сборкапк #игровойпк #pc #games #gamingpc #ram

Не покупай DDR5 в свой ПК! #пк #игры #гейминг #озу #сборкапк #игровойпк #pc #games #gamingpc #ram

⚡️Супер БЫСТРАЯ Зарядка | Проверка

⚡️Супер БЫСТРАЯ Зарядка | Проверка

🛜 Срок службы роутера Если начал глючить, то не нужно прибегать к танцам с бубном

🛜 Срок службы роутера Если начал глючить, то не нужно прибегать к танцам с бубном

ПОЧЕМУ ПОСЛЕ ВМЕШАТЕЛЬСТВА ШАЛОВЛИВЫХ РУК УСТРОЙСТВА ПЕРЕСТАЮТ РАБОТАТЬ? ОБЗОР OSIO F160A

ПОЧЕМУ ПОСЛЕ ВМЕШАТЕЛЬСТВА ШАЛОВЛИВЫХ РУК УСТРОЙСТВА ПЕРЕСТАЮТ РАБОТАТЬ? ОБЗОР OSIO F160A

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts