Delta Live Tables || change data capture (CDC) in DLT || SCD1 and SCD 2 || Apply Changes DLT
Вставка
- Опубліковано 12 вер 2024
- Delta Live Tables (DLT) Introduction
Introduction to Lakehouse Architecture
Challenges with Lakehouse Architecture
Procedural ETL vs Declarative ETL
DLT is Declarative ETL
Features present in DLT
#DLT
#StreamingTable
#MeterializeView
#views
#lineage2
#pipeline
#DeclarativeFramework
#ELTFramework
#ETL
#databrickstesting
#dataengineers
#dataengineering
#Databricks
#StreamingETL
#BatchETL
#DataQuality
#DataIntegration
#MergeExpectations
#ELT
#DataProcessing
#BigData
#DataManagement
#SmartContracts
#DataGovernance
#DataAnalytics
#DataScience
#DataEngineering
#ETLProcess
#TechInnovation
Hey man, awesome video and I loved how you got straight to the point, thanks for uploading this!
Crisp and Clear explanation, Thank you.
thank you
Quick Question : If a record is dropped from Source table i.e hard delete how does apply_changes handle it .
we should not drop any record from the source (or bronze layer), ideally we should do deduplication at silver layer . so that if we need source data again we can process it from bronze.
How to track lineage as after Apply changes drops the Lineage
Hi , I was implementing SCD1 SCD2 in DLT . I have batch data in source and daily new file will come from source.
After day1 success run , I modified day 2 file , so that some new records will insert and some will update.
But i am getting below error :-
Flow '' has FAILED fatally. An error occurred because we detected an update or delete to one or more rows in the source table. Streaming tables may only use append-only streaming sources. If you expect to delete or update rows to the source table in the future, please convert table to a live table instead of a streaming live table. To resolve this issue, perform a Full Refresh to table . A Full Refresh will attempt to clear all data from table and then load all data from the streaming source.
few questions ->
- if you have a batch data why are you using DLT
- even if you are using may i know which function u are using for reading the file like read or readStream or autoloader ?
- if you are using read then you should work with some kind of folder strucure based on date since you are getting 1 file each day
-if you are using readStream make sure you deleted the existing the check points and then restart your pipeline
- if you are using the autoloader i guess you should not face any issue just try fresh load
If we need to implement any transformation on silver table how to do it
better to use views to do transformation and then append that data in dlt table