Real time ETL: Integrate Kafka Data Stream with a Data Lake | Kafka | Data Stream | Data Lake

Поділитися
Вставка
  • Опубліковано 6 вер 2024

КОМЕНТАРІ • 9

  • @BiInsightsInc
    @BiInsightsInc  3 місяці тому +1

    Link to Kafka series: ua-cam.com/play/PLaz3Ms051BAkwR7d9voHsflTRmumfkGVW.html
    Link to Data Lake video: ua-cam.com/video/DLRiUs1EvhM/v-deo.html
    Link to data lake GitHub repo: github.com/hnawaz007/pythondataanalysis/tree/main/data-lake
    Link to Kafka GitHub repo: github.com/hnawaz007/pythondataanalysis/tree/main/kafka

  • @streambased
    @streambased 2 місяці тому

    Love how Kafka is turning into a datalake now that you can have unlimited retention at very cheap cost (KIP-405). This means that you can reduce data movement by bringing analysts directly where the data was ingested. This opens a plethora of new data sources and much greater volume of data available for ad hoc analysis. We can finally say bye to the cost, complexity and consistency issues associated with heavy ELT/ETL processes!

  • @patrickblankcassol4354
    @patrickblankcassol4354 3 місяці тому

    Thank you for the vídeo, excelent explanation.

  • @rafaelg8238
    @rafaelg8238 3 місяці тому

    great project, congrats. keep going on

  • @vladimirborisov3586
    @vladimirborisov3586 3 місяці тому

    Hi! Thank you for the video - it's great explanation as always on your channel!
    I have a questions. I have similar task starting from kafka and now I'm using iceberg/dremio/nessie stack for storing the data from your previous video. Here you have added hive - could you explain what's benefits of using hive with or instead of stack from your previous data lakehouse guide. Thanks!

    • @BiInsightsInc
      @BiInsightsInc  3 місяці тому

      Thanks. Nessie catalog is feature rich, Git integration, and I personally like it over Hive. However, dremio is a cloud native and offers an open source option. So I'd read the fine print and what's allowed commercially. That's the only catch. Otherwise, your setup optimal for streaming and storage. This implementation is fully open source and you can deploy it commercially. Both options offer similar capabilities and I will cover more as the current connector limits the iceberg's ACID capabilities. More to come on that.

  • @amymorrison5615
    @amymorrison5615 3 місяці тому

    🔥🎆😍

  • @MohacelHosen-cd8iz
    @MohacelHosen-cd8iz 3 місяці тому

    i subscribe , send my phone to Bangladesh, Uttara, Sector#13, Road#18. I prefer macbook