Modern Data Engineering Workflows, Explained

Поділитися
Вставка
  • Опубліковано 21 гру 2024

КОМЕНТАРІ • 12

  • @KahanDataSolutions
    @KahanDataSolutions  Рік тому +2

    Looking for help with your team's data strategy? → www.kahandatasolutions.com
    Looking to improve your data engineering skillset?→ bit.ly/more-kds

  • @jacobukokobili6457
    @jacobukokobili6457 Рік тому +1

    Thanks for this Kahan. Please make a video implementing the workflow like you've done with the CI/CD. Thanks again.

  • @dataruncoach
    @dataruncoach Рік тому +1

    Very clear and concise, thank you

  • @goosetaculous
    @goosetaculous Рік тому

    I love it. Already doing but it's a good reminder

  • @marcosoliveira8731
    @marcosoliveira8731 Рік тому

    A lot of good ideas from your videos has inspired me to improve my development flow.

  • @vishal_uk
    @vishal_uk 5 місяців тому

    Hi Mike! Could you please clarify the following:
    After the developer makes some changes in the model and raises a PR so that his changes are reviewed/auto-tested in the QA/CI DB/Schema, and later merged to the Main branch, Is the QA/CI a replica of Prod DB(warehouse and marts) where it reads data from Staging and validates the changes prior getting merged to main? Thanks in advance!

  • @NicoWright-ly6en
    @NicoWright-ly6en 8 місяців тому +1

    Hi Kahan, a question I have after watching many of your videos. What about a client's situation makes you think one tool would fit better than another? For example Snowflake vs BigQuery.

  • @MrUbbers
    @MrUbbers Рік тому

    In our setup we have multiple environments (DEV, QA, PROD), all seperate including the raw sources including the ETL. This doubles our costs at least. The setup that you showed eliminates the extra costs for processing and storage by using one environment, right? How do you deal with upgrades and changes in the raw datasource layer? For example a source system that has significant changes in its database schema after an upgrade? Just add another schema in the raw database?

  • @felipecondore4173
    @felipecondore4173 Рік тому

    Its a very clear explanation

  • @EMBrown801
    @EMBrown801 Рік тому

    Would you need separate dev schemas for the staging and marts? Let's say I want to develop a new mart. Would I put all of those models in the same dev schema before going to production?

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому +1

      I typically will do that. I like to keep all tables/views in a single Dev schema (ex. all Staging, Warehouse, Marts) to avoid excessive objects and keep it simple. The way I see it, nobody else is really looking at that schema so perfect separation & organization isn't as important. What's more important is that you can confirm models deploy, check the data, etc. Then once you move to "production", separate things out by specific schemas. Hope that helps!