How to build a sustainable data ecosystem on Google Cloud

Поділитися
Вставка
  • Опубліковано 10 вер 2024
  • Today, I’m sharing my experience on how to establish a data ecosystem within Google Cloud that addresses some significant challenges such as enhancing the speed of development, improving data management and sharing, ensuring quality, and most importantly, identifying clear methods to evaluate our progress.
    It’s essential to recognise that creating a lasting data system isn’t merely about adhering to the latest trends or buzzwords. It’s about genuinely understanding the complaints, identifying the root causes, and utilising technology to construct a system that tackles these issues while also closely monitoring its adoption.
    Let’s initiate a dialogue on how we can leverage technology to overcome these challenges and monitor our success effectively. What are your thoughts or experiences with enhancing data management and sharing?
    #DataManagement #DataSharing #GoogleCloud #Technology #ProgressMeasurement #DataEcosystem #Adoption #DataMesh
    01:24 - The biggest challenge in today’s data landscape
    02:41 - What your typically day might look like
    04:36 - What do these actually mean
    07:37 - What tooling do you need to help with these challenges
    10:27 - A view of a data ecosystem
    14:27 - The “sustainable” part
    22:03 - Let’s revisit our top issues
    24:46 - Summary
    Slides: docs.google.co...

КОМЕНТАРІ • 14

  • @alifarah9
    @alifarah9 4 місяці тому +1

    Really appreciate these high quality videos ! Seriously your videos are better than the official video for GCP. What makes these videos invaluable is you teach frok first principles and talk about problem that will be faced in any cloud environment not GCP.

    • @practicalgcp2780
      @practicalgcp2780  4 місяці тому +1

      Thank you so much 🙏 you are right the principal are very much the same, no matter which cloud provider it is. Although my focus is GCP because it is something I believe as an ecosystem it’s much more powerful but remains the easiest to implement and scale compares to other cloud providers.

  • @alexanderpotts8425
    @alexanderpotts8425 5 місяців тому +1

    Knocking it out of the park as usual. I'm trying to get adoption of some of these in my team already!

    • @practicalgcp2780
      @practicalgcp2780  5 місяців тому

      Amazing to see you find it useful. I believe a of these things I covered are what we are doing everyday already, I was trying to put everything together in a more structured way hopefully to help a winder crowd adopt these technologies and methods.

  • @WiktorJurek
    @WiktorJurek 5 місяців тому +1

    This is bang on. It would be awesome to see how this works in practice - as in, how all of this looks in the console, how to set it up, and practically how you can oversee/manage this kind of setup.

    • @practicalgcp2780
      @practicalgcp2780  5 місяців тому

      There’s quite a lot of effort involved but the foundation isn’t that difficult to setup. But it’s not like there is just some sort of UI everything can be done there, I think the entry point of data management and discovery for large group of users can be from the catalog tool, and a platform team can own the tooling for things like quality scan and analytics hub while making them self service. There are things especially like the data quality check rules I would prefer to keep these in version control so it’s much easier to control the changes and quality of the checks where as other things like analytics hub UI should be sufficient as long as there is a way to recovery if something goes wrong

  • @JohnMcclaned
    @JohnMcclaned 5 місяців тому +2

    Would love to see a video about how to use AlloyDB to an ordered pubsub topic

    • @practicalgcp2780
      @practicalgcp2780  5 місяців тому

      I thought about challenges from event based data consumption coming from message queues, but decided not to cover in this video.
      Event based data consumption in real time has very different challenges and I don’t believe it’s the same pain as we get compared to data stored in analytic databases. Sure, managing those are important, but from my experience, event based applications are very bespoke, already has clear data contract as they are very mission critical and built by data engineering team mostly and are well maintained. Unfortunately the same cannot be said for data being consumed in analytic databases.
      AlloyDB I assume you are using it for more bespoke use cases as it’s not typically something used to store all data permanently to allow a large group of teams to consume.

    • @JohnMcclaned
      @JohnMcclaned 5 місяців тому

      @@practicalgcp2780 I am building an event sourced event store and I need a way to have ordered changes propagated out. I am defaulting to 1 second interval polling though I am exploring other solutions.

  • @SwapperTheFirst
    @SwapperTheFirst 5 місяців тому +1

    I like this format of battle stories/coaching.

    • @practicalgcp2780
      @practicalgcp2780  5 місяців тому

      Thanks ☺️ thought might try a different way to present feels like more people can relate to this

  • @SwapperTheFirst
    @SwapperTheFirst 5 місяців тому +1

    Any examples of such tools for cataloging, certification and lineage? Especially OSS?
    I had some experience with Qlik Catalog, but not sure if this is a good choice to GCP and how well it is integrated with BQ.
    Beyond usual suspects (Collibra, Immuta, ...)

    • @practicalgcp2780
      @practicalgcp2780  5 місяців тому +1

      There are a few who are GCP partners has very good integration with GCP to save you a lot of time doing meta data integration by engineers. Collibra is one of them as you already mentioned, you can also look at Atlan, a new player in the field but has some powerful features too. That’s the two I am aware of in my view have pretty good integration and features but please do your own research there are pros and cons and these are not recommendations I am making here. OSS do you mean support systems like JSM?

    • @SwapperTheFirst
      @SwapperTheFirst 5 місяців тому

      @@practicalgcp2780 nope, I mean open source software, like Apache Airflow for workflow management. From which you can also make managed solutions, like Astronomer or Cloud Composer. I think something should exist in this space too?