Tybul on Azure
Tybul on Azure
  • 54
  • 168 793
DP-203: 50 - Azure Stream Analytics overview
Hey Data Engineers!
We've successfully ingested streaming data into Azure using Azure Event Hubs, but it's just sitting there, untapped and underutilized. It's time to unlock its potential and dive into the insights it holds!
In the 50th episode of my free DP-203 course, I'll guide you through Azure Stream Analytics-a powerful tool for processing streaming data. We'll explore its windowing functions, which are essential for working with temporal data windows. Don’t miss out-let’s put that data to work and uncover what’s inside.
Enjoy!
▬▬▬▬▬▬ IMPORTANT LINKS ▬▬▬▬▬▬
My LinkedIn profile: www.linkedin.com/in/piotr-tybulewicz-81a8793/
GitHub with my drawings: github.com/TybulOnAzure/DP-203
Stream Analytics documentation: learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction
▬▬▬▬▬▬ MEMBERSHIP ▬▬▬▬▬▬
Join this channel to get access to perks:
ua-cam.com/channels/LnXq-Fr-6rAsCitq9nYiGg.htmljoin
▬▬▬▬▬▬ CHAPTERS ▬▬▬▬▬▬
00:00 Introduction
00:52 Stream analytics overview
11:50 Creating a Stream Analytics Job
14:22 Components of the job
23:03 Creating the passthrough processing
47:21 Introduction to windows
49:59 Tumbling windows
1:02:25 Hopping window
1:17:40 Sliding window
1:28:06 Session window
1:33:34 Snapshots
1:35:36 Summary
Переглядів: 339

Відео

DP-203: 49 - Introduction to streaming, Event Hubs
Переглядів 588День тому
Hey Data Engineers! Tired of the same old batch processing? Let's dive into something more exciting-streaming and real-time data processing! In the 49th episode of my free DP-203 course, I'm exploring streaming, sharing real-world examples, outlining a high-level architecture, and taking a deep dive into the ingestion phase using Azure Event Hubs. Enjoy! ▬▬▬▬▬▬ IMPORTANT LINKS ▬▬▬▬▬▬ My LinkedI...
DP-203: 48 - 3rd milestone - Serve phase recapped
Переглядів 43614 днів тому
Hey data engineers! We’ve done it! We’ve just completed another phase of our projects: serving data to the end users/data consumers. Join me in the 48th episode of my free DP-203 course, where I'll walk you through the entire data serving process and present a challenge to help you refine your skills. Enjoy! ▬▬▬▬▬▬ IMPORTANT LINKS ▬▬▬▬▬▬ My LinkedIn profile: www.linkedin.com/in/piotr-tybulewicz...
DP-203: 47 - Azure Synapse Analytics - Serverless SQL Pool
Переглядів 805Місяць тому
Hey data engineers! We've all seen the modern data warehouse architecture that uses Dedicated SQL Pool for storing data and serving it to consumers. But do we really need another copy of the data? What if we could just use the lakehouse approach? Well, we can! The key component for this is the Serverless SQL Pool available in Azure Synapse Analytics. Join me in the 47th episode of my free DP-20...
DP-203: 46 - Dedicated SQL Pool security
Переглядів 574Місяць тому
Hey data engineers! So you've decided to store your data in Synapse Dedicated SQL Pool. Great! But how do you secure that data? How do you make sure that only selected principals are allowed to query the data? Is there a way to restrict which columns and rows users can see? Join me in the 46th episode of my free DP-203 course, where I cover the following topics: • Firewall in Synapse • Transpar...
DP-203: 45 - Azure Synapse Dedicated SQL Pool - extra features
Переглядів 687Місяць тому
Hey data engineers! Imagine we’ve designed our Dedicated SQL Pool using a star schema and allowed users to query it. But now we’re getting complaints that it’s incredibly slow, and users have to wait too long for the data. How can we handle concurrent workloads in a Dedicated SQL Pool? How can we prioritize important queries to ensure they are processed first? How can we reserve resources to en...
DP-203: 44 - Loading data to Dedicated SQL Pool
Переглядів 854Місяць тому
Hey data engineers! If you’ve chosen to use Dedicated SQL Pool for your data storage and serving needs, you might be wondering how to load data into it. What options are available, and what are their pros and cons? Join me for the 44th episode of my free DP-203 course, where I explore various methods for inserting data into Dedicated SQL Pool: • PolyBase • COPY INTO • ADF/Synapse data flows • A...
DP-203: 43 - Azure Synapse Dedicated SQL Pool - Architecture and overview
Переглядів 1 тис.2 місяці тому
Hey data engineers! We've finished transforming our data - it's now pure, pristine, and ready to be consumed as facts and dimensions. Great! The only issue is that it's located in a data lake using the delta format, and not every consumer can connect to it. Or maybe we just prefer a good, old-fashioned, traditional relational data warehouse instead of this "data lake thing." What should we do? ...
Channel membership announcement
Переглядів 9022 місяці тому
▬▬▬▬▬▬ IMPORTANT LINKS ▬▬▬▬▬▬ My LinkedIn profile: www.linkedin.com/in/piotr-tybulewicz-81a8793/ GitHub with my drawings: github.com/TybulOnAzure/DP-203 ▬▬▬▬▬▬ MEMBERSHIP ▬▬▬▬▬▬ Join this channel to get access to perks: ua-cam.com/channels/LnXq-Fr-6rAsCitq9nYiGg.htmljoin ▬▬▬▬▬▬ CHAPTERS ▬▬▬▬▬▬ 00:00 Introduction 00:17 UA-cam membership offer 01:09 Joining the club 02:11 Junior Data Engineer lev...
DP-203: 42 - 2nd milestone - transform phase recapped
Переглядів 7692 місяці тому
Hey data engineers! We did it! We’ve just wrapped up another phase of our projects: data transformations. Catch me in the 42nd episode of my free DP-203 course, where I’ll give you a rundown of the whole transformation phase and throw in a challenge to sharpen your transformation skills (yes, we’re still using Lego data). Enjoy! ▬▬▬▬▬▬ IMPORTANT LINKS ▬▬▬▬▬▬ My LinkedIn profile: www.linkedin.co...
DP-203: 41 - Transforming data with Data Flows
Переглядів 9472 місяці тому
Hey data engineers! In our work, we often need to transform ingested data. So far, I've introduced you to three tools for this purpose: Azure Databricks notebooks, Synapse Analytics notebooks and dbt. However, all of them require you to write code using SQL, Python, Scala, or R. But what if you don't know any of these languages and still need to transform data? Is it possible? Yes, it is! You c...
DP-203: 40 - Extra features in Azure Synapse Analytics Spark Pools
Переглядів 8203 місяці тому
Hey, data engineers! You're already aware that Spark Pools in Azure Synapse Analytics can seamlessly integrate with various Azure services, such as ADLSg2. But what about other services? Join me in the 40th episode of my free DP-203 course, where I answer the following questions: • How is it possible for a Spark Pool to retrieve data from ADLSg2? • Will it work out of the box with any other sto...
DP-203: 39 - Azure Synapse Analytics - Spark Pools
Переглядів 1,2 тис.3 місяці тому
Hey, data engineers! Apache Spark is excellent for data transformations, especially when handling large datasets. However, setting up Spark can be challenging. That's why there are services like Azure Databricks that simplify the process. But what if you prefer not to use third-party solutions like Databricks and want to stick with Microsoft offerings? Is it possible? The answer is yes - we hav...
DP-203: 38 - Transforming data with dbt
Переглядів 1,3 тис.3 місяці тому
Hey data engineers! Databricks notebooks aren't the only way to transform your data. In the latest episode of my free DP-203 course, I discuss dbt - a widely used data transformation solution that offers several advantages over Databricks: • Simplicity and ease of use • Data lineage • Automatically generated and maintained documentation • Data quality tests • Jinja templating language ▬▬▬▬▬▬ IM...
DP-203: 37 - Orchestrating Databricks notebooks
Переглядів 1,1 тис.3 місяці тому
Hey, data engineers! We have successfully developed Databricks notebooks for data transformation, and they are functioning smoothly. However, up to this point, we have been executing them manually, cell by cell. How can we automate this process and integrate it into our orchestration workflow? Discover the solution by tuning in to the 37th episode of my free DP-203 series. Enjoy! ▬▬▬▬▬▬ IMPORTA...
DP-203: 36 - Automating the process with Azure Databricks Autoloader
Переглядів 1,3 тис.4 місяці тому
DP-203: 36 - Automating the process with Azure Databricks Autoloader
DP-203: 35 - Writing data to ADLSg2 from Azure Databricks
Переглядів 1,2 тис.4 місяці тому
DP-203: 35 - Writing data to ADLSg2 from Azure Databricks
DP-203: 34 - Common data transformations
Переглядів 1,5 тис.4 місяці тому
DP-203: 34 - Common data transformations
DP-203: 33 - Connecting to ADLSg2 from Azure Databricks
Переглядів 1,7 тис.4 місяці тому
DP-203: 33 - Connecting to ADLSg2 from Azure Databricks
DP-203: 32 - A closer look at Databricks notebooks
Переглядів 1,8 тис.5 місяців тому
DP-203: 32 - A closer look at Databricks notebooks
DP-203: 31 - Introduction to Azure Databricks
Переглядів 2,5 тис.5 місяців тому
DP-203: 31 - Introduction to Azure Databricks
DP-203: 30 - Slowly changing dimensions, data lake structure
Переглядів 1,7 тис.5 місяців тому
DP-203: 30 - Slowly changing dimensions, data lake structure
DP-203: 29 - Introduction to dimensional modeling
Переглядів 1,9 тис.5 місяців тому
DP-203: 29 - Introduction to dimensional modeling
DP-203: 28 - 1st milestone (ingest phase recapped + challenge)
Переглядів 1,4 тис.6 місяців тому
DP-203: 28 - 1st milestone (ingest phase recapped challenge)
DP-203: 27 - Azure Synapse Analytics: Overview, Pipelines
Переглядів 2,7 тис.6 місяців тому
DP-203: 27 - Azure Synapse Analytics: Overview, Pipelines
DP-203: 26 - Introduction to Azure Logic Apps
Переглядів 1,5 тис.6 місяців тому
DP-203: 26 - Introduction to Azure Logic Apps
DP-203: 25 - Selective CI/CD for Azure Data Factory with ADF Tools (part 2)
Переглядів 1,3 тис.6 місяців тому
DP-203: 25 - Selective CI/CD for Azure Data Factory with ADF Tools (part 2)
DP-203: 24 - Selective CI/CD for Azure Data Factory with ADF Tools (part 1)
Переглядів 1,6 тис.7 місяців тому
DP-203: 24 - Selective CI/CD for Azure Data Factory with ADF Tools (part 1)
DP-203: 23 - ARM-based CI/CD for Azure Data Factory (part 2)
Переглядів 1,8 тис.7 місяців тому
DP-203: 23 - ARM-based CI/CD for Azure Data Factory (part 2)
DP-203: 22 - ARM-based CI/CD for Azure Data Factory (part 1)
Переглядів 2,3 тис.7 місяців тому
DP-203: 22 - ARM-based CI/CD for Azure Data Factory (part 1)

КОМЕНТАРІ

  • @mdpdurawix1834
    @mdpdurawix1834 День тому

    42:50 Could you please elaborate Piotr what you meant by culture? Personally i would add few more questions What timezone is used in source dataset? About historical data: 1 For how long historical data should be stored in ADLS? 2. If there will be need for full data reload due to some issues in the future, what starting point in time should be set for initial load? Should it be static point in time like 01.01.2020, or just dynamic load (today-2 years)

  • @mdpdurawix1834
    @mdpdurawix1834 День тому

    Does private endpoint needs to be monitored? Or authentication keys should be rotated after some period of time?

  • @faisalrahman6373
    @faisalrahman6373 День тому

    I tried to create a Managed Private Endpoints where it is failed in the Provision State. Any suggetions how to troubleshoot it?

    • @TybulOnAzure
      @TybulOnAzure День тому

      Do you have Microsoft.Network resource provider registered in your Azure subscription? You can check that by opening our subscription in Azure Portal, then going to Settings -> Resource providers.

  • @mdpdurawix1834
    @mdpdurawix1834 День тому

    Piotr, you are doing such a great job! Is it possible to verify what IR's are being used in whole subscription? I guess it will be some command in PowerShell. (If its explained in another video you can ignore my question :))

  • @rafaelvieira2003
    @rafaelvieira2003 2 дні тому

    watching all episodes! thank you!

  • @KamilNowinski
    @KamilNowinski 2 дні тому

    Awesome! Thanks for using and spreading the word about "Real-Time Data Simulator". It's open source for the community, so hopefully many will benefit from it.

  • @PankajMehta101
    @PankajMehta101 4 дні тому

    Sending love from Nepal! I have my DP-203 exam tomorrow. I just found this playlist 2 days back. It has been a great resource to revise and further consolidate the concepts. Thank you for your detailed explanations on related topics, not to mention the Whiteboard goldmines😊😊!

    • @TybulOnAzure
      @TybulOnAzure 4 дні тому

      Good luck on the exam!

    • @TybulOnAzure
      @TybulOnAzure 4 дні тому

      And please let me know your result after the exam.

    • @PankajMehta101
      @PankajMehta101 3 дні тому

      @@TybulOnAzure Hooray! I passed my exam (914/1000). Thank you for your wonderful content, Piotr!

    • @TybulOnAzure
      @TybulOnAzure 3 дні тому

      @PankajMehta101 that's fantastic! Congratulations 🎉

  • @victor7893
    @victor7893 5 днів тому

    Thanks a lot ! new subscriber here

  • @rimwor
    @rimwor 5 днів тому

    Great content - even for a newbie to the topic like myself. Thanks a lot !! 🙏🏻😊 And CONGRATULATIONS on you're 50th episode !! 🎉🚀✨

  • @TheMapleSight
    @TheMapleSight 6 днів тому

    Over than 1,5h of great content!

    • @TybulOnAzure
      @TybulOnAzure 5 днів тому

      Yup, when I started recording it, I didn't have a clue that it would be that long. Otherwise, I would probably split it.

  • @veeraaaa1991
    @veeraaaa1991 6 днів тому

    Thank you so much Tybul! After watching 50 of your videos I passed the exam with score 953 today, I couldnt have done this without your help! You are a great teacher! Your explanations towards those concepts are the best and your demenstrations are very easy to follow. All credit goes to you

    • @TybulOnAzure
      @TybulOnAzure 5 днів тому

      That's fantastic score! Congratulations 🎉

  • @lucasalbergaria2
    @lucasalbergaria2 7 днів тому

    Thanks for the content! Would there a correct way to add "tags" to the deployment? For example "v1.0.1"

    • @TybulOnAzure
      @TybulOnAzure 6 днів тому

      If you would like to use tags to explicitly mark a specific code version - go for it.

  • @smithapisharath9610
    @smithapisharath9610 7 днів тому

    is dbt part of 203 curriculum?

  • @dmitryzvorikin
    @dmitryzvorikin 7 днів тому

    (50 minutes of setting up checkboxes and roles) See how elegant and simple this approach is! : )

    • @TybulOnAzure
      @TybulOnAzure 6 днів тому

      It really is simple and elegant :) And once you know how to set it up, it takes way less time than 50 minutes.

    • @dmitryzvorikin
      @dmitryzvorikin 5 днів тому

      @@TybulOnAzure I'm just laughing, please dont take it too serious. Thank you a lot for this clear and precise course!

  • @onghuiling6937
    @onghuiling6937 8 днів тому

    Thank you Tybul! You are the best teacher!

  • @dmitryzvorikin
    @dmitryzvorikin 8 днів тому

    Thank you! Looks a little bit impractical since at 41:43 Azure tells you cannot grant this access and need some extra RBAC role.

  • @ZeeshanKhan-ff9xl
    @ZeeshanKhan-ff9xl 10 днів тому

    Thank you for all your efforts on this. Very grateful for it. Does this course if followed properly provide enough information to have a career as a Azure data engineer (not just the certificate) Thank you again

    • @TybulOnAzure
      @TybulOnAzure 10 днів тому

      Hi, it should give you solid foundations to start a career as Azure data engineer - some of my students already got such jobs.

    • @ZeeshanKhan-ff9xl
      @ZeeshanKhan-ff9xl 10 днів тому

      @@TybulOnAzure Thank you and i pray for your good health

  • @muhammadzakiahmad8069
    @muhammadzakiahmad8069 11 днів тому

    So once we complete the series, what are you planning to teach more? Maybe some end to end projects or other data Engineering technologies you know.

    • @TybulOnAzure
      @TybulOnAzure 11 днів тому

      I have some ideas already in mind. For sure it won't be anything as long as DP-203 course (at least not for now). Also, Data Engineer members of my channel will have a chance to propose new topics and then vote on their priorities.

  • @TheMapleSight
    @TheMapleSight 12 днів тому

    Hi, it's great to start another section of the course! When are you going to upload solutions to challanges? After completing the last episode of the course or later?

    • @TybulOnAzure
      @TybulOnAzure 12 днів тому

      I'm not going to upload them as I want you to implement them on your own.

    • @TheMapleSight
      @TheMapleSight 12 днів тому

      @@TybulOnAzure Sure! I just wanted to clarify this, because you once answered me that you planned to upload them anyways.

    • @pankajpatil4634
      @pankajpatil4634 9 днів тому

      How long is this series gonna be? How many videos are planned in this series?

    • @TybulOnAzure
      @TybulOnAzure 9 днів тому

      @pankajpatil4634 54

  • @vlad_badiuc6481
    @vlad_badiuc6481 12 днів тому

    Awesome work, more content about streaming is greatly appreciated :) Thanks !!!

    • @TybulOnAzure
      @TybulOnAzure 11 днів тому

      Thanks! There will be two more episodes about streaming released quite soon.

  • @alphar85
    @alphar85 13 днів тому

    Hey Piotr, you might see my comments every now and then in your channel. I have been through your videos from 1 to 38 as i have a a DP203 exam soon. Your videos are very well detailed and you know very well your stuff. i did Microsoft videos and i was meant to take the exam 2 weeks ago but when i saw your videos, i felt like i need to learn more and more, which is why i have postponed it till i cover all of yours. And hopefully i will be in a better place.

    • @TybulOnAzure
      @TybulOnAzure 11 днів тому

      Thanks! Please let me know the result after you've taken the exam.

  • @user-uh2sk3vw2r
    @user-uh2sk3vw2r 13 днів тому

    Hi Piotr, Could you please let us know that whether the course to clear DP-203 is finished, or is it still an on going course? The reason I am asking this is that I want to set a plan to cover everything, so that I can appear for the exam and clear it well. Hence, I wanted to know if the course is completed or not?

    • @TybulOnAzure
      @TybulOnAzure 13 днів тому

      Hi, the course is almost finished - in total there will be probably 54 episodes.

  • @navneetkummar4811
    @navneetkummar4811 13 днів тому

    Hi Piotr. Thank you for the great content , I just found you 2 weeks back and I have finished video no: 35 right now, my question is whether Unity Catalog has been covered or is it yet to be covered as I searched till video 38 but couldn't find it.

    • @TybulOnAzure
      @TybulOnAzure 13 днів тому

      Hi, no - UC is not covered. Maybe I'll do that separately once the DP-203 course is finished.

    • @navneetkummar4811
      @navneetkummar4811 13 днів тому

      @@TybulOnAzure : Thanks for replying, I am enjoying your content and eagerly waiting for the remaining videos :)

    • @navneetkummar4811
      @navneetkummar4811 13 днів тому

      ​@@TybulOnAzure : I need one advice from you. I am trying to change my domain from a Data Scientist to Azure Data Engineer so after getting the job shall I focus on Azure only or should I learn AWS and it's functionalities like Glue ( which is similar to ADF) side by side . What will be your take on this ?

    • @TybulOnAzure
      @TybulOnAzure 13 днів тому

      @navneetkummar4811 learning multiple clouds is a huge effort so I would focus on the one needed in your job.

    • @navneetkummar4811
      @navneetkummar4811 12 днів тому

      @@TybulOnAzure Thank you so much :)

  • @onghuiling6937
    @onghuiling6937 15 днів тому

    Really informative and good channel! Recommend to everyone who would like to kickstart the journey as an Azure Data Engineer!

  • @mgdesire9255
    @mgdesire9255 15 днів тому

    Same pinch 42:37😂

  • @onghuiling6937
    @onghuiling6937 15 днів тому

    Thank you for such an amazing video!

  • @youngforever9173
    @youngforever9173 15 днів тому

    Thank you so much for your greate explanation.

  • @jeromedayrit02
    @jeromedayrit02 15 днів тому

    Hi guys, i need your recommendation, im planning to start my azure cloud data engineer, im new to this, should I start first on dp-900? O can continue here dp203?

    • @TybulOnAzure
      @TybulOnAzure 13 днів тому

      Hi, if you are completely new to data, then I would recommend starting from DP-900 (or even AZ-900 if you know nothing about Azure).

    • @jeromedayrit02
      @jeromedayrit02 13 днів тому

      @@TybulOnAzure Hi Tybul, I really appreciate your time responding me, I will definitely take and follow your advise.

  • @forsalemailid6305
    @forsalemailid6305 16 днів тому

    Great explanation - went through all your videos of Azure Data Lake Security. However, still unable to help my use case. I am trying to grant Read access to a specific blob within a folder to a User. When the user clicks the blob URL in the browser - it simply says "Resource not found". Yes I have anonymous access enabled to the storage account. If I don't, I get "Public access not permitted" error. So how can I make sure only the user who have access/ACL permission can open the URL and others can't?

    • @TybulOnAzure
      @TybulOnAzure 16 днів тому

      If you need it to work for any user (even those outside Entra) then use SAS tokens and share them with those users. Or if it is from Entra, then you can use ACLs but please remember about X ones on higher levels of the hierarchy.

    • @forsalemailid6305
      @forsalemailid6305 16 днів тому

      @@TybulOnAzure Thanks for the response. User ID is from Entra. The ACLs are working as expected, validated the same using Storage explorer with User account. However, trying to open the blob URL in browser giving "Resource not found" error. In your example in the video, if you login as DemoUser - would you be able to access the Winnie image from the URL in browser?

    • @TybulOnAzure
      @TybulOnAzure 16 днів тому

      In the browser - are you already logged in to Azure as that user for which ACLs are set?

    • @TybulOnAzure
      @TybulOnAzure 16 днів тому

      You've got me curious with your case so I recreated it on my side, and to my surprise I got the same results as you. I even granted Storage Blob Data Contributor role on the whole storage account, but the result was the same. I found this old article that might shed some light on this case: stackoverflow.com/questions/20672368/acl-access-abilities-for-azure-containers-and-blobs

    • @forsalemailid6305
      @forsalemailid6305 13 днів тому

      @@TybulOnAzure Thank you! We have moved on closing this issue as "cannot be accomplished". Appreciate looking into this!

  • @swathi8273
    @swathi8273 16 днів тому

    Hi Tybul, what is the best way to fix data skew?

    • @TybulOnAzure
      @TybulOnAzure 16 днів тому

      Are you asking in the context of this episode?

  • @swathi8273
    @swathi8273 16 днів тому

    Is Z-ORDER in delta lake does similar distribution to HASH (except applying has function) ?

    • @TybulOnAzure
      @TybulOnAzure 16 днів тому

      I might cover Z-ORDER in other episode after I finish the DP-203 series.

  • @swathi8273
    @swathi8273 16 днів тому

    Great explanation! I appreciate how you reference every topic to relate with production. Thank you very much

  • @swathi8273
    @swathi8273 17 днів тому

    I am certified, still watch your videos for preparing for my interviews. Please continue making videos for DE!

  • @swathi8273
    @swathi8273 17 днів тому

    I was asked this question in interview - how to implement SCD in datlake or data warehouse!! Thanks for addressing this

  • @NIKOS.koukos
    @NIKOS.koukos 18 днів тому

    Hello, thank you for your contribution by creating this series of videos. Is it possible to know in which episode the series will end?

    • @TybulOnAzure
      @TybulOnAzure 18 днів тому

      Hi, probably there will be 54 episodes in total.

    • @NIKOS.koukos
      @NIKOS.koukos 18 днів тому

      @TybulOnAzure The reason I am asking is that I am planning to take the exam in about a month. Based on the existing structure, I'm guessing the rest of the course will be dedicated to streaming data.

    • @TybulOnAzure
      @TybulOnAzure 18 днів тому

      There will be 3 episodes about streaming, one about Purview, one for 4th milestone and the final one with exam overview.

    • @NIKOS.koukos
      @NIKOS.koukos 18 днів тому

      @@TybulOnAzure Thank you very much!

  • @filipposvangelis6053
    @filipposvangelis6053 19 днів тому

    Hi Piotr, a big thank you for the great content! I just wanted to ask you if this tool is available for Synapse.

  • @TheMapleSight
    @TheMapleSight 19 днів тому

    Is episode 49 going to be uploaded today or next tuesday? Last tuesday there was no episode, so that's why I'm asking.

    • @TybulOnAzure
      @TybulOnAzure 19 днів тому

      Next week (it is already available in early access). Last Tuesday I was on holidays.

  • @user-py3zb2or6n
    @user-py3zb2or6n 20 днів тому

    When i tried to create an external table, i was forced to create storage credentiel and external location,

    • @TybulOnAzure
      @TybulOnAzure 20 днів тому

      What was the code you wrote?

    • @user-py3zb2or6n
      @user-py3zb2or6n 20 днів тому

      I configured aacess on cluster level When i directly save files in data lake, usin spark.write.format("delta").save("path") everything was fine But when i tried to create the external table ( using the same create table query as you ) i had an error, external location.. when i created storage credentiels and external location then i re execute the cell, it worked

  • @shilpababu9459
    @shilpababu9459 21 день тому

    Hi Tybul , Can you please make a video for concepts related to end to end jobs for running the pipelines , error handling ,monitoring, debugging and performance scenarios

    • @TybulOnAzure
      @TybulOnAzure 20 днів тому

      As a "Data Engineer" member of my channel, you’ll have the special privilege of suggesting topics for new videos and voting on them. If you have a topic in mind, I’d love for you to join as a member. I’ll be setting up the first poll once I complete the DP-203 course.

  • @navneetbali112
    @navneetbali112 23 дні тому

    Great , thanks for this lucid delivery <3

  • @Engineering_101_
    @Engineering_101_ 24 дні тому

    excellent ❤

  • @mdpdurawix1834
    @mdpdurawix1834 25 днів тому

    Hi Piotr, It would be nice to see some video about how to verify the quality of the data in different layers before it reaches end user. Great video like always, please keep it up!

    • @TybulOnAzure
      @TybulOnAzure 20 днів тому

      As a "Data Engineer" member of my channel, you’ll have the special privilege of suggesting topics for new videos and voting on them. If you have a topic in mind, I’d love for you to join as a member. I’ll be setting up the first poll once I complete the DP-203 course.

  • @navneetbali112
    @navneetbali112 26 днів тому

    thanks a lot!

  • @ValeriuB-c2w
    @ValeriuB-c2w 26 днів тому

    Hi Tybul, i have a quick question: When the new file is created into the Conformed container, how does Databricks know to save the new file in Delta format and not in other format?

    • @TybulOnAzure
      @TybulOnAzure 20 днів тому

      Delta Lake is the default format for all operations on Azure Databricks. Unless otherwise specified, all tables on Azure Databricks are Delta tables.

  • @abhijitbaner
    @abhijitbaner 27 днів тому

    Does DBrx MANAGED tables have any advantages over the EXTERNAL tables? like btter performance on read/write etc?

    • @TybulOnAzure
      @TybulOnAzure 20 днів тому

      I'm not aware of any performance differences between those two (or at least I didn't notice any). Important difference between them is that if you drop a managed table then it will also drop your data. In case of external table, only the metadata in metastore is removed but the actual data stays untouched (as it is stored externally).

  • @mdpdurawix1834
    @mdpdurawix1834 29 днів тому

    So if we want to verify if data is correct using databricks, we should go into transaction log and take only parquets which are next to "add" statement? (Maybe its explained later but i didnt want to lose this question) 14:05 EDIT: It was of course explained later 33:38 DESCRIBE HISTORY Thanks for great teaching Piotr!

    • @TybulOnAzure
      @TybulOnAzure 20 днів тому

      You shouldn't even have to bother about reading the transaction log on your own and then determining which files have to be processed - that's the job for the tool that understands delta format and "can speak it". All you would need to do is to instruct your tool to read delta folder and it would be the tool's responsibility to figure out how to do it properly.

  • @senthilkumarpalanisamy365
    @senthilkumarpalanisamy365 29 днів тому

    Very clear and elaborate content, thanks for taking time in preparing the content and knowledge sharing. Please do more.