Azure Data Factory, Azure Databricks, or Azure Synapse Analytics? When to use what.

Поділитися
Вставка
  • Опубліковано 25 гру 2024

КОМЕНТАРІ • 57

  • @saivenkateshtummala5576
    @saivenkateshtummala5576 9 місяців тому +7

    This is really helpful for someone starting new, thank you!

  • @marchelomoratti1
    @marchelomoratti1 10 місяців тому +3

    Thank you so much for the presentation! It was very informative, it gave me a great picture of those tools!

  • @DistrictGentleman
    @DistrictGentleman 5 місяців тому +5

    This was definitely helpful for my DP-900 exam

  • @MuhammadUsamaAwan
    @MuhammadUsamaAwan Рік тому +5

    It was an excellent session regarding all these tools. It helps you a lot to understand when to use what.

  • @leolebron23
    @leolebron23 6 місяців тому +3

    Lisa is amazing!
    What a cool presentation.

  • @cloudbaud7794
    @cloudbaud7794 8 місяців тому +1

    Nice info and fun to watch 😊

  • @shanthababu
    @shanthababu Рік тому +5

    Excellent! Thanks, Lisa Hoving.

  • @sajidsid
    @sajidsid 6 місяців тому +3

    Thank you for summarization, this is quite helpful

  • @jaydeep9622
    @jaydeep9622 18 днів тому

    Lovely Session Thanks👌

  • @premanandasahoo290
    @premanandasahoo290 Рік тому +2

    Thanks a lot @lisa. I got a whole lot of clarity. Was always confused about which service to use and why.

  • @ranjancse26
    @ranjancse26 5 місяців тому +1

    Wow! Amazing presentation on Azure Data Factory, Azure Databricks, Azure Synapse Analytics. Love it :)

  • @MrBadGenius
    @MrBadGenius 3 місяці тому +1

    Amazing work 🎉❤

  • @valliguduru4963
    @valliguduru4963 9 місяців тому +2

    Thank you for the video. Excellent analysis and presentation!!! Can you please do a comparision video for Azure Fabric vs Azure Databricks.

  • @AliciaMarkoe
    @AliciaMarkoe 3 місяці тому +1

    Wonderful, thank you 🦋

  • @pauloroncarati
    @pauloroncarati 10 місяців тому +2

    Great presentation!

    • @SQLBits
      @SQLBits  10 місяців тому

      Thank you kindly!

  • @psvarada
    @psvarada Рік тому +2

    very nicely explained. great job!

  • @kirole7381
    @kirole7381 7 місяців тому +1

    Thank you for the work Lisa !

  • @CMJTe
    @CMJTe 2 місяці тому +1

    What would be your recommendation is she was using Delta lake, does synpase intergrate well with delta lake for dataflows and data processing?

    • @LisaHoving
      @LisaHoving 2 місяці тому

      Delta Lake is totally an option!

    • @CMJTe
      @CMJTe 2 місяці тому

      @@LisaHoving yes I know delta lake is an option but am asking if synapse integrates well with Delta lake vs using databricks for Delta lake

    • @LisaHoving
      @LisaHoving 2 місяці тому +1

      @@CMJTe In my personal opinion, both integrate well. However, historically speaking, Azure Databricks does come out with newer Delta Lake versions faster than Synapse, or Azure Data Factory for that matter. So, if it is the newest features you are after, go with Azure Databricks. If this is not a priority, both are good.

  • @davidlion4482
    @davidlion4482 Рік тому +16

    Azure Data Factory is similar to SSIS and doesn't have a data store to persist the data, but Azure Databricks and Azure Synapse has a database engine to support the storage of data.
    Azure Data Factory is only an ETL/ELT tool. But for the other two there are ETL/ELT and database.
    In case this, Azure Data Factory shouldn't be compared to a database.

    • @devarshsanghvi9315
      @devarshsanghvi9315 Рік тому +2

      Its a seperate tool that's true and as many people use ETL with Data Factory they do have doubts about Should I use Azure Synapse / Azure Databricks for my ETL or I should continue using Azure Data Factory. Noting don't know code can leverage UI with little extra cost and who knows code can save little too.

    • @LisaHoving
      @LisaHoving Рік тому +5

      Migrating to Databricks can offer you a bit more flexibility, but you would have to migrate all the pipelines to code. Alternatively, you could use both tools, and make your new flows in Databricks. Notebooks and packaged code in databricks can easily be kicked off by ADF, making it a cool orchistrator!

    • @grahamthomas7821
      @grahamthomas7821 Рік тому +1

      Agreed that ADF seems like an odd comparison here but the Databricks vs Synapse comparison was really helpful

    • @rajeshshetty4685
      @rajeshshetty4685 Рік тому +1

      Why then the speaker is saying that there is no data storage (24:36) in all three:?

    • @莫奈-s3z
      @莫奈-s3z 3 місяці тому

      Agree

  • @datadataeverywhere6954
    @datadataeverywhere6954 7 місяців тому +1

    Eye opening

  • @peterpan-yj4rn
    @peterpan-yj4rn 9 місяців тому +2

    Why ADF can’t be used for Power BI if the target data model is SQL server?!

    • @LisaHoving
      @LisaHoving 9 місяців тому +2

      If SQL Server is the target, you can indeed just connect Power BI to SQL Server and do your aggregations/data loading with ADF, no problem! My point was more regarding to connecting ADF to Power BI. In synapse and Databricks you can create tables and use these definitions directly in Power BI by connecting these tools. ADF has no such thing.

  • @ishankhobare22
    @ishankhobare22 4 місяці тому +1

    Cheers Lisa! Thank you

  • @CMJTe
    @CMJTe 2 місяці тому

    Isn't Azure Synapse pipelines based on ADF? If so how come it's cheaper on Synapse to run data flows

  • @MauriceBierhuizen
    @MauriceBierhuizen 10 місяців тому +2

    Very clear. And hilarious when she misspoke sqlbit, and blamed her adhd🤣

  • @nikjojo
    @nikjojo Рік тому +1

    Great presentation thank you.

  • @sbudama242
    @sbudama242 Рік тому +1

    I am bit confused, why cant we store data in Databricks. Databricks has Lake house to do so?

    • @grahamthomas7821
      @grahamthomas7821 Рік тому +1

      I guess it's because it's just Azure data lake storage under the hood? So technically the data isn't actually stored in Databricks

    • @michaszalast6094
      @michaszalast6094 Рік тому

      lake house is just the architectural approach. as of my knowledge, every analytical, cloud based solution is built on top of some kind of cloud data storage (adls, blob storage, aws s3 etc.) and this is only a data storage layer

    • @himondas18
      @himondas18 Рік тому +1

      as per my understanding, databricks and synapse store data in azure blob storage, and give you a database/ datawarehouse like model on top of that, so that you can do easier analytics or other stuffs. Even some projects creates data integration and pipeline in ADF to trigger databricks jobs/notebooks and synapse can do analytics and use BI tools over delta lake in databricks.

  • @williamnguyen5771
    @williamnguyen5771 9 місяців тому +4

    HAHAHAHA 20:12 man she’s so hilarious for keeping it real. ADHD here too

  • @waldchiller4695
    @waldchiller4695 11 місяців тому +1

    Here still just having on prem projects with SSIS LOL.

  • @YasminS-k9o
    @YasminS-k9o 2 місяці тому

    Hi, what is the minimum salary we can expect for azure data factory developer with 5 yrs of experience,, other experience 5 yrs

  • @ivanp9222
    @ivanp9222 Рік тому +1

    What about the Java u highlighted earlier? Or did i missed it 😂

    • @UNNIE2363
      @UNNIE2363 Рік тому

      Yes , you kinda missed it . She mentions go with Databricks if speciality is in Java, as Java lang is supported

    • @DiscobiscuitUK1
      @DiscobiscuitUK1 9 місяців тому

      ua-cam.com/video/_QtA_492l4k/v-deo.htmlsi=NAXqM24LibEQz4tI&t=1171

  • @steelmilkjug
    @steelmilkjug Рік тому

    What can DataBricks do that Synapse cannot do better?

    • @danhorus
      @danhorus Рік тому +7

      Here's a few off the top of my head:
      1. Databricks clusters are more flexible. You can choose the cheaper Compute Optimized VMs for append-only incremental processing, or Storage Optimized VMs to enable caching on the local SSDs, among other VM types. In Synapse, you can only use Memory Optimized and GPU Optimized VMs;
      2. Databricks clusters allow you to use Spot VMs for the workers, which are significantly cheaper as well. Synapse does not support Spot VMs;
      3. Databricks allows for better cluster sharing, as the same cluster can have multiple Spark sessions active at once. Synapse reserves slots for each Spark session, and those slots will sit idle when the developer is not running any code -- they can't be used by other developers while they are reserved;
      4. The notebook file format in Databricks lends itself better to git diffs in Pull Requests, as they are regular code files (e.g., Python code) with some comments for special cells. Synapse notebooks, on the other hand, are saved as JSON files which are much harder to review in a git diff interface;
      5. Databricks has exclusive features such as Auto Loader and identity columns, which are really helpful for data engineering and framework development;
      6. Databricks is the flagship product of the company founded by the creators of Apache Spark, and as such it will always have an edge in supporting new Spark versions and features. Meanwhile, Synapse is a PaaS offering from Microsoft, and Microsoft is now clearly focusing a lot more in their SaaS offering: Microsoft Fabric. If I had to build a data platform on Azure today, I would use Databricks as my transformation engine. Hope this helps! :)

  • @roman220220220
    @roman220220220 Місяць тому

    I hope you finally found a job)

  • @vyacheslavs5642
    @vyacheslavs5642 2 місяці тому

    Your choise is Snowflake, actually. :)

  • @tinasheyamaone5435
    @tinasheyamaone5435 Рік тому +1

    WHAT DOES MORE MATURE EVEN MEAN???!!!!

    • @SQLBits
      @SQLBits  Рік тому

      Hi Tinashelyemaone5435, you can get in touch with the speakers directly through LinkedIn and X! They are normally more than happy to help.

    • @kimstuart7989
      @kimstuart7989 11 місяців тому +2

      the amount of work the developing community has put into it. Think of it as beta vs stable. Databricks is way more stable, has been developed through iterations to catch bugs and implement fixes already. Synapse Analytics is newer comparatively and is going through that iterative process now, so in time its reliability will catch up to that of Databricks.

    • @bms4654
      @bms4654 8 місяців тому

      I would say maturity is the level of knowledge and skills an organization has to support these tools. You are not going to give a graphing calculator to a 6 yr old child. You are not going to give databricks to a company that has everything in spreadsheets.

  • @tinasheyamaone5435
    @tinasheyamaone5435 Рік тому +5

    You Said absolutely Nothing!!!