Azure Synapse Analytics | Data Distribution Strategy and Best Practices

Поділитися
Вставка
  • Опубліковано 12 вер 2024

КОМЕНТАРІ • 46

  • @orxanbabashov
    @orxanbabashov 7 місяців тому

    This is the first time I ever subscribed a channel as well. Huge thanks !!!!

  • @VK-ln9vk
    @VK-ln9vk Рік тому

    i wish there are 100000 LIKE buttons. THE BEST VIDEO on the azure synapse distribution. Understood clearly about the distributions with the demo.Thank you so much 🙏

  • @VirtusRex48
    @VirtusRex48 Рік тому

    One of the best Synapse videos out there; highly recommend!!!

  • @hoanglieuit
    @hoanglieuit Рік тому

    This is the first time I ever subscribed a channel.

  • @vinayak6685
    @vinayak6685 2 роки тому +1

    Really happy to find this video. Loved the practical demo on how the distributions happened. Subscribed(500th subscriber😁). Waiting for more such awesome content🤩

  • @Zaf567
    @Zaf567 2 роки тому

    Have watched many videos related to this but yours is awesome.

  • @goelnikhils
    @goelnikhils Рік тому

    What hard work in creating this video. Very good content

  • @husnabanu4370
    @husnabanu4370 Рік тому

    wow so detailed explaination with all the visuals and query example is making so easy to understand...

  • @donanuradha2162
    @donanuradha2162 3 роки тому +1

    Very well explained how data is distributed in Synapse SQL DW

  • @vaibhavvaidya1442
    @vaibhavvaidya1442 3 роки тому

    Never saw explanation like this on azure synapse, Amazing :)

  • @julianromero3359
    @julianromero3359 Рік тому

    Amazing explanation, thanks for concepts are very clear and practical to understand. I hope find more contents from you. 🤗

  • @Farisito
    @Farisito 11 місяців тому

    Thank you a lot ALI, very useful in my case

  • @gvgnaidu6526
    @gvgnaidu6526 2 роки тому

    Amazing explanation and nice representation of all the aspects. Thank you so much Arshad

  • @SQLTalk
    @SQLTalk 2 роки тому

    This is a very well done and helpful video. Thank you for making it.

  • @danielveraec
    @danielveraec 2 роки тому

    Thanks for sharing this knowledge. Really helpfully!!

  • @jubershikalgar4205
    @jubershikalgar4205 2 роки тому

    Thank you very much for this video.
    It was a very helpful and learnt alot about synapse.

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 3 роки тому

    Looking forward for the next session

    • @ArshadAliAasTrailblazers
      @ArshadAliAasTrailblazers  2 роки тому

      Thanks Mohammed, I just posted a video on CI/CD and planning to post few more in next couple of weeks.

  • @peaceneeded
    @peaceneeded 2 роки тому

    Simply Amazing Explanation !

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 3 роки тому

    Thank you to explain the concepts in detail.

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 3 роки тому

    Very Good session to understand the concepts in Synapse Analytics

  • @abc_987
    @abc_987 24 дні тому

    JUST GOLD

  • @vivekvishal2500
    @vivekvishal2500 2 роки тому +1

    Great Sir 👌

  • @kuldeepgawande9550
    @kuldeepgawande9550 3 роки тому

    Excellent explanation. Thank you.

  • @upendarjakkula2561
    @upendarjakkula2561 2 роки тому

    Extraordinary 👌

  • @user-yj9rv7us4x
    @user-yj9rv7us4x Рік тому +1

    👍🏻👍🏻👍🏻

  • @shuaibpantnagar
    @shuaibpantnagar 2 роки тому

    Very nicely explained the Azure Synapse specially SQL pool. I have question here. Both Synapse and Azure Data bricks have spark engine. How would I choose one between them for my my project work?

  • @amittyagi9171
    @amittyagi9171 2 роки тому

    Thank you so much. You are amazing.

  • @TiffanyMorris123
    @TiffanyMorris123 3 роки тому +1

    Thanks for this video! Question you touched quickly on creating statistics in Synapse prior to running queries based on the query patterns.. For my case I have a large group of users from admins to analysts to developers and I can not predict the types of queries that they will run. Is there a best practices that I can pass on to the users when planning to create the stats before running their queries? Do you plan on future tutorials on this topic? thanks!

    • @ArshadAliAasTrailblazers
      @ArshadAliAasTrailblazers  2 роки тому

      Thanks Tiffany! While creating stats in advance is a proactive way to optimize the performance, engine also learns from first time submitted queries to optimize the performance for future submissions when AUTO_CREATE_STATISTICS setting is ON (which is ON by default). You can find more details about it here: docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-statistics
      To shorten statistics maintenance time, be selective about which columns have statistics, or need the most frequent updating. For example, you might want to update date columns where new values may be added daily. Focus on having statistics for columns involved in joins, columns used in the WHERE clause, and columns found in GROUP BY. docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool#maintain-statistics

  • @HGoIchetan09
    @HGoIchetan09 3 роки тому

    Excellent explanation.. Thanks..

  • @samuelrocha9079
    @samuelrocha9079 2 роки тому

    Thank you for the video, one of the bests that I ever watched in terms of learning data.
    Just a quick question, in round-robin table, you said the data will be shuffled when you query the group by ProductKey, and the distribution will be organized by that field, so, what if after that, I decide to execute the same query, but grouping by a different field? The shuffle will happen again? and the distribution will be by this other field that I'm considering to group?

  • @user-yj9rv7us4x
    @user-yj9rv7us4x Рік тому +1

    👍👍👍👍

  • @sumitrauniyar7347
    @sumitrauniyar7347 2 роки тому

    how does replicate distribution work when we have 1 compute node?

  • @Mohammad.aarif_222
    @Mohammad.aarif_222 5 місяців тому

    From where I need to store files in blob storage

  • @SSingh-lr2ue
    @SSingh-lr2ue 3 роки тому

    Thank you for the clear explaination . however i am not clear about where does 60 buckets or 60 distribution gets stored , Is it in azure storage ? In short not getting the purpose/difference of azure storage and SQL Database instance attached with compute node , Could you please explain more about it ?

    • @ArshadAliAasTrailblazers
      @ArshadAliAasTrailblazers  2 роки тому

      For developers, I think the important thing to consider is how it scales out, for example, if you have 2 nodes, each of these nodes will have 30 distributions attached to it, likewise if you 4 nodes, each of these nodes will have 15 distributions. By this scaling out from 2 to 4 nodes, each of these nodes now will have roughly half of the data (assuming there is no data skewness), and will take roughly half the time to complete processing. docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/memory-concurrency-limits#service-levels

    • @BabatundeAdeleye-mw5ce
      @BabatundeAdeleye-mw5ce 10 місяців тому

      The 60 distributions are stored in the sql database instance in the sql pool. data from azure store are distributed to the distributions in different patterns, depending on the distribution type defined on the sql pool table during table creation. sql engine then gets these data from the distributions as instructed in your query, which may require it to move data around or not before executing the aggregate function on the data and sending the output to the control node, which in turn sends the same to the user for viewing.

  • @Mohammad.aarif_222
    @Mohammad.aarif_222 5 місяців тому

    How do I make external table