Olympic Data Analytics | Azure End-To-End Data Engineering Project | Part 2

Поділитися
Вставка
  • Опубліковано 1 гру 2024

КОМЕНТАРІ • 91

  • @s2muhamm
    @s2muhamm Рік тому +4

    Really good project! two things to point out. you don't need to calculate the total medals won by each country because the row is unique by country so the data is already in a usable format for this task. 2) your query for average number of entries by gender is not correct. Again, each row is unique by country so the avg() will not work. here is the right code: select
    Discipline
    ,cast(female as float) / total as average_female
    ,cast(male as float) / total as average_male
    from entriesgender .

  • @HemantKumar-su1qt
    @HemantKumar-su1qt 7 місяців тому +3

    Hi sir
    I personally thank you very much for the AWS Data Engineering Project series and we are learning a lot from them. We all are really grateful for this level of your generosity. Our humble request you to create an Azure Data Engineering Project using SQL & SSMS as there is only one project of Azure Data Engineer. We want to learn more about the Azure Data Engineering Project as there is no Azure Data Engineer project using Bronze, Silver & Gold layer transformation in PySpark DataBricks and SSMS respectively.
    I hope our request will soon be accepted from your end.
    Thank you so much for your valuable guidance & support.

  • @humayunshahid9433
    @humayunshahid9433 6 місяців тому +2

    It's really supportive even for beginners. Great Work done. Love from Pakistan

  • @azureenthusiast
    @azureenthusiast Рік тому +3

    its an awesome Project where End to End Clearly explained.i really loved it!! you helped all the way!!

  • @omkarkhurd2460
    @omkarkhurd2460 2 місяці тому

    Very good project for the understanding. Great work Darshil..

  • @soumikmishra7288
    @soumikmishra7288 Місяць тому

    Really nice project to learn, Thanks a lot Darshil

  • @ClayartistRaj
    @ClayartistRaj 11 днів тому

    loved the way of teaching

  • @heljava
    @heljava 9 місяців тому

    Thank you Darshill. It was a very good tutorial on how to work with different tools.

  • @sukoi2113
    @sukoi2113 11 місяців тому

    Your video is very clear and convenient to understand and gain confidence. Thanks a lot.

  • @deepam.g4558
    @deepam.g4558 Місяць тому

    Thank you so much. Great work. Please upload more project videos.

  • @rahulgite875
    @rahulgite875 11 місяців тому

    The explanation was on point and really good. I liked it.

  • @vinayadimane8839
    @vinayadimane8839 Рік тому

    Thank you very much Darshil Sir. The way you have explained the project is very Awesome. You made the explanation as much simpler to understand us. I loved it.

  • @ranjitprakash1986
    @ranjitprakash1986 Рік тому

    Exactly what i was looking for, thank you so much!

  • @sivakumarrajabather1140
    @sivakumarrajabather1140 8 місяців тому

    Great explanation and this helps me a lot.

  • @prachideokar7639
    @prachideokar7639 5 місяців тому +1

    very informative and helpful video... plz make more videos related to azure data engineer with diff diff activities, pipelines.

  • @adilmajeed8439
    @adilmajeed8439 6 місяців тому

    Thanks for sharing, taking the same basis from part 1 of converting the same based on Microsoft Fabric

  • @Kingshakazulu
    @Kingshakazulu Рік тому

    Excellent tutorial, I will definitely get in touch for your other courses

  • @freelychanu2086
    @freelychanu2086 2 дні тому

    Nice work❤

  • @fmwihler
    @fmwihler Рік тому

    Thank you so much!! really well explained

  • @kanchanbhattarai2266
    @kanchanbhattarai2266 Рік тому +2

    If we want the query in 19:11 to give us same result as the notebook, won't the query be
    SELECT Discipline, (CAST(Female AS float)/CAST(Total AS float)) AS Fe_Average, (CAST(Male AS FLOAT)/CAST(Total AS FLOAT)) AS Ma_AVG FROM entriesGender;
    Right now you are just grouping by Discipline and I think the Discipline is already unique which is why it's pulling the same result as SELECT * FROM entresGender.

  • @surajkumarjha5757
    @surajkumarjha5757 9 місяців тому

    Thank you for this project sir.

  • @Iceiam
    @Iceiam 3 місяці тому

    Briliant tutorial. If we wanted to post this on our git for portfolio reasons, would it not be possible to keep this all active or would there be charges incurred even though we aren't using them?

  • @narsimhadri
    @narsimhadri Рік тому +2

    Quick question. In real-time, will companies use only Synapse analytics alone for data processing and for storage? If yes, during what scenarios they will use, will it not be challenge for testing?

    • @tejas4054
      @tejas4054 Рік тому

      Ye nhi nolega bhai ye free de rha hai

  • @saroshfaisalkhan3531
    @saroshfaisalkhan3531 3 місяці тому

    Hello, Darshil I have finally finished this section as well. I sincerely appreciate your efforts. I have a question for you now: will you offer certification upon completion of your course? I emailed you earlier, but I never heard back. I understand that you are very busy, but I really wanted your combo course.

  • @pukitkapoor
    @pukitkapoor Рік тому

    can you share how to connect to synapse analytics using serverless SQL pool.

  • @Nikhillllllllllllll
    @Nikhillllllllllllll 5 місяців тому

    but we can only continue visualization part if we have premium or professional BI account

  • @sarthakhaldar9107
    @sarthakhaldar9107 10 місяців тому

    Thank you for this Darshil. After ETL if we dont want to do analytics and want to implement some ML library prediction, what would be the best way to achieve that in azure platform ?

  • @aryamathew2273
    @aryamathew2273 11 місяців тому

    How is there a new resource group at 0:25?

  • @akashmahapatra
    @akashmahapatra Рік тому

    19:11
    -- Calculate the average number of entries by gender for each discipline
    SELECT Discipline, AVG(Female) AverageFemale, AVG(Male) AverageMale
    FROM entriesgender
    GROUP BY Discipline;

  • @thammanenisrinivasareddy1246

    Nice tutorial, do you have a similar one for GCP cloud?

  • @diegosalazar3245
    @diegosalazar3245 8 місяців тому

    In the real world which service will you choose between datafactory, databricks and synapse?

  • @hugenerretho9151
    @hugenerretho9151 Рік тому +1

    i got trigger async thread was blocked when i run airflow standalone on Ubuntu. i did export pythonasyciodebug=1 but nothing is output. any tips?

    • @shubhamgupta1632
      @shubhamgupta1632 Рік тому

      check all dependencies and version of environment and packages. See for any upgrades. Check the log files too, to see if you find any error.

  • @bukunmiadebanjo9684
    @bukunmiadebanjo9684 Рік тому

    Thanks you!
    One question Darshil - Does this project including the Coursera preparation course enough for one to sit for the Azure Data Engineer Certification(Associate)?

  • @sanishthomas2858
    @sanishthomas2858 11 місяців тому

    Nice . quick question, after creating databricks and synapse how much cost it would take if we keep in pause state if not using.
    Also for databricks, is there any pause options?

    • @hariprasad3820
      @hariprasad3820 10 місяців тому

      If you keep the cluster in terminated state, there won't be any charges in Databricks not sure whether that is what you are expecting

  • @karandoke1134
    @karandoke1134 4 місяці тому

    good work bro

  • @vinayadimane8839
    @vinayadimane8839 Рік тому

    Sir, I am requesting to do more project videos on Azure technology

  • @MiguelTorres-fp7jr
    @MiguelTorres-fp7jr Рік тому

    Excelente video!

  • @Han-bk7wr
    @Han-bk7wr 4 місяці тому

    How do I access the PowerBI when it's only available for business users?

    • @srijanbansal6078
      @srijanbansal6078 3 місяці тому

      create a new user and give the owner access to it. Then sign up with that account

  • @sharans5771
    @sharans5771 4 місяці тому

    While creating the table I am getting : review and update the file format settings to allow file schema detection in synapse studio
    How to solve this?

    • @teamof2iith728
      @teamof2iith728 4 місяці тому

      I am getting the same error. Did you find hoe to resolve it..?

  • @ishansingh163
    @ishansingh163 Рік тому +1

    Sir, in Part 1 at the very end ... I am not able to create 'transforemed-data' folders for entriesgender and medals, it's just creating folders for rest 3 files. :( Did I missing something or I made a mistake?

    • @sumanthhabib8028
      @sumanthhabib8028 Рік тому

      can you share the code?

    • @ishansingh163
      @ishansingh163 Рік тому

      @@sumanthhabib8028 I did exactly the same as instructed in the tutorial and if things were wrong then it would not have created the 3/5 folders. I didn't have the code now, as I have deleted the whole resource group after completion. Thanks for showing concern, I will try the whole thing one more time for better understanding and practice ☺️

  • @sachin-b8c4m
    @sachin-b8c4m 23 дні тому

    thank you

  • @nawelfardeheb5657
    @nawelfardeheb5657 6 місяців тому

    Thaankkss!!!

  • @anujjhunjhunwala39
    @anujjhunjhunwala39 11 місяців тому +1

    Synapse analytics does not work for student account

    • @abdulrafey7439
      @abdulrafey7439 11 місяців тому +1

      same issue

    • @vemedia5850
      @vemedia5850 11 місяців тому

      Did you find an alternative?

    • @abdulrafey7439
      @abdulrafey7439 11 місяців тому

      @@vemedia5850 bro you can go into your account details and add synapse into the list of programs you can use in your free account. it will then show up

  • @LearnAtHomewithGulshan
    @LearnAtHomewithGulshan Рік тому

    Good one

  • @enkhjargaldavaadolgor5209
    @enkhjargaldavaadolgor5209 Рік тому +2

    Sir, at 8:45 of Part 2 It won't let me create external table after I press continue. It says "Failed to detect schema, Please review and update the file format settings to allow file schema detection" I had no problem in part 1 as everything went smooth. When I click detail it says "Failed to execute query. Error: Error encountered while parsing data: 'Invalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.'. Underlying data description: fil" I even googled and it seems like there is no solution. PLEASE HELP!!!

    • @fmwihler
      @fmwihler Рік тому +3

      i solved it changing the format to .parquet >
      athletes.repartition(1).write.mode("overwrite").option("header",'true').parquet("/mnt/tokyoolymic/transformed-data/athletes.parquet")
      coaches.repartition(1).write.mode("overwrite").option("header", "true").parquet("/mnt/tokyoolymic/transformed-data/coaches.parquet")
      entriesgender.repartition(1).write.mode("overwrite").option("header", "true").parquet("/mnt/tokyoolymic/transformed-data/entriesgender.parquet")
      medals.repartition(1).write.mode("overwrite").option("header", "true").parquet("/mnt/tokyoolymic/transformed-data/medals.parquet")
      teams.repartition(1).write.mode("overwrite").option("header", "true").parquet("/mnt/tokyoolymic/transformed-data/teams.parquet")
      also modify the DB *or create a new one* in parquet format

    • @minhthanhle1223
      @minhthanhle1223 Рік тому

      @@fmwihler many thanks bri

    • @sanjithaamarathunga9949
      @sanjithaamarathunga9949 Рік тому

      i had the same issue. worked with .parquet to solve the issue. thank you @fmwihler

  • @jackmaguina8480
    @jackmaguina8480 Рік тому

    One question where you do the data modeling, I mean the star model. Can I do it in Azure? And can I also do the measurements in Azure and use power bi only as a visualizer?

    • @mithunshet5922
      @mithunshet5922 Рік тому +1

      You can do data modeling in PowerBI too.

  • @fatihyalcin3713
    @fatihyalcin3713 Рік тому

    have an issue while publishing the data : InternalServerError executing request:

  • @mastershinzo3170
    @mastershinzo3170 Рік тому

    incremental loading ja tutorial miljata to mja ajata...

  • @mohitxagg
    @mohitxagg 12 днів тому

    While creating any Azure Resource new user might get an error saying "Resource not registered". Like I got while creating Synapse saying "The Azure Synapse resource provider (Microsoft.Synapse) needs to be registered with the selected subscription."
    To fix this you can follow this method - ua-cam.com/video/fvdCWbadIko/v-deo.htmlsi=EXgMD2Eq_RKJYmkV
    Hope it helps!

  • @Iceiam
    @Iceiam 3 місяці тому

    Has anyone managed to succesfully complete the project and if so in how much time?

    • @srijanbansal6078
      @srijanbansal6078 3 місяці тому

      2 hrs

    • @Iceiam
      @Iceiam 3 місяці тому

      @@srijanbansal6078 thanks! Did you upload it anywhere? How easy isit to make as part of your portfolio via git etc?

  • @tokunbochimaobim6582
    @tokunbochimaobim6582 Рік тому

    encountered this error " Failed to detect schema
    Please review and update the file format settings to allow file schema detection" ....

    • @tokunbochimaobim6582
      @tokunbochimaobim6582 Рік тому

      Failed to execute query. Error: Error encountered while parsing data: 'Invalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.'. Underlying data description: file 'tokyoolympicdatasnug.dfs.core.windows.net/tokyo-olympic-data/transformed-data/athletes/part-00000-tid-446593903816833556-9250b5ec-d2a1-4ff8-a184-6e2e4276fe0e-17-1-c000.csv'.
      The batch could not be analyzed because of compile errors.

    • @pspc890121
      @pspc890121 Рік тому

      I encounter the same issue…..

    • @tokunbochimaobim6582
      @tokunbochimaobim6582 Рік тому

      please how did you go about it ?
      @@pspc890121

    • @bukunmiadebanjo9684
      @bukunmiadebanjo9684 Рік тому

      Facing a similar challenge whenever I attempt to create tables.

    • @bukunmiadebanjo9684
      @bukunmiadebanjo9684 Рік тому +1

      As an alternative, I decided to go back to Azure Databricks and write my transformed dataset types as Parquet instead of CSV format and everything worked out fine. Try this approach instead.

  • @agusherbozo19
    @agusherbozo19 5 місяців тому

    capo

  • @syedmuhammadraqimalishah5193
    @syedmuhammadraqimalishah5193 Рік тому +6

    Where is Dashboard?😠

    • @geekyprogrammer4831
      @geekyprogrammer4831 Рік тому +5

      This is free course. Don't show that expression

    • @karanswamygowda
      @karanswamygowda Рік тому +5

      Show some respect. He is doing all this for free

    • @jainpratham8217
      @jainpratham8217 Рік тому +1

      how do we create dashboard? do you have any idea

    • @zgeorgem
      @zgeorgem 3 місяці тому

      see the whole video. he mentions at 14:50

  • @freetrainingvideos
    @freetrainingvideos 11 місяців тому

    Excellent!