Master Databricks and Apache Spark Step by Step: Lesson 1 - Introduction

Поділитися
Вставка
  • Опубліковано 1 січ 2025

КОМЕНТАРІ • 102

  • @dhwanik02
    @dhwanik02 Рік тому +16

    This is one of the best and clearest explanations about Spark and Databricks on the internet.

  • @faisala1037
    @faisala1037 2 роки тому +110

    There's a ton of videos ( one for every keyword) on UA-cam on this subject. Most fails to deliver any useful knowledge, others are too narrow and/or incomprehensible. I'm so glad to have found this series. Your teaching style took me back to my college classes. Fairly detailed and well explained. So a big thanks to you for it Bryan 👍.

    • @BryanCafferky
      @BryanCafferky  2 роки тому +5

      Thanks, Faisal. If you follow the entire series, you will get a solid foundation.

    • @mansah707
      @mansah707 5 місяців тому

      @@BryanCafferky I intend to go through this whole stuff.. Lesson 0 and 1 completed... onto lesson 2

  • @arturrizzato1034
    @arturrizzato1034 8 місяців тому +1

    A very good class, especially for a Databricks virgin like me.

  • @mandarkulkarni7675
    @mandarkulkarni7675 Рік тому +2

    probably the first video that describes the difference between spark and databricks so cleanly and also the different components of spark with regards to where they are placed in the whole data engineering ecosystem .... Thanks a lot ...!!!

  • @MarkFreedmanNY
    @MarkFreedmanNY 11 місяців тому +2

    Finally, a Databricks UA-cam series that makes sense! I'm using DB with AWS, but this all pertains. Thanks!

  • @bibinkunjumon
    @bibinkunjumon Рік тому +1

    This is my 3 Rd teacher. You explained all well from an experienced person. I thought first what this old man gonna speak...now end up touching ur feet. Well done
    Bibin from India,Kerala

  • @kingawewome3900
    @kingawewome3900 9 днів тому

    Love the accent, As a New Englander living abroad, it made me homesick! This intro video is wicked awesome.

  • @boubeniamohamed236
    @boubeniamohamed236 Рік тому +2

    Definetly the best serie for learning databricks

  • @CodeVeda
    @CodeVeda 10 годин тому

    finally someone is talking clearly

  • @AnandGhosh-m4d
    @AnandGhosh-m4d 9 місяців тому +1

    Spot On! I really liked how you transitioned from the broader umbrella of Hadoop> spark> Databricks.. Great job Bryan!...

  • @nathanielsackey2083
    @nathanielsackey2083 2 місяці тому +1

    Man man I'm supporting this guy on patron.... What a class ...what a breakdown...you here about all these tools and 9 out of 10 times I'm drowning in them

  • @naomilago
    @naomilago 2 роки тому +6

    O M G
    I found what I was looking for.
    I've started working at Nestlé as a Data Science Analyst and I'm searching for a good playlist of Databricks and Spark to have a deeper understanding on this subject but you're the one that matched my way to learn and have lectures. A huge big thanks to you 🌟

    • @BryanCafferky
      @BryanCafferky  2 роки тому +1

      Thanks so much! It is really great to hear feedback like that! Glad it helps you.

  • @mansah707
    @mansah707 5 місяців тому

    I have never seen such a straightforward, clear , concise explanation on this concept. till date, i have tried to understand Apache Spark and Databricks... but i've always had some convoluted understanding of them. thank you for much for this video.. it really helped me understand where things stand now.

    • @BryanCafferky
      @BryanCafferky  5 місяців тому +1

      Thanks. Glad the videos are helpful.

  • @sudhamadhurikandru2708
    @sudhamadhurikandru2708 2 місяці тому

    I dont know how come i did not see your channel earlier, I am now hooked on to these, please make more and more, I like listening to your tutorials and making notes.

  • @rmj5410
    @rmj5410 3 місяці тому

    Absolutely the best explanation of Databricks I've ever heard

  • @abhinavkashyapv
    @abhinavkashyapv 2 роки тому +7

    This video clearly explains the concepts around apache spark, databricks and the various offerings. Wonderful explanation thanks a ton 👏👍

  • @Hamromerochannel
    @Hamromerochannel 7 місяців тому

    I tried to do data bricks academy and I got lost. Thanks to channel, I understand every nook and crannies. Thumbs up Brian!!

    • @BryanCafferky
      @BryanCafferky  6 місяців тому

      Thank you! Glad my videos are helping you.

  • @dataoil8416
    @dataoil8416 2 роки тому

    Exactly what I was looking for !!! your best teacher is your last mistake! proved!

  • @ash2ucool
    @ash2ucool Рік тому

    Thank you, Thank you, Thank you for explaining it in the simplest way possible. At last I was able to understand what are Hadoop, Spark and Databricks, and what actually they do.

    • @BryanCafferky
      @BryanCafferky  Рік тому

      So glad to hear that. It's why I do this channel. Thanks

  • @samirks27
    @samirks27 5 місяців тому

    Thanks Bryan for wonderful video, you kept me engaged and attentive through out of the video. Your explanation very crystal clear and one of the best on the internet. Thanks and god bless you healthy and energetic.

  • @danielejiofor7032
    @danielejiofor7032 4 місяці тому +1

    Best DB tutorial out there!!!

  • @alexandermedina4950
    @alexandermedina4950 2 роки тому +1

    Great content, thank you for doing this general and historic view, sometimes it is necessary to understand the details.

  • @amataratsu006-xs6hv
    @amataratsu006-xs6hv 9 місяців тому

    Sir thank you so much! You match my learning style and you have a clear voice

    • @BryanCafferky
      @BryanCafferky  9 місяців тому

      Thanks. Glad the videos are helpful!

  • @animeshmohanty5052
    @animeshmohanty5052 2 роки тому

    You are awesome! There's hardly any other material which is as clear and condensed. Thank you for creating this video🙏

  • @brenthackers132
    @brenthackers132 Рік тому

    Guy has two left sides and still manages to make sense. Inspiring. :)

  • @alokhom
    @alokhom 6 місяців тому

    your video has decluttered me a lot. Now am going to make a hdfs on my k8s cluster and spark operator

  • @datoalavista581
    @datoalavista581 2 роки тому +2

    Thank you Professor Bryan !

  • @samanthamccarthy9765
    @samanthamccarthy9765 Рік тому

    thanks really good summary of all these languages and how they came about .

  • @voliteon
    @voliteon Рік тому

    Thanks for your videos Bryan - nice work. Really good amount of information clearly explained.

  • @KhalilJolibois
    @KhalilJolibois 3 роки тому +1

    thanks for these videos i'm finishing up the data camp data engineer track and then jumping in on these

  • @hemalpbhatt
    @hemalpbhatt 4 місяці тому

    Love your explanation! It is so easy to understand

  • @stefantodorovikj6165
    @stefantodorovikj6165 3 місяці тому

    Thank you brother you are simply amazing

  • @Mickley0
    @Mickley0 Місяць тому

    Fantastic video, thank you Bryan

  • @anandchandrashekhar2933
    @anandchandrashekhar2933 2 роки тому

    Great start to the series. Thank you!

  • @andreaceribelli9705
    @andreaceribelli9705 6 місяців тому

    Incredible quality, thanks!

  • @jamesschoi87
    @jamesschoi87 Рік тому +1

    28:10 You couldn't install external libraries with open source spark?

    • @BryanCafferky
      @BryanCafferky  Рік тому +1

      You can but you can define libraries for a cluster and Databricks will automatically re-install them ever time the cluster starts. You can even define libraries you want installed on every cluster if you like. Spark does not support cluster stop and start. You have too delete and re-create clusters if you want to stop paying for them. When you create a cluster, you have do do some work to install the libraries you want.

  • @MeridiusMaximus
    @MeridiusMaximus 2 роки тому +2

    such a clean explanation. Thank you!

  • @revidenver5142
    @revidenver5142 Рік тому

    The Best explanation, thank you

  • @ishaqkhan8653
    @ishaqkhan8653 7 місяців тому +1

    Hey Bryan, thank you for the excellent video. it put my mind at ease. I have seen that you have used Azure Databricks going forward. However my organization stores data on s3 and works predominantly in databricks platform itself. I was wondering if the knowledge you have shared will work good in direct databricks platform. I am a complete new beginner in this field, so apology for any silly questions

    • @BryanCafferky
      @BryanCafferky  7 місяців тому

      Hi Ishaq,
      Databricks is a complete self contained service available on AWS, Azure, and GCP. It should work the same on all three with the only differences being how it integrates with the cloud specific back end services like s3. Also, Azure integrates Databricks in a way that eliminates the need for the customer to have an agreement with Databricks and Microsoft. It appears as if it were an Azure service. I think AWS requires customers to license with Databricks and AWS when they set it up. So yes, overall, all the Databricks and Spark code and services should be the same on all 3 cloud platforms. Make sense?

  • @carlosramirez-pf1zq
    @carlosramirez-pf1zq Рік тому

    thank you for your explanation about spark is ,Its confuse at firts sigh are these technologies for someone that never used .

  • @BillusTinnus
    @BillusTinnus 2 роки тому +1

    Fantastic video! Really well done, thank you

    • @BryanCafferky
      @BryanCafferky  2 роки тому

      Thank you! Glad they help.

    • @mehmetkaya4330
      @mehmetkaya4330 2 роки тому

      I would double that! So concise yet comprehensive overview! Thank you so much!

    • @BryanCafferky
      @BryanCafferky  2 роки тому

      @@mehmetkaya4330 Thanks!

  • @srajv01
    @srajv01 5 місяців тому

    Clingon !! That's when I subscribed 😅

  • @sehaj778
    @sehaj778 3 роки тому +2

    Hi Bryan, I'm currently learning Data science on GCP as a beginner. I'm just scratching the surface about learning GCP tools/platform. I wanted to learn Spark and that is why I'm here. Would learning Spark and Databricks in a 'Microsoft Azure platform' be a right idea at this time given I'm focusing on GCP ? Thanks for making this course though, I see so much content here and I'm still on the first video!

    • @BryanCafferky
      @BryanCafferky  3 роки тому +3

      Databricks is a service owned by the company Databricks that is available on AWS, Azure, and GCP. It should be the same on any of these platforms with the only differences being how cloud-specific resources are called or integrated, i.e Azure Synapse vs. Google's BigQuery. You should be fine using Databricks on GCP but let me know if you find significant differences.
      Make sense?

  • @G47_Code
    @G47_Code 2 роки тому

    Thank you Brian so much for the wonderful contents!!!

  • @anmolchoudhary3982
    @anmolchoudhary3982 3 роки тому +1

    ohh man such a detailed and superbly structured content.... I wish I could take you out for beers sometime :)

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Thanks. I appreciate the kind words. It's great to know my work is helpful.

  • @Navinneroth
    @Navinneroth 2 роки тому

    Brilliant analogy sir .. phone books example.. for distributed compute too good.

  • @davidk7212
    @davidk7212 6 місяців тому

    Zank you sir for zis tutorial. It is most very velcome.

  • @bananaboydan3642
    @bananaboydan3642 Рік тому

    This is an amazing video

  • @scxry5597
    @scxry5597 9 місяців тому

    Thank you so much for your videos, i have been looking for this

  • @lucassaito1791
    @lucassaito1791 2 роки тому

    Outstanding content!

  • @artus198
    @artus198 Рік тому +1

    In general , what I notice is , compared to the past, they are over-complicating everything, especially that whole Azure thing is unnecessarily complex. At least on-premise was never this much work !

    • @BryanCafferky
      @BryanCafferky  Рік тому +1

      No. I disagree there. In fact, the point is that Cloud based Databricks is tons easier to use and provides much better tools than using open source Spark on prem. Not sure what you are looking at. Thanks for your comment.

    • @artus198
      @artus198 Рік тому

      @@BryanCafferky Eg: In Databricks , If I want to access dbfs files in another resource group - you have to create a "scope', get access to a vault secret, use the scope to mount that dbfs in your workspace hive metastore, write a script to mount, write a script to create a temp view and read the data from that delta table.
      In SQL Server: I can share connection string user/password with somebody else, they can connect to the database from SQL Management studio, enter the details and run as many queries as they want on that database, joining multiple tables etc etc.

  • @IvanSedov-i7f
    @IvanSedov-i7f Рік тому

    Thank you very much, it was very interesting and helpful

  • @JCArtuso
    @JCArtuso 3 роки тому

    Great! Let's go!

  • @rydmerlin
    @rydmerlin 2 роки тому

    Is your book available in epub format?

  • @gustavonavesdesouza759
    @gustavonavesdesouza759 9 місяців тому

    Thanks for that

  • @ThEHaCkeR1529
    @ThEHaCkeR1529 Рік тому

    Thanks a lot!

  • @GOONER_FOREVER1989
    @GOONER_FOREVER1989 3 роки тому

    How to drop cached data which was cached using delta cache into local storage ? I couldn't find a proper command.

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      That's a bit beyond the content of this video.

  • @erkansirin6849
    @erkansirin6849 2 роки тому

    Where's Kubernetes as cluster manager?

  • @jay_wright_thats_right
    @jay_wright_thats_right 2 місяці тому

    I wanted to know why we need to know this. I just felt like I was going through the motion while watching this.

    • @BryanCafferky
      @BryanCafferky  2 місяці тому +1

      Its more than just coding. You need the background and concepts to be effective. It's a long video series and if you skip the foundation, you will never gain mastery.

    • @stigmartinsen3359
      @stigmartinsen3359 Місяць тому

      @@BryanCafferky As someone who's been researching the Apache ecosystem for the last month, trying to make sense of what's what with so much overlapping functionality, I greatly appreciate this video. Thank you for the thorough explanation. I look forward to watching and learning from the rest of the videos in this playlist about Spark and Databricks.
      With that said, since some of these videos are a bit old, would you say any of the information in them is outdated?

    • @BryanCafferky
      @BryanCafferky  Місяць тому

      @@stigmartinsen3359 The Databricks UI has changed a lot but the functionality has stayed. New functionality has been added such as Delta Lake, Unity Catalog, and Photon. See this video for an update on these: ua-cam.com/video/9YJby_COOdc/v-deo.html

  • @youssefloukili1785
    @youssefloukili1785 2 роки тому

    thanks

  • @sivachagaleti6614
    @sivachagaleti6614 2 роки тому

    Awesome

  • @rohitchakravarthi94
    @rohitchakravarthi94 Рік тому +1

    In real life this is something called "I stumbled and found a gold mine" !

  • @shomero8334
    @shomero8334 Рік тому

    Thank you, man! I was lost at first, I needed your Tutorial so so so so much!!

    • @BryanCafferky
      @BryanCafferky  Рік тому

      Glad it helped! I understand. It is a lot to learn.