20. Runtime Architecture of Spark In Databricks

Поділитися
Вставка
  • Опубліковано 27 сер 2024
  • Follow me on Linkedin
    / bhawna-bedi-540398102
    Instagram
    www.instagram....
    Data-bricks hands-on tutorials
    • Databricks hands on tu...
    Azure Event Hubs
    • Azure Event Hubs
    Azure Data Factory Interview Question
    • Azure Data Factory Int...
    SQL leet code Questions
    • SQL Interview Question...
    Azure Synapse tutorials
    • Azure Synapse Analytic...
    Azure Event Grid
    • Event Grid
    Azure Data Factory CI-CD
    • CI-CD in Azure Data Fa...
    Azure Basics
    • Azure Basics
    Data Bricks interview questions
    • DataBricks Interview Q...

КОМЕНТАРІ • 22

  • @ganeshshinde4905
    @ganeshshinde4905 5 годин тому

    Very nice explanation

  • @Dinesh-g1o
    @Dinesh-g1o Місяць тому

    Great explanation, Thank you

  • @shrutikansal9831
    @shrutikansal9831 5 місяців тому

    You are doing Amazing, really appreciate your teaching skills and knowledge. Keep it up.

  • @ramaraju3273
    @ramaraju3273 2 роки тому

    Very informative video , please continue to upload more videos on Databricks... Thank you .

  • @ashutoshdeshpande3525
    @ashutoshdeshpande3525 Рік тому

    Very nice explanation. Was getting confused between stages and task part but now cleared. Thanks for this 😊.

  • @Prashanth-yj6qx
    @Prashanth-yj6qx 7 місяців тому

    your teaching skills are amazing.

  • @Ravi_Teja_Padala_tAlKs
    @Ravi_Teja_Padala_tAlKs 10 місяців тому

    Super after lot of confusions Thanks 😊

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 2 роки тому

    Nice Explanation Bhawana. Thank you!!

  • @Dinesh-g1o
    @Dinesh-g1o Місяць тому

    I have one question, I am using AWS EMR, in that cluster one worker node can have more than one executors,... in databricks is it a hard rule, that one worker node = one executor

  • @lifewithtarun
    @lifewithtarun Рік тому

    Thanks Ma'am for explaining.

  • @ayushsrivastava6494
    @ayushsrivastava6494 2 місяці тому

    say I've a heavy parquet file lying in S3 and I want to bring that file (COPY INTO command) into databricks as a delta table. What would be the ideal worker and driver type in that case if I have no transformations at all while moving the data but the dataset is very huge.

  • @deepikasoni7423
    @deepikasoni7423 2 роки тому

    Thanks a lot... Very well explained.. please upload videos on optimization techniques in Databricks.

    • @cloudfitness
      @cloudfitness  2 роки тому +1

      ua-cam.com/video/a2ehHq3DJrw/v-deo.html
      here i the link that might help you

  • @nagabadsha
    @nagabadsha 7 місяців тому

    well explained. Thanks

  • @deepjyotimitra1340
    @deepjyotimitra1340 2 роки тому

    very nice explanation

  • @nagamanickam6604
    @nagamanickam6604 10 місяців тому

    Great

  • @shivamjha9720
    @shivamjha9720 2 роки тому

    Very in-depth explanation. Keep up the good work. But I have one doubt - where we are defining the partitions?? No of tasks = no. of partitions. From where does the number of partitions will come. Are we defining it somewhere?

    • @cloudfitness
      @cloudfitness  2 роки тому +6

      We can define number of partitions in code and then choose the cluster configuration as per number of partitions you have set up in code(other factors are also taken into consideration while choosing cluster)...if you donot specify partitions in code spark in dbx will default create partitions for you, usually it's 200partitions with around 128mb size

    • @shivamjha9720
      @shivamjha9720 2 роки тому

      @@cloudfitness Thanks a ton !! You've gained a new subscriber. You can upload more videos pertaining to databricks, PySpark, SQL. It would be helpful

  • @abhishek310195
    @abhishek310195 Рік тому

    What happens if...
    1. In a databricks cluster a worker node get's down,what happens to the data which resides on that worker node???
    2. Meanwhile in continuation to above scenario,if databricks spins a new worker node...what happens if a select query goes to that new node..which doesn't have data(as its newly added in place of other node which went down and had data previously), will this cause data inconsistency???

    • @billcates4048
      @billcates4048 Рік тому

      we use metastore for that purpose which contains every information about the storage of data like which partition is in which node ,so if it fails ,it automatically recovers as data would be replicated across nodes .

  • @nagendraprasadreddy353
    @nagendraprasadreddy353 2 роки тому

    Super