23. Databricks | Spark | Cache vs Persist | Interview Question | Performance Tuning

Поділитися
Вставка
  • Опубліковано 11 гру 2024

КОМЕНТАРІ • 60

  • @omprakashreddy4230
    @omprakashreddy4230 3 роки тому +16

    Only few people have ability to teach in way that even novice can understand. Hats off to you.
    Keep going !!!

  • @rockykefunday2707
    @rockykefunday2707 2 роки тому +4

    you are the real raja bro , super

  • @gulsahtanay2341
    @gulsahtanay2341 9 місяців тому +2

    Thank you for sharing your knowledge with us!

  • @stepup2me1
    @stepup2me1 2 роки тому +5

    You have very good way of explaining the concepts. Thank you!

  • @joyo2122
    @joyo2122 2 роки тому +3

    your videos are the best

  • @kamalbhallachd
    @kamalbhallachd 3 роки тому +4

    Good 👍

  • @abinaya7704
    @abinaya7704 2 роки тому +4

    Your videos are making wonders!!

  • @tanushreenagar3116
    @tanushreenagar3116 9 місяців тому +2

    Nice content sir

  • @rahulpandit9082
    @rahulpandit9082 2 роки тому +4

    I found many videos on UA-cam regarding Cache and Persist, but nobody explain like the way you did...

  • @pavithraeshwar8881
    @pavithraeshwar8881 3 місяці тому +1

    This is the explanation thank you for share the knowledge sir👏

  • @turanfair9364
    @turanfair9364 2 роки тому +2

    Best teacher!!! Thank you sir 🙏🏻

  • @iamkiri_
    @iamkiri_ Рік тому +1

    Raja, I really appreciate your explanation :)

  • @vutv5742
    @vutv5742 7 місяців тому +1

    Great explaination 🎉

  • @coolraviraj24
    @coolraviraj24 4 місяці тому +1

    You explained it so simply...
    i hope will be able to explain to the interviewer the same way u did😅

  • @kamalbhallachd
    @kamalbhallachd 3 роки тому +1

    Knowledge session

  • @justvenkyy...3423
    @justvenkyy...3423 Рік тому +3

    this is too good . please keep doing. can you post on processing small file problem with spark?

  • @shakthimaan007
    @shakthimaan007 4 місяці тому +1

    But where and how do we define these? Can you please add a short demo?

  • @ranjithajit4717
    @ranjithajit4717 Рік тому +2

    Can you add the examples for creating persist in the description?

  • @pankajchikhalwale8769
    @pankajchikhalwale8769 8 місяців тому +1

    I guess you have at least an M.Tech. + M.Ed. degrees.
    Expert in Spark and Amazing Teacher.
    Sir, Tussi Grett Ho !

    • @rajasdataengineering7585
      @rajasdataengineering7585  8 місяців тому +1

      Thank you Pankaj! Hope you like the tutorial

    • @pankajchikhalwale8769
      @pankajchikhalwale8769 8 місяців тому +1

      @@rajasdataengineering7585, So far I have watched 9 out of the 22 videos in the "Databricks Performance Optimization" playlist. It is very detailed. Like it.

    • @rajasdataengineering7585
      @rajasdataengineering7585  8 місяців тому

      Glad you like it!

  • @sravanthiyethapu9970
    @sravanthiyethapu9970 2 роки тому +2

    Hi Raja, u said that persist will use both memory and disk. Here memory means both on and off heap memory????

    • @rajasdataengineering7585
      @rajasdataengineering7585  2 роки тому +3

      By default, it is cached at on-heap memory. But if off-heap memory is enabled and jvm memory(on-heap) is full, off-heap memory would be used for caching remaining partitions

  • @swathi6472
    @swathi6472 4 місяці тому +1

    Please make Video on Salting in Performance optimization

  • @premsaikarampudi3944
    @premsaikarampudi3944 Рік тому +2

    Hi, I was asked to prepare for Spark for my next role in the same company I am working, Is this learning series enough ?

  • @sanjayr3597
    @sanjayr3597 Рік тому

    Very good playlist which I have come across.. Could you please provide example with practical example because I was watching some videos regarding this and what I noticed was when we df.cache() then by default it is MEMORY_AND_DISK SER ..there was no just MEMORY_AND_DISK it was always SERIALIZED ..need to know the reason on this.

  • @vlogsofsiriii
    @vlogsofsiriii 7 місяців тому +1

    Hi Raja. I have one doubt.
    Cache - will store the data in memory means is it onheap memory ??
    Persist - Will store the data in onheap and off heap both ??
    Is it correct ??

    • @rajasdataengineering7585
      @rajasdataengineering7585  7 місяців тому +1

      Yes that's correct. Cache always stores in memory but persist has flexibility of memory or disk

    • @vlogsofsiriii
      @vlogsofsiriii 7 місяців тому

      @@rajasdataengineering7585 memory means here onheap rgt and disk means offheap??

    • @rajasdataengineering7585
      @rajasdataengineering7585  7 місяців тому +1

      No onheap and offheap both are memory and disk is different. I have already posted a video on onheap vs offheap. Pls watch that video

    • @vlogsofsiriii
      @vlogsofsiriii 7 місяців тому

      @@rajasdataengineering7585 thank you 😊

  • @RamaiahChenna
    @RamaiahChenna 4 місяці тому

    Hi Sir, we want vidoe for performance issues and solutions while develope the notebook
    what are the issue comes

  • @aayushdesai532
    @aayushdesai532 2 роки тому +1

    great video sir! one question - is disc memory same as off heap memory?

    • @rajasdataengineering7585
      @rajasdataengineering7585  2 роки тому +2

      No, off heap and in disc both are different. Off heap memory is part of RAM. on heap is controlled by jvm while off heap is controlled by os itself

  • @suresh.suthar.24
    @suresh.suthar.24 Рік тому

    Best Explanation. but i have 1 question like cache() is a transformation or action ?

    • @rajasdataengineering7585
      @rajasdataengineering7585  Рік тому +1

      Cache is an action

    • @tunyestark2633
      @tunyestark2633 8 місяців тому

      @@rajasdataengineering7585 No, cache is not an action.It is an transformation, please do try it out.

  • @Uda_dunga
    @Uda_dunga Рік тому +1

    Try to make videos under 10 mins sir

  • @MrPerikala
    @MrPerikala Рік тому +1

    how to avoid the duplicate rows while joining large datasets