Gaurav nice video. One comment. Writeback cache refers to writing to cache first and then the update gets propagated to db asynchronously from cache. What you're describing as writeback is actually write-through, since in write through, order of writing (to db or cache first) doesn't matter.
Write-through: data is written in cache & DB; I/O completion is confirmed only when data is written in both places Write-around: data is written in DB only; I/O completion is confirmed when data is written in DB Write-back: data is written in cache first; I/O completion is confirmed when data is written in cache; data is written to DB asynchronously (background job) and does not block the request from being processed
Other variants 1. There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors. 2. There are only two hard problems in distributed systems: 2. Exactly-once delivery 1. Guaranteed order of messages 2. Exactly-once delivery
@@gkcs A humble suggestion, I think you should have a sub-reddit for the channel, because these are such critical topics [not just for cracking interviews], I'm sure they'd definitely encourage healthy discussions. I think YT's comment system is not really ideal to have/track conversations with fellow channel members.
@@gkcs Can you please give some hints on WHY "out of order Delivery" is a problem in distributed systems, if the application is running on TCP ..................PLease Kindly reply.
@goutham Kolluru , Can you please give an hint on WHY "out of order Delivery" is a problem in distributed systems, if the application is running on TCP ..................PLease Kindly reply.
I can already hear the interviewer asking "with the hybrid solution: what happens when the cache node dies before it flushes to the concrete storage?" You said youd avoid using that strategy for sensitive writes but you'd still stand to lose upto the size of the buffer you defined on the cache in the e entire of failure. You'd have to factor that risk into your trade off. Great video, as always. Thank you!
Hi Guarav, I really like your videos thank you for sharing! I need to point out something about this video. Writing directly do DB and updating cache after, is called write around not write back. The last option you have provided, writing to cache and updating DB after a while if necessary, is called write back
Cache doesn’t stop network calls but does stop slow costly database queries. This is still explained well and I’m being a little pedantic. Good video, great excitement and energy.
Notes: In Memory Caching - Save memory cost - For commonly accessed data - Avoid Re-computation - For frequent computation like finding average age - Reduce DB Load - Hit cache before querying DB Drawbacks of Cache - Hardware (SSD) much more expensive than DB - As we store more data on cache, search time increases (counter productive) Design - Database (Infinite information) vs Cache (Relevant information) Cache Policy - Least Recently Used (LRU) - Top entires are recent entries, remove least recently used entries in cache Issue with caches - Extra calls - When we couldn’t find entry in cache, we query from database. - Threshing - Input and output cache without ever using results - Consistency - When update DB, we must maintain consistency between cache and DB Where to place the cache - Close to server (in memory) - Benefit - Fast - Issue - Maintaining consistency between memory of different servers, especially for sensitive data such as password - Close to DB (global cache, i.e. Redis) - Benefit - Accurate, Able to scale independently Write-through vs Write-back - Write-through - Update cache, before updating DB - Not possible for multiple servers - Write-back - Update DB, before updating cache - Issue: Performance - When we update the DB, and we keep updating the cache based on that, much of the data in the cache will be fine and invalidating them will be expensive - Hybrid - Any update first write to cache - After a while, persist entries in bulk to database
Teaching and learning are processes. Gaurav makes it fun to learn about stuff, then let it be systems or the egg dropping problem. I might just take the InterviewReady course to participate in the interactive sessions. Take a bow!
Fun part. I was going through 'Grokking The System Design Interview' course, found the term 'Redis', started searching for more on it on youtube, landed here, finished the video and Gaurav is now asking me to go back to the course. Was going to anyway! :)
Great video. But I wanted to point out that, I think what you are referring to as 'write-back' is termed as 'write-around', as it comes "around" to the cache after writing to the database. Both 'write-around' and 'write-through' are "eager writes" and done synchronously. In contrast, "write-back" is a "lazy write" policy done asynchronously - data is written to the cache and updated to the database in a non-blocking manner. We may choose to be even lazier and play around with the timing however and batch the writes to save network round-trips. This reduces latency, at the cost of temporary inconsistency (or permanent if the cache server crashes - to avoid which we replicate the caches)
If someone explains any concept with confidence & clarity like you in the interview, he/she can rock it seriously. Heavily inspired by you & love your content of system design. Thanks for the effort @Gaurav Sen
What you explained as write-back cache is actually a write-around cache. In write-back cache...you update only the cache during the write call and update the db later (either while eviction or periodically in the background).
This everything what I needed. I am really looking forward to learn that how can create an online game hosting server . I researched a lot on how do it and I didn't get it what is exactly happening. Your CDN video was really good 👍. Now I have understood how exactly CDN works and why it uses distributed caching 👍💯
Gaurav, what you initially described as write-back at around 10:30 I have seen described as write-around. Write-back is where you write to the cache and get confirmation that the update was made, then the system copies from the cache to the database (or whatever authoritative data store you have) later... be it milliseconds or minutes later. Write through is reliable for things that have to be ACID but it is slower than write back. You later describe what I have always heard as write-back at around 12 and a half minutes
Description for write back cache is incorrect. Write-back cache: Under this scheme, data is written to cache alone and completion is immediately confirmed to the client. The write to the permanent storage is done after specified intervals or under certain conditions. This results in low latency and high throughput for write-intensive applications, however, this speed comes with the risk of data loss in case of a crash or other adverse event because the only copy of the written data is in the cache.
Yes, as per my understanding, write-through cache : when data is written on the cache it is modified in the main memory, write back cache: when dirty data (data changed) is evicted from the cache , it is written on the main memory, so write back cache will be faster. The whole explanation around there two concepts given in this video seems fuzzy.
Gaurav, what you are describing as a Write Back cache is actually called Write Around cache. What you describe as the hybrid mechanism, is actually called the Write Back cache. In both assumption is an asynchronous update unlike Write Through where update is synchronous. Might be worth taking this video offline and uploading a corrected version to avoid misleading folks prepping for interviews.
Nice Explanation Gaurav. This video covers basics of caching. In one of the interviews, I was asked to design the Caching System for stream of objects having validity. Is it possible for you to make some video on this system design topic?
A few other reasons not to store completely everything in cache (and thereby ditching DBs altogether) are (1) durability since some caches are in-memory only; (2) range lookups, which would require searching the whole cache vs a DB which could at least leverage an index to help with a range query. Once a DB responds to a range query, of course that response could be cached.
Summary Caching can be used for the following purposes: Reduce duplication of the same request Reduce load on DB. Fast retrieval of already computed things. Cache runs on SSD (RAM) Rather than on commodity hardware. Don't overload the cache for obvious reasons: It is expensive(hardware) Search time will increase Think of two things:(You obviously want to keep data that is going to be most used) !So predict! When will you load data in the cache When will you evict data from the cache Cache Policy = Cache Performance Least Recently Used Least Frequently used Sliding Window Cache Policy = Cache Performance Least Recently Used Least Frequently used Sliding Window Avoid thrashing in Cache Putting data into the cache and removing it without using it again most of the time. Issues can be of Data Consistency What if data has changed Problems with Keeping cache in Server memory(In memory) -What if the server goes down(cache will go down) -How to maintain consistency in data across cache. Mechanism Write through Always write first in the cache if there is an entry and then write in DB. The second part can be synchronous. But if you have in-memory cache for every server obviously you will enter into data inconsistency again Write back Go to Db, make an update, and check-in cache if you have the entry.. Evict it. But suppose there is no any important update and you keep evicting entries from cache like this you can again fall into thrashing. One can use Hybrid approach as per the use case. Thanks to @GauravSen
A label/comment in the video about the change of usage w.r.t to write-back and write-through would help future viewers. I never saw the pinned comment until recently. This could have backfired in an interview.
i think you mixed write-back with write-around cache. write-back is when you just update the cache and the database gets updated at a later point in time. write-around is when the db gets updated first and then the cache gets notified asynchronously about that update.
Do you implement caching on most systems? It will add complexity, how can you determine if it is worth the additional effort to develop. Love the videos by the way. These are a great learning tool, you do a great job.
The draw back of write through you explained is equally applicable in Write Back i.e. I null the value in S1 still the value is not null in S2. Major thing is - Redis is not distributed cache. Even their own definition does not include the word "Distributed" - Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
One Observation, cache need not run on expensive hardware, and for cache, one would use "memory" centric instances on the cloud, not SSD(s) and caches can be used in place of a database if the size is relatively small and you require high throughput and efficiency.
Hi Gaurav - good video on distributed caching! This expands a bit more on what I learned in my computer architecture class - I didn't recall thrashing the cache too well, or what distinguished write-through vs. write-back. I think learning caching in the context of networks is more interesting, since it was initially introduced as a way to avoid hitting disk ( on a single machine ), but is also a way to reduce network calls invoked from server to databases.
What is the efficiency of such architecture for rapidly changing data. Not only write-thru is required (as Vijay Somasundaram indicated below), but reading from the database is always required in order to get the most updated information, in which case this architecture is almost useless. Do I miss anything? In other words, it would be better to start with going thru the use cases where this architecture has an advantage. thanks a lot for preparing this video
I have one doubt regarding the cache policy. Gaurav explained that for critical data we use Write Back policy to ensure consistency. In write through one instance memory cache gets updated and others can remain stale. 1) My question is same can happen in Write Back, one instance's in memory cache entry gets deleted and we update DB..other instances still have that entry. So there is inconsistency in write Back as well. Why do we prefer write back for critical data because same issue is there in write back. If answer is invalidate all instances in memory cache entry then same can be done for Write through. Which makes me ask question 2. 2) My another question is : We can update all instances' in memory cache entry and then update DB. In this way consistency is maintained so why not we use this for critical data like password financial information.
@Gaurav Sen - How network call can be reduced in terms of distributed cache wherein cache would be distributed? Why distributed cache is faster than database?
Thnaks for the informative video... I have one scenario could you please go through and provide your suggestions if any.... 1. Application fetching the data from multiple configuration databases and actual data will fetch from Big data on the basis of configuration ... But all the configuration is different for all users and with their respective roles... It is just like "access level" it is something dynamic.. Here we want to reduce network calls... We can think tag basis distributed caching but on some level, we need a cache where we can perform queries also.
One approach I use for consistency is lazy updates. On DB write instead of pushing the data back to the caches (which may never get read if a second update comes in) the DB writes the ID to invalidate to a message queue that all caches subscribe to. Then you can implement query--then-cache-on-miss semantics. This way load throughout the system is reduced, with some double-queries occurring if the cache was cleared after a good query due to latency (this can be eliminated by using versioning: using the current timestamp in milliseconds at the time of write and broadcasting it so that the cache only accepts to clear itself if the cached version # differs from the broadcasted version #)
Awesome overview thanks. One other possible issue with write-through - it's possible to make the update to the cache then the DB update itself fails. Now your cache and db will be inconsistent.
Gaurav nice video. One comment. Writeback cache refers to writing to cache first and then the update gets propagated to db asynchronously from cache. What you're describing as writeback is actually write-through, since in write through, order of writing (to db or cache first) doesn't matter.
Ah, thanks for the clarification!
Yes, would be great if you can add a comment saying correction about the 'Write back cache'. Thanks for the great video!
I agree.. a comment in the video correcting this would be good update to this.
So Gaurav was also wrong in saying "write-back" is a good policy for distributed systems?
@Gaurav Yes that would be great. That part was confusing, had to read about that separately.
Write-through: data is written in cache & DB; I/O completion is confirmed only when data is written in both places
Write-around: data is written in DB only; I/O completion is confirmed when data is written in DB
Write-back: data is written in cache first; I/O completion is confirmed when data is written in cache; data is written to DB asynchronously (background job) and does not block the request from being processed
Q
Other variants
1. There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.
2. There are only two hard problems in distributed systems: 2. Exactly-once delivery 1. Guaranteed order of messages 2. Exactly-once delivery
Hahahaha!
@@gkcs A humble suggestion, I think you should have a sub-reddit for the channel, because these are such critical topics [not just for cracking interviews], I'm sure they'd definitely encourage healthy discussions. I think YT's comment system is not really ideal to have/track conversations with fellow channel members.
This is an underrated comment .... 😂😂😂
@@gkcs Can you please give some hints on WHY "out of order Delivery" is a problem in distributed systems, if the application is running on TCP ..................PLease Kindly reply.
@goutham Kolluru , Can you please give an hint on WHY "out of order Delivery" is a problem in distributed systems, if the application is running on TCP ..................PLease Kindly reply.
I can already hear the interviewer asking "with the hybrid solution: what happens when the cache node dies before it flushes to the concrete storage?" You said youd avoid using that strategy for sensitive writes but you'd still stand to lose upto the size of the buffer you defined on the cache in the e entire of failure. You'd have to factor that risk into your trade off. Great video, as always. Thank you!
Hi Guarav, I really like your videos thank you for sharing! I need to point out something about this video. Writing directly do DB and updating cache after, is called write around not write back. The last option you have provided, writing to cache and updating DB after a while if necessary, is called write back
Thanks Zehra 😁
Cache doesn’t stop network calls but does stop slow costly database queries. This is still explained well and I’m being a little pedantic. Good video, great excitement and energy.
Notes:
In Memory Caching
- Save memory cost - For commonly accessed data
- Avoid Re-computation - For frequent computation like finding average age
- Reduce DB Load - Hit cache before querying DB
Drawbacks of Cache
- Hardware (SSD) much more expensive than DB
- As we store more data on cache, search time increases (counter productive)
Design
- Database (Infinite information) vs Cache (Relevant information)
Cache Policy
- Least Recently Used (LRU) - Top entires are recent entries, remove least recently used entries in cache
Issue with caches
- Extra calls - When we couldn’t find entry in cache, we query from database.
- Threshing - Input and output cache without ever using results
- Consistency - When update DB, we must maintain consistency between cache and DB
Where to place the cache
- Close to server (in memory)
- Benefit - Fast
- Issue - Maintaining consistency between memory of different servers, especially for sensitive data such as password
- Close to DB (global cache, i.e. Redis)
- Benefit - Accurate, Able to scale independently
Write-through vs Write-back
- Write-through - Update cache, before updating DB
- Not possible for multiple servers
- Write-back - Update DB, before updating cache
- Issue: Performance - When we update the DB, and we keep updating the cache based on that, much of the data in the cache will be fine and invalidating them will be expensive
- Hybrid
- Any update first write to cache
- After a while, persist entries in bulk to database
nice, but write through and write back notes part is wrong, pls correct it. you can check other comments. thanks
Nice notes
I just can't find a better content on YT than this, thanks man!
The world needs more people like you. Thank you!
Teaching and learning are processes. Gaurav makes it fun to learn about stuff, then let it be systems or the egg dropping problem.
I might just take the InterviewReady course to participate in the interactive sessions.
Take a bow!
This man is literally insane in explanation 🔥
Dude you are the reason for my system design interest Thanks and never stop making system design videos
I watched this video 3 times because of confusion but ur pinned comment saved my mind
thank you sir
I don't know how people can dislike your video Gaurav, you are a master at explaining the concepts.
Thank you so much for these videos!. Using this I was able to pass my system design interview.
Fun part. I was going through 'Grokking The System Design Interview' course, found the term 'Redis', started searching for more on it on youtube, landed here, finished the video and Gaurav is now asking me to go back to the course. Was going to anyway! :)
Hahaha!
I am actually using write back redis in our system but this video actually helped me to understand what's happening overall. GReat video
Great video. But I wanted to point out that, I think what you are referring to as 'write-back' is termed as 'write-around', as it comes "around" to the cache after writing to the database. Both 'write-around' and 'write-through' are "eager writes" and done synchronously. In contrast, "write-back" is a "lazy write" policy done asynchronously - data is written to the cache and updated to the database in a non-blocking manner. We may choose to be even lazier and play around with the timing however and batch the writes to save network round-trips. This reduces latency, at the cost of temporary inconsistency (or permanent if the cache server crashes - to avoid which we replicate the caches)
If someone explains any concept with confidence & clarity like you in the interview, he/she can rock it seriously. Heavily inspired by you & love your content of system design. Thanks for the effort @Gaurav Sen
Nice video Gaurav, really like your way of explaining. Also, the fast forward when you write on board is great editing, keeps the viewer hooked.
nice quick video to get an overview. thanks Gaurav. you are helping a lot of people.
each of ur videos, i watched ay least twice lol, thank you!! WE ALL LOVE U! U R THE BEST!
I also watch his videos mamy times.
At least 4 times to be precise.
Thanks Gaurav, your lecture helped me to crack MS. Keep posting video's
Congrats!
Are you in the Hyd campus?
Bhai. u r a life saver! Brilliant tutoring. Thank you!
amazing clarity, intuitive explanations
What you explained as write-back cache is actually a write-around cache. In write-back cache...you update only the cache during the write call and update the db later (either while eviction or periodically in the background).
Explained like my interviewed candidate today.
This everything what I needed. I am really looking forward to learn that how can create an online game hosting server . I researched a lot on how do it and I didn't get it what is exactly happening. Your CDN video was really good 👍. Now I have understood how exactly CDN works and why it uses distributed caching 👍💯
Thank you 😁
Good video around basic caching concepts. I was hoping to learn more about Redis (given your video title)!
Gaurav, what you initially described as write-back at around 10:30 I have seen described as write-around. Write-back is where you write to the cache and get confirmation that the update was made, then the system copies from the cache to the database (or whatever authoritative data store you have) later... be it milliseconds or minutes later. Write through is reliable for things that have to be ACID but it is slower than write back. You later describe what I have always heard as write-back at around 12 and a half minutes
Yes, I messed up with the names. Thanks for pointing it out 😁
@@gkcs so does this mean mean that write-through is good for critical data (financial/passwords) and write-back/write-around is not?
I think simply telling THANK YOU will be very less for this help !!! Superb video.
Glad to help :)
I mean you can always do more by becoming a channel member 😄
Description for write back cache is incorrect.
Write-back cache: Under this scheme, data is written to cache alone and completion is immediately confirmed to the client. The write to the permanent storage is done after specified intervals or under certain conditions. This results in low latency and high throughput for write-intensive applications, however, this speed comes with the risk of data loss in case of a crash or other adverse event because the only copy of the written data is in the cache.
Thanks for pointing this out Satvik 😁👍
I believe the description in the video given for write-back cache is actually a write-around cache (according to grokking system design)
What if the cache itself is replicated? Will write-back still has risk of data loss
Yes, as per my understanding, write-through cache : when data is written on the cache it is modified in the main memory, write back cache: when dirty data (data changed) is evicted from the cache , it is written on the main memory, so write back cache will be faster. The whole explanation around there two concepts given in this video seems fuzzy.
The way you explained concepts is AWSOME.
Can you please create a video that decribes DOCKER and Containers in your style.
Great content. Would love to hear more about how to solve cached data inconsistencies in distributed systems.
thanks for this quick tutorial :) your English is really good
Gaurav, what you are describing as a Write Back cache is actually called Write Around cache. What you describe as the hybrid mechanism, is actually called the Write Back cache. In both assumption is an asynchronous update unlike Write Through where update is synchronous. Might be worth taking this video offline and uploading a corrected version to avoid misleading folks prepping for interviews.
Very easy understanding Gaurav. Thanks a lot !!!
Nicely packed lot of information for glimpse.. Great work
Very nice presentation . Simple, powerful and fast presentation. Keep up the style
Thank you!
Thank you for the video. You could have gone a little deeper about how the cache is implemented? What’s the underlying data structure of the cache?
Excellent! Great video with tremendous info and design considerations
wonderfully explained. thanks
Thank you so much..! your videos are really valuable. Really appreciate your effort, sir.!!
You articulate these concepts very well. Thanks for the upload.
Very informative and concepts explained clearly. Thanks
Nice Explanation Gaurav. This video covers basics of caching. In one of the interviews, I was asked to design the Caching System for stream of objects having validity. Is it possible for you to make some video on this system design topic?
Excellent info and presentation - thanks!
You have explained it very nicely. Thanks.
Your System Design videos are very good and helpful, thanks!
learned a ton in this video thanks so much
This is my video on your channel and I must say that you explain very well! You seem professional, knowledgable and researched your topic well!
A few other reasons not to store completely everything in cache (and thereby ditching DBs altogether) are (1) durability since some caches are in-memory only; (2) range lookups, which would require searching the whole cache vs a DB which could at least leverage an index to help with a range query. Once a DB responds to a range query, of course that response could be cached.
Your explanation is awesome. Keep it up!
Thanks!
My boy look very energized... keep it up!
😁
Very knowledgeable. Nicely explained
Thanks!
Great video Gaurav!
Thanks code_report 😁
hey Gaurav, for holidays I'll watch your videos day in and day out... So please teach new topics asap.
I love to listen you
Summary
Caching can be used for the following purposes:
Reduce duplication of the same request
Reduce load on DB.
Fast retrieval of already computed things.
Cache runs on SSD (RAM)
Rather than on commodity hardware.
Don't overload the cache for obvious reasons:
It is expensive(hardware)
Search time will increase
Think of two things:(You obviously want to keep data that is going to be most used)
!So predict!
When will you load data in the cache
When will you evict data from the cache
Cache Policy = Cache Performance
Least Recently Used
Least Frequently used
Sliding Window
Cache Policy = Cache Performance
Least Recently Used
Least Frequently used
Sliding Window
Avoid thrashing in Cache
Putting data into the cache and removing it without using it again most of the time.
Issues can be of Data Consistency
What if data has changed
Problems with Keeping cache in Server memory(In memory)
-What if the server goes down(cache will go down)
-How to maintain consistency in data across cache.
Mechanism
Write through
Always write first in the cache if there is an entry and then write in DB.
The second part can be synchronous.
But if you have in-memory cache for every server obviously you will enter into data inconsistency again
Write back
Go to Db, make an update, and check-in cache if you have the entry.. Evict it.
But suppose there is no any important update and you keep evicting entries from cache like this you can again fall into thrashing.
One can use Hybrid approach as per the use case.
Thanks to @GauravSen
Amazing Explanation!! Thanks!!
A label/comment in the video about the change of usage w.r.t to write-back and write-through would help future viewers. I never saw the pinned comment until recently. This could have backfired in an interview.
Very well explained !!
It is a really great video.Finally found a detailed video.Thank you for sharing your knowledge!!
Excellent explanation
always watching your videos. topic straight to the point. keep uploading man. thanks always.
I really hope I have watched this video before my interview this week...:(
Great explanation. You are making my revision so much easier. Thanks!!
Great explanation for caching. I believe you'll go far.
Awesome explanation gaurav. You're cool man. We want a lottt more from you. We admire your ability to explain topics with great simplicity.
great video,very helpful to learn english
awesome Gaurav thanks
Great explanation
i think you mixed write-back with write-around cache. write-back is when you just update the cache and the database gets updated at a later point in time. write-around is when the db gets updated first and then the cache gets notified asynchronously about that update.
Thank you Gaurav, it was a really good explanation
Do you implement caching on most systems? It will add complexity, how can you determine if it is worth the additional effort to develop.
Love the videos by the way. These are a great learning tool, you do a great job.
Please make a full series in Redis or Paid Course.
well explained bhai sahib
The draw back of write through you explained is equally applicable in Write Back i.e. I null the value in S1 still the value is not null in S2. Major thing is - Redis is not distributed cache. Even their own definition does not include the word "Distributed" - Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
this video was gold. studying for my facebook on-site and i need to understand a bit more how backend works. cheers @gaurav sen
One Observation, cache need not run on expensive hardware, and for cache, one would use "memory" centric instances on the cloud, not SSD(s) and caches can be used in place of a database if the size is relatively small and you require high throughput and efficiency.
Awesome explanation! Thanks
Thank you!
You continue to offer great content. thank you !
Great video, thank you!
Good video. Thank you. From Canada.
Great going, Gaurav. You have a great future!
This one is very helpful for me. Many thanks Gaurav.
Cheers!
I like the the explanation Dada
Hi Gaurav - good video on distributed caching! This expands a bit more on what I learned in my computer architecture class - I didn't recall thrashing the cache too well, or what distinguished write-through vs. write-back. I think learning caching in the context of networks is more interesting, since it was initially introduced as a way to avoid hitting disk ( on a single machine ), but is also a way to reduce network calls invoked from server to databases.
What is the efficiency of such architecture for rapidly changing data. Not only write-thru is required (as Vijay Somasundaram indicated below), but reading from the database is always required in order to get the most updated information, in which case this architecture is almost useless. Do I miss anything?
In other words, it would be better to start with going thru the use cases where this architecture has an advantage.
thanks a lot for preparing this video
I have one doubt regarding the cache policy. Gaurav explained that for critical data we use Write Back policy to ensure consistency. In write through one instance memory cache gets updated and others can remain stale.
1) My question is same can happen in Write Back, one instance's in memory cache entry gets deleted and we update DB..other instances still have that entry. So there is inconsistency in write Back as well. Why do we prefer write back for critical data because same issue is there in write back.
If answer is invalidate all instances in memory cache entry then same can be done for Write through. Which makes me ask question 2.
2) My another question is : We can update all instances' in memory cache entry and then update DB. In this way consistency is maintained so why not we use this for critical data like password financial information.
@Gaurav Sen - How network call can be reduced in terms of distributed cache wherein cache would be distributed? Why distributed cache is faster than database?
Solid explanation
Thanks for Video Gaurav.
What if global cache itself failed? What are different backup strategies for it?
Thnaks for the informative video... I have one scenario could you please go through and provide your suggestions if any....
1. Application fetching the data from multiple configuration databases and actual data will fetch from Big data on the basis of configuration ...
But all the configuration is different for all users and with their respective roles... It is just like "access level" it is something dynamic.. Here we want to reduce network calls...
We can think tag basis distributed caching but on some level, we need a cache where we can perform queries also.
Correction: INPUTING and OUTPUTTING -> Adding and Removing 5:46
awesome video, n informative
nice. you have good presentation skills. keep it up
when doing videos make legible pen we cannot see out side but overall explanation is too good and nice explanation
One approach I use for consistency is lazy updates. On DB write instead of pushing the data back to the caches (which may never get read if a second update comes in) the DB writes the ID to invalidate to a message queue that all caches subscribe to. Then you can implement query--then-cache-on-miss semantics. This way load throughout the system is reduced, with some double-queries occurring if the cache was cleared after a good query due to latency (this can be eliminated by using versioning: using the current timestamp in milliseconds at the time of write and broadcasting it so that the cache only accepts to clear itself if the cached version # differs from the broadcasted version #)
Useful :)
Awesome overview thanks. One other possible issue with write-through - it's possible to make the update to the cache then the DB update itself fails. Now your cache and db will be inconsistent.
True 😁
Excellent 👍