System Design: Why is single-threaded Redis so fast?
Вставка
- Опубліковано 9 сер 2022
- Weekly system design newsletter: bit.ly/3tfAlYD
Checkout our bestselling System Design Interview books:
Volume 1: amzn.to/3Ou7gkd
Volume 2: amzn.to/3HqGozy
Other things we made:
Digital version of System Design Interview books: bit.ly/3mlDSk9
Twitter: bit.ly/3HqEz5G
LinkedIn: bit.ly/39h22JK
Animation tools: Illustrator and After Effects
ABOUT US:
Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.
As a solution architect I really appreciate the brevity and clarity of explanation. Ability to properly set the level of abstraction, explain complex things in plain simple language without missing important pieces is a sign of true deep understanding.
I am using UA-cam for a very long time and this is no doubt one of the absolute best channels I have ever encountered. Every video a hit and about interesting and in-depth topics.
Looking forward to more of your content!
Hey, are there any such other programming channels that you'd recommend?
Appreciate these tiny tidbits of knowledge! Any plans on uploading some in-depth videos for commonly used systems such as image storages, chat apps etc?
They cover that kind of stuff deeper in their books.
9 videos, most are around 3 mins and already 111k subs. That shouts quality man. Good job!
Found this channel a while back and saved it in my back pocket for a deep dive over a weekend. Going through the videos right now and holy crap, I swear I learn more from your 3-4 minute videos than I do from entire courses online. Thank you so much for what you do!
Came here from the HTTP/1 to HTTP/3 history lesson but stayed for the rest of the videos. The quality on this channel is awesome. Great work on these presentations! Very thorough and informative introductions.
Hey there Alex, long time listener- first time caller, I just want to let you know that I absolutely love how well thought out and designed your videos are. I'm somewhat of a novice-intermediate engineer and your videos have been a lifesaver! Thank you.
Easily my new favourite channel on UA-cam.
Informative, concise, and so easy to understand.
I've been in Software Engineering for a long time now and these videos are such a great resource
Love your videos! They are to the point.
Great videos, excellent format, outstanding host - I really love your videos!
You're really good at summarising difficult concepts and distilling the information!! ❤🖖
These videos are just so well done. Only a matter of time before this channel has a million subs. Keep up the excellent work!
Videos and texts like never before.....Heavily loaded with knowledge :) Thank you, sir, for providing such deep insights.
Just came across this channel, you guys are doing an amazing job!
Subscribed and ordered the book. Thanks for a great content!
I can see one of the highest quality material on the youtube in tech domain, which is presented visually in very fine grained manner.
I wish great to the team. And request to keep it up.
Thanks & Regards,
Arun Dhwaj
These videos are awesome. Keep making them.
Thank you so much for sharing such nice informative videos. These have been really helpful for us. Thanks.
I'm so glad I came across your channel!
On 2:47 you could also mention that the fact that REDIS is single threaded allows for lock-free, simple and very fast data-structures that do not have to worry about thread synchronization. That's the exact same approach (single-threaded, lock-free, non-blocking I/O and garbage-free) we've been taking on CoralBlocks for near a decade now.
I am not the kind of person who comments on videos but this one is amazing, keep it up!!
This has to be the highest quality content available at the moment! Most of the channels about programming talk about trendy web stack basics or do opinion vlogs... But here we get a bit more advanced and it's well researched. Easiest subscribe of my life
These days, Javascript is a religion
I would love to see a video where you talk about consistency, HA and fault-tolerance of modern DB systems such as Redis
KeyDB
I am so happy I have found this channel 🙂
Did you create a powerpoint to make this video? It's very fluid and beautiful!
Great videos!
This time I was struggling a little bit with timely pressing pause on smartphone to read infographics :)
Would appreciate some tempo decrease in the future.
this chanell is gold and books also amazing, have 1 part and wanna take second part
This was really intersting! Subbed
Addicted to your videos 👍
Appreciate the value of this video
Can I ask you what are you using for making those drawings in your video? 🙏 They are so beautiful!
love this, thank you.
Your content is so good.
Good one. Which tool are you using to draw these system blocks?
This particular video went above my head. But overall great content on your channel
So amazing!!
love these videos
For those who have some confusion, the "Event loop" is a very old design pattern used since 70s to allow the multitasking on single thread cpu systems...its not related of Nodejs.
I see only 2 comments mentioning nodejs, one is this one, and the other is not referring to them as related.
It is, though. It's the exact same thing. The "event loop" of Node and the "event loop" of Redis are just two different implementations of the same pattern.
Many new ppl think its a nodejs invention...is what I mean.
@@WurstRELOADED Hold on, doesn’t NodeJs use kernel threads under the hood to simulate asynchronous processing ?
I was also expecting some explanation of the internal processes that Redis does to optimise the data which speeds it up.
Nice video, by the way
There are not a lot of them really, basically just magic with memory allocation and key expirations. The most of the speed comes from single thread, no locks / synchronization, "memory only" and such (much like nginx actually)
Thanks for he link to Dragonfly and KeyDB. I am developing a database similar to Redis myself (HM4 Redis)
I love your videos . Curious which software is being used to create such animations?
Thanks for this easy to watch and informative video!!
It would be even better if there are some references for us to dig deeper. A reference section like the one in the end of a paper. Take an example of this video.e.g. I am pretty interested to learn more about IO multiplex and corresponding system calls. A reference for that would be tremendous
Very nice presentation, which took do you use please?
interesting! Subscribed
thanks! deep into it
Pure Quality.
could you share with me what is the tool to make effect for your video?
Thanks for your sharing
I have never seen an illustration that tells thousand of words like these.
Would you mind to share the name of the software which was used to create your presentation?
Some more excellent content
What kind of tools did you use to edit this video ?
The pyramid of memory access speeds is nice, but its graphically not proportional which hides some of the true difference in latency. Might not be fully readable if proportional, but it would be cool to see.
love this
02:08 "With I/O multiplexing, the OS allows a single thread to wait on many socket connections simultaneously."
What does this means? Why it helps on performance? 🤔
I love the theme song :)
but how does it handle a crash?
is all the data now gone if the service or the os crashes since the data is only present in RAM?
Easy answer: yes. Since all the data in RAM, once redis server (or OS or hardware server or whatever) is down - all data is lost.
Real answer: it depends. Redis has two mechanisms to fight that. RDB and AOF (which can be used complementary at the same time btw). They both save "data" to disk and after restart redis will load data from disk and continues to service requests.
RDB is "point in time backup", you set an interval and say to redis "every N seconds save all data to disk". So in case of a crash, you'll lose at most N seconds of data (actually 2N in worst case scenario, since crash can happen mid backup). Drawback: you need 2x RAM available (e.g. you redis dataset is 2G you need 4G of RAM for RDB backup to complete). Idea being - when timer ticks (it's time to make a new backup) redis makes a fork of itself with whole copy of data in memory. That's another example of simplicity of redis. That allows to initiate backup in microseconds and continue to serve requests while backup process does not interfere with workload at all. Main process continues to serve requests as if nothing happened, while child has a whole copy of immutable memory to work with and push it to disk (since it's copy it will not change; once dumping to disk is done, memory is freed, and process terminates)
AOF is acronym for Append Only File. That's a transaction log. Each time something changes in redis memory (basically set/del operation) this operation is written to AOF. Another example of redis simplicity FTW. Since redis is single threaded, it does not need to worry about locks and synchronizations, since file is append only operation of adding abstract "line to the end of this file" is fast. This provides much more up to date state of redis memory before crash so you basically will lose just couple of operations. When redis restarts, it starts with empty memory and then replays AOF in order from the very beginning, and like that it eventually reaches the state "before crash". Drawback: replaying AOF is kinda long and AOF grows effectively endlessly
And that's why it's common to use RDB + AOF. RDB provides "backup of everything" but rarely (say every half an hour) AOF provides transaction story for "missing half an hour". Those giving you good tradeoff between restart time, losing data, RAM hungry redis
For more accurate information refer to redis documentation (search for persistence or RDB or AOF)
@@ThePhilosoft sounds reasonable. thanks for the explanation
Can you please make a video on Memcache vs redis
Why not use mutithread and ring buffer for redis ? Anyone can tell me ?
The diagram depicting single-thread process vs multi-threaded process seems to be incorrect. Instead of code and files twice in single thread, it should have been register and stack in the second row.
Would appreciate if you could dive deeper into I/O multiplexing
1. In memory
2. Single threaded (without locking) and multiplexiing I/O
3. Efficient (in-memory) data structures (without worrying efficient disk storage)
Nice.
Adding to that one of the biggest drawback of it is that redis is not suitable for heavy operations like intersectAndStore, diffAndStore, delete keys on millions of records it will take approx 10-30 seconds to complete and as result all the other operations will wait for this task to complete and if these heavy tasks are lined one after another it can lead to cascading delay in tha application and if you are using sentinel it can even lead to failure and ping will not return response as cpu will be 100% for redis instance at that time and due to fail over there can be chances of key loss.
By deleting, do you mean invalidating the cache contents or removing data from source database ?
@@danieljust295 it can be both removing data that is no longer valid or required.
awesome
How does it compare to SAP HANA?
please share how you make animations
Another disadvantage of in memory database is, it is not persistent.
What if we need to restart machine? All data is lost?
Easy answer: yes. Since all the data in RAM, once redis server (or OS or hardware server or whatever) is down - all data is lost.
Real answer: it depends. Redis has two mechanisms to fight that. RDB and AOF (which can be used complementary at the same time btw). They both save "data" to disk and after restart redis will load data from disk and continues to service requests.
RDB is "point in time backup", you set an interval and say to redis "every N seconds save all data to disk". So in case of a crash, you'll lose at most N seconds of data (actually 2N in worst case scenario, since crash can happen mid backup). Drawback: you need 2x RAM available (e.g. you redis dataset is 2G you need 4G of RAM for RDB backup to complete). Idea being - when timer ticks (it's time to make a new backup) redis makes a fork of itself with whole copy of data in memory. That's another example of simplicity of redis. That allows to initiate backup in microseconds and continue to serve requests while backup process does not interfere with workload at all. Main process continues to serve requests as if nothing happened, while child has a whole copy of immutable memory to work with and push it to disk (since it's copy it will not change; once dumping to disk is done, memory is freed, and process terminates)
AOF is acronym for Append Only File. That's a transaction log. Each time something changes in redis memory (basically set/del operation) this operation is written to AOF. Another example of redis simplicity FTW. Since redis is single threaded, it does not need to worry about locks and synchronizations, since file is append only operation of adding abstract "line to the end of this file" is fast. This provides much more up to date state of redis memory before crash so you basically will lose just couple of operations. When redis restarts, it starts with empty memory and then replays AOF in order from the very beginning, and like that it eventually reaches the state "before crash". Drawback: replaying AOF is kinda long and AOF grows effectively endlessly
And that's why it's common to use RDB + AOF. RDB provides "backup of everything" but rarely (say every half an hour) AOF provides transaction story for "missing half an hour". Those giving you good tradeoff between restart time, losing data, RAM hungry redis
For more accurate information refer to redis documentation (search for persistence or RDB or AOF)
I would learn computer engineering from you.
Dont forget about cache locality!!
So running a single instance of Redis on a multi core server, is equivalent to being only using one core on that server and all the other cores are wasted?
yes, exactly. if you want to utilize all cores check out dragonflydb
It's not wasted. All cores are used by windows. This is why windows beats single core OS such as DOS every day of the week. Claims of single thread performance is a gimmik, may be to sell a book.
Speaking of leveraging *multi-core systems* when can we expect a video about #Golang ⚡
How to make this animated presentations?
It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.
This is how Redis Enterprise comes in the picture to enhance the Redis OSS
Please explain mqtt, rabbitmq
Is it really single threaded though?
somehow your videos make me sleep bruh !!!!
Your voice in the video sounds low. A higher volume would give better clarity (not an issue with my device since other videos are playing fine).
Actually redis not so fast, after some tests i decided to design own key value data structure in java and got 120k RPS per core, while redis can reach only 90k RPS on similar task on same equipment. Redis contains many options and usefull tools which sometimes actually not applicable to your requirements at single moment of the time at current specific task.
Redis has so much more functionality and a great fault tolerance system. Its not just a server with key-value store.
"Multi-threaded applications require locks or other synchronization mechanisms". Fair enough, multi-threading may introduce more complexity but these days we have "lock-free" techniques that remove most of the need for expensive synchronization protocols. I don't know anything about the internals of Redis, but I'm sceptical that there's no way to improve on the single-threaded performance. Even the desktop PC that I'm writing this on has 12 logical CPUs and it seems a shame to waste all that potential.
When you want to write to a shared DB, a write-lock is required if you intend to maintain consistency across threads. If you're willing to accept eventual consistency, then you can relax that requirement but have now introduced all the complexity of distributed databases while operating on a single machine. It's important to remember HOW redis is typically used, I.E. as a process-independent cache. Caches have usage patterns that are somewhat different than what typical relational databases are tuned for. In particular, Caches tend to have a lot of hits against the same subset of keys and write/read ratios that are closer to equal. Typical multi-threaded DB designs eak out additional speed because of how they are used. Reads don't have to lock out the block they're pulling from, which allows multiple threads to read from the block at once. A write HAS to lock that block from other writes AND reads until it is done and persisted. That means that typical multi-threaded DB designs rely on the fact that they have many more read operations than write operations to attain that higher level of performance. If you attempted to use a typical DB like people use redis, the performance would take a node-dive.
some people explains 3 minutes concepts in 3 hours and few people reverse this that is 3 hours conecepts in 3 minutes , I will chose the later one.
Synchronization such as locks and mutex are not need for single threaded program.
Actually the Video rises the question how IO multiplexing solves the "blocking for each request" issue?
ram vs ssd
Single threaded mutiplexing vs multithreaded
efficient data structures
2 of the reasons are basically because it's inmemory!
0 minutes 59 seconds: Looks to be a bug/syntax error in the program possibly. Variable declared as "Path discRoot" later referred to as "diskRoot" on next line.
Redis is web scale 😎
the future could be with io_uring instead of epoll
Multi-threading is easy with Rust
It’s a glorified hashtable with i/rpc interface
basically, thing runs in memory, memory 50x faster than disk there you go
Try on couple million HSET HGET records. It’s very slow
👍👍
Just to highlight that Redis is not just an all in memory database. it can also be used to store data, and still remains pretty fast.
No thread scheduler has ever made anything faster.
In my experience, Redis is good for anything memcached can do.
As soon as you try more than that, it fails utterly.
Do you have specific examples?
We did use redis for its other data structures and found it generally to be okay, as long as we pay close attention to the documentation on the big O costs of different operations.
Meh. "It's fast _because_ of single-threaded design" is a bit of a stretch, especially if you're running multiple instances to overcome a performance bottleneck. Any general-purpose data store in that situation could probably benefit from a more sophisticated "complex" architecture.
@Larry Brin Yep you definitely didn't understand it.
Getting gains out of any kind of concurrency requires having underutilized resources available, with some exceptions, with room to compensate for whatever overhead that may entail. That's true of process parallelism and any threaded architecture. I wouldn't suggest otherwise.
That's the official reason. The unofficial reason: Meth.
*But Redis not so fast as Dragonfly* 😂😹
noice