The design is neat and crisp. And I think with little more consideration it would become awesome. I have some suggestions, 1. Notification Microservice is still SPOF. You can add multi instance with a multi LB config 2. You could have used a DLQ for failed messages as we can set replay interval directly. You don't need a schedular here as it would introduce additional complexity. 3. Payload could have the user info as most notification services are internal services which wouldn't allow exposing any sensitive information outside. And if any sensitive information required you can have another internal service to provide those. This way we can decouple each services. 4. For priority messages you can have separate worker group.
We use unique Id while sending it to the email/SMS services. And there will be a status scheduler which fetches message status based on uid@@Imrohanroy
Thanks .. very useful video.. 2 suggestions I know everyone has their own way to design but I have 2 following sessions 1) When messages are read from the queue like (SQS) you can set visibility timeout and if worker fail to send notification it is back in the queue when message visibility time out is expired. 2) Rather than ruing CRON job for failed notification and again reading DB we can add DLQ to our main queue and another micro service or lambda can reprocess failed notifications that way we can avoid another trip to DB and decouple more . But at the end very nicely presented and very good content not just this video but all others as well.
Those are some fantastic ideas! Thanks for sharing! Glad you found the video helpful and checked out some other ones too. Stay tuned for more system design videos coming up : )
Regarding the second suggestion of using DL queue, will the failure in case exception which occurred after messages has been read (but any processing failed later, lets say to email user) can be also put back in the DL queue to be reprocessed?
Nice video, I think using SQL table to store the state of the notification makes your system more complex and redundant. The workers can communicate with the queue in acknowledging message delivery success. So, if a message was delivered successfully, it can be dropped from the queue, else it would still be retained in the queue. Updating your table for each message is therefore not necessary.
Great video, One suggestion. Instead of messaging queue we can use Amazon SQS and use error handling efficiently. When message is pulled we can set inflight time to reappear after sometime and if not processed we can move to DLQ.
Agree with others that a new SQL table for message status might lead to a lot of complications. It seems like it could lead to concurrency issues (how is work being reassigned? what issues might come up if there are races between the workers' updates from the progress table and the service's reads from that table?)
Thanks for keeping these designs simple to understand. When you get a chance, would you pls make video for zoom system design? (After you settle at your new place you are moving to) … thanks a lot
Awesome presentation and easy to understand. I have a question. How to ensure notification is not lost when the notification microservice goes down before it pushes to the message queue? Thanks alot for the video
Glad you enjoyed the video : ) To do that, you would need to keep track of notifications in some persistent storage before even hitting the notification microservice. But I would suggest, instead just focus on keeping the notification microservice highly available by horizontally scaling.
Hi! Thanks for the feedback. I will go ahead and add the slide deck / notes in the description from my next video. For now, hopefully the chapters and timestamps will help you move around a bit easier.
Should we have a message queue for each type of notification? For example, if email service is offline then message queues for SMS or push notifications won't be affected by it.
Excellent presentation but I want to discuss something related to the Database as you have mentioned Mysql, in this scenario, it will scale vertically and which will make things slower as the data grows so we have to do sharding for horizontal scaling but that will lead to a lot of maintenance and also increase in cost. But if we use the NoSQL database we get the horizontal scaling but we compromise with the JOINS of RDBMS. Because of that, we might end up iterating unnecessary data because we can't filter the data at the DB level, which will again make the process slower. So which database will be best suited for the notification system, is there any other approach to tackle this issue or do we just have to trade off one feature over another? Thanks!
Let us take scripts wanna to send push notification to 10k users, did the notification microservice need to create a batch of message or single message?
Question - Why do you need the user info in the database? Why can't the client send all the info through the payload? If you have the user database with user info.. What is the client sending in the payload? Also do we need a service that stores the template ?
Sure the client can send all the information through the payload. Depends on how complex your notifications are. Usually, you don't want to transmit the sensitive user information across the network every single time you send a notification. Also, in terms of performance, it makes sense to store some kind of a materialized view of user data in the service if you use it regularly. You can store the templates in your notification service, unless you have super complex template logic.
If we want store all notifications what will happen if we send all users. Let's say we have 10000 users and we will send 10000 message each how we should keep this in database, is it not too much?
The design is neat and crisp. And I think with little more consideration it would become awesome. I have some suggestions,
1. Notification Microservice is still SPOF. You can add multi instance with a multi LB config
2. You could have used a DLQ for failed messages as we can set replay interval directly. You don't need a schedular here as it would introduce additional complexity.
3. Payload could have the user info as most notification services are internal services which wouldn't allow exposing any sensitive information outside. And if any sensitive information required you can have another internal service to provide those. This way we can decouple each services.
4. For priority messages you can have separate worker group.
One more thing, how would the upstream get to know about the status of the notification?
What would be the api structure?
Those are some great suggestions! Thank you!
We use unique Id while sending it to the email/SMS services. And there will be a status scheduler which fetches message status based on uid@@Imrohanroy
Thanks .. very useful video.. 2 suggestions I know everyone has their own way to design but I have 2 following sessions 1) When messages are read from the queue like (SQS) you can set visibility timeout and if worker fail to send notification it is back in the queue when message visibility time out is expired. 2) Rather than ruing CRON job for failed notification and again reading DB we can add DLQ to our main queue and another micro service or lambda can reprocess failed notifications that way we can avoid another trip to DB and decouple more . But at the end very nicely presented and very good content not just this video but all others as well.
Those are some fantastic ideas! Thanks for sharing!
Glad you found the video helpful and checked out some other ones too. Stay tuned for more system design videos coming up : )
I think it's still good to store all the notification history in the database for auditing purposes.
Regarding the second suggestion of using DL queue, will the failure in case exception which occurred after messages has been read (but any processing failed later, lets say to email user) can be also put back in the DL queue to be reprocessed?
Nice video, I think using SQL table to store the state of the notification makes your system more complex and redundant.
The workers can communicate with the queue in acknowledging message delivery success.
So, if a message was delivered successfully, it can be dropped from the queue, else it would still be retained in the queue.
Updating your table for each message is therefore not necessary.
Might still need to retry later, otherwise, it's clogging up the queue.
You don't want to rely on the queue as a source of truth. Queues are designed for immediate processing, not permanent storage.
This is gold! A lot of pointers included! I'll definitely share it to my team.
Thank you! Glad you found it valuable.
More like Software architecture, less System design, but loved watching it. Thanks
Glad you liked it!
Great video bro , teaching complex thing in easy way
Glad you liked it
Great video, One suggestion.
Instead of messaging queue we can use Amazon SQS and use error handling efficiently. When message is pulled we can set inflight time to reappear after sometime and if not processed we can move to DLQ.
That's a great suggestion!
I really liked the way you have simplified the things.
great example ❤🔥
Query -> For a celebrity user who has millions of followers, will the approach be the same ?
Depends on whether you are thinking of "incoming notifications" to the celebrity user or "outgoing" to all the followers of the celebrity.
This video helps me!
I was just about to consider a notification service architecture.
Glad you found value! Let me know if you have any feedback.
Great video pal. I easily related it with my work in my company.
Glad you enjoyed it!
This is very helpful for the development of our Capstone Project. Thank you for the information.
Eyy that's awesome! It always makes me super happy to hear that. Hope the other videos help you too. Good luck on the Capstone project!
Agree with others that a new SQL table for message status might lead to a lot of complications. It seems like it could lead to concurrency issues (how is work being reassigned? what issues might come up if there are races between the workers' updates from the progress table and the service's reads from that table?)
Thanks for keeping these designs simple to understand. When you get a chance, would you pls make video for zoom system design? (After you settle at your new place you are moving to) … thanks a lot
YES! I do have a Zoom System design in my backlog. I think it’s like the 7th video coming up.
Wow, excellent piece of knowledge. Thanks a lot!
My pleasure!
This video is perfect. Thank you very much!
Glad you found it helpful!
Thank you for a very helpful tutorial. Should there be a one queue per user/connection?
No the queues are not related to users.
You would have one queue in total for your workers. Or you could have one queue per type of message.
@@FreakStyler Thank you, how do I send different messages to each user?
Beautifully explained, thanks!! Subscribed for more such quality content.
Thank you for the feedback!
ur channel is so underrated. liked & subd
Thank you! Glad you found it valuable.
Awesome presentation and easy to understand. I have a question. How to ensure notification is not lost when the notification microservice goes down before it pushes to the message queue? Thanks alot for the video
Glad you enjoyed the video : )
To do that, you would need to keep track of notifications in some persistent storage before even hitting the notification microservice.
But I would suggest, instead just focus on keeping the notification microservice highly available by horizontally scaling.
I watched this again.. very good video.
Thank you again!
Thanks for your explination,i want to know how can you implement a message queue using django i wish if you can give precise answer with exemples
I can consider making a video about message queues on Django in the future.
Solid tutorial, great lecture
Thank you for this very helpful video
You are welcome! I hope you stay subscribed and get more value out of the channel.
How we will handle notifications related to transactions such as OTP ? For this scenario, I think we need some prioritization in the message queue
yes we need some kind of prioritization
@divyanshubajpai2560 @kalpeshmali8498 you can create a separate queue for high priority one jobs
Thank you.
You're welcome!
Man - you're awesome.
19:07 can we not acknoledge the message unless is consume it properly? instead of the table?
Share these ppt so easy to revise later otherwise hard to retain details and rewatching same video is not the most optimal thanks
Hi! Thanks for the feedback. I will go ahead and add the slide deck / notes in the description from my next video.
For now, hopefully the chapters and timestamps will help you move around a bit easier.
Thanks and also add these slides for previous video too including this notification service, rate limiter etc
Amazing ❤
Hi Irtiza, how response will be handled for success and retry exceeds ?
I am not sure I understand. Could you elaborate?
Should we have a message queue for each type of notification? For example, if email service is offline then message queues for SMS or push notifications won't be affected by it.
Yup that's a good abstraction too!
Excellent presentation but I want to discuss something related to the Database as you have mentioned Mysql, in this scenario, it will scale vertically and which will make things slower as the data grows so we have to do sharding for horizontal scaling but that will lead to a lot of maintenance and also increase in cost.
But if we use the NoSQL database we get the horizontal scaling but we compromise with the JOINS of RDBMS. Because of that, we might end up iterating unnecessary data because we can't filter the data at the DB level, which will again make the process slower.
So which database will be best suited for the notification system, is there any other approach to tackle this issue or do we just have to trade off one feature over another?
Thanks!
Great video! Btw isn't the notification service a single point of failure?
You will have multiple instances of the notification service running to handle scale.
Why did you choose not to scale stateless micro service and instead mentioned that worker should be scaled
Why Cafe , table , tree and Subscribe button taking 30% screen space. It is difficult to see the diagram.
Appreciate the feedback! I was trying out a few things, but turns out this was a horrible idea, haha. I stopped adding those.
Let us take scripts wanna to send push notification to 10k users, did the notification microservice need to create a batch of message or single message?
You want to batch up the messages in that case for better performance.
Question - Why do you need the user info in the database? Why can't the client send all the info through the payload?
If you have the user database with user info.. What is the client sending in the payload? Also do we need a service that stores the template ?
Sure the client can send all the information through the payload. Depends on how complex your notifications are.
Usually, you don't want to transmit the sensitive user information across the network every single time you send a notification. Also, in terms of performance, it makes sense to store some kind of a materialized view of user data in the service if you use it regularly.
You can store the templates in your notification service, unless you have super complex template logic.
@@irtizahafiz Thanks Much...
If we want store all notifications what will happen if we send all users. Let's say we have 10000 users and we will send 10000 message each how we should keep this in database, is it not too much?
That's not too much. You can easily keep that information in any database you want.
i'm immediately thinking: "we're storing emails and other personal contact information -- how are we going to address security and legal concerns?"
Yeah, that's something to always think about. I don't think the intention of this design was to dive deep into compliance/legal stuff.
Nice video, but 30% of the screen is wasted.
I appreciate the feedback and suggestions. Will work on it.
Background music is annoying 😭 please remove it
Thank you for the feedback! Many people said the same. I have removed it since.
I requested u on LinkedIn. I need to ask you a question about notification design
Hopefully our discussion helped you out : )