I have worked in multiple systems with Kafka as a messaging broker. This video is an excellent tutorial ❤. I would love to see more tutorials that explain concepts like timeouts, effects on networking latency in polling, etc.
This is one of the best Kafka videos I have seen on UA-cam. You have taught so nicely here that I am looking for a playlist only of Kafka in your channel. A few things that were best -> format of the video, you started with a small box and kept on adding pieces, separately explained broker, topics, partition, consumer, consumer groups, then cluster, and zookeeper ... this gave an idea which is what and small functional components kept adding up and formed the full system. BEST! Few things I think can be added to make this a true masterpiece and no need to watch any other videos -> some twisted cases maybe.. complex cases with multiple partitions, multiple machines, and different partitions of the same topic in different brokers. Several other cases and a bit more extension of this in part 2 would be great. These questions are asked sometimes by the interviewer to test the knowledge.
Sir in system design interview we should only know overview of it (Kafka,RabbitMQ)or we should also know how to integrate it to our projects means internal working also.
No internal working is not required, as in 45 mins interview thats not possible even Interviewer might not know how to integrate without reading the documentation
Hey Shryansh In the case of the Rabbit MQ If I have two consumer listening to a particular queue .Will the message go to one Consumer or to Both the consumer?
As you stated, if a partition is full, data is automatically moved to another partition. How will the hash-function know that a specific partition is full? How does Kafka handle partitions during rebalancing? How are network latency, data consistency, and data synchronisation handled when Kafka is used to replicate data across multiple data centers or geographic regions?
Hi Akash, in Kafka msgs are also deleted from queue periodically. Generally it has 2 criteria: - Set Retention time in topic. (After that time, msg will automatically deleted) - Size limit, once the size limit is reached, it will purge all msg whether it's read or not. So same partition can be reuse again. Also Publisher can implement Acknowledgement approach too for make sure that Msg is actually added in the question. If Ack=0 means Fire and forget. Publisher sent the msg but it might not added in the queue. Ack=1, means msg is accepted in one more partition.
First great content Shreyansh. Kudos for that. I would like to add a bit clarification from official documentation on offset management. This offset mechanism where information about offset is stored in zookeeper znode is and old architecture. As per the latest version kafka has already moved out of dependency over zookeeper znodes to maintain the offsets, Kafka introduced an internal topic called "__consumer_offsets." The offsets are now stored as messages in this topic. Brokers replicate this topic for fault tolerance.
Thanks a lot for such a great article. I have few doubts 1. Can kafka implement both PUB/SUB or Point 2 point? Which is by default available with kafka? 2. In last u explained about RabbitMQ and same message getting broadcasted to multiple queues. how consumer will know wheteher the message has alredeay been processed by a different consumer? what is the business use case when we want the same message to be processed by different consumer? 3. Does kafka use Point 2 point?
Kakfa is by default distributed Pub/Sub architecture. And it does not care whether same msg is processed by different consumer (in some other consumer group) or not. USECASE where same msg multiple consumer need is, Take the same example of Cab services (in which they are sending their location every 10sec). One consumer is Dashboard for all Car location in a particular area. Another Consumer could be logs Application, which keep logging everything. Another Consumer could be the Real time Cab movement.
Great explanation but just question why in so hurry? i feel like you could have explained more on rabbitMQ but dont know why you didnt do? if video is being long you have part 2 for that and even i dont think anyone will mind if video is more than an hour because ultimately you explain in so simplest and easiest way!!! Thank you for this.
Insightfully explained. Thank you for your contributions. However, I feel you should have discussed acknowledgement part which could have provided more in-depth understanding.
Hi Shrayansh, I wanted to take a moment to commend you-your teaching style is truly phenomenal and highly impactful. I have a query related to the IoT domain, where I am currently working. As you know, the MQTT protocol, widely used in IoT, operates on the PUB/SUB architecture. I was wondering if integrating a messaging queue system, such as Apache Kafka, would add value to such a system? Looking forward to your insights.
How do we determine size of message coming from producer? Do producer needs to produce message within size limit? Where and how do we configure that in kafka? What if size of message is bigger than that? Or what is the max size message which kafka can handle?
Have doubts about the part where the message is stored in the dead message queue after retries in kafka, you said after certain retries it goes to the dead msg queue, but after a few minutes you explain that message is pulled by consumer itself in Kafka, so if the consumer is down it will not pull, and if it doesn't pull - message retries doesn't make sense. Can you please explain.
Hi Shrayansh, Suppose there is one rabbitMQ queue and there are multiple replicas of a service . The service is a consumer of this queue. In this case, will both the replicas consume the same message simultaneously? If yes then how to avoid such scenario?
One improvement you can make for your future videos is to understand the use of "read" (present tense) and "read" (past tense). You are superb in spite.
Thankyou for the valuable videos! I have a doubt at 27:00 - you mentioned that consumer2 will start consuming msgs from partition1 but earlier you said that each consumer of a CG will consume msgs from different partitions. Here what if the C2 is consuming msgs from P2 already?
Hey, i have a question lets suppose, there are 2 partitions for topic and there are 2 pods running on kubernetes which are pulling messages, so 1 pod will read from 1 partition, now lets suppose there is a surge in message so new partitions will be created, how will my pods know about this? adding partition based scaling is the only solution as that partition will remain idle?
I don't think, Kafka support automatic scaling with partition of a topic, we have to provide the Count of partition while creating the topic. When we have to increase or decrease the partition, as what I am aware of, it create new topic with new count and stream the data from original to new topics which is created and update the zookeeper. But let me check in morning again. But this is what i know
Do we really have dead letter queue thing in kafka? I guess we can only have this sort of functionality by adding a new topic called dead letter topic. Please correct me if I am wrong.
Thanks so much. I only heard the hype of message queue, but now understood how it works under the hood. You explanation was excellent. I have one query: say for example, in our app we have a feature of sending push notifications to the app users for various events we can do it by kafka right? but for example, i have 10 applications and each application has the same responsibility to send push notification, can i do this using a single broker means kafka server ? is there any complexity?
Thanks a lot for this awesome video! Happy to see your channel growing now! It would be also helpful if you can guide how to decide how many partitions , concurrency, retentions etc should have in kafka ? There is no such docs or tutorial on this estimation of these things, i have struggled a lot to figure out this. Hope you will pick up.
Hi Rupesh, first of all thank you. How many Partition should have, it depends upon traffic volume ( generally partition number is in odd like 3, 5,7 etc) but how many you needed depen upon traffic and it can be increased later also, so to answer your question there is no fixed number you can start with 1 and grow as per need
When you say push based approach, and messages are pushed to consumers, Does that mean, bidirectional connection is established between the queue and consumer in RabbitMQ ? If not, what protocol is used underneath for push mechanism ?
rabbit mq structure i felt more like a multiplxer from computer architecture like you give an input and only target is geen lit in 1) fanout approach , first exachge will filter out the based on topics and then it will push to all queues of that topic and then consumer have to decide it either process it or ignore 2) in direct it more presize it can directly map to only target queue so it feel like in fanout approach the last bit(here key is missing) that's why it lit (pushed) to all the queues and in direct approach it had last bit also(key) so it had some unique path so it just psuhed in specific queues only and similarly topic exchange is bigger version of fanout where you have some bits kown like **123 so it can be anything ending with 123 correct me if im going south...
I have a quetion :- Lets say 2 different consumer from diff consumer group are reading partition 0 then they both can read partition 0 data right? Which should not be happening. But if this functionality is there in kafka then once both consumer consumes data from partition 0 then only the offset will be changed? Or there is any gap in my understanding, if any one can explain plz.
I am watching it again.. Just one question.. Let say we only have one consumer group and 2 consumer in it.. And we have 2 partitions inside 1 broker. You also said in a same consumer group each consumer assigned to a different partition. Now let say if 1 consumer goes down while reading from partition 1 at offeset 5.. And assume 2nd partition is reading by consumer 2 continuing without failure.. What will happen in this case for partition 1
In that case where consumers are less than partition kafka will rebalance the partition and consumer 2 will read from partition 1 for sometime and keep on doing the rebalancing again and again
@@ankitgupta-ph4nkIt will do that In round robin fashion sometime from 1 and then sometime from 2 .Read the Kafka documentation.Thats what is consumer rebalancing in kafka
@@sam-um5wo I got this thanks man.. Just one more thing asking out of context... I want to learn Low level design .. Let say for pizza store or any parking lot.. How to structure all classes and what best design patterns we can use Do we have any resources online to thoroughly got those concepts..
amazing video sir ! have been following you since many months now . any specific reason why did you switch to english instead of hindi ? i liked hindi videos more in general
Hi Aditya, got 100s of msg that they also want to learn but do not understand Hindi, so after thinking a lot i decided to switch to English. - My English is very basic Kam chalu. So anybody can understand. - 2nd point is more than language, way of teaching is imp, and that i kept the same. So i moved on
i'm thinking there is a slight conflict in : What happens when Queue Size limit is reached? so here if we increase multiple brokers then we will be copying other topics as well right but let's say we only faced this overflow issue in topic1 so we only needed to increase the capacity of topic 1 rather than having one more broker which basically created new instances of topic1, topic 2, topic 3.... and so on . but we only needed to scale topic one right, so what i have learned from you so far, i think we should increase no of partition let's say intially we had 3 partition of topic1 each with 100 message capacity and now incoming messages are 500 so need to increase partition of topic one only, ya obviously in distributed system we will already have this clusters where multiple broker will be there to deal with this ,so this is also a solution but , reverse question might pop up like hey why creating multiple broker just beacuse u had more pressure on topic 1 and rest other topics are pretty fine so here resource utilization will be low, (this is just just theoritical thinking 😅, yes, we will need multiple kafka servers to ensure if one server/topic/partition goes down, system remain available and functioning )
Hi Srayansh, I have one Qs. Suppose, Application 1 and Application 2 acts as a consumer 1 and consumer 2 and it is listening to Topic A message. Both are belong to one Consumer group. Now, when message published and suppose Application 1 consumed the message then will the Application 2 also consume the message?
If 2 different applications but with same group Id means belongs to same group. And both are reading the same topic, then they can not read same msg. Because as i mentioned, inside topic there are partitions which consumer read. So consumers in same group, can not read same partition. Therefore we can say, same msg can not be read by both consumer those belong to 1 consumer group
Very nice explanation. Could you please tell me about consumer group if consumer group 1 has 4 consumers does it mean that those 4 consumers does the same work like they are replicas of a particular consumer or they can be diff. Consumers doing different jobs. Basically I am asking while creating consumer groups on what basis we divide consumers to different consumer groups like they are grouped as replicas to each other or consumers doing different works can be grouped to one consumer group??
Each Consumers are not replicas. Depending upon company needs different consumer application might be created, let's say one app need cars location data to create dashboard for users. But other need the data for doing something else. If both needed same data they need to be put into different consumer group
@@ConceptandCoding So you are saying that if two consumer applications need same data then they need to be placed in diff consumer groups as keeping them in same consumer group the applications cannot read the message concurrently??
I have one doubt .. kafka is pull based then why will zookeeper take care of assigning topic to a consumer 2 when consumer 1 goes down? Isn't it consumer's responsbility
This is configuration based logic, as per my understanding, when consumer group is created, it configure the strategy by which it decide which partition to assign to which consumer, which strategy to choose when any consumer goes down. And strategy i think is Kafka platform code(called partition assignor). Let me double check where does this code logic or strategy resides.
Hey, Let’s assume in consumer group, there are two consumer 1 and 2. If consumer 1 went down, then will consumer 2 start reading from the partition which was assigned to consumer 1? What if consumer 2 was already assigned assigned a partition, what will be behaviour now? Also judging from the design here do consumer group consists of single service(with multiple instance/pods)?
@@ConceptandCoding hi but in the video u taught that at a time only 1 consumer can interact with 1 partition. So is this an exception case or do we have any configuration through which a consumer can talk 2 partitions? Can a consumer read from more than 1 partition?
As per Kafka documentation. If there is 2 partition and 1 consumer, then 1 consumer can read from both Partition. But if there are 2 or more consumers active and 2 partition, then correct configuration should be that each consumer should mapped to different partition
Suppose there are multiple consumers from same group and they are reading message from same topic A.what if they want to read same message msgA which is present in Partition0. Does producer push duplicate message to each partition in the topic. How this can happen.
No that's not possible as per Kafka documentation. 2 consumers in one consumer group can not read from Same partition. But same msg can be present in different partition of a topic. That logic need to be present at publisher, to publish same msg one for say Partition0 and another partition1.
@@ConceptandCoding Here i am considering a consumer to be suppose ServiceA , ServiceB . Both need to work on msgA. so msgA will be pushed to both the partition in a topic. Am i thinking right?
@@Voyager1001 right. But generally if ConsumerA and ConsumerB both need to work on same partition, the correct way is they should be part of different consumer group that make more sense. You are right
Hi Animesh, LLD is OOPs concepts only. I am not familiar with javascript, see if all the OOPs fundamentals is possible with Java script then you are good to go)
Hi. Thanks for the Wonderful explanation. Please help me in getting clarity for below one's 1. I hope the P2P queue can be achieved by pub sub (kafka and Rabbitmq with Direct exchange technique). If not kindly help me in knowing it 2. How key (hash) in sender payload decide to push the message in kafka partition topic. Does the key's hash is generated based on partition information? Thanks in advance and thanks for the effort ☺️😊
We can design a topic or exchange that is unique to every producer-consumer combination to establish a P2P queue where each producer has a queue to send messages to and each consumer has a queue to receive messages from. This is possible in RabbitMQ by using a direct exchange, which routes messages to queues based on a routing key. Each producer can have its own queue, and each consumer can use its routing key to consume from a specific queue. This is possible in Kafka by using topics, in which each producer publishes to a specific topic and each consumer subscribes to a specific topic.
The hash value of the key is used to determine which partition the message is assigned to. The MurmurHash2 algorithm is used to generate the hash value, which is not directly based on partition information. The number of partitions in the topic, however, has an effect on the calculation of the partition to which the message is assigned because the hash value is taken modulo the number of partitions.
Thanks Akash for the input. And regarding the first point can P2P be implemented via PubSub like Kafka. By putting one consumer and Consumer group, we can get the P2P behaviour. But in general Kafka is Pub Sub pattern
Hi Suheab. When i convert these One note pages to PDF, it's not in format at all. No one can read and understand properly from it. Sharing those rough kind and not formatted notes is not a good impression. One way for me is, to write it properly and then share, but again it gonna consume some time. Let me reach out to some team members and see if anyone while watching video make notes. I will try to get that.
This is necessary to maintain message order and prevent duplication buddy. This design ensures sequential processing within a partition, avoiding complexities associated with concurrent consumption from the same partition by multiple consumers.
actually both are used, queues are used to store the message temporarily and when all retry finished, it does store the failed msgs in DB. So multiple distributed workers can easily work on the queue and its easy to build event driven architecture using queue. DB is mostly for storing permanent data and doing some complex operation or queries.
Awesome content, but video length could be long, 45.12 is a very short period of time to understand complex topics like Kafka and Rabbit. Sometimes I felt you were in a hurry to wrap it up, please take it in a positive way :)
teaching is an art, he literally said … “tell me” ….. reminded me of my math teacher
loved your comment buddy thans a lot means lot to me
I have worked in multiple systems with Kafka as a messaging broker. This video is an excellent tutorial ❤. I would love to see more tutorials that explain concepts like timeouts, effects on networking latency in polling, etc.
This is one of the best Kafka videos I have seen on UA-cam. You have taught so nicely here that I am looking for a playlist only of Kafka in your channel. A few things that were best -> format of the video, you started with a small box and kept on adding pieces, separately explained broker, topics, partition, consumer, consumer groups, then cluster, and zookeeper ... this gave an idea which is what and small functional components kept adding up and formed the full system. BEST!
Few things I think can be added to make this a true masterpiece and no need to watch any other videos -> some twisted cases maybe.. complex cases with multiple partitions, multiple machines, and different partitions of the same topic in different brokers. Several other cases and a bit more extension of this in part 2 would be great. These questions are asked sometimes by the interviewer to test the knowledge.
Noted, thanks will add more videos on Kafka to covered more usecases
THE EASIEST AND CLEAR CUT EXPLAIATION . Thanks a lotttt mannn
RABBIT MQ could have been in another video with better explaination
May I confess to you that "You are my hero". Thank you for everything that you are doing. Knowledge empowers the society.
Thanks a lot buddy, means lot to me
Dkhne se phle hi comment karta hoon I truly blve is bande kamal ka content diya hoga is session mein
Thank you very much Shams❤️, pls do comment after watching too if you liked the video and found it informational , really wanted to know
Thank you for the black background. Really helps the eyes.
Yes got lot of feedback for this, so switched to Black background
Currently on 11th video, all amazingly explained. Thank you.
Thanks buddy, hope by the time you will come to 16, i will add couple of more videos till then :)
@@ConceptandCoding LLD and HLD complete ✌️😀
Wow...the explanation is so on point. Thank you so much for this valuable content.
Thank you, Saumya pls do share it with your connections ☺️
I have watched more than ten videos on this no body explained it like you. Thanks a lot for making such content.
thanks a lot
Perfect one , you can come with implementing it in a project.
Thanks
love you sir. thanks for giving these classes as free
Welcome buddy, pls do share it with your connections ❤️
Too good ! Saved several hours of reading from the book
Glad it helped!
Best ever i have seen...thnxxx
Thanks Ujjal, pls do share it with your connections too ✌️
👍👍 sure
Amazing video , great explanation
44:29 he mean *rabbitmq* works on push approach
Thank you so much for sharing high level architecture of Kafka. Too much informative enjoyed the video!!
Glad you enjoyed it!
awesome
didn't know about this requeue concept
Sunday sorted😃😃
Thanks buddy do share it with your connections too ☺️
Great content and well explained!!!! Many thanks
Thanks a lot, that was so much helpful. It cleared all my doubts😀
Please keep up the good job!!!
Thank you
Concise form, covered all the kakfa topic.
Great video on messaging queues 👍😃
Thank you, pls do share it with your connections Subham ☺️
21:37 isn't the partition selection is like first partition being checked first if not found then key hash and if no key then round robin
Sir in system design interview we should only know overview of it (Kafka,RabbitMQ)or we should also know how to integrate it to our projects means internal working also.
No internal working is not required, as in 45 mins interview thats not possible even Interviewer might not know how to integrate without reading the documentation
Very valuable content. Thanks a lot🙏
Thanks a lot
Hey Shryansh
In the case of the Rabbit MQ If I have two consumer listening to a particular queue .Will the message go to one Consumer or to Both the consumer?
As you stated, if a partition is full, data is automatically moved to another partition. How will the hash-function know that a specific partition is full? How does Kafka handle partitions during rebalancing?
How are network latency, data consistency, and data synchronisation handled when Kafka is used to replicate data across multiple data centers or geographic regions?
Hi Akash, in Kafka msgs are also deleted from queue periodically.
Generally it has 2 criteria:
- Set Retention time in topic. (After that time, msg will automatically deleted)
- Size limit, once the size limit is reached, it will purge all msg whether it's read or not.
So same partition can be reuse again.
Also Publisher can implement Acknowledgement approach too for make sure that Msg is actually added in the question.
If Ack=0 means Fire and forget. Publisher sent the msg but it might not added in the queue.
Ack=1, means msg is accepted in one more partition.
First great content Shreyansh. Kudos for that.
I would like to add a bit clarification from official documentation on offset management.
This offset mechanism where information about offset is stored in zookeeper znode is and old architecture.
As per the latest version kafka has already moved out of dependency over zookeeper znodes to maintain the offsets, Kafka introduced an internal topic called "__consumer_offsets." The offsets are now stored as messages in this topic. Brokers replicate this topic for fault tolerance.
Thanks a lot for such a great article. I have few doubts
1. Can kafka implement both PUB/SUB or Point 2 point? Which is by default available with kafka?
2. In last u explained about RabbitMQ and same message getting broadcasted to multiple queues. how consumer will know wheteher the message has alredeay been processed by a different consumer? what is the business use case when we want the same message to be processed by different consumer?
3. Does kafka use Point 2 point?
Kakfa is by default distributed Pub/Sub architecture.
And it does not care whether same msg is processed by different consumer (in some other consumer group) or not.
USECASE where same msg multiple consumer need is,
Take the same example of Cab services (in which they are sending their location every 10sec).
One consumer is Dashboard for all Car location in a particular area.
Another Consumer could be logs Application, which keep logging everything.
Another Consumer could be the Real time Cab movement.
Amazing tutorial👏👏👏
Great explanation but just question why in so hurry? i feel like you could have explained more on rabbitMQ but dont know why you didnt do? if video is being long you have part 2 for that and even i dont think anyone will mind if video is more than an hour because ultimately you explain in so simplest and easiest way!!!
Thank you for this.
Insightfully explained. Thank you for your contributions. However, I feel you should have discussed acknowledgement part which could have provided more in-depth understanding.
Hi Shrayansh, I wanted to take a moment to commend you-your teaching style is truly phenomenal and highly impactful.
I have a query related to the IoT domain, where I am currently working. As you know, the MQTT protocol, widely used in IoT, operates on the PUB/SUB architecture. I was wondering if integrating a messaging queue system, such as Apache Kafka, would add value to such a system?
Looking forward to your insights.
How do we determine size of message coming from producer? Do producer needs to produce message within size limit? Where and how do we configure that in kafka? What if size of message is bigger than that? Or what is the max size message which kafka can handle?
i would appreciate if you come up with video implementing the kafka using psuedo code.
Have doubts about the part where the message is stored in the dead message queue after retries in kafka, you said after certain retries it goes to the dead msg queue, but after a few minutes you explain that message is pulled by consumer itself in Kafka, so if the consumer is down it will not pull, and if it doesn't pull - message retries doesn't make sense. Can you please explain.
Very useful content. Thanks
Thank you
Please make Series on How Multiple Microservices talk to each other using Kafka or Rabbit MQ ?
Very well explained
Nice explanation. A real world practical example would have been great to connect the dots.
Yes i will cover 1 topic on that too
Hi Shrayansh,
Suppose there is one rabbitMQ queue and there are multiple replicas of a service . The service is a consumer of this queue. In this case, will both the replicas consume the same message simultaneously? If yes then how to avoid such scenario?
Excellent! Thanks!
One improvement you can make for your future videos is to understand the use of "read" (present tense) and "read" (past tense). You are superb in spite.
Very well explained...awsome!!!
Thankyou for the valuable videos!
I have a doubt at 27:00 - you mentioned that consumer2 will start consuming msgs from partition1 but earlier you said that each consumer of a CG will consume msgs from different partitions. Here what if the C2 is consuming msgs from P2 already?
when c1 goes down the consumer group will pick the next available consumer to resume the processing
@@brahm_and_coding thanks
Just amazing video. Thanks a lot ❤
Thank you
Wait a sec, RabbitMQ works on push approach right? Why did you mentioned kafka works at push approach near 44:30?
Oops sorry my bad.
I wanted to say RabbitMq.
RabbitMq - works on Push approach
Kafka - works on Pull approach.
Thanks for pointing out Yash.
@@ConceptandCoding no worries, mistakes happen. Btw you are doing absolutely godly. Thanks for the content. We
very detailed one , keep going thanks
Thanks
Hey, i have a question lets suppose, there are 2 partitions for topic and there are 2 pods running on kubernetes which are pulling messages, so 1 pod will read from 1 partition, now lets suppose there is a surge in message so new partitions will be created, how will my pods know about this? adding partition based scaling is the only solution as that partition will remain idle?
I don't think, Kafka support automatic scaling with partition of a topic, we have to provide the Count of partition while creating the topic.
When we have to increase or decrease the partition, as what I am aware of, it create new topic with new count and stream the data from original to new topics which is created and update the zookeeper.
But let me check in morning again. But this is what i know
Do we really have dead letter queue thing in kafka?
I guess we can only have this sort of functionality by adding a new topic called dead letter topic. Please correct me if I am wrong.
Thanks so much. I only heard the hype of message queue, but now understood how it works under the hood. You explanation was excellent.
I have one query: say for example, in our app we have a feature of sending push notifications to the app users for various events we can do it by kafka right? but for example, i have 10 applications and each application has the same responsibility to send push notification, can i do this using a single broker means kafka server ? is there any complexity?
Create User group for the 10 notification sending pods.
Best kafka video
thanks
Did I miss the part where queue is designed, the title Design Messaging Queue like Kafka, RabbitMQ seems misleading to me.
Thanks a lot for this awesome video!
Happy to see your channel growing now!
It would be also helpful if you can guide how to decide how many partitions , concurrency, retentions etc should have in kafka ?
There is no such docs or tutorial on this estimation of these things, i have struggled a lot to figure out this.
Hope you will pick up.
Hi Rupesh, first of all thank you.
How many Partition should have, it depends upon traffic volume ( generally partition number is in odd like 3, 5,7 etc) but how many you needed depen upon traffic and it can be increased later also, so to answer your question there is no fixed number you can start with 1 and grow as per need
One question:- if a consumer goes down in Rabbit MQ then how is a new consumer assigned to read from that queue?
whats the point of different brokers if we have replicas of partition as ultimately both in sync hence both brokers will get full queue at same time??
Awesome explanation, plz explain for schema registry and Avro. Thanks,
Thanks, It gives me good understanding of both of the queues. Can we add here AWS SQS too.
Thanks and noted
What is the software that you use to create the freehand sketches for the presentation ... I need something like that
When you say push based approach, and messages are pushed to consumers, Does that mean, bidirectional connection is established between the queue and consumer in RabbitMQ ?
If not, what protocol is used underneath for push mechanism ?
I agree, with all this information covered, it would have been nice to talk about how push and pull are implemented.
Also, I think push based approach won't care the pace at which consumer is consuming the message and hence questions the usage of queue.
Very easy and well explained, Any practical reference(Code wise) if you can share for this messaging will really helpful
Thanks and noted
rabbit mq structure i felt more like a multiplxer from computer architecture
like you give an input and only target is geen lit
in
1) fanout approach , first exachge will filter out the based on topics and then it will push to all queues of that topic
and then consumer have to decide it either process it or ignore
2) in direct it more presize it can directly map to only target queue
so it feel like in fanout approach the last bit(here key is missing) that's why it lit (pushed) to all the queues and in direct approach it had last bit also(key) so it had some unique path so it just psuhed in specific queues only
and similarly topic exchange is bigger version of fanout where you have some bits kown like **123 so it can be anything ending with 123
correct me if im going south...
Hey Shrayansh, could you do a similar video for GCP Pubsub? Thanks.
noted
Good job mate!
I have a quetion :-
Lets say 2 different consumer from diff consumer group are reading partition 0 then they both can read partition 0 data right? Which should not be happening. But if this functionality is there in kafka then once both consumer consumes data from partition 0 then only the offset will be changed?
Or there is any gap in my understanding, if any one can explain plz.
I am watching it again..
Just one question..
Let say we only have one consumer group and 2 consumer in it..
And we have 2 partitions inside 1 broker.
You also said in a same consumer group each consumer assigned to a different partition.
Now let say if 1 consumer goes down while reading from partition 1 at offeset 5..
And assume 2nd partition is reading by consumer 2 continuing without failure..
What will happen in this case for partition 1
In that case where consumers are less than partition kafka will rebalance the partition and consumer 2 will read from partition 1 for sometime and keep on doing the rebalancing again and again
@@sam-um5wo Ok, then what about consumer 2 which already reading from partition 2..
@@ankitgupta-ph4nkIt will do that In round robin fashion sometime from 1 and then sometime from 2 .Read the Kafka documentation.Thats what is consumer rebalancing in kafka
@@sam-um5wo I got this thanks man..
Just one more thing asking out of context...
I want to learn Low level design ..
Let say for pizza store or any parking lot..
How to structure all classes and what best design patterns we can use
Do we have any resources online to thoroughly got those concepts..
Zabardast
Thanks
amazing video sir ! have been following you since many months now . any specific reason why did you switch to english instead of hindi ? i liked hindi videos more in general
Hi Aditya, got 100s of msg that they also want to learn but do not understand Hindi, so after thinking a lot i decided to switch to English.
- My English is very basic Kam chalu. So anybody can understand.
- 2nd point is more than language, way of teaching is imp, and that i kept the same.
So i moved on
i'm thinking there is a slight conflict in :
What happens when Queue Size limit is reached?
so here if we increase multiple brokers then we will be copying other topics as well right but let's say we only faced this overflow issue in topic1 so we only needed to increase the capacity of topic 1 rather than having one more broker which basically created new instances of topic1, topic 2, topic 3.... and so on . but we only needed to scale topic one right, so what i have learned from you so far, i think we should increase no of partition let's say intially we had 3 partition of topic1 each with 100 message capacity and now incoming messages are 500 so need to increase partition of topic one only,
ya obviously in distributed system we will already have this clusters where multiple broker will be there to deal with this ,so this is also a solution but ,
reverse question might pop up like hey why creating multiple broker just beacuse u had more pressure on topic 1 and rest other topics are pretty fine so here resource utilization will be low,
(this is just just theoritical thinking 😅,
yes, we will need multiple kafka servers to ensure if one server/topic/partition goes down, system remain available and functioning )
what happens when the dead letter queue overflows?
Best Explained
Thank you
Hey, Could you please share your hand written material i-e oneNote/ipad link too for the reference.
Thank you so much this is awesome. One quick question: Do you have the notes available anywhere?
Notes are available in description section, attaching below too:
notebook.zohopublic.in/public/notes/u3i1s522a981ed32d48bcbb0b940ee3d58f22
Could you please elaborate the retry mechanisms that should be used while using kafka?
In video i did, it maintain a sequence no right
Where is the video for designing the queue
Hi Srayansh, I have one Qs.
Suppose, Application 1 and Application 2 acts as a consumer 1 and consumer 2 and it is listening to Topic A message. Both are belong to one Consumer group.
Now, when message published and suppose Application 1 consumed the message then will the Application 2 also consume the message?
If 2 different applications but with same group Id means belongs to same group.
And both are reading the same topic, then they can not read same msg.
Because as i mentioned, inside topic there are partitions which consumer read.
So consumers in same group, can not read same partition.
Therefore we can say, same msg can not be read by both consumer those belong to 1 consumer group
🔥
Very nice explanation. Could you please tell me about consumer group if consumer group 1 has 4 consumers does it mean that those 4 consumers does the same work like they are replicas of a particular consumer or they can be diff. Consumers doing different jobs. Basically I am asking while creating consumer groups on what basis we divide consumers to different consumer groups like they are grouped as replicas to each other or consumers doing different works can be grouped to one consumer group??
Each Consumers are not replicas. Depending upon company needs different consumer application might be created, let's say one app need cars location data to create dashboard for users.
But other need the data for doing something else. If both needed same data they need to be put into different consumer group
@@ConceptandCoding So you are saying that if two consumer applications need same data then they need to be placed in diff consumer groups as keeping them in same consumer group the applications cannot read the message concurrently??
Hi Great videos 👍🏻What is the software used for system design interviews
One note and wacom
I have one doubt .. kafka is pull based then why will zookeeper take care of assigning topic to a consumer 2 when consumer 1 goes down? Isn't it consumer's responsbility
This is configuration based logic, as per my understanding, when consumer group is created, it configure the strategy by which it decide which partition to assign to which consumer, which strategy to choose when any consumer goes down.
And strategy i think is Kafka platform code(called partition assignor). Let me double check where does this code logic or strategy resides.
is AWS SQS another example of Distributed Messaging Queue?
Right
What's binding is it any logical entity
Hey,
Let’s assume in consumer group, there are two consumer 1 and 2. If consumer 1 went down, then will consumer 2 start reading from the partition which was assigned to consumer 1?
What if consumer 2 was already assigned assigned a partition, what will be behaviour now? Also judging from the design here do consumer group consists of single service(with multiple instance/pods)?
Consumer 2 will now take care of both the partition of a topic.
@@ConceptandCoding hi but in the video u taught that at a time only 1 consumer can interact with 1 partition. So is this an exception case or do we have any configuration through which a consumer can talk 2 partitions? Can a consumer read from more than 1 partition?
As per Kafka documentation.
If there is 2 partition and 1 consumer, then 1 consumer can read from both Partition.
But if there are 2 or more consumers active and 2 partition, then correct configuration should be that each consumer should mapped to different partition
Suppose there are multiple consumers from same group and they are reading message from same topic A.what if they want to read same message msgA which is present in Partition0. Does producer push duplicate message to each partition in the topic. How this can happen.
No that's not possible as per Kafka documentation.
2 consumers in one consumer group can not read from Same partition.
But same msg can be present in different partition of a topic. That logic need to be present at publisher, to publish same msg one for say Partition0 and another partition1.
@@ConceptandCoding Here i am considering a consumer to be suppose ServiceA , ServiceB . Both need to work on msgA. so msgA will be pushed to both the partition in a topic. Am i thinking right?
@@Voyager1001 right.
But generally if ConsumerA and ConsumerB both need to work on same partition, the correct way is they should be part of different consumer group that make more sense.
You are right
Nice!
Possible to share the slide ?
hello sir LLD can be done with javascript (also learned oops ) because this my first language or i need to learn java?
Hi Animesh, LLD is OOPs concepts only.
I am not familiar with javascript, see if all the OOPs fundamentals is possible with Java script then you are good to go)
Hi.
Thanks for the Wonderful explanation.
Please help me in getting clarity for below one's
1. I hope the P2P queue can be achieved by pub sub (kafka and Rabbitmq with Direct exchange technique). If not kindly help me in knowing it
2. How key (hash) in sender payload decide to push the message in kafka partition topic. Does the key's hash is generated based on partition information?
Thanks in advance and thanks for the effort ☺️😊
We can design a topic or exchange that is unique to every producer-consumer combination to establish a P2P queue where each producer has a queue to send messages to and each consumer has a queue to receive messages from.
This is possible in RabbitMQ by using a direct exchange, which routes messages to queues based on a routing key. Each producer can have its own queue, and each consumer can use its routing key to consume from a specific queue. This is possible in Kafka by using topics, in which each producer publishes to a specific topic and each consumer subscribes to a specific topic.
The hash value of the key is used to determine which partition the message is assigned to. The MurmurHash2 algorithm is used to generate the hash value, which is not directly based on partition information. The number of partitions in the topic, however, has an effect on the calculation of the partition to which the message is assigned because the hash value is taken modulo the number of partitions.
Thanks Akash for the input.
And regarding the first point can P2P be implemented via PubSub like Kafka.
By putting one consumer and Consumer group, we can get the P2P behaviour.
But in general Kafka is Pub Sub pattern
Hi sir, what happen if zookeeper is down will there be a replica for this
Yes Zookeeper is also a Distributed system, one goes down another comes up. But did not cover it else video length becomes very big
GREAT VIDEO !! May be just some grammatical errors while explaining. If that's improved, it can be world class .. keep it up
sure i will improve, thanks for the feedback
What if the queue goes down for rabbit MQ?
RabbitMq also has the same concept of broker, leader and follower. So it should behave in the same manner as Kafka
Hi @shrayansh
Even after repeated requests, you are not sharing the one note links for notes😢
Hi Suheab. When i convert these One note pages to PDF, it's not in format at all. No one can read and understand properly from it. Sharing those rough kind and not formatted notes is not a good impression.
One way for me is, to write it properly and then share, but again it gonna consume some time.
Let me reach out to some team members and see if anyone while watching video make notes. I will try to get that.
@@ConceptandCoding you can directly share this one note link by making it read only
perfect
Why does two consumers within a consumer group can not read a same partition?
This is necessary to maintain message order and prevent duplication buddy. This design ensures sequential processing within a partition, avoiding complexities associated with concurrent consumption from the same partition by multiple consumers.
Why can't we use db, why message queue
actually both are used, queues are used to store the message temporarily and when all retry finished, it does store the failed msgs in DB.
So multiple distributed workers can easily work on the queue and its easy to build event driven architecture using queue.
DB is mostly for storing permanent data and doing some complex operation or queries.
Awesome content, but video length could be long, 45.12 is a very short period of time to understand complex topics like Kafka and Rabbit. Sometimes I felt you were in a hurry to wrap it up, please take it in a positive way :)
Thanks for the feedback Ratnakar.
Taken this feedback
Kafka itself does not have a built-in dead-letter queue
Sir can you please provide this document for reference
Noted, will try to upload on gitlab
One thing is wrong here partiotion is not the leader, Its Broker which is the leader in the cluster. Dont delete the comments.
Leader and follower are the concept of Partition. One is leader and one or many are followers.
To clarify, there is a leader broker machine for a particular topic and partition combo.
Broker is not leader it is elected as Controller which is done by Zookeeper and the leader is one of the partition which is elected by Controller
could you please share notes on same
Check description once pls,, if it's not there, i will add buddy
👍👍
Bro, be concise please. Too much repetition of words