Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code. www.learningjournal.guru/courses/
I’m afraid there was an occasional mistake in the part 16 “Consumer group” at 5:54. There was told that the first coordinator to participate in a group becomes a leader but in fact it was meant to be that the first consumer becomes a group leader.
I'am very much pleased with the way and ease you explain things , thanks a loot for that, how ever i've a question could you please help me with it ? Q)What if the group leader crashes or wants to exit the group, which member will be picked to become leader and who makes it leader ?
FOR A CORRECTION, at 5:50 minutes, it is not First Coordinator it actually first consumer who joins the group called as LEADER and the rest Members/Followers.
At 5:44, there is one group coordinator for a consumer-group, does that mean if there are say 100 consumer groups, then there will be 100 group coordinators? For one consumer group, all make sense, but I'm confused when I consider more than one consumer group.
Yes, that's correct. For each Consumer Group, one of the brokers is selected as the group coordinator. A Broker acts as a broker in general and also takes care of group coordination as an extra responsibility.
@@ScholarNest thanks for reply, just one more question, does it hold when there are many consumer group and only a handful of broker. In this case, each broker would be group coordinator for many consumer groups.
Great material. Thanks sir. is it possible we could have access to slides of the whole series so that if we want to refer to something in future we dont have to view the whole video again? Thanks again.
Is it a mistake when you say "The first coordinator to participate in the group becomes a leader. All other Consumers join later becomes the member of the group" from 5:53 to 6:06 in this video? From your draw I think the first Consumer participate in the group is the leader, and those join later are members.
firstly, thnx a lot for the good work on this! it would be great if u cud add a small annotation around 5:55 mentioning that u meant 'consumer' instead..
Came back to review your video again, learned more each time. May I have a question here, I am using kafka-mongodb-sink connection to pass data from 3 kafka brokers to A mongodb in my system. I am planing to parallel insert to mongodb. so on each host I have broker + connector + mongodb primary shard. My question here is the connector should be in distributed mode or standalone mode? as now I use the distributed mode. hope this is correct setup. thanks, your demo is very advanced and very clear. Robin
1) I have a question here, When we say the replication factor 3 then the same message will be there in different partitions right, In this case is there any chance that the consumer can read same message from different partitions ? 2) Is there any chance that group leader can be overloaded during re-balance action executing
Replication factor 3 makes three copies of the entire partition. Those partitions are managed my the followers. Consumers always read from leaders. Hence those partitions are not used by consumers.
Thank you for the immediate explanation and this makes very clear, so always one leader is associated with one partition. Even if the data is replicated in to multiple partitions it doesn't matter for the consumers because consumer always reads from the leader and not from the followers.
This lecture was about dealing with multiples consumers within the same consumer group and same instance application running on a single machine. But what would change from that if I wanted to configure multiple consumers in the same consumer group but running in different machines for achieving greater horizontal scalability? Thanks for your lectures, you've done such an awesome job!
Thanks. It's really good. Could you please also explain how do multiple messages get sent with multiple producers using one java program ? All these programs run with a single thread. Do we need to use threads or run multiple java programs run the same code, which send different messages ? or the same program you've written works and set a config on producer api.
The best method is to learn and make your own notes for future reference. I don't use powerpoint for making these videos, so it's very difficult to share them.
Awesome tutorial series! I'm wondering if there's a python version for the codebase or any resources that I can look up? Also what are some major differences between confluence kafka & kafka?
You mention there is a worry about reading the same messages - why not have asynchronous reads to multiple consumers per partition? Let's say in a simple case, 2 consumers read asynchronously from partition 0. The reason I bring this up is what if a certain partition receives more messages compared to the others? Doesn't it make sense to assign another (or more) consumers to it?
I have a question I hope someone will answer. Why is there a need of a parallel reading.. and how will you handle the data from a consumer group like I want to store it into my database in order.
Does your database preserves the order? We always do an order by query from the database. What is the use case of keeping records in database in order?
Hi , I didn't understand that how Group Coordinator election will take place tried to explore and got the details regarding election of Leader... Would you please explore???
Hi sir ...the way you explained is awesome but it can give only book knowledge , if you show this at code level that will be really helpful for us ... ..to be frank we don't know how create and test multipl consumers in intellij or eclipse ......strugling lot while testing multiple consumers....pls sir could you cover this topics
Need clarification w.r.t ack and asynchronous send, let us say i have implemented async producer with callback and mentioned ack in producer config. what will be the behavior
I have a double. What is the need of consumer group if a single consumer can read individual partition? I.e. if we have 3 partitions in a topic then we can start 3 consumer scripts mapping 1 partition to each consumer. How it is different from creating a group? Kindly explain.
You can start 3 consumers and assign them three different partitions, manage failures and other things. But this approach is manual and you have to write a lot of code. Creating a consumer group is an easy way to achieve parallel processing of messages. If one consumer fails for some reason, it's partition is automatically assigned to someone else in the same group.
The Leader that we are talking about, from your draw, the first Consumer participate in the group is the leader, and those join later are members, my question is how the consumer becomes a leader to do the rebalancing task. Consumer has the code to read from the topic but how It does the rebalancing. There should be some node which we can elect as a Leader?
for a consumer group, i created 3 consumers with same group id logically, these 3 consumers are in one group, right. but individually iam polling each of the consumer, is there any way to poll the consumer group so that i need not write to all consumer code, finally what iam asking is if i say consumer.poll() it returns bunch of records from the topics that consumer subscribe, is there any eay that i can directly tell consumergroup.poll()
No, A consumer polls. Generally, we don't create separate code for consumers if the requirement is just to consume in parallel. It's the same code, and you just execute multiple instances.
Hi Sir , Wonderful video . I want to have 2 consumers with different consumer group in my application using springboot autoconfiguration , Please can you guide how can Configure it ? by below I can just configure one but I want like consumer 1 and consumer 2 . spring: kafka: consumer: bootstrap-servers: kafka-0.broker.kafka.svc.cluster.local:9092,kafka-1.broker.kafka.svc.cluster.local:9092,kafka-2.broker.kafka.svc.cluster.local:9092 group-id: digitaltwin key-deserializer: org.apache.kafka.common.serialization.StringDeserializer value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
Hi, I have a doubt, let say I have 2 topics one topic has two partitions with two consumer in a group, ( reason for two consumer, if one fails it move to there consumer) And other topic has a 4 partitions in this two consumers are in one group of consumer and other two partitions are in other group consumer.. so here I have two consumers in each group. So here One consumer group is moving to disk or data capture server and other consumer group is pointing to influx Database. ( data is splitting to different places) So here the question is.. do I can maintain 4 partitions with two consumer groups with two consumers in each. I am just maintaining two partitions with two consumer in a group for best practice.
I am sorry, I am not sure if I got your question correctly. Let me rephrase it. You have two topics (Topic T1 and T2) and three consumer groups (Group G1, G2, and G3). G1 is reading T1. G2 and G3 are reading T2. T1 has 2 partitions so both the consumers in G1 will read one partition each. T2 has 4 partitions. So, Both the Consumers in G2 will read 2 partitions each, and in total, they will read all the data in T2. Similarly, Both the Consumers in G3 will read 2 partitions each, and in total, they will read all the data in T2. The basic concept is that each group reads complete data from a subscribed Topic. Since you have 2 groups reading T2, you are reading data twice irrespective of the number of partitions. Hope this clarifies your doubt.
Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code.
www.learningjournal.guru/courses/
You're a natural born teacher.
Thanks.
The summary in this video explains how beautiful this learning is structured. Great Work,Thx much sir!!
I’m afraid there was an occasional mistake in the part 16 “Consumer group” at 5:54. There was told that the first coordinator to participate in a group becomes a leader but in fact it was meant to be that the first consumer becomes a group leader.
thanks bro
Yep I spotted that too..was about to post about it but you saved me the trouble..cheers.
I'am very much pleased with the way and ease you explain things , thanks a loot for that, how ever i've a question could you please help me with it ?
Q)What if the group leader crashes or wants to exit the group, which member will be picked to become leader and who makes it leader ?
5:55 Is that a toungslip? Did you mean the first consumer to participate in a group becomes the leader?
FOR A CORRECTION, at 5:50 minutes, it is not First Coordinator it actually first consumer who joins the group called as LEADER and the rest Members/Followers.
the best explanation of how groups work
most simplest way of explanation. love it.
At 5:44, there is one group coordinator for a consumer-group, does that mean if there are say 100 consumer groups, then there will be 100 group coordinators? For one consumer group, all make sense, but I'm confused when I consider more than one consumer group.
Yes, that's correct. For each Consumer Group, one of the brokers is selected as the group coordinator. A Broker acts as a broker in general and also takes care of group coordination as an extra responsibility.
@@ScholarNest thanks for reply, just one more question, does it hold when there are many consumer group and only a handful of broker. In this case, each broker would be group coordinator for many consumer groups.
Yes, However Kafka cluster is scalable by adding more brokers.
Very simple and precise. Thank you
Great material. Thanks sir. is it possible we could have access to slides of the whole series so that if we want to refer to something in future we dont have to view the whole video again?
Thanks again.
i guess u can visit the website and check for what u want
Is it a mistake when you say "The first coordinator to participate in the group becomes a leader. All other Consumers join later becomes the member of the group" from 5:53 to 6:06 in this video? From your draw I think the first Consumer participate in the group is the leader, and those join later are members.
Yes, You are correct.I intend to say "consumer" but not sure how it said, "the coordinator."
firstly, thnx a lot for the good work on this! it would be great if u cud add a small annotation around 5:55 mentioning that u meant 'consumer' instead..
Came back to review your video again, learned more each time. May I have a question here,
I am using kafka-mongodb-sink connection to pass data from 3 kafka brokers to A mongodb in my system. I am planing to parallel insert to mongodb. so on each host I have broker + connector + mongodb primary shard.
My question here is the connector should be in distributed mode or standalone mode? as now I use the distributed mode. hope this is correct setup.
thanks, your demo is very advanced and very clear.
Robin
Thanks for your feedback. If your data is distributed on multiple brokers then distributed mode makes sense.
Wonderful. your help is greatly appreciated.
My pleasure.
1) I have a question here, When we say the replication factor 3 then the same message will be there in different partitions right, In this case is there any chance that the consumer can read same message from different partitions ?
2) Is there any chance that group leader can be overloaded during re-balance action executing
Replication factor 3 makes three copies of the entire partition. Those partitions are managed my the followers. Consumers always read from leaders. Hence those partitions are not used by consumers.
Thank you for the immediate explanation and this makes very clear, so always one leader is associated with one partition. Even if the data is replicated in to multiple partitions it doesn't matter for the consumers because consumer always reads from the leader and not from the followers.
Exactly.
Great lecture and content very well explained.
Very nicely explained.. thanks 😊
Awesome tutorials , is there any videos for kafka security ? i mean SASL_SSL or SSL with authenetication and authorization,
a very very good tutorial ! Like that presentation (Y) !
Excellent work, thanks
Nice video sir. Thanks.
Very good explanation on consumer groups architecture
This lecture was about dealing with multiples consumers within the same consumer group and same instance application running on a single machine. But what would change from that if I wanted to configure multiple consumers in the same consumer group but running in different machines for achieving greater horizontal scalability?
Thanks for your lectures, you've done such an awesome job!
The method is same. You can run consumers on different machines and keep them in the same group.
Thanks. It's really good. Could you please also explain how do multiple messages get sent with multiple producers using one java program ? All these programs run with a single thread. Do we need to use threads or run multiple java programs run the same code, which send different messages ? or the same program you've written works and set a config on producer api.
Very helpful thank you!
Could you please provide, slideshow you are presenting in videos. It would be very helping to refer in future.
The best method is to learn and make your own notes for future reference. I don't use powerpoint for making these videos, so it's very difficult to share them.
awesome session sir
Awesome tutorial series! I'm wondering if there's a python version for the codebase or any resources that I can look up? Also what are some major differences between confluence kafka & kafka?
Hello Sir, I want to know if there is a way to get all producer list who have send messages to the topics?
Please can you suggest how can we create multiple consumer in same application for parall read
You mention there is a worry about reading the same messages - why not have asynchronous reads to multiple consumers per partition? Let's say in a simple case, 2 consumers read asynchronously from partition 0. The reason I bring this up is what if a certain partition receives more messages compared to the others? Doesn't it make sense to assign another (or more) consumers to it?
Kafka is designed to assign one partition to one consumer in the same group.
I have a question I hope someone will answer. Why is there a need of a parallel reading.. and how will you handle the data from a consumer group like I want to store it into my database in order.
Does your database preserves the order? We always do an order by query from the database. What is the use case of keeping records in database in order?
can you explain me when a leader partition is broken what happens to data in that partition and when the node joins back how does reassignment happens
When a leader dies a new leader is elected. :-)
Excellent
Sir cant we configure Kafka in windows instead of Linux?
The leader that we are talking about here, is this the partition leader or the first consumer to join the group which becomes the leader??
No, That's group leader.
Hi ,
I didn't understand that how Group Coordinator election will take place tried to explore and got the details regarding election of Leader... Would you please explore???
Hi sir ...the way you explained is awesome but it can give only book knowledge , if you show this at code level that will be really helpful for us ...
..to be frank we don't know how create and test multipl consumers in intellij or eclipse ......strugling lot while testing multiple consumers....pls sir could you cover this topics
Need clarification w.r.t ack and asynchronous send, let us say i have implemented async producer with callback and mentioned ack in producer config. what will be the behavior
I think I have covered it in one of the videos.
I have a double. What is the need of consumer group if a single consumer can read individual partition? I.e. if we have 3 partitions in a topic then we can start 3 consumer scripts mapping 1 partition to each consumer. How it is different from creating a group? Kindly explain.
You can start 3 consumers and assign them three different partitions, manage failures and other things. But this approach is manual and you have to write a lot of code. Creating a consumer group is an easy way to achieve parallel processing of messages. If one consumer fails for some reason, it's partition is automatically assigned to someone else in the same group.
thanks, got it.
what if a single consumer in a consumer group wants to read all partitions data? how is that possible.
The Leader that we are talking about, from your draw, the first Consumer participate in the group is the leader, and those join later are members, my question is how the consumer becomes a leader to do the rebalancing task. Consumer has the code to read from the topic but how It does the rebalancing. There should be some node which we can elect as a Leader?
Good question. Watch other videos, I think I covered some more details about it.
I watched all 23 videos but I did not come across this explanation. Sorry If I missed it.
Can you please put some light on "Multiple instance of same application read data from same Topic" ?
beautiful explanation thank you!
for a consumer group, i created 3 consumers with same group id logically, these 3 consumers are in one group, right. but individually iam polling each of the consumer, is there any way to poll the consumer group so that i need not write to all consumer code, finally what iam asking is if i say consumer.poll() it returns bunch of records from the topics that consumer subscribe, is there any eay that i can directly tell consumergroup.poll()
No, A consumer polls. Generally, we don't create separate code for consumers if the requirement is just to consume in parallel. It's the same code, and you just execute multiple instances.
Very nice !
What if leader is crashed?
what happens when a kafka broker goes down?
I have a separate video on Fault tolerance in Kafka. Have you checked that? Hope that answers your doubt.
Hi Sir , Wonderful video . I want to have 2 consumers with different consumer group in my application using springboot autoconfiguration , Please can you guide how can Configure it ? by below I can just configure one but I want like consumer 1 and consumer 2 .
spring:
kafka:
consumer:
bootstrap-servers: kafka-0.broker.kafka.svc.cluster.local:9092,kafka-1.broker.kafka.svc.cluster.local:9092,kafka-2.broker.kafka.svc.cluster.local:9092
group-id: digitaltwin
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
What happend if Group Leader crash down?
A new leader is elected by the Group Coordinator.
Awsome
Hi,
I have a doubt, let say I have 2 topics one topic has two partitions with two consumer in a group, ( reason for two consumer, if one fails it move to there consumer)
And other topic has a 4 partitions in this two consumers are in one group of consumer and other two partitions are in other group consumer.. so here I have two consumers in each group.
So here One consumer group is moving to disk or data capture server and other consumer group is pointing to influx Database. ( data is splitting to different places)
So here the question is.. do I can maintain 4 partitions with two consumer groups with two consumers in each.
I am just maintaining two partitions with two consumer in a group for best practice.
I am sorry, I am not sure if I got your question correctly. Let me rephrase it. You have two topics (Topic T1 and T2) and three consumer groups (Group G1, G2, and G3). G1 is reading T1. G2 and G3 are reading T2. T1 has 2 partitions so both the consumers in G1 will read one partition each. T2 has 4 partitions. So, Both the Consumers in G2 will read 2 partitions each, and in total, they will read all the data in T2. Similarly, Both the Consumers in G3 will read 2 partitions each, and in total, they will read all the data in T2. The basic concept is that each group reads complete data from a subscribed Topic. Since you have 2 groups reading T2, you are reading data twice irrespective of the number of partitions.
Hope this clarifies your doubt.