Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code. www.learningjournal.guru/courses/
soory to ask so many questions,your way of explanation is simply awesome ,if it goes for all bigdata skills like hive,hbase,sqoop,pig,spark sql, etc.......it will be very helpful for alot of people.
For ordering guarantee you gave two option 1. Synchronous send 2. max.in.flight.request.per.connection=1 In the example, you explained that if the first batch of 5 messages failed then the second batch of 5 messages will be tried and if it succeeds then there will be problem in ordering. In the similar way, when the max in flight requests per connection is set to 1 (async send) and if it fails then the second message will be tried and if it succeeds then also there will be order issues. Am i correct? If i'm correct please explain how this is a solution for ordering guarentee. correct me if am missing something here.
max.in.flight.requests.per.connection means the producer can not (async) send more records than this value(default 5 in the current version) until unless at least one of those requests has been acknowledged. max.in.flight.requests.per.connection=1 means after sendind 1 record producer have to wait for that response.
Since the Bootstrap, key, value and the custom serializer were sent as Properties object, I am assuming these configurations are also set the same way.
hello Sir, your explanation is very helpful. one question, if acks=0 and at the same time if RecordMetadata r = producer.send().get(); gets called then will r going to get the record metadata informations as acks=0 ?
One leader. However, I think you are missing one important point. It is not the leader of the cluster. It is the leader of the partition. So in fact there are many leaders in a cluster, one for each partition.
Thanks for this excellent collection. I have a question about offsets.topic.replication.factor parameter that does not make sense for me. Could you please provide some explanation. Also how it differs from in-sync replica (ISR). Thank you
Sir, in my college project, I am fetching my data from mysql using logstash to elasticsearch. But now I am using in between kafka. So can you kindly tell me how to transfer data from logstash to kakfa. and later I can fetch from that to elastic.
Hi, I've recently started working on spark, I want to post all messages in one shot to the Kafka but I've challenge over here I've about 1M message are there to post but it can able to post few before timeout, Can you help me with POST all or not all before retry What will be best configs for me, 1. No need of order guaranteed 2. Should not post duplicate messages on retry
Also once the max is reached for inflight, the other messages become synchronous? This was not too clearly explained I think...Appreciate your patience in answering and the awesome tutorials...
I thought, the problem of ordering could also be solved, if we send the timestamp as a parameter to the ProducerRecord object. Please correct me if I am wrong
With your approach, consumer would have to pull entire data from the queue and then sort (based on timestamp parameter) at its end. But this is not feasible. We would not know the range in which the message are not ordered, pull that entire range and then sort at consumer end. Anyway that's a design decision, in theory a consumer can *insert* message at the appropriate position as it pulls message from Kafka (which again as I said depends on problem you are solving)
if you set the acks config to 0, you would not get the acks and wont call any call backs in case of async and future.get() in case of sync calls. But when i set this to 0, both cases it invokes the callbacks(async) and future.gets called. what does this mean ?
my understanding is that if you set the ack to 0, producer wouldnt get the acks back from kafka broker. but when i tried with ack = 0 for async, the callback is getting called. ideally this is not supposed to happen if the broker is not acknowledging the message, correct ? I didnt get the actual use case of ack=0
+Charls Joseph Well, your understanding is incorrect. Watch the video once again. A bit carefully this time. acks controls the acknowledgement from followers for replication, not the acknowledgement from broker to producer.
can you do something about the sound and picture at the starting of the video? doesn't give a good vibe and put it like a childish content though it's not.
Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code.
www.learningjournal.guru/courses/
This is an awesome tutorial.. I got entire concept to start a project in couple of hours. Thank you!!
The way you are explaining it is very helpful to understand such a complex thing
soory to ask so many questions,your way of explanation is simply awesome ,if it goes for all bigdata skills like hive,hbase,sqoop,pig,spark sql, etc.......it will be very helpful for alot of people.
Really ... your presentation is so simple to understand
No paid course as such good and brief explanation.
Thank you!
what abaut wab sarrvar?
On failure of asynchronous call, should I write retry logic inside onFailure method ? Or, it will be taken care by producer retry config parameter?
can you explain linger.ms property?
For ordering guarantee you gave two option
1. Synchronous send
2. max.in.flight.request.per.connection=1
In the example, you explained that if the first batch of 5 messages failed then the second batch of 5 messages will be tried and if it succeeds then there will be problem in ordering.
In the similar way, when the max in flight requests per connection is set to 1 (async send) and if it fails then the second message will be tried and if it succeeds then also there will be order issues. Am i correct? If i'm correct please explain how this is a solution for ordering guarentee.
correct me if am missing something here.
I have the same question.. could you please answer sir??
max.in.flight.requests.per.connection means the producer can not (async) send more records than this value(default 5 in the current version) until unless at least one of those requests has been acknowledged.
max.in.flight.requests.per.connection=1 means after sendind 1 record producer have to wait for that response.
Hi Thanks for such a nice explanation. Can we configure the Producer Configuration topic wise?
where can we set these producer configs?
Since the Bootstrap, key, value and the custom serializer were sent as Properties object, I am assuming these configurations are also set the same way.
hello Sir,
your explanation is very helpful. one question,
if acks=0 and at the same time if
RecordMetadata r = producer.send().get(); gets called then will r going to get the record metadata informations as acks=0 ?
I believe it will not.
Does Kafka guarantee messaging ordering for messages in a specified batch.size ?
I have basic question like in a given kafka cluster can have one leader always or many leader ?
One leader. However, I think you are missing one important point. It is not the leader of the cluster. It is the leader of the partition. So in fact there are many leaders in a cluster, one for each partition.
Thanks for this excellent collection. I have a question about offsets.topic.replication.factor parameter that does not make sense for me. Could you please provide some explanation. Also how it differs from in-sync replica (ISR). Thank you
Replication factor tells how many copies do you want. ISR tells who has the latest copy same as the topic leader.
Sir, in my college project, I am fetching my data from mysql using logstash to elasticsearch. But now I am using in between kafka. So can you kindly tell me how to transfer data from logstash to kakfa. and later I can fetch from that to elastic.
Hi, I've recently started working on spark, I want to post all messages in one shot to the Kafka but I've challenge over here I've about 1M message are there to post but it can able to post few before timeout,
Can you help me with
POST all or not all before retry
What will be best configs for me,
1. No need of order guaranteed
2. Should not post duplicate messages on retry
max inflight is still not clear, could you please elaborate it. why its limited upto 5
They consume memory. You can increase the number if you have enough memory to buffer more messages.
Learning Journal, by memory do you mean the producer should keep a buffer of all the messages until it all the in-flight requests are resolved?
Also once the max is reached for inflight, the other messages become synchronous? This was not too clearly explained I think...Appreciate your patience in answering and the awesome tutorials...
I thought, the problem of ordering could also be solved, if we send the timestamp as a parameter to the ProducerRecord object. Please correct me if I am wrong
With your approach, consumer would have to pull entire data from the queue and then sort (based on timestamp parameter) at its end. But this is not feasible. We would not know the range in which the message are not ordered, pull that entire range and then sort at consumer end. Anyway that's a design decision, in theory a consumer can *insert* message at the appropriate position as it pulls message from Kafka (which again as I said depends on problem you are solving)
if you set the acks config to 0, you would not get the acks and wont call any call backs in case of async and future.get() in case of sync calls. But when i set this to 0, both cases it invokes the callbacks(async) and future.gets called. what does this mean ?
+Charls Joseph did I say that? or you read it somewhere?
my understanding is that if you set the ack to 0, producer wouldnt get the acks back from kafka broker. but when i tried with ack = 0 for async, the callback is getting called. ideally this is not supposed to happen if the broker is not acknowledging the message, correct ?
I didnt get the actual use case of ack=0
+Charls Joseph Well, your understanding is incorrect. Watch the video once again. A bit carefully this time. acks controls the acknowledgement from followers for replication, not the acknowledgement from broker to producer.
@@ScholarNest Sir at 3:42, you mentioned that producer get acknowledgement from broker as metadata or exception.
Hi sir please correct the video if what u said in above message is correct
can you do something about the sound and picture at the starting of the video? doesn't give a good vibe and put it like a childish content though it's not.
Poor explanation of the third parameter