Looks like broker keeps state about each record being produced. 1. How is this scalable? I mean it's a lot of memory pressure for this need. 2. If let say broker is crashed how other broker would dedup the record which is already produced? Thanks a lot.
I would guess that "seq" is just a "producer offset" so the broker just needs to hold a map of the latest seq for each producer to know if it's been written to that partition. it's not memory intensive.
Looks like broker keeps state about each record being produced.
1. How is this scalable? I mean it's a lot of memory pressure for this need.
2. If let say broker is crashed how other broker would dedup the record which is already produced?
Thanks a lot.
I would guess that "seq" is just a "producer offset" so the broker just needs to hold a map of the latest seq for each producer to know if it's been written to that partition. it's not memory intensive.
Wanted a lower level discussion with a little more theory. I'm a kafka noob and i dont think i really learned anything.
There are other videos for more higher level explanation of Kafka. This talk is specific to a particular feature for Kafka.
This can't be explained in just 15 mins ... common we need atleast 60 mins to learn in detail...