NATS & Kafka Compared: Part 1 | Rethink Connectivity
Вставка
- Опубліковано 2 жов 2024
- In this episode, Jeremy and Jean-Noel compare NATS and Kafka from an architectural perspective, outlining the design differences between both technologies.
02:35 Biggest Differences between NATS & Kafka
06:13 Technical details & tradeoffs around distributed logs vs NATS
12:06 Kafka topics vs NATS JetStream stream
15:55 Subject-based addressing in streams
17:35 NATS JetStream consumers vs Kafka consumer groups
22:28 Data storage
27:35 JetStream Data stores as Object store & KV store
28:18 CRUD and concurrency access control
32:30 JetStream Rollups
34:16 Throughput, batching, & latency, Oh my!
To download a Total Cost of Ownership report on NATS and Kafka:
www.synadia.co...
This video is a follow-up from our RethinkConn talk on Kafka and NATS: • Comparing and contrast...
NATS is a connective technology powering modern distributed systems, unifying Cloud, On-Premise, Edge, and IoT.
Join the NATS Community on Slack: slack.nats.io
Learn More about NATS at docs.nats.io/
Thank you for the video and for such a thorough comparison. It was very insightful!
However, I noticed a few technical inaccuracies regarding Kafka that I’d like to clarify.
1. You mentioned that Kafka isn’t a “proper” messaging system and is mainly a distributed log platform. Of course, Kafka did start as a distributed log, it has evolved significantly and is now widely used as a messaging system. It does handling pub-sub effectively.
2. There’s a point about Kafka topics being less flexible because they lack the hierarchical structure of NATS subjects. Well, Kafka topics are designed to be simple and efficient, with key-based partitioning that supports powerful message routing. This simplicity is key to Kafka’s scalability.
3. You suggest that Kafka requires clients to receive all messages and filter them locally, which can be inefficient. In reality, Kafka consumers can use offsets and keys to retrieve only the messages they need, especially with compacted topics or Kafka Streams.
4. You also noted that Kafka lacks CRUD operations compared to NATS JetStream. Sure, Kafka doesn’t offer traditional CRUD like a database, it has strong mechanisms like compacted topics, transactional messaging, and exactly-once semantics that handle many data management needs effectively.
5. You mentioned that Kafka isn’t “real real-time” and focuses more on throughput than latency. Kafka does use batching for throughput, and it’s also capable of low-latency processing with the right configuration.
6. Finally, you suggest that Kafka’s partitioning is a workaround for its single-consumer-per-topic design. In fact, partitioning is a deliberate design choice that enables Kafka to scale horizontally and handle massive data volumes efficiently.
Thank you again for the video 👍
Cheers ;)
OMG! Those "realtime" diagrams. If you created them on the spot while the speaker is talking, personally my mind ... BLOWN 🤯
Created them on the spot! I’ve spent far too much time drawing things in excalidraw, just second nature now :)
I'd call this first nature TBH, it's too good to be second 😆
Great video, can't wait to watch more. As a data platformer I don't need to know better or worse, just the trades, which is what this video provides.
As someone who engineered financial services front-office trading and risk systems atop various technologies including both Tibco RVCM/EMS and Kafka stacks, Jean Noel is very much a legend! This is a great talk, thank you!
I prefer Nats over Kafka for different reasons. Kafka is built on Java, and it's a resource hog. It requires an easy 32GB of RAM, and even then, the OS will swap. On the contrary, you have Nats/Jetstream, which can be deployed on an embedded system or at the edge, I have run nats with < 512Mb on K3s. I believe nats is also more developer friendly - it's built by some of the smartest developers and with the low footprint can run off any laptop.
Now, to be fair, I tried to engage in a commercial discussion for one of my projects, and I was shocked by Synadia's lack of maturity from an operational readiness, resources and sales standpoints. just good for startups if you ask me. Also the price point was so high, I could have bought 4 kafka clusters with 24/7 supports.
See in the end if not just tech.
We will definitely be discussing the deployment and operational differences in future episodes! You are spot on that NATS is great at the edge
Helpful comment. I would also not want to use Kafka for the Java resource-hogging problem. It's ok if you deploy on bare metal, but not so cool if you work with VMs with limited memory resources.
i don't like how kafka is the first thing that shows up in my searches when it's so limited in terms of usecase
Great video! So nice to see the explanation and at the same time the drawing of the diagram of what his partner said.
Glad it helped!
Hello, great video. I'm trying to find a strategy to resolve transactions of microservices communicated to each other by Kafka. Do you think that is possible? Thanks for your help.
25:55 "Logs", "data stores", ring a bell? That's right, these messaging/streaming systems look a lot like special use cases for databases, and not something totally different. So databases could conceivably start offering similar features, in which case the data consumers on these systems would simply become database event listeners. 13:18 Indeed, the data selection mechanisms that both systems are offering to data consumers seem to be fairly limited. The hierarchical data selection in NATS may be somewhat better than Kafka, but a tagging system would be more general for example. Again, using a full-featured database would remove these limitations.
it was very useful, thank you
Glad it was helpful!
Are you in investment for a sml deposits...please confirm....thank you
.
Hi Papa!
are there any limitation about the cardinality amount of subjects? Like the Sensors example, can there be millions of sensors, like sensors in millions of cars?
No hard limit on the number of subjects. The resource usage of servers maintaining the interest graph for millions of subjects depends on a couple factors like sustained throughput (per subject) and the number of active clients either publishing or showing interest at any given time. So the practical limit will depend largely on the use case. Its worth noting that we have observed use cases that are in the millions of subjects and optimization for both number of subjects and number of subscriptions is a constant area of focus to support even larger scale.
curious what tool Jeremy used in this video for whiteboarding?
Excalidraw
Excellent explanation
Right
Right)
Great one
awesome video !
Thanks! And thanks for the sub
Great talk
Thanks!