Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code. www.learningjournal.guru/courses/
Could you please explain how to setup schema registry for windows? And i do understand confluent-schema registry is to register avro-schema, but how it differentiate versions of one schema (Based on name of schema file name) when we are using for lower and higher versions of same schema? Apart from Avro schema, is this registry useful for any other tools or framework or else it is only specific to Avro. Ideally schema registry shouldn't be specific to avro.
Great tutorial! I am looking forward to seeing the solution how to make old and new producer/consumers work together because now I can't get it how this could happen...
Im curious, isn't it is simple if we always serialise our object to string (use gson) before sending to Kafka? And in consumer side, once we received the String, we can just simply deserialise it to the object.
+chen hau khoo Yes. We can do that easily. In fact json is quite popular in simple scenarios and Json support is inbuilt in Kafka. However, when you have evolving schema, Avro could be a better option. I have covered schema evolution in a video.
What is the advantage of using AvraSerializer/Deserializer over the following approach? I have created one google protobuf object and converted into ByteString and send as a message and used org.apache.kafka.common.serialization.ByteArraySerializer and org.apache.kafka.common.serialization.ByteArrayDeserializer
Hi, Its a Nice tutorial over Schema in Kafka, just one clarification i want to have as you have told me earlier that schema is well embedded in data and deserializer extract schema and deserializer data so what is the requirement of schema registry when data has embedded schema in it.
Embedding Schema in each record will increase the size of each record and ultimately impact the performance. So Schema is stored in the registry, and an ID is embedded in the message record.
Same question here as well. But ClickRecord.java class has the schema(variable name: SCHEMA$) along with the data. And while writing the Producer/Consumer code you are using ClickRecord.java, so you have the schema embedded in the JAVA file. Why we need the schema registry
@@sonunitjsr223 Same question, since ClickRecord.java is generated from the schema, on the consumer side, it already knows how to deserialize the message, why we need the schema registry?
@@DagangWei This approach as @Learning Journal mentioned above can be costly in terms of network, storage, and other processing costs. So it's better to use a schema registry. You only supply schema if you don't use a schema registry. Otherwise, schema id will be sent in the message.
Nice tutorial, is there a .NET/C# equivalent of Java SDK for Kafka including all the advanced topics you covered like Custom Partition, Commits, Schema Evolution etc...
Nicely defined... Good Job. As we see Avro schema are defined in JSON, so is there any requirement that the data shall also be in JSON, AVRO, ORC format or simple flatfile, CSV can also be processed by AVRO/JSON schema
Avro itself is a data file format. If your data is in another format, your producer need to encode it into Avro object as we have done in the example code.
7 років тому
Thank you for tutorial, but I have a question about the schema registry. who does set up it? Where?
@@ScholarNest "ClickRecord" class can serve as schema for serializing and deserializing right. Why does it require schema registry when we pass ClickRecord as ValueSerializer... Please clarify this part. Thank you..
Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code.
www.learningjournal.guru/courses/
I've been watching the playlist right from the start. The method of delivery is concise, succinct and clear. Way to go Sir. Thanks a lot.
Dear Sir - you have brilliantly profuse expertise in teaching right content !
Excellent tutorials.. Sir. Clear and Concise...
Nicely explained
Excellent tutorials
awesome sir
Nice video
Great explanation. Thank you so much
very useful videos
thanks a lot for wonderful share.
Sir how do you configure the schema registry?
Could you please explain how to setup schema registry for windows? And i do understand confluent-schema registry is to register avro-schema, but how it differentiate versions of one schema (Based on name of schema file name) when we are using for lower and higher versions of same schema? Apart from Avro schema, is this registry useful for any other tools or framework or else it is only specific to Avro. Ideally schema registry shouldn't be specific to avro.
Please make the tutorial on elastic search.
Excellent !
Thanks for the great tutorial, Very well explained.
Excellent!!
Where is the link i am not able to download it its showing download from maven central?
awesome
Very well explained Thanks :)
Great tutorial! I am looking forward to seeing the solution how to make old and new producer/consumers work together because now I can't get it how this could happen...
After a second watching I got it:)
Still I didn’t got that. Is it in any other video??
Im curious, isn't it is simple if we always serialise our object to string (use gson) before sending to Kafka? And in consumer side, once we received the String, we can just simply deserialise it to the object.
+chen hau khoo Yes. We can do that easily. In fact json is quite popular in simple scenarios and Json support is inbuilt in Kafka. However, when you have evolving schema, Avro could be a better option. I have covered schema evolution in a video.
What is the advantage of using AvraSerializer/Deserializer over the following approach? I have created one google protobuf object and converted into ByteString and send as a message and used org.apache.kafka.common.serialization.ByteArraySerializer and org.apache.kafka.common.serialization.ByteArrayDeserializer
What if your producer is changed an now it adds one new field in the message record. Can you use the same consumer without changing it?
Thank u so much
Hi, Its a Nice tutorial over Schema in Kafka, just one clarification i want to have as you have told me earlier that schema is well embedded in data and deserializer extract schema and deserializer data so what is the requirement of schema registry when data has embedded schema in it.
Embedding Schema in each record will increase the size of each record and ultimately impact the performance. So Schema is stored in the registry, and an ID is embedded in the message record.
Same question here as well. But ClickRecord.java class has the schema(variable name: SCHEMA$) along with the data.
And while writing the Producer/Consumer code you are using ClickRecord.java, so you have the schema embedded in the JAVA file. Why we need the schema registry
@@sonunitjsr223 Same question, since ClickRecord.java is generated from the schema, on the consumer side, it already knows how to deserialize the message, why we need the schema registry?
@@DagangWei This approach as @Learning Journal mentioned above can be costly in terms of network, storage, and other processing costs. So it's better to use a schema registry. You only supply schema if you don't use a schema registry. Otherwise, schema id will be sent in the message.
Nice tutorial, is there a .NET/C# equivalent of Java SDK for Kafka including all the advanced topics you covered like Custom Partition, Commits, Schema Evolution etc...
You can use Kafka Rest Proxy if you want to use it from C#.
Nicely defined... Good Job.
As we see Avro schema are defined in JSON, so is there any requirement that the data shall also be in JSON, AVRO, ORC format or simple flatfile, CSV can also be processed by AVRO/JSON schema
Avro itself is a data file format. If your data is in another format, your producer need to encode it into Avro object as we have done in the example code.
Thank you for tutorial, but I have a question about the schema registry. who does set up it? Where?
Schema registry is an optional component of Kafka. If you need it, the cluster admin should setup it on a dedicated host machine.
@@ScholarNest "ClickRecord" class can serve as schema for serializing and deserializing right. Why does it require schema registry when we pass ClickRecord as ValueSerializer... Please clarify this part. Thank you..