🚨 Since this video was published Confluent have released their Oracle CDC connector - see this blog for more details: www.confluent.io/blog/introducing-confluent-oracle-cdc-connector/?.devx_ch.rmoff_LAoepZTapMM&
i have configured the oracle as a source with kafka with debezium and logminer, it is taking the snapshot but not streaming any changes made to the database, also i don't find any consumer offsets related to data taken through the snapshot, what i am missing here, can any one tell me please @Robin Moffat
Hi Rmoff, when you say that query based CDC will only pull the records in first poll and last poll and not in between ones, does that mean that if there is some oracle table with high throughput (crores of records per day)... Then if we use JDBC source connector (query based) ... That might not pull all records to kafka?.. Because I am facing this problem, in which records are getting missed in between the day, with no other major configuration difference in the source connector. Also when the connector polls, (suppose timestamp based).... So it does like select * from table where timestampcol=last_poll_time. So how will this loose records while polling? Anyways, it's a great video.
Does this work with Oracle cloud as well. We are using Informatica CDC today but our source Oracle system is now moving to cloud and Informatica CDC does not seem to work with Oracle cloud. How about Confluent Kafka?
I think we're both right ;-) The Kafka Connect worker uses the Consumer API under the covers to consume (pull) data from Kafka, and then pushes the data to S3.
Most of what I do is in Docker as it's just easier for creating and sharing demos. If you've got a particular question about an element of it that you need help with outside of a container then feel free to head over to forum.confluent.io/ and ask there :)
Hi! Thanks for the good explanation. When I'm using JDBC Source connector for oracle the columns of the table are coming within the double quotes due to which I was facing insert error at Postgres as column mismatch (already tables are created at Postgres). Can we avoid wrapping up the column names within double quotes. do we need set any other configuration parameter to avoid this?? Thanks for your help in advance.
Hi rmoff, Tgx for yout input. I manage to set up a oracle kafka connector. But getting the following error as I tried to import big table: "The message is 1320916 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration." I am struggling to set this "max.request.size" the whole day bzt never managed. Where can I set this value. I am not using docker and have confluent-5.5.1 Thx in advance.
Hi rmoff, Since savepoint creates an oracle create scn it get translated as csn in ogg. Do we have a way to filter out savepoint events from kafka handler? Again Thanks a lot for this great demo..
Hi, I've not worked with the Kafka handler in OGG much so I don't know the answer, sorry. If it's a message on a topic you could always filter it out post-ingest with Kafka Streams or ksqlDB.
@@rmoff Thanks a lot .. Even OGG Kafka Connect adapter has this as it’s an abstraction on top of ogg Java addons.. It’s really tough to differentiate a save point transaction from a committed transaction in Kafka topic , as both of them look similar from an op_type .. I’ll try to raising an SR with Oracle 😊( Saw that Debezium already fixed this issue for MySQL)..
Hi, thanks alot for your effort to work on this. I tried to do the sam but stuck on oracle docker image. I could get the docker image from docker hub after login and start the rest from your docker-compose.ylm file except the docker one this seems not to work. I tried to build the docker image for oracle as you pointed out but I wanted to do it in aws ec2 and I stuck there as you can not wget the installation file as this requires authentication. Can you point me out to some solution hier. How did you creat your docker image?
Hi, I built my Docker image for the Oracle database per instructions here github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md#building-oracle-database-docker-install-images
@@rmoff Hi, thanks for you response. Now I can connect oracle to kafka, but faces 2 main problems. 1) the oracle db table has more than a million entries and when I do bulk I am getting only 800k entries in kafka, when I do timestamp getting only ca 200k. 2) the increment id in oracle db table is string like "AB1234" and can not be used. CAn this some be casted to a integer? Do you have any suggestion for those cases? Thanks alot for the videos and documentation you are providing. It helped me so much to get started in this topic. keep it up. Your presentation was so clear and so helpful.
Hi @rmoff, I have question I use Avro to sink with my Oracle DB but I don't know how to put Database Schema. I tried to put in table.name.format Failed. any suggestions to solve this?Btw Greet demo 👍 Thx b4
🚨 Since this video was published Confluent have released their Oracle CDC connector - see this blog for more details: www.confluent.io/blog/introducing-confluent-oracle-cdc-connector/?.devx_ch.rmoff_LAoepZTapMM&
Parabéns pela apresentação.
Great presentation 👍👍
Thanks, glad you liked it!
i have configured the oracle as a source with kafka with debezium and logminer, it is taking the snapshot but not streaming any changes made to the database, also i don't find any consumer offsets related to data taken through the snapshot, what i am missing here, can any one tell me please @Robin Moffat
That was a great demo. Thank you for doing this.
Glad you liked it!
Full of information, but speaker speech was too speedy.
You can adjust the playback speed on UA-cam ;-)
(but glad you like the content, thanks)
Hi Rmoff, when you say that query based CDC will only pull the records in first poll and last poll and not in between ones, does that mean that if there is some oracle table with high throughput (crores of records per day)... Then if we use JDBC source connector (query based) ... That might not pull all records to kafka?.. Because I am facing this problem, in which records are getting missed in between the day, with no other major configuration difference in the source connector.
Also when the connector polls, (suppose timestamp based).... So it does like select * from table where timestampcol=last_poll_time.
So how will this loose records while polling?
Anyways, it's a great video.
Yes, exactly that, which is why log-based CDC is better in many situations. I cover this also here: rmoff.dev/no-more-silos
So Amazing
Does this work with Oracle cloud as well. We are using Informatica CDC today but our source Oracle system is now moving to cloud and Informatica CDC does not seem to work with Oracle cloud. How about Confluent Kafka?
10:40 Kafka does NOT use the push model. Actually, the S3 pulls the data from Kafka, as every Kafka consumer does.
I think we're both right ;-)
The Kafka Connect worker uses the Consumer API under the covers to consume (pull) data from Kafka, and then pushes the data to S3.
Do you have any sample without using containers ?
Most of what I do is in Docker as it's just easier for creating and sharing demos. If you've got a particular question about an element of it that you need help with outside of a container then feel free to head over to forum.confluent.io/ and ask there :)
Hi! Thanks for the good explanation. When I'm using JDBC Source connector for oracle the columns of the table are coming within the double quotes due to which I was facing insert error at Postgres as column mismatch (already tables are created at Postgres). Can we avoid wrapping up the column names within double quotes. do we need set any other configuration parameter to avoid this?? Thanks for your help in advance.
A good place to ask this is at forum.confluent.io/
Hi rmoff, Tgx for yout input. I manage to set up a oracle kafka connector. But getting the following error as I tried to import big table:
"The message is 1320916 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration."
I am struggling to set this "max.request.size" the whole day bzt never managed. Where can I set this value. I am not using docker and have confluent-5.5.1
Thx in advance.
Hi, the best place to ask this is on:
→ Slack group: cnfl.io/slack
or
→ Mailing list: groups.google.com/forum/#!forum/confluent-platform
Hi rmoff, Since savepoint creates an oracle create scn it get translated as csn in ogg. Do we have a way to filter out savepoint events from kafka handler? Again Thanks a lot for this great demo..
Hi, I've not worked with the Kafka handler in OGG much so I don't know the answer, sorry. If it's a message on a topic you could always filter it out post-ingest with Kafka Streams or ksqlDB.
@@rmoff Thanks a lot .. Even OGG Kafka Connect adapter has this as it’s an abstraction on top of ogg Java addons.. It’s really tough to differentiate a save point transaction from a committed transaction in Kafka topic , as both of them look similar from an op_type .. I’ll try to raising an SR with Oracle 😊( Saw that Debezium already fixed this issue for MySQL)..
Do you have all databases installed in a Docker?
Yes - github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md#building-oracle-database-docker-install-images
Hi, thanks alot for your effort to work on this. I tried to do the sam but stuck on oracle docker image. I could get the docker image from docker hub after login and start the rest from your docker-compose.ylm file except the docker one this seems not to work. I tried to build the docker image for oracle as you pointed out but I wanted to do it in aws ec2 and I stuck there as you can not wget the installation file as this requires authentication. Can you point me out to some solution hier. How did you creat your docker image?
Hi, I built my Docker image for the Oracle database per instructions here github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md#building-oracle-database-docker-install-images
@@rmoff Hi, thanks for you response. Now I can connect oracle to kafka, but faces 2 main problems.
1) the oracle db table has more than a million entries and when I do bulk I am getting only 800k entries in kafka, when I do timestamp getting only ca 200k.
2) the increment id in oracle db table is string like "AB1234" and can not be used. CAn this some be casted to a integer?
Do you have any suggestion for those cases?
Thanks alot for the videos and documentation you are providing. It helped me so much to get started in this topic. keep it up. Your presentation was so clear and so helpful.
@@arada123 Hi, the best place to ask this is on:
→ Slack group: cnfl.io/slack
or
→ Mailing list: groups.google.com/forum/#!forum/confluent-platform
Hi @rmoff,
I have question I use Avro to sink with my Oracle DB but I don't know how to put Database Schema. I tried to put in table.name.format Failed. any suggestions to solve this?Btw Greet demo 👍
Thx b4
Hi, the best place to ask this is on:
→ Slack group: cnfl.io/slack
or
→ Mailing list: groups.google.com/forum/#!forum/confluent-platform
@@rmoff Thx u for ur fast replay and link group.