If you found this video helpful for learning about Nifi and ingesting text files. Then consider subscribing to the channel and getting notified about more video’s. Also feel free to leave a comment!
This is great Steve . A few questions 1. Can you please make this template available in your github ? 2. If we want to add additional attributes into the SQL via another constant csv file (eg. systemName ="ABC") or replace a parameter in the json before its converted to SQL, how can we do ? An advanced version of this tutorial might be very helpful. Thanks a lot for this !
I love this tutorial, it is so well explained and detailed :) I would love to see a video on how to read a fixed width file using Nifi, as I'm struggling with this assignment at work!
Good work. But is possible to group multple sql operation in one single databse transaction ? Example: First delete all records from table, next insert multiple records ? Any bad insert rollback delete and previous inserts ?
Sorry about this. That was one of the first video's I did. It was before I started to include the resources in my Git repo. Let me see about doing a different video on txt files and I will make sure to include the resources and Nifi template for download.
Hello Steven, your videos on Apache Nifi are very interesting and helpful , I was trying to understand Apache Nifi to AWS , using api request processor InvokeAWSGatewayApi , but cant send a file in post request . . Can you make a video on sending api requests to aws , and specially covering sending files via API request.
First of all thanks for this video! And I also had a question. Why not use a processor "PutDatabaseRecord" for the same purposes? It seems to me that it looks much more logical and easier to understand.
I use "PutDatabaseRecord" all the time for my dataflows. Just depends on what I'm doing with the flowfiles as well. I thought I had already covered it in a video. I'll look to get it in one of my next couple of videos. Thanks for asking about it.
Thanks for the tutorial...but I have a question...I have a executeSQL processor to get records from a table with a timestamp of the latest 10 minutes...this outputs do n flowfiles (1 row per flowfile)...I I run this processor every 5m (so I get a overlap of records, intended)....after that, I pickup those flowfiles and convert them from avro2json, then json2sql and send them to a putSQL processor to another database...I was expecting that the putSQL considered a flowfile as a transaction, in case of a flowfile data was already in the destination database, it discarded, moved to the next flowfile, extracted the data and try the INSERT INTO tablename...but if I have 3 flowfiles, with different data, only 1 flowfile data already in destination database, it fails all 3 flowfiles data insert...any hint/tip?
Nuno Silva, If possible I would check /logs/nifi-app.log and see what the reason is that it give you for why the flowfiles were rejected during the putsql. Since the putsql processor construct normal sql statements. I sounds to me that there could be a constraint on the table that isn't allowing the flowfiles to insert. If there was a unique id in the table that you are inserting into then you timestamp would not help. I hope I understood the problem correctly but in any case I think you would fine the most help form checking out the logs and seeing that the error was for those flowfiles that didn't write.
@@StevenKoon they say i'm violating a primary key and give me the ok value...I select all records from the table with that pk and returns no data (no record with that pk)...And this happens because the previous flowfile with one row had on deed a record with a pk already on the Database)
Good question. You don't need to. Another way to do the flow in this video is GetFile>PutDatabaseRecord. In the PutDatabaseRecord you can configure the Record Reader for the schema of you CSV files and directly Insert into the table with only two processors total.
Excellent tutorial. Thank you for creating/posting it. I like that I can inspect the queue at each step. For a given queue entry, is it possible to determine the line number in the CSV that produced it?
Greetings, First, I wanted to thank you for the tutorial. I've been trying to connect to orientdb database in the same way you connect to mysql and it fails. Do you know how to connect to orientdb, with emphasis on filling in the fields: Connection URL, Class Name and the path. thank you very much!
Let me get Orientdb setup in a docker container and I'll see if I can successfully insert data into so I can share the connection configuration with you.
Hi i saw that at the getfile after some time, if you have lets say 10 out, and it does not do anything (stoped or the queue on the connector is full), after some time it begins to reduce it until it gets back to 0. What does this mean and why this is happening? Also is this usefull for something? And how much is the time that reduces them?(for example every 10 sec? every 4-5 secs?) thank u
Thanks for the video, maybe can you guide me how to download the JDBC driver for be enable, the DBCP connection pool, i can't make it change the Status to "Enabled"
If it is stuck as "enabling" rather than "enabled" then it is having issues establishing a connection. Edit your controller service and ensure you point to the driver file itself : /path/to/nifi-install/lib/mssql-jdbc-12.8.1.jre11.jar
@@StevenKoon Thank you Steven. Im kind of new to nifi and its been a while sine I used java since I come from .net environment, so forgive me if my questions seem to be trivial. I was able finally to figure it out but it took me a while to know how to populate values for (DB connection URL, Driver Class Name & Driver Location) fields. The documentation on this is very lacking so an example would be definitely helpful for people coming from the Microsoft world. Here is how I populate those fields in my case with [Place Holders]: DB Conn. Url: jdbc:sqlserver://[ServerName];databaseName=[DbName] DB Driver Class Name: com.microsoft.sqlserver.jdbc.SQLServerDriver DB Driver Location: [Installation path]\sqljdbc_8.2\enu\mssql-jdbc-8.2.2.jre8.jar One problem I faced is when I tried to use "IntegratedSecurity=ture" instead of providing credential but I got an error on that. So if you can explain how to do it would be greatly appreciated. Thanks
@@eliaskamyabify yes, if you follow my answer to see how you setup the dbpoolconnection service. Also steven has a video that shows how to connect to sql server.
Hi man, i'm learning to use nifi for my work, i'm having an issue with the put json to sql processor, i cant find the folder where i have my drivers of mysql, i'm trying to find them for downloading, but it seems more difficult than i thought, is there any other way to put the json into my database without declaring the driver?
If you check out this video ua-cam.com/video/0YROsMuqpFo/v-deo.html you will see in the comments that there is a link to mysql jdbc drivers. All you need to do is place the drivers into a directory on your Nifi server that Nifi has access to and use that to configure the mySql connection for your database if you are trying to place the data into a table in the database.
I tend to think of them as a connection string to your destination. For Example you might have one that points to database ABC on server 1, and another that points to database XYZ on server 1 as well. Now your processors can use these ready made connection strings from a drop down menu. No need to add connection details to each component you put on the canvas. Only one place to update the passwd when it needs changing too.
When running Nifi in a docker container. I setup in my Nifi docekr-compose file a volume. So I bind a folder from the host machine to a folder in the container. I can then place file into the host machine folder that will appear inside of the container. This will make it possible to a database drivers to nifi and other files. You can see a example of a docker-compose that does this here: github.com/skoonData/docker-compose/blob/main/nifi_cluster.yaml Look at the volumes section where the type is "bind"
If you found this video helpful for learning about Nifi and ingesting text files. Then consider subscribing to the channel and getting notified about more video’s.
Also feel free to leave a comment!
This is great Steve . A few questions
1. Can you please make this template available in your github ?
2. If we want to add additional attributes into the SQL via another constant csv file (eg. systemName ="ABC") or replace a parameter in the json before its converted to SQL, how can we do ? An advanced version of this tutorial might be very helpful.
Thanks a lot for this !
Does it only support row by row insert or update? How to implement bulk inserts though db specific load utilities?
Would you please help?
Tutorial helped me to fix a trivial bug in my processor configuration. Thanks Steven!!.
That's great to hear.
I love this tutorial, it is so well explained and detailed :) I would love to see a video on how to read a fixed width file using Nifi, as I'm struggling with this assignment at work!
What a great video! Thank you!
So So So helpful. Thanks for making this.
You welcome. Thank you!
Good work. But is possible to group multple sql operation in one single databse transaction ? Example: First delete all records from table, next insert multiple records ? Any bad insert rollback delete and previous inserts ?
It was an amazing tutorial. Thanks Steven !!
Thank you! It's been great to see it so helpful to everyone.
Thanks a lot Steven! You are making a great Job.
Thank you for your videos!
Thanks for the tutorial, just starting with nifi. Could you please share a link to pull all the csv files? Regards from Argentina
Sorry about this. That was one of the first video's I did. It was before I started to include the resources in my Git repo. Let me see about doing a different video on txt files and I will make sure to include the resources and Nifi template for download.
Hello Steven, your videos on Apache Nifi are very interesting and helpful , I was trying to understand Apache Nifi to AWS , using api request processor InvokeAWSGatewayApi , but cant send a file in post request . . Can you make a video on sending api requests to aws , and specially covering sending files via API request.
Amazing content Steven 👌
Thank you!
Awesome demo! Thanks for sharing!
It's great that the video was helpful.
First of all thanks for this video!
And I also had a question.
Why not use a processor "PutDatabaseRecord" for the same purposes? It seems to me that it looks much more logical and easier to understand.
I use "PutDatabaseRecord" all the time for my dataflows. Just depends on what I'm doing with the flowfiles as well. I thought I had already covered it in a video. I'll look to get it in one of my next couple of videos. Thanks for asking about it.
Thanks for the tutorial...but I have a question...I have a executeSQL processor to get records from a table with a timestamp of the latest 10 minutes...this outputs do n flowfiles (1 row per flowfile)...I I run this processor every 5m (so I get a overlap of records, intended)....after that, I pickup those flowfiles and convert them from avro2json, then json2sql and send them to a putSQL processor to another database...I was expecting that the putSQL considered a flowfile as a transaction, in case of a flowfile data was already in the destination database, it discarded, moved to the next flowfile, extracted the data and try the INSERT INTO tablename...but if I have 3 flowfiles, with different data, only 1 flowfile data already in destination database, it fails all 3 flowfiles data insert...any hint/tip?
Nuno Silva, If possible I would check /logs/nifi-app.log and see what the reason is that it give you for why the flowfiles were rejected during the putsql. Since the putsql processor construct normal sql statements. I sounds to me that there could be a constraint on the table that isn't allowing the flowfiles to insert. If there was a unique id in the table that you are inserting into then you timestamp would not help. I hope I understood the problem correctly but in any case I think you would fine the most help form checking out the logs and seeing that the error was for those flowfiles that didn't write.
@@StevenKoon they say i'm violating a primary key and give me the ok value...I select all records from the table with that pk and returns no data (no record with that pk)...And this happens because the previous flowfile with one row had on deed a record with a pk already on the Database)
excellent explanation
does this csv files needs to be binded in a volume to the docker container?
Dear sir, why do we need to convert flow file into json before making the insert sql statement?
Good question. You don't need to. Another way to do the flow in this video is GetFile>PutDatabaseRecord. In the PutDatabaseRecord you can configure the Record Reader for the schema of you CSV files and directly Insert into the table with only two processors total.
can u make a video on how to use docker wit nifi and sql?
Excellent tutorial. Thank you for creating/posting it. I like that I can inspect the queue at each step. For a given queue entry, is it possible to determine the line number in the CSV that produced it?
Thank you for watching it.
Well done dude 👍👏👏🌶🔥
Thank you.
great tutorial, thanks Steven !!
That's great to hear. Thanks!
Greetings,
First, I wanted to thank you for the tutorial.
I've been trying to connect to orientdb database in the same way you connect to mysql and it fails. Do you know how to connect to orientdb, with emphasis on filling in the fields: Connection URL, Class Name and the path.
thank you very much!
Let me get Orientdb setup in a docker container and I'll see if I can successfully insert data into so I can share the connection configuration with you.
Hi i saw that at the getfile after some time, if you have lets say 10 out, and it does not do anything (stoped or the queue on the connector is full), after some time it begins to reduce it until it gets back to 0. What does this mean and why this is happening? Also is this usefull for something? And how much is the time that reduces them?(for example every 10 sec? every 4-5 secs?) thank u
Thanks for the video, maybe can you guide me how to download the JDBC driver for be enable, the DBCP connection pool, i can't make it change the Status to "Enabled"
If it is stuck as "enabling" rather than "enabled" then it is having issues establishing a connection. Edit your controller service and ensure you point to the driver file itself : /path/to/nifi-install/lib/mssql-jdbc-12.8.1.jre11.jar
Hi Steven,
Thanks for the video. I have a question regarding the ConverntJsontoSQL & PutSQL, can you use them against MS SQL?
Yes, absolutely. I do it all the time at work. Let me get an example setup for you to look at on the connection configuration.
@@StevenKoon Thank you Steven. Im kind of new to nifi and its been a while sine I used java since I come from .net environment, so forgive me if my questions seem to be trivial. I was able finally to figure it out but it took me a while to know how to populate values for (DB connection URL, Driver Class Name & Driver Location) fields. The documentation on this is very lacking so an example would be definitely helpful for people coming from the Microsoft world. Here is how I populate those fields in my case with [Place Holders]:
DB Conn. Url: jdbc:sqlserver://[ServerName];databaseName=[DbName]
DB Driver Class Name: com.microsoft.sqlserver.jdbc.SQLServerDriver
DB Driver Location: [Installation path]\sqljdbc_8.2\enu\mssql-jdbc-8.2.2.jre8.jar
One problem I faced is when I tried to use "IntegratedSecurity=ture" instead of providing credential but I got an error on that. So if you can explain how to do it would be greatly appreciated. Thanks
@@samsal073 hi samer saleh! have u got ur problem solved? I face the same problem
@@eliaskamyabify yes, if you follow my answer to see how you setup the dbpoolconnection service. Also steven has a video that shows how to connect to sql server.
More content please?!??
I'd gladly pay for a full revitalized tutorial on udemy.
Apacha Airflow/Nifi: from zero to hero!
Hi man, i'm learning to use nifi for my work, i'm having an issue with the put json to sql processor, i cant find the folder where i have my drivers of mysql, i'm trying to find them for downloading, but it seems more difficult than i thought, is there any other way to put the json into my database without declaring the driver?
BTW excellent job
If you check out this video ua-cam.com/video/0YROsMuqpFo/v-deo.html you will see in the comments that there is a link to mysql jdbc drivers. All you need to do is place the drivers into a directory on your Nifi server that Nifi has access to and use that to configure the mySql connection for your database if you are trying to place the data into a table in the database.
What do the controller services really do? What would have happen if you didnt use csvreader ? I cant get the meaning of controller services
Thank u
I tend to think of them as a connection string to your destination. For Example you might have one that points to database ABC on server 1, and another that points to database XYZ on server 1 as well. Now your processors can use these ready made connection strings from a drop down menu. No need to add connection details to each component you put on the canvas. Only one place to update the passwd when it needs changing too.
thank you very much, great content! your video helped me a lot in learning the tool!
Thank you.
how to get path in Database Driver Location in docker
When running Nifi in a docker container. I setup in my Nifi docekr-compose file a volume. So I bind a folder from the host machine to a folder in the container. I can then place file into the host machine folder that will appear inside of the container. This will make it possible to a database drivers to nifi and other files.
You can see a example of a docker-compose that does this here: github.com/skoonData/docker-compose/blob/main/nifi_cluster.yaml
Look at the volumes section where the type is "bind"
@@StevenKoon thank you Steven!
Thank you very much
Please make the videos by zooming in a bit. It's difficult to watch!
how to get the url of the driver and jdbc ?
A web search.
👍👍
cool👍