Hi Amit, you may not know me, but I have encountered you so many times in Power BI Community. I started to learn Fabric recently and suddenly you appear everywhere when I search for Fabric tutorials. What a happy reunion !
Hi there! Thanks for the kind words! 😊 I'm glad to hear we've crossed paths in the Power BI Community and now with Fabric too! Welcome to the Fabric journey - happy to be part of your learning experience!
Thank you for your kind words! I completely agree with you. Microsoft Fabric is an incredible technology that holds immense potential for the future. Its innovative features and capabilities make it a game-changer in the industry. I'm excited to see how it will continue to evolve and shape the future of computing. Kudos to the Microsoft team for their outstanding work! Hope you will like the full series ua-cam.com/video/p-v0I5S-ybs/v-deo.html
I'm glad to hear that the video was exactly what you needed! Microsoft Fabric is truly an impressive technology, and I'm happy that the video provided you with the information you were looking for. Thank you for sharing your positive feedback! One more is coming soon - Incremental for Warehouse using only data pipeline
Hi Amit - in this demo, you basically pulling all data from source and then doing lookup with help of fact table to identify which needs modification. can't you use proc from source which has data parameter to passed in so it only return records after last data sync?
There are many ways you can do this. In some cases, I have even used a SQL statement to get the max date using a SQL Server connection (for Lakehouse and Warehouse-).
You can use on-premise gateway Microsoft Fabric: Load local SQL server data in Lakehouse using on-premise Gateway| Dataflow Gen2 - ua-cam.com/video/oEF-jHVmvdo/v-deo.html Now Integrate On-Premise Data with Microsoft Fabric Using Data Pipelines: Gateway Update- ua-cam.com/video/sVePvZOjtoo/v-deo.html
Thanks for the detailed explanation here. What would happen in case there is no incremental data? For example no new sales orders over the weekend for namesake. It'll be great if you can share the process to handle that.
If there is no data, then add an additional condition , update only when date is not null or use coalesce. If the case is if there no new data coming in.
Thanks again for such a nice blog, is it possible for you to tag the numbers on the videos so that it will be easy to follow instead of looking at the publishibg date. Thanks for understanding.
Please check the Incremental load video, I have given an example to use where the clause Microsoft Fabric: Incremental ETL for Warehouse using Data Pipeline, SQL Procedure: ua-cam.com/video/qsOIfTzjCSQ/v-deo.html Microsoft Fabric: Incremental ETL for Warehouse using Dataflow Gen 2, SQL Procedure, Data Pipeline: ua-cam.com/video/mpFRnZBXsvQ/v-deo.html #microsoftfabric- How to use Append data option of Data Flow Gen 2 and Data Pipeline| Incremental ua-cam.com/video/E5axX6auBN4/v-deo.html Microsoft Fabric: How to append only Incremental data using Data Pipeline in Lakehouse: ua-cam.com/video/xzhmaQnB2fM/v-deo.html
Hi Amit, Good Morning!, I have to convert the existing SQL server stored procedure into fabric environment, In my stored procedures there are CURSOR commands but fabric doesn't support CURSOR commands, in this case how do I proceed, is there any alternative.
I have tried something very similar, TSQL loops learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-loops Search for Azure Synapse solution without using cursor
Depending on the type of incremental load you run, if you use the update date as the basis, it will bring in both new and previously updated data, which you will need to handle in the ETL process.
Hi Amit, sorry if it looks as I'm not doing the proper effort but I quite don't get how you can use DML in the SQL endpoint, whenever I try to use insert, update, delete, etc, I get an error that those commands are not supported, I don't get what I'm doing wrong. On the other hand I really appreciate all your material.
Hi Joan, Hope you are using that on the Warehouse SQL endpoint, Not the Lakehouse. On Lakehouse you can run using Spark SQL - Managing Microsoft Fabric Lakehouse using Spark SQL: Data Manipulation Insert, Update, Delete, Alter- ua-cam.com/video/PunKbz4iCEg/v-deo.html
@@AmitChandak you're so kind, yes I kept watching your videos and found out the issue was exactly that, I was using lakehouse. Thanks, your videos are excellent
Sir, How can we build the JDBC/Pyodbc connection between Fabric Data warehouse and Fabric Notebook. I have been finding it since a long time, but un-successful
I have not tried on notebook, but I have tried from local Python to Warehouse. Check if that code can help Microsoft Fabric: Load local SQL server data in Warehouse using Python, Pandas, and sqlalchemy - ua-cam.com/video/P0o-a-8rFH0/v-deo.html
The tutorial uses sql commands in data warehouse to create/alter tables, I can only load data into lakehouse via copy data. Dataflow Gen2 won't work due to firewall setup. How can I implement incremental loading in data lakehouse? I understand one can't change data in sql endpoint of a lakehouse.
@zhiyingwang1234 , You can use Spark SQL, or PySpark. In spark sql try merge spark.sql(""" MERGE INTO target_table AS target USING source_table AS source ON target.id = source.id WHEN MATCHED THEN UPDATE SET target.value = source.value WHEN NOT MATCHED THEN INSERT (id, value) VALUES (source.id, source.value) """)
Can this process handle millions of data. I have a table which has 13 million rows which keeps of increasing every month. I am not able to use incremental refresh on Dataflows because of the large data. I am being hopeful if it can work out
I successfully managed to load data of up to 4 million records without incremental load. Let me to share a method to achieve this. Create a table that has two dates: "Start Date" and "End Date." Customize the date range to correspond with either a year or a month, depending on your specific requirements. After each loading cycle, once the data processing concludes, adjust the "Start Date" and "End Date" to accommodate the subsequent year or month, as dictated by your business needs. This can be achieved by performing an update within a procedure. Execute this process multiple times or create a loop within a data pipeline to ensure a seamless and efficient data loading operation.
Hi Amit, do you know if it is possible to connect directly to a Pipeline and use the upsert option for an SQL server, I have tried it but I can not connect to the Server. I have connected succesfully using Data Flows, but Pipeline gives the following error: "The server was not found or was not accessible." This would make it so much easier for incremental ETL in a SQL Server but I think the option is still not available in Fabric.
Hi Ilse, It seems like you're encountering issues with connecting directly to an on-premise SQL Server using Pipeline and encountering the error "The server was not found or was not accessible." If you're using an on-premise SQL Server, it requires a gateway to establish a connection. Currently, Dataflow Gen 2 is the only option supporting on-premise sources. Unfortunately, the option to directly connect via Pipeline might not be available in Fabric at the moment. Thanks, Amit
Thank you Mr.Amit, Great job sir May be aske you a question .MS Fabric , Power bi and Ms Sql server in my laptop The data base which I worked with including more than 4000 tables Actually it is for Jdedwards Enterprisepoweri it is called JDEData and my server name called for example Localhost since last 2 weeks,i tried to connect to my local host but no way icant connect Notic sir i connect to power query by just giving the server name and database name and simple select statement for the table and the fileds which ineed by the way i never work with power bi service before can you please guide me to connect to the my local host throw what ever data flow G 2 or pipeline or what ever in Fabric lakehouse or any thing I am very sorry my comment is very tall but Itried hundreds time but i could not doing and also iam not network experience i again iam sory and thank you
To connect with Local, first you need install on-premise gateway and in dataflow you have use that gateway, while creating connection. Last time when I tried the flow on Fabric, it was giving some issue. And reported the same. I will check and get back on this.
@@AmitChandak I tried again by using PosgresSQL on cloud and successfully ran each step: Dataflow and Stored procedure. However, when I created a pipeline with these 2 steps like your video. I got fail when ran Dataflow in this pipeline. Could you please explain me the reasons? The details of the error: Error code: 20302 Failure type: User configuration issue Details: Microsoft.DataTransfer.Execution.Core.ExecutionException: Refresh Dataflow failed with status: NotFound message: Not Found
Hi Amit, you may not know me, but I have encountered you so many times in Power BI Community. I started to learn Fabric recently and suddenly you appear everywhere when I search for Fabric tutorials. What a happy reunion !
Hi there! Thanks for the kind words! 😊 I'm glad to hear we've crossed paths in the Power BI Community and now with Fabric too! Welcome to the Fabric journey - happy to be part of your learning experience!
My friend, i would like to thanks for all the effort and time that you've been dedicating for this series. Thanks!
Thanks 🙏
Hope you will like the full series
Mastering Microsoft Fabric 35+ Videos: ua-cam.com/video/p-v0I5S-ybs/v-deo.html
Thank you Amit, very nice done. Fabric is the future 😎
Thank you for your kind words! I completely agree with you. Microsoft Fabric is an incredible technology that holds immense potential for the future. Its innovative features and capabilities make it a game-changer in the industry. I'm excited to see how it will continue to evolve and shape the future of computing. Kudos to the Microsoft team for their outstanding work!
Hope you will like the full series
ua-cam.com/video/p-v0I5S-ybs/v-deo.html
Fantastic. Exactly the vedio I needed. I highly appreciate you Amit for doing this.
I'm glad to hear that the video was exactly what you needed! Microsoft Fabric is truly an impressive technology, and I'm happy that the video provided you with the information you were looking for. Thank you for sharing your positive feedback!
One more is coming soon - Incremental for Warehouse using only data pipeline
Thanks so much. Please make more videos on Dataflow gen2
Sure. Hope you will like more video in the series
Mastering Microsoft Fabric 40+ Videos: ua-cam.com/video/p-v0I5S-ybs/v-deo.html
Hi Amit - in this demo, you basically pulling all data from source and then doing lookup with help of fact table to identify which needs modification. can't you use proc from source which has data parameter to passed in so it only return records after last data sync?
There are many ways you can do this. In some cases, I have even used a SQL statement to get the max date using a SQL Server connection (for Lakehouse and Warehouse-).
Good. But PostRSQL is on Premise right so how we have used the self hosted IR or Data ways to access it?
You can use on-premise gateway
Microsoft Fabric: Load local SQL server data in Lakehouse using on-premise Gateway| Dataflow Gen2 - ua-cam.com/video/oEF-jHVmvdo/v-deo.html
Now Integrate On-Premise Data with Microsoft Fabric Using Data Pipelines: Gateway Update- ua-cam.com/video/sVePvZOjtoo/v-deo.html
Thanks for the detailed explanation here. What would happen in case there is no incremental data? For example no new sales orders over the weekend for namesake. It'll be great if you can share the process to handle that.
If there is no data, then add an additional condition , update only when date is not null or use coalesce. If the case is if there no new data coming in.
Thanks again for such a nice blog, is it possible for you to tag the numbers on the videos so that it will be easy to follow instead of looking at the publishibg date. Thanks for understanding.
Please follow the series link. I will see if, I can rename and add number
ua-cam.com/video/p-v0I5S-ybs/v-deo.html
Hi , Is it possible to use where clause in Copy data activity from ADF source as Microsoft Fabric , SQL Analytics Endpoint?
Please check the Incremental load video, I have given an example to use where the clause
Microsoft Fabric: Incremental ETL for Warehouse using Data Pipeline, SQL Procedure: ua-cam.com/video/qsOIfTzjCSQ/v-deo.html
Microsoft Fabric: Incremental ETL for Warehouse using Dataflow Gen 2, SQL Procedure, Data Pipeline: ua-cam.com/video/mpFRnZBXsvQ/v-deo.html
#microsoftfabric- How to use Append data option of Data Flow Gen 2 and Data Pipeline| Incremental ua-cam.com/video/E5axX6auBN4/v-deo.html
Microsoft Fabric: How to append only Incremental data using Data Pipeline in Lakehouse: ua-cam.com/video/xzhmaQnB2fM/v-deo.html
Hi Amit, Good Morning!, I have to convert the existing SQL server stored procedure into fabric environment, In my stored procedures there are CURSOR commands but fabric doesn't support CURSOR commands, in this case how do I proceed, is there any alternative.
I have tried something very similar, TSQL loops
learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-loops
Search for Azure Synapse solution without using cursor
Hi Amit,
One question what if we update some records will they also come, or only newly added records will come.
Thanks!
Depending on the type of incremental load you run, if you use the update date as the basis, it will bring in both new and previously updated data, which you will need to handle in the ETL process.
Hi Amit, sorry if it looks as I'm not doing the proper effort but I quite don't get how you can use DML in the SQL endpoint, whenever I try to use insert, update, delete, etc, I get an error that those commands are not supported, I don't get what I'm doing wrong. On the other hand I really appreciate all your material.
Hi Joan,
Hope you are using that on the Warehouse SQL endpoint, Not the Lakehouse. On Lakehouse you can run using Spark SQL -
Managing Microsoft Fabric Lakehouse using Spark SQL: Data Manipulation Insert, Update, Delete, Alter- ua-cam.com/video/PunKbz4iCEg/v-deo.html
@@AmitChandak you're so kind, yes I kept watching your videos and found out the issue was exactly that, I was using lakehouse. Thanks, your videos are excellent
Sir,
How can we build the JDBC/Pyodbc connection between Fabric Data warehouse and Fabric Notebook.
I have been finding it since a long time, but un-successful
I have not tried on notebook, but I have tried from local Python to Warehouse. Check if that code can help
Microsoft Fabric: Load local SQL server data in Warehouse using Python, Pandas, and sqlalchemy - ua-cam.com/video/P0o-a-8rFH0/v-deo.html
❤❤
It will definitely solve my scenario
Thank you so much sir 😊😊
The tutorial uses sql commands in data warehouse to create/alter tables, I can only load data into lakehouse via copy data. Dataflow Gen2 won't work due to firewall setup. How can I implement incremental loading in data lakehouse? I understand one can't change data in sql endpoint of a lakehouse.
@zhiyingwang1234 , You can use Spark SQL, or PySpark. In spark sql try merge
spark.sql("""
MERGE INTO target_table AS target
USING source_table AS source
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET target.value = source.value
WHEN NOT MATCHED THEN
INSERT (id, value) VALUES (source.id, source.value)
""")
Can this process handle millions of data. I have a table which has 13 million rows which keeps of increasing every month. I am not able to use incremental refresh on Dataflows because of the large data. I am being hopeful if it can work out
I successfully managed to load data of up to 4 million records without incremental load. Let me to share a method to achieve this. Create a table that has two dates: "Start Date" and "End Date." Customize the date range to correspond with either a year or a month, depending on your specific requirements. After each loading cycle, once the data processing concludes, adjust the "Start Date" and "End Date" to accommodate the subsequent year or month, as dictated by your business needs. This can be achieved by performing an update within a procedure. Execute this process multiple times or create a loop within a data pipeline to ensure a seamless and efficient data loading operation.
Hi Amit, do you know if it is possible to connect directly to a Pipeline and use the upsert option for an SQL server, I have tried it but I can not connect to the Server. I have connected succesfully using Data Flows, but Pipeline gives the following error: "The server was not found or was not accessible." This would make it so much easier for incremental ETL in a SQL Server but I think the option is still not available in Fabric.
Hi Ilse,
It seems like you're encountering issues with connecting directly to an on-premise SQL Server using Pipeline and encountering the error "The server was not found or was not accessible." If you're using an on-premise SQL Server, it requires a gateway to establish a connection. Currently, Dataflow Gen 2 is the only option supporting on-premise sources. Unfortunately, the option to directly connect via Pipeline might not be available in Fabric at the moment.
Thanks,
Amit
Thank you Mr.Amit, Great job sir
May be aske you a question .MS Fabric , Power bi and Ms Sql server in my laptop
The data base which I worked with including more than 4000 tables Actually it is for Jdedwards Enterprisepoweri it is called JDEData and my server name called for example Localhost since last 2 weeks,i tried to connect to my local host but no way icant connect
Notic sir i connect to power query by just giving the server name and database name and simple select statement for the table and the fileds which ineed
by the way i never work with power bi service before
can you please guide me to connect to the my local host throw what ever data flow G 2 or pipeline or what ever in Fabric lakehouse or any thing
I am very sorry my comment is very tall but Itried hundreds time but i could not doing and also iam not network experience
i
again iam sory and thank you
To connect with Local, first you need install on-premise gateway and in dataflow you have use that gateway, while creating connection. Last time when I tried the flow on Fabric, it was giving some issue. And reported the same. I will check and get back on this.
I am able to complete flow using local SQL server and latest on-premise gateway -ua-cam.com/video/oEF-jHVmvdo/v-deo.html
I'm wondering, your PosgrestSQL database is on local. So how you can connect Fabric without using a gate way?
No, It is on cloud, I have used an IP.
@@AmitChandak I tried again by using PosgresSQL on cloud and successfully ran each step: Dataflow and Stored procedure. However, when I created a pipeline with these 2 steps like your video. I got fail when ran Dataflow in this pipeline. Could you please explain me the reasons?
The details of the error:
Error code: 20302
Failure type: User configuration issue
Details: Microsoft.DataTransfer.Execution.Core.ExecutionException: Refresh Dataflow failed with status: NotFound message: Not Found
I am trying to call a dataflow and execute it from a pipeline, but I get a 'User Configuration issue' error. Any idea how to fix this?
I am able to call dataflow gen2 from pipeline. Tested few mins back. Please provide more details. to reproduce.