Hey everyone, thanks for watching! How is your DP-600 studying going? 🤓 Please leave a LIKE and a COMMENT if you are finding this series useful in your preparation!
Very helpful video! especially loved the easy manner in which you explained the differences between the different ETL/ELT methods. Do you have a video in which you go over how to implement file partitioning?
Hi Will, thanks for the video, you mentioned that the shortcut can realise the "real-time" for the source, but I tried to use the Dataverse as the source, created a shortcut in the Fabric, seems that it will need half an hour to sync to the Fabric...
Hi Will, nice presentation. I think practice question 4 the answer should be B. To be able to create a shortcut access to table in lakehouse B. With viewer permission you only have access to the SQL end point. ViewAll access to the lakehouse would be sufficient, but that was not one of the options. Curious if I missed anything there.
Thanks for commenting, I believe A is the correct answer, see the Shortcut creation permissions from this table here: learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts#workspace-roles
You actually can load data from within data pipeline to a data store located inside different workspace, it's just that the option of straightforward choice is not implemented in UI for some reason, but if you get the destination Workspace and Item ID parameters and put them inside appropriate fields, it get's the job done.
That is correct yes, they released an article yesterday showing this method, which is helpful! They are working on adding it to the UI 👍~ Here's the link for those that want to read more: blog.fabric.microsoft.com/en-US/blog/copy-data-from-lakehouse-in-another-workspace-using-data-pipeline/
The answer D (Data Pipeline (Web Activity)) is not correct for the practice question 2 since the web activity is not meant for saving the data. It is meant for doing web requests and getting data from APIs for some pipeline level logic and those responses can't be directly saved to a Lakehouse. But still good work and great content like always! Keep it up! :)
Thank you for the content. I am looking for a way to efficient copy data from a on-prem database to a 'bronze' layer. Is there a workaround for the fact that parameterization of dataflows is not possible (yet)
Hey there, thanks for comment! I don't think the lack of (external) parameterization in dataflow is a blocker for what you describe? Just have to set it up manually, which is a bit more effort to setup (and also maintain, if you on-prem db changes structure regularly).
Will, what's your view on using mixed methods across a fabric estate, possibly over different stages of a medallion architecture. I have a team with mixed skills and I'd like the whole team to be able to serve them themselves to a greater extent than they can right now. Many of the features and benefits are pushing me to be Notebook led, but I will likely exclude part of my team from being able to interact with, or create additional ingestion/flows/transformations. I'm not sure what the right way to approach this is. Do we use Notebooks for a foundational structure of Source thru Bronze to Gold, but use another Workspace with Shortcuts off Gold to allow the less code savvy to use dataflows for certain purposes? I'm not sure and would welcome anyone's thoughts.
16:55 Where is it mentioned that transformations must be done in dataflow, and it can't leave the source data alone? Can't we use dataflow with no transformations done, or hack some int->float->int useless transforms if it must have some steps.
Hello Will, imagine that i have historic json files (thousands of them that would add up to a couple hundreds of GBs). I need i append and save them to a data lakehouse for later consumption in Power BI. I believe that the notebook is the way to go, as pipelines can't get data from local files, and dataflow will suffer with such amount of Data, am i right? Another question is about the ability of power bi to connect to such amount of data in a lakehouse, will the report work, and will it be fast, taking into account that the connection would be a direct lake.
Yeh sounds like a job for a notebook 👍 and yes, should be pretty quick with Direct Lake. 200GB of JSON will compress a lot by the time it's in Lakehouse delta table. Give it a try and find out 👍
@@LearnMicrosoftFabricMaybe more of a Spark Questions: Consider that the user have an incoming file every week? Logically, he will go and and schedule the notebook to run every week, to append the new file to the delta table. My question is: will the appending proces to the delta table require a read of the delta table in the noteook, or will he be able to append to the delta table, without reading it first? I am concerned about the appending process time every week, will it be too long? Thanks.
Hi Will Thanks for the wonderful Video Can you please Upload a video to fetch data from a given API and storing it into Fabric Warehouse I am trying to take this as a substitute for informatica where the newly generated data from the API should merge to the fabric warehouse after every daily schedule. Please explain this with a live API so that I can create a proper flow. Thank you
In Practice question 3, for option D, does Warehouse supports directly reading data from ADLS Gen 2? I thought COPY INTO can only be used if file is present in lakehouse or somewhere within Fabric
Hi Karen! I hope to cover database mirroring in more detail in the future, but full transparency it won't be for at least another month! I know other UA-camrs have videos on it though might be worth a search!
Through using the Platform. It's a fairly well-known issue. Although Microsoft are taking steps to bring the the DWH and Spark engine closer together. Recently they announced this Spark Connector (for Spark Scala): learn.microsoft.com/en-us/fabric/data-engineering/spark-data-warehouse-connector
@@LearnMicrosoftFabric there was a shortcut link :) I did not tried it yet, but I googled for solution how to write to DWH 😂 some ways occured, didn't těšte them tho. But it might be crucial for our customers
Hey everyone, thanks for watching! How is your DP-600 studying going? 🤓 Please leave a LIKE and a COMMENT if you are finding this series useful in your preparation!
Great quality content! I'm following all the series of this DP-600 exam preparation. Thanks for sharing your knowledge on it; it's really helpful.
Have booked dp-600 exam seat on 10th May. This video posted really perfect timing for the DP-600 exam for me.
Oh nice, best of luck for the exam, I should have a few more videos released before then too :)
Very helpful video! especially loved the easy manner in which you explained the differences between the different ETL/ELT methods.
Do you have a video in which you go over how to implement file partitioning?
Wills material is the best material I have found so far. Thanks for all the effort
In this video: ua-cam.com/video/rHaq9ysFpnE/v-deo.html
Hi Will, thanks for the video, you mentioned that the shortcut can realise the "real-time" for the source, but I tried to use the Dataverse as the source, created a shortcut in the Fabric, seems that it will need half an hour to sync to the Fabric...
Hey, is that just for the first sync? Or every sync? Not unreasonable for a first sync, if you have a lot of data to sync
Amazing. Thanks for sharing your knowledge.
Thanks for watching!
Hello Will Thanks for your video.
does PowerQuery have any data model size limitation? when import data
Hi Will, nice presentation. I think practice question 4 the answer should be B. To be able to create a shortcut access to table in lakehouse B. With viewer permission you only have access to the SQL end point. ViewAll access to the lakehouse would be sufficient, but that was not one of the options. Curious if I missed anything there.
Thanks for commenting, I believe A is the correct answer, see the Shortcut creation permissions from this table here: learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts#workspace-roles
You actually can load data from within data pipeline to a data store located inside different workspace, it's just that the option of straightforward choice is not implemented in UI for some reason, but if you get the destination Workspace and Item ID parameters and put them inside appropriate fields, it get's the job done.
That is correct yes, they released an article yesterday showing this method, which is helpful! They are working on adding it to the UI 👍~
Here's the link for those that want to read more: blog.fabric.microsoft.com/en-US/blog/copy-data-from-lakehouse-in-another-workspace-using-data-pipeline/
The answer D (Data Pipeline (Web Activity)) is not correct for the practice question 2 since the web activity is not meant for saving the data. It is meant for doing web requests and getting data from APIs for some pipeline level logic and those responses can't be directly saved to a Lakehouse.
But still good work and great content like always! Keep it up! :)
Good point, thanks Aleksi, yes I should have been clearer here
Thank you for the content. I am looking for a way to efficient copy data from a on-prem database to a 'bronze' layer. Is there a workaround for the fact that parameterization of dataflows is not possible (yet)
Hey there, thanks for comment! I don't think the lack of (external) parameterization in dataflow is a blocker for what you describe? Just have to set it up manually, which is a bit more effort to setup (and also maintain, if you on-prem db changes structure regularly).
Hi Will, what’s your opinion on Exam dumps do you think they’re viable or outdated?
Sorry what do you mean by exam dumps?
Will, what's your view on using mixed methods across a fabric estate, possibly over different stages of a medallion architecture. I have a team with mixed skills and I'd like the whole team to be able to serve them themselves to a greater extent than they can right now. Many of the features and benefits are pushing me to be Notebook led, but I will likely exclude part of my team from being able to interact with, or create additional ingestion/flows/transformations. I'm not sure what the right way to approach this is. Do we use Notebooks for a foundational structure of Source thru Bronze to Gold, but use another Workspace with Shortcuts off Gold to allow the less code savvy to use dataflows for certain purposes? I'm not sure and would welcome anyone's thoughts.
Hi Simon, probably best to ask this kind of question in your Skool community: Skool.com/Microsoft-fabric
@@LearnMicrosoftFabric Will do
16:55 Where is it mentioned that transformations must be done in dataflow, and it can't leave the source data alone? Can't we use dataflow with no transformations done, or hack some int->float->int useless transforms if it must have some steps.
You can have a dataflow with no transformations, but I don't think it's possible to export a JSON file from a dataflow
✅ High quality content. Highly recommended.
Thanks! And Thanks for watching!
Quality stuff.Good work Will.
Thanks! Thanks for watching, hope you found it useful 👍
Hello Will, imagine that i have historic json files (thousands of them that would add up to a couple hundreds of GBs). I need i append and save them to a data lakehouse for later consumption in Power BI. I believe that the notebook is the way to go, as pipelines can't get data from local files, and dataflow will suffer with such amount of Data, am i right? Another question is about the ability of power bi to connect to such amount of data in a lakehouse, will the report work, and will it be fast, taking into account that the connection would be a direct lake.
Yeh sounds like a job for a notebook 👍 and yes, should be pretty quick with Direct Lake. 200GB of JSON will compress a lot by the time it's in Lakehouse delta table.
Give it a try and find out 👍
@@LearnMicrosoftFabricMaybe more of a Spark Questions: Consider that the user have an incoming file every week? Logically, he will go and and schedule the notebook to run every week, to append the new file to the delta table. My question is: will the appending proces to the delta table require a read of the delta table in the noteook, or will he be able to append to the delta table, without reading it first? I am concerned about the appending process time every week, will it be too long?
Thanks.
Hi Will
Thanks for the wonderful Video
Can you please Upload a video to fetch data from a given API and storing it into Fabric Warehouse
I am trying to take this as a substitute for informatica where the newly generated data from the API should merge to the fabric warehouse after every daily schedule.
Please explain this with a live API so that I can create a proper flow.
Thank you
Hi if you go through some of my older videos on the channel I talk through a rest api example 👍
@@LearnMicrosoftFabric Thank you Will for the update
In Practice question 3, for option D, does Warehouse supports directly reading data from ADLS Gen 2? I thought COPY INTO can only be used if file is present in lakehouse or somewhere within Fabric
Yes, like this: learn.microsoft.com/en-us/fabric/data-warehouse/tutorial-load-data
@@LearnMicrosoftFabric ohh got it, it is similar to how it was in Synapse! Thanks Will
Big ups!!
Good work Will.thanks
Thanks for watching!!
Could you do a video in Database mirroring- Snowflake thanks
Hi Karen! I hope to cover database mirroring in more detail in the future, but full transparency it won't be for at least another month!
I know other UA-camrs have videos on it though might be worth a search!
Thanks!
Thanks so much Lean, I appreciate it! but honestly you don't have to!!
Where did you find, it is not possible to write to DWH from Notebook? Thank you.
Through using the Platform. It's a fairly well-known issue. Although Microsoft are taking steps to bring the the DWH and Spark engine closer together. Recently they announced this Spark Connector (for Spark Scala): learn.microsoft.com/en-us/fabric/data-engineering/spark-data-warehouse-connector
@@LearnMicrosoftFabric Thank you! I found some kind of wokarounds for spark code, but I was not able to find specific mention about this issue.
@@alisabenesova A workaround to write data into a Data Warehouse from a Spark notebook? what was the workaround? via the JDBC?
@@LearnMicrosoftFabric there was a shortcut link :) I did not tried it yet, but I googled for solution how to write to DWH 😂 some ways occured, didn't těšte them tho. But it might be crucial for our customers