Insights & Outliers
Insights & Outliers
  • 47
  • 157 616
ELT 250M rows in less than 20 Minutes with Fabric Python Spark Notebooks
This video reviews the process by which to import 250M rows of data from a public source, land that raw data in the bronze layer of a Fabric OneLake Lakehouse, combine it into a single flattened table in the silver layer, and then create a star schema relational design in the gold layer. The entire process takes less than 20 minutes with a Microsoft Fabric F64 node, and you do not need to know how to write code in order to deploy. Three PySpark Notebooks run each of the respective steps, and a Fabric Pipeline orchestrates those Spark Notebooks. The source data is from the CMS Medicare Part D database, and the GitHub Repo that you can reference to recreate the entire solution can be found at this link: github.com/isinghrana/fabric-samples-healthcare . The GitHub repo is co-created by Inder Rana and Greg Beaumont, both work at Microsoft at the time of this recording.
MENU
00:00 Intro
01:22 Starting at the GitHub landing page
02:34 Create Lakehouse and Import Spark Notebooks
04:52 Choose Spark Notebook Lakehouse Bronze Layer
06:10 Create and Configure Pipeline
10:00 Run Fabric Pipeline
11:10 Outro and Summary
Переглядів: 125

Відео

Synthetic Podcast with an AI Simulation of Benjamin Franklin
Переглядів 742 місяці тому
What if students and lifelong learners could have an AI chat bot simulating a historical figure for educational conversations? This video is a synthetic podcast using Microsoft Azure OpenAI and Azure AI text to speech to create an AI simulation of Benjamin Franklin in the form of a podcast. With Azure AI, tools exists for anyone to easily create an educational AI simulation of a historical figu...
Moving Fabric (Power BI) Workspace from Free Trial or P SKU to F SKU
Переглядів 1,5 тис.4 місяці тому
Do you have Microsoft Fabric (Power BI) Workspaces that need to be moved from either 1) the Fabric Free Trial capacity or 2) a Power BI Premium capacity (P SKU) to a new Fabric capacity (F SKU)? This video quickly walks you through the process by which to reassign a Workspace to a new F SKU. As long as your Premium capacity or Fabric Free Trial capacity are in the same Azure Region as your new ...
Create a Microsoft Fabric Node in Azure and Attach to a Workspace
Переглядів 6594 місяці тому
This video details the process of creating a PAYGO (Pay as You Go) Microsoft Fabric F64 node in a Resource Group for an Azure Subscription, adding Node Administrators in the Fabric Admin Portal, and then creating a Workspace for using the Node. The F64 node, at the time of recording, is the minimum size for Fabric Copilots and Power BI components of Fabric.
Connect SSMS (SQL Server Management Studio) to Fabric Warehouse
Переглядів 9266 місяців тому
The Microsoft Fabric Warehouse provides a familiar interface for users with SQL skills, even though it is based on Data Lakehouse technology. This video demonstrates how SQL Server Management Studio (SSMS) can be connected to the Fabric Warehouse to provide a user experience similar to that of SQL Server. Microsoft Entra ID is used to ensure that inbound traffic to Fabric is secure. You can rec...
Import and Flatten 1400+ json files using Microsoft Fabric Notebooks
Переглядів 8097 місяців тому
Microsoft Fabric Spark Notebooks and Pipelines are used to import over 1400 json files into the Fabric Lakehouse, flatten them out, and then make them available for queries in delta parquet format. Data is 400GB of real OpenFDA open source data. The presenter, Inder Rana, has published these steps for you to reproduce at this link: github.com/isinghrana/fabric-samples-healthcare . MENU 00:00 - ...
Evolutionary History of Microsoft Fabric - Spreadsheets to Lakehouse
Переглядів 10 тис.8 місяців тому
While some may believe that Microsoft Fabric was created in May 2023, this presentation reviews over two decades of the products leading up to Fabric. You've all heard of SQL Server and Excel, but what roles did ProClarity, Panorama, and Sharepoint play in the evolution of Power BI and Fabric? How did Microsoft analytic tools emerge from on-premises and begin existing in the Cloud? In order to ...
Power BI Copilot Narrative Visual adds Azure OpenAI to your Fabric Reports
Переглядів 2,8 тис.9 місяців тому
The new Power BI Copilot Narrative visual in Microsoft Fabric gives you the power to build Azure OpenAI prompts into your Power BI reports to query Large Language Models for insights about your data. Report builders can test and design prompts that can be rerun against data as new data is added and existing data is filtered by the users. The Narrative visual works as a SaaS product, you just dr...
Copilot for Data Science & Data Engineering in Microsoft Fabric
Переглядів 1,4 тис.10 місяців тому
Copilot for Data Science and Data Engineering is a new capability in Microsoft Fabric which is in Preview at the time of this recording by Inder Rana. This Copilot will generate python code for a Spark Notebook in Fabric. The demo uses components of a free Git repo that anyone can deploy, and then recreate the steps in this demo: github.com/isinghrana/fabric-samples-healthcare . The Copilot is ...
Copilot for Data Factory in Microsoft Fabric for a Fiscal Date Table
Переглядів 89110 місяців тому
At the time of this recording, Copilot for Data Factory is a preview capability for Microsoft Fabric Dataflows Generation 2. Copilot for Data Factory in Fabric enables no code data transformation such as joins, metadata changes, filtering, and more. Natural language entered into the Copilot is interpreted using OpenAI LLM technology and used to generate code within Dataflows. This example uses ...
220M+ row Microsoft Fabric demo using Direct Lake, Lakehouse, Warehouse, Spark and Pipelines
Переглядів 8 тис.11 місяців тому
This is a demo of Microsoft Fabric using 220 million rows of data that anyone can recreate using the Git Repo at this link: github.com/isinghrana/fabric-samples-healthcare/tree/main/analytics-bi-directlake-starschema . Tour OneLake and the Fabric Lakehouse, Spark Notebooks, Warehouse with SQL queries, Pipelines, a Direct Lake Semantic Model, and a Power BI report. The entire solution is impleme...
Create a Direct Lake Power BI Dataset for a Microsoft Fabric Lakehouse
Переглядів 5 тис.Рік тому
This video walks through the process of creating a Power BI dataset in Direct Lake mode with a delta parquet table in Microsoft Fabric Data Lakehouse as the source. The video is the third in a series that documents an end-to-end solution from a Github repo for ingesting, transforming, and reporting on 220 million rows of CMS Medicare Part D data. A link to the repo is here: github.com/isinghran...
Load Delta Parquet Table from CSV files using Microsoft Fabric Spark
Переглядів 2,1 тис.Рік тому
This video walks through the process of creating a delta parquet format table in Microsoft Fabric Data Lakehouse using a Spark Notebook. The video is the second in a series that documents an end-to-end solution from a Github repo for ingesting, transforming, and reporting on 220 million rows of CMS Medicare Part D data. A link to the repo is here: github.com/isinghrana/fabric-samples-healthcare...
Manually Upload Large CSV files to a Microsoft Fabric Lakehouse
Переглядів 769Рік тому
This video walks through the process of manually ingesting CSV files into a Microsoft Fabric Data Lakehouse. The video is the first in a series that documents an end-to-end solution from a Github repo for ingesting, transforming, and reporting on 220 million rows of CMS Medicare Part D data. A link to the Git repo is here: github.com/isinghrana/fabric-samples-healthcare/tree/main/analytics-bi-d...
Azure OpenAI ChatGPT for Cryptic Error Messages from Power BI, SQL Server, and Power Apps!
Переглядів 642Рік тому
Cryptic error messages have been bottlenecks that waste time and cost money for almost anyone who has ever uses software. Nothing is more frustrating than an error message filled with GUIDs and vague error codes that lack proper explanations. ChatGPT within Azure OpenAI provides a secure and easy new way to quickly get suggestions as to what caused cryptic error messages. Instead of combing thr...
A Power Apps Turbo Button for Azure SQL DB to Reduce Costs & Keep Business Users Happy
Переглядів 3032 роки тому
A Power Apps Turbo Button for Azure SQL DB to Reduce Costs & Keep Business Users Happy
Connect Power Apps with Azure ML to make Predictions in Microsoft Teams
Переглядів 3,1 тис.2 роки тому
Connect Power Apps with Azure ML to make Predictions in Microsoft Teams
Azure Data LakeHouse in an Hour Virtual Workshop
Переглядів 6 тис.2 роки тому
Azure Data LakeHouse in an Hour Virtual Workshop
Use a Keyword in Microsoft Teams with a Power Automate Flow to Resume & Pause Azure Synapse
Переглядів 1,8 тис.2 роки тому
Use a Keyword in Microsoft Teams with a Power Automate Flow to Resume & Pause Azure Synapse
Pause & Resume Azure Synapse Dedicated SQL Pools with Data Factory Pipelines
Переглядів 2,3 тис.2 роки тому
Pause & Resume Azure Synapse Dedicated SQL Pools with Data Factory Pipelines
Planning for a Secure and Scalable Power BI Enterprise Architecture
Переглядів 2,2 тис.3 роки тому
Planning for a Secure and Scalable Power BI Enterprise Architecture
Create Power BI DataFlows using an existing M Script
Переглядів 1,4 тис.3 роки тому
Create Power BI DataFlows using an existing M Script
Power BI Custom Power Query using M Code
Переглядів 2,2 тис.3 роки тому
Power BI Custom Power Query using M Code
Deploy an Azure ARM Template to an Azure Data Factory
Переглядів 5 тис.3 роки тому
Deploy an Azure ARM Template to an Azure Data Factory
Deploy Azure Data Factory V2
Переглядів 1663 роки тому
Deploy Azure Data Factory V2
Deploy Azure Data Lake Storage Gen2
Переглядів 9213 роки тому
Deploy Azure Data Lake Storage Gen2
Quick Demo - Power BI Small Multiples with Azure Synapse
Переглядів 283 роки тому
Quick Demo - Power BI Small Multiples with Azure Synapse
Quick Demo - Power BI Decomposition Tree with Azure Synapse
Переглядів 5953 роки тому
Quick Demo - Power BI Decomposition Tree with Azure Synapse
Quick Demo - Power BI Q&A with Azure Synapse for Natural Language Queries
Переглядів 1463 роки тому
Quick Demo - Power BI Q&A with Azure Synapse for Natural Language Queries
Azure Updates for Synapse, Power BI and CMS Medicare Part D End-to-End Solution February 2021
Переглядів 963 роки тому
Azure Updates for Synapse, Power BI and CMS Medicare Part D End-to-End Solution February 2021

КОМЕНТАРІ

  • @fabric-with-felmo
    @fabric-with-felmo 9 днів тому

    Hello sir, I want to learn complete fabric,is it possible? Does fabric alon gets me in a job?

    • @insightsoutliers
      @insightsoutliers 7 днів тому

      Here's a great place to get started, these skills should help boost any resume: learn.microsoft.com/en-us/training/paths/get-started-fabric/

  • @SQLTalk
    @SQLTalk 22 дні тому

    Very helpful. Thank you!

  • @unbrokenultimate6118
    @unbrokenultimate6118 Місяць тому

    Thank you! This was really helpful in my learning

  • @markskiles462
    @markskiles462 Місяць тому

    Love it, Greg - how creative and objecting current context into a historical figure - great work!

    • @insightsoutliers
      @insightsoutliers Місяць тому

      Thank you Mark, I could've learned much more as a kid with the tools we have available today. It's amazing how things change in just a few decades.

  • @cupicapbleew1460
    @cupicapbleew1460 2 місяці тому

    How to create new table on dataset? Becauss new table is disable

    • @insightsoutliers
      @insightsoutliers 2 місяці тому

      If you add a new table to the underlying Lakehouse, you should be able to add it to the custom Semantic Model afterwards.

  • @ralphsparkle
    @ralphsparkle 3 місяці тому

    The sections at 3:32 and beyond, seem to suggest that Co-Pilot can pull in data from external sources - is this correct?

    • @insightsoutliers
      @insightsoutliers 3 місяці тому

      Copilot connects to an Azure OpenAI model to generate additional context to enrich the report. The data is external to the Power BI Semantic Model, but nothing is being pulled from the public internet for the responses.

  • @ΑθανάσιοςΣουλιώτης-θ2γ

    How can edit the ARM template after deploy it. I have created linked integration runtime for the target data factory and I want to configure all my resources to utilize this integration runtime

    • @insightsoutliers
      @insightsoutliers 3 місяці тому

      Is this thread helpful? stackoverflow.com/questions/57505831/how-to-update-and-redeploy-arm-template

  • @Francesca-yu5cy
    @Francesca-yu5cy 6 місяців тому

    omg you are genious - danke!

  • @sphscholz
    @sphscholz 6 місяців тому

    Liked this very much. Even though I have been involved with BI for a long time, I am relatively new to the MS tool stack and this is a really helpful overview, thnx!

  • @paulpurington8637
    @paulpurington8637 6 місяців тому

    What a great presentation! Excellent stroll down memory lane. I wrote a book with Dan English about PowerView back in the day and PowerView got scrapped within days of the book's release. LOL!

    • @insightsoutliers
      @insightsoutliers 6 місяців тому

      Thank you, it was fun to put it all together. It's unreal the velocity of change with product improvements and changes. The right side of my primary diagram would even be a little bit different today versus when I put it together a few months ago.

  • @majeedadil777
    @majeedadil777 6 місяців тому

    Thanks for sharing. when the data is loaded into the delta parquet table then how we can do the partition on the Year column?

    • @insightsoutliers
      @insightsoutliers 6 місяців тому

      Adding a step using Data Pipelines should be a straightforward way to add partitions: learn.microsoft.com/en-us/fabric/data-factory/tutorial-lakehouse-partition

  • @nalinkhosla587
    @nalinkhosla587 7 місяців тому

    Thanks for the insights, how much of the Fabric promised land is real vs roadmap/vision?

    • @insightsoutliers
      @insightsoutliers 7 місяців тому

      It's all there but some parts are more mature than others. This roadmap should help with any questions on this topic: aka.ms/fabricroadmap

  • @sunnysaneeth6693
    @sunnysaneeth6693 7 місяців тому

    Why did you use date.csv file in this ?

    • @insightsoutliers
      @insightsoutliers 7 місяців тому

      At the time it was a simple way to add a Date table to the Gold layer. It also shows that you can strategically unite data from different sources with a Lakehouse architecture. I'd probably do it differently today with Fabric.

  • @blendcloud
    @blendcloud 7 місяців тому

    great presentation and thank you for providing this visual roadmap. even though i've experienced all of these technologies first hand since sql 7, it's actually a good reminder of how the individual components have evolved and now converged into fabric.

  • @tripathifamily2465
    @tripathifamily2465 7 місяців тому

    nice one

  • @sanjay98317
    @sanjay98317 7 місяців тому

    Superb! Thanks so much for connecting all the dots, and making the Fabric strategy and roadmap so clear.

  • @kdcapparelli
    @kdcapparelli 7 місяців тому

    Great presentation. Congratulations. I've always liked to "sense" the evolution of IT offerings (software products and services)

  • @angelost1467
    @angelost1467 8 місяців тому

    Very nice summary, thank you for taking the time to share it!

  • @ankeet1
    @ankeet1 8 місяців тому

    Things just went berserk around 2022, in a good way! I can totally relate to everything from SQL 7 until it had just started going Azure and then the advancement is exponential each year! Very informative reference video, Greg. Thank you!

  • @shantanumathur1059
    @shantanumathur1059 8 місяців тому

    Very insightful article

  •  8 місяців тому

    Great job! Thank you!

  • @adin6429
    @adin6429 8 місяців тому

    I know only excel, power pivot, power query, power bi and very little sql. What should I learn to understand and work in fabirc?

    • @insightsoutliers
      @insightsoutliers 8 місяців тому

      The Fabric Career Hub would be a great place to start: aka.ms/fabriccareerhub

    • @adin6429
      @adin6429 8 місяців тому

      Thanks

  • @Andrewskji
    @Andrewskji 8 місяців тому

    Crazy to think about how fast this has happened

  • @daustonian9331
    @daustonian9331 8 місяців тому

    Oh how I wanted the decomp tree to have colors... but alas MS locked the code and even the old Proclarity people were not permitted to implement it for us! I may still have a performance point book or two around. It was a tough use case for sure. Thanks for walking us older folks along the trail.

    • @insightsoutliers
      @insightsoutliers 8 місяців тому

      Agreed! I think the newer Decomp Tree in PBI can change colors based on KPIs, but I agree that the old PerformancePoint version was a step down form ProClarity.

  • @rifkiamil
    @rifkiamil 8 місяців тому

    Great job on compiling everything. I'd like to offer a small suggestion if that's all right. Engine - VertiPaq 1. a box for VertiPaq which connected with 1.1 PowerPivot 2010 Add-In for Excel 2010 1.2 Analysis Services Tabular Mode 1.3 SQL Server 2012 - Columnstore storage VertiPaq rename to xVelocity in 2012 Acquired 2. Maximal Innovative Intelligence which is connected 2.1 Microsoft Data Analyzer (2002) (for OLAP)

    • @insightsoutliers
      @insightsoutliers 8 місяців тому

      Excellent info that I didn't find anywhere on the web. Thank you! At some point I plan to update the presentation and I'll include this content.

  • @OMAKEM
    @OMAKEM 8 місяців тому

    Great presentation, thanks

  • @barrysinni7556
    @barrysinni7556 8 місяців тому

    great presentation that cuts through all of the marketing nonsense and shows real lineage

  • @alexanderjacobs817
    @alexanderjacobs817 8 місяців тому

    Great content here. Filled in a lot of gaps that I have thought about from the last few years.

  • @aamoody81
    @aamoody81 8 місяців тому

    Good stuff. Thank you.

  • @bcrivatutub
    @bcrivatutub 8 місяців тому

    Loved your presentation, thank you for creating and sharing it! It may be worth mentioning that PowerPivot (the vertipaq engine) did not come from the blue (or green), but really from Analysis Services. It shipped initially only in Excel, but it was 100% analysis services.

    • @insightsoutliers
      @insightsoutliers 8 місяців тому

      Thank you! I've added a note to the description and the related LinkedIn post.

  • @CaribouDataScience
    @CaribouDataScience 8 місяців тому

    Well, the problem you have to have a business email account which I don't.

    • @insightsoutliers
      @insightsoutliers 8 місяців тому

      I agree it would be nice to see a "Fabric per User" license in the future for small business owners and personal use.

  • @tanveerakhtar7213
    @tanveerakhtar7213 8 місяців тому

    Excellent and precise presentation Well done

  • @MrSith-yp3yq
    @MrSith-yp3yq 8 місяців тому

    Should you blur out the name in the bottom right visual?

    • @insightsoutliers
      @insightsoutliers 8 місяців тому

      Thank you for the suggestion, I'll see if I can edit the video. I tried to blur out names to be respectful of anyone who showed up on the screen by random chance, but the data is 100% public so there are no PII concerns. Anyone can download and/or search all of the data from this link: data.cms.gov/provider-summary-by-type-of-service/medicare-part-d-prescribers/medicare-part-d-prescribers-by-provider-and-drug

    • @MrSith-yp3yq
      @MrSith-yp3yq 8 місяців тому

      Thanks for the information 🙂@@insightsoutliers

  • @khalidjaradat
    @khalidjaradat 8 місяців тому

    Thank you very much 🤩

  • @angelost1467
    @angelost1467 8 місяців тому

    In a world where everyone is pushing flashy over the top videos to grab attention, this video is a refreshing. Thank you for taking the time to produce something that adds so much value in a 5 minute video.

  • @NewsSports-ey1ke
    @NewsSports-ey1ke 9 місяців тому

    nice video

  • @ThePilli41
    @ThePilli41 9 місяців тому

    Amazing work! I cant wait to see more of your content.

  • @gulhermepereira249
    @gulhermepereira249 9 місяців тому

    hi, Greg. What Fabric Capacity did you use for this example? It ran really smoothly

  • @rakshadilip2277
    @rakshadilip2277 9 місяців тому

    Hi, I have hit into a situation where my PowerBI dataset is taking 2hrs to refresh , which includes Data extraction using REST API from Jira(Attlasian) cloud and transformations to cater business requirement. Is there any alternatives , where I can extract the Jira data in PowerBI and push these data to one lake. Create tranformation in one lake and build analytics in PowerBI by connecting to one lake? Or any other suggestions ?

    • @insightsoutliers
      @insightsoutliers 9 місяців тому

      You can try connecting to the Jira API using Pipelines in Fabric, and then landing the data in OneLake. If the Jira connection doesn't yet work in Fabric (I have no way to test it), you can use Pipelines in Azure Data Factory to land the data in ADLSv2 and then shortcut it over to Fabric: learn.microsoft.com/en-us/azure/data-factory/connector-jira?tabs=data-factory .

  • @ashutoshnehete2397
    @ashutoshnehete2397 9 місяців тому

    Thanks for posting, well explained!

  • @sahilkothekar9927
    @sahilkothekar9927 9 місяців тому

    Great Demo One question ,does the query made to open Ai had the context of complete data model or the visual on canvas along with the data in it

    • @insightsoutliers
      @insightsoutliers 9 місяців тому

      For the Narrative visual, at the time of this reply, it is for data on canvas. This is why it can work with very large data models. There is another natural language visual called Q&A that will query the whole data model, but that one doesn't allow for LLM prompts.

    • @terryliu3635
      @terryliu3635 9 місяців тому

      Thanks for sharing! Quick question…does the refresh have to be manual?

    • @insightsoutliers
      @insightsoutliers 9 місяців тому

      @@terryliu3635 no, the refresh could be scheduled with Fabric Pipelines. The instructions use manual refresh since it is a one-time load.

  • @valentinloghin4004
    @valentinloghin4004 10 місяців тому

    Amazing job !! Congrats !

  • @mrgodai01
    @mrgodai01 11 місяців тому

    How come the manage roles security is grey out in my sematic model? could it be because my lakehouse is using shorcuts to a Gen2 storage?

    • @insightsoutliers
      @insightsoutliers 10 місяців тому

      Right now this is only available using the XMLA endpoint. I was told the GUI version will be available soon.

  • @PalaniKumar-b7e
    @PalaniKumar-b7e 11 місяців тому

    What are the other activities we can achieve through azure devops for fabric? Other than fabric git integration

    • @insightsoutliers
      @insightsoutliers 11 місяців тому

      Here's a great article on that topic I'm reading from Reza Rad: radacad.com/version-control-in-power-bi-and-fabric

  • @muhammadrahman3510
    @muhammadrahman3510 11 місяців тому

    Very well explained demo. Please keep it up on next videos regarding Fabric. Thanks so much 😊

  • @trading_with_fish
    @trading_with_fish 11 місяців тому

    This is very interesting......does a solution like this work well in a data centric architecture like a UNS?

    • @insightsoutliers
      @insightsoutliers 11 місяців тому

      Fabric does have IoT components, but they are not part of this video.

  • @maggietheutubedog
    @maggietheutubedog 11 місяців тому

    Great job. No music. No BS. Just a huge amount of knowledge transferred in 5 minutes.

  • @utilars
    @utilars Рік тому

    Is there a way to activate version control for all uploaded files? What happens in the lakehouse if someone accidentally replaces MUP_DPR_RY21_P04_V10_DY19_NPIBN_1.csv with a blank file?

    • @insightsoutliers
      @insightsoutliers Рік тому

      Fabric will have Git integration, and we should know more about version control options when it goes GA: blog.fabric.microsoft.com/en-us/blog/introducing-git-integration-in-microsoft-fabric-for-seamless-source-control-management

  • @saurabhshrigadi
    @saurabhshrigadi Рік тому

    I come across your video, loved it. What tips you give to person who is beginner in power bi?

    • @insightsoutliers
      @insightsoutliers Рік тому

      Thank you. For beginners, I would recommend starting with the free online learning: learn.microsoft.com/en-us/training/powerplatform/power-bi?WT.mc_id=powerbi_landingpage-marketing-page

  • @azurelearner4055
    @azurelearner4055 Рік тому

    Thanks for the video !!