Extract and Load from External API to Lakehouse using Data Pipelines (Microsoft Fabric)

Поділитися
Вставка
  • Опубліковано 4 жов 2024

КОМЕНТАРІ • 49

  • @KurtJ-r8w
    @KurtJ-r8w Місяць тому

    Really hope you do more Fabric content.
    You were clear, structured and concise in the teachings done

  • @jampeauk
    @jampeauk Рік тому +4

    Just want to say a massive thank you for your Fabric videos they have been amazing. Keep up the great work.

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  Рік тому

      Hi, thanks for watching! don’t worry, there’s plenty more videos to come!

    • @jampeauk
      @jampeauk Рік тому

      @@LearnMicrosoftFabric I may have missed this in your videos but do you have a section on how to show the contents of a file directly and load the most recent file (my files all have date stamps in them).
      I have not had any luck with os.listdir().

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  Рік тому +1

      @@jampeauk Hi James, for file system searching you probably want to use mssparkutils which has that kind of list files in a directory functionality - I plan to cover this in my upcoming video on mssparkutils 👍

    • @jampeauk
      @jampeauk Рік тому

      @@LearnMicrosoftFabric awesome thanks Will, looking forward to this.
      To provide a little extra context I would like to list the files located in my S3 Bucket which I have added as a Shortcut.

  • @chescov
    @chescov Рік тому +1

    Much appreciated my good sir 👏👏

  • @stevengarcia7277
    @stevengarcia7277 3 місяці тому

    thanks mate, well explained.

  • @peternguynguyen5208
    @peternguynguyen5208 10 місяців тому

    Nice instructions, thank you

  • @WillOSullivan-k1q
    @WillOSullivan-k1q 10 місяців тому

    Good explanations mate keep up the good work

  • @chetan2309
    @chetan2309 Рік тому +2

    Hey! Massive thanks! Do you’ve plans to cover any oauth based API on your system! Also how to parallelise these APIs for massive data loads! Let say you want to fetch data for 100 cities on everyday basis. Also triggers when 101st is added all those scenarios

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  Рік тому

      Hi,
      Greats questions! Absolutely yes, I plan to do more videos about handling different auth scenarios, and also loading v big datasets with parallel reads. Watch this space :)

  • @samirsahin5653
    @samirsahin5653 Рік тому

    I came here for same question. That some people already asked.
    How to call this api for multiple cities.
    I watched your other videos that you used notebook to transform data and in other video scheduled in pipeline. If you can show how to call this api for multiple cities, would be a great project. You can create a playlist as a end to end project.
    I really like your channel, following your daily spark videos.
    I believe this channel will be one of the main source of fabric youtube channels.

    • @samirsahin5653
      @samirsahin5653 Рік тому

      Just saw you already have a playlist:)

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  Рік тому

      Hey! Yes, I plan on continuing this series and going a bit deeper on data pipelines v soon! Thanks for watching and for your kind words 💪🙏

  • @KAshIf0o7
    @KAshIf0o7 Рік тому

    waiting for next part

  • @FranciscoRodriguezFabric
    @FranciscoRodriguezFabric 8 місяців тому +1

    Thanks !

  • @gguuyypp
    @gguuyypp 3 місяці тому

    Thanks, can you make a video about extracting a file from SFTP ?

  • @sreekanth0112
    @sreekanth0112 4 місяці тому

    Hi,
    Please make the video on extracting the files from share point to lakehouse through Data pipeline ( Data Factory) in fabric

  • @anushav3342
    @anushav3342 10 місяців тому

    Great content. Thanks for explaining about different options available in Fabric. I need to load a Fact data which is a bookings data through REST API call. How to setup the loading into lakehouse for ingesting weekly updates. Do i need to start with pipeline or is there a way to start with notebook directly to load data into the lakehouse.

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  10 місяців тому

      thanks for watching! it depends on the complexity of your api call really! if it’s simple, then you can use dataflows or data pipelines, more complex authentication or transformation will require a notebook

  • @hotrung5469
    @hotrung5469 10 місяців тому

    Thank you so much Will for your detailed instructions!!! Could you help me make an instruction to load Excel files in OneLake (specifically stored in lakehouse) into Tables in Datawahouse?

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  10 місяців тому

      hey thanks for watching! to read excel into a lakehouse table, you can either use pandas to load into a pandas df and convert to spark df (and then lakehouse table) or you can use the pyspark.pandas library (pandas within spark) - good luck!

  • @fnplazatuc
    @fnplazatuc 5 місяців тому

    Hi, how are u? After data extraction, How its the next step to transform the data and visualize this in MS PowerBi?

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  5 місяців тому

      Hi there, good thanks, you? In this video here I go right from end-to-end talking about extraction , storage and then visualization. Hope it helps 👍ua-cam.com/video/hwwU8V48g-4/v-deo.html

    • @fnplazatuc
      @fnplazatuc 5 місяців тому

      @@LearnMicrosoftFabric Will how are u? Your video are util! I have a question.. It's possible obtain data from JSON API rest and will transformate to table in a datalake? I can't execute this.. only transform in a Warehouse! Thanks!

  • @alex24tech
    @alex24tech 6 місяців тому

    how to run a pipeline for data copying. In fact, I have an API that uses two authentication systems: token and basic authentication (user and password).
    the first connection to the API (via the post method) allows you to retrieve the token which will be used afterward by the second request to execute the request itself. Is it possible to create a paper that can do the job? should I use nodebooks or is there a solution?
    the result of the second query will of course be stored in a lakehouse table.

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  6 місяців тому +1

      Yes, should be possible either in Data Pipeline, or Notebook. You can make the post request, then pass the token to your next activity.

    • @alex24tech
      @alex24tech 6 місяців тому +1

      @@LearnMicrosoftFabric Thanks sir. Please do you have any ressource that can help me?

  • @matask23
    @matask23 8 місяців тому

    Amazing video, thanks for this Will! I wanted to ask if PySpark would be the most optimal choice to achieve this or if I could use SQL to achieve the same goal?

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  8 місяців тому +1

      Yes you could also use SQL! The good thing about fabric is that you're free to use whichever language you are comfortable with! (well as long as it's T-SQL, Python, R, Scala or KQL)

    • @matask23
      @matask23 8 місяців тому

      @@LearnMicrosoftFabric Thanks for that, that's really useful to know! I guess my follow up would be whether there's any compatibility issues or limitations that I might encounter if I was to use SQL within MS Fabric?

  • @dineshreddy2207
    @dineshreddy2207 4 місяці тому

    Hi, I have an XML file an want to ingest this file into MS Fabric without using notebook, Can you help me ?

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  4 місяці тому

      Should be able to use either Dataflow or Data Pipeline, but if it’s horribly nested XML, notebook will probably be necessary

    • @itversityitversity7690
      @itversityitversity7690 4 місяці тому

      I used copy activity but seems some problem and suggestions please give other way..

  • @mshparber
    @mshparber Рік тому

    Thanks. Please explain what is best practice to make a nested api calls and merge the results back into one json file? For example, the first api call /students - gives me a list of all students, then for each I need to make another call /{sudent_id}/courses to get their courses information. I need to save the results of all students’ courses as one json file. It’s easy to do in Dataflow, but it cannot save the results as json, only table. So what is the right way to do it in Pipeline?Thanks!

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  Рік тому

      Hey it's not something I've done with Data Pipelines tbh, but might be possible with the For loop activity? If you know how to use Python, I would recommend doing this in Fabric Notebooks with the requests library - much easier to manage this kind of logic in a notebook.

    • @mshparber
      @mshparber Рік тому +1

      Thaks. One of the main advantages in Power BI tools is low-code/ no-code. I know Python, but I we need a simple GUI low-code experience. Like a Power Query / Dataflow. I hope Pipeline can provide it
      @@LearnMicrosoftFabric

    • @jampeauk
      @jampeauk Рік тому

      @@mshparber if it helps there is now a GUI which should do what you are after, do some watching/reading on "Data Wrangler" it is currently only avaliable for Pandas in Notebooks but it should be useful.

  • @DinoAMAntunes
    @DinoAMAntunes 7 місяців тому

    Hello Very good Tks very much. My ERP is 100% online but i can´t connect to it. I think i have all the data necessary. URL, db Name, Username Password or API.

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  7 місяців тому +1

      Hey if it's 100% online and an ERP system, it's likely to have an API to connect to. Google " {ERP NAME} API documentation" and find out how to connect to it. Or if it's one of the big ERP systems, you could use a dataflow because they might have a pre-built connector for your ERP system available. Good luck

  • @rashane1000
    @rashane1000 10 місяців тому

    Awesome video, keep it coming! How about having Oauth2 protocol? New subscriber here, thanks very much!

    • @LearnMicrosoftFabric
      @LearnMicrosoftFabric  10 місяців тому +1

      Hey thanks for watching! Currently I haven't covered this yet, but I should make something about oauth2 yes because it's such a common use case.

    • @rashane1000
      @rashane1000 10 місяців тому

      @@LearnMicrosoftFabric thanks heaps.looking forward for your next vids 🔥🔥🔥

  • @rdeheld
    @rdeheld 3 місяці тому

    Thats not complicated. Would like to see it it the other way around