How to Build Data Pipelines for ML Projects (w/ Python Code)

Поділитися
Вставка
  • Опубліковано 14 чер 2024
  • 👉 Need help with Data & Analytics? Reach out: shawhintalebi.com/
    This is the 3rd video in a series on Full Stack Data Science. Here, discuss key aspects of building data pipelines for machine learning and share Python code for pulling transcripts from all my UA-cam videos.
    🎥 Series Playlist: • Full Stack Data Science
    More Resources:
    📰 Read more: towardsdatascience.com/how-to...
    💻 Example Code: github.com/ShawhinT/UA-cam-B...
    References:
    [1] How Data Engineering Works: • How Data Engineering W...
    [2] ETL vs ELT: aws.amazon.com/compare/the-di...
    [3] UA-cam Search API: developers.google.com/youtube...
    --
    Book a call: calendly.com/shawhintalebi
    Homepage: shawhintalebi.com/
    Socials
    / shawhin
    / shawhintalebi
    / shawhint
    / shawhintalebi
    The Data Entrepreneurs
    🎥 UA-cam: / @thedataentrepreneurs
    👉 Discord: / discord
    📰 Medium: / the-data
    📅 Events: lu.ma/tde
    🗞️ Newsletter: the-data-entrepreneurs.ck.pag...
    Support ❤️
    www.buymeacoffee.com/shawhint
    Introduction - 0:00
    Data Engineering - 0:34
    Data Pipelines - 1:19
    2 Types of Pipelines (ETL vs ELT) - 2:18
    Extract - 4:30
    Transform - 6:07
    Load - 7:22
    Orchestration - 9:06
    Example Code: ETL of My UA-cam Video Transcripts - 10:47
    What's Next? - 21:34

КОМЕНТАРІ • 16

  • @ShawhinTalebi
    @ShawhinTalebi  Місяць тому +3

    More on Full Stack Data Science 👇
    🔗Series Playlist: ua-cam.com/play/PLz-ep5RbHosWmAt-AMK0MBgh3GeSvbCmL.html
    📰 Read more: towardsdatascience.com/how-to-build-data-pipelines-for-machine-learning-b97bbef050a5?sk=4823c18cab0a6225b0be8773c5427704
    💻 Example Code: github.com/ShawhinT/UA-cam-Blog/tree/main/full-stack-data-science/data-engineering

  • @tylernardone3788
    @tylernardone3788 Місяць тому +3

    Probably the clearest and best video on what ETL and Data Engineer is that I have seen. Thank you!

    • @ShawhinTalebi
      @ShawhinTalebi  Місяць тому

      Thanks Tyler! Glad it was clear 😁

  • @brianmorin5547
    @brianmorin5547 Місяць тому

    You did such a great job explaining the data pipeline and gave a great example. Subscribed. Can't wait to see more vids

  • @michaelpihosh5904
    @michaelpihosh5904 Місяць тому

    Great content Shaw, thank you

  • @berndkaufmann6934
    @berndkaufmann6934 Місяць тому

    Best Video. The mix btw theory and hands on practice is genius!

  • @ajalipio1
    @ajalipio1 Місяць тому

    Crystal, as always. Great vid Shaw! 😘

  • @coolworship6704
    @coolworship6704 Місяць тому +1

    How would you automate the entire process?

    • @Ryuko-gf1fd
      @Ryuko-gf1fd Місяць тому

      He can't explain that. I'd check out Paul Iutzin, Nana Janashia, and others for real explanations. No one scraps UA-cam stuff directly.

    • @ShawhinTalebi
      @ShawhinTalebi  Місяць тому

      This is where an orchestration tool like AirFlow can help. I'll get to this in an upcoming video on ML engineering.

    • @ShawhinTalebi
      @ShawhinTalebi  Місяць тому

      Not sure what you mean. But Paul and Nana are great!

    • @ShawhinTalebi
      @ShawhinTalebi  15 днів тому

      I ended up using GitHub Actions to automate this pipeline: ua-cam.com/video/wJ794jLP2Tw/v-deo.html