Designing a Data Pipeline | What is Data Pipeline | Big Data | Data Engineering | SCALER

Поділитися
Вставка
  • Опубліковано 26 вер 2024

КОМЕНТАРІ • 57

  • @SCALER
    @SCALER  2 роки тому +3

    Check out our FREE masterclasses by leading industry experts now: bit.ly/3Apojjv

    • @ankitKumar-js1ow
      @ankitKumar-js1ow 2 роки тому +2

      I think scaler should have separate course for Data engineering with Dsa and system design with industry level courses as most of guys are working in data engineer field than as Data science
      Waiting for such quality course to move into product based company

    • @sandeepdash5652
      @sandeepdash5652 6 місяців тому

      @@ankitKumar-js1ow Till now they do not have a plan/module for Data Engineering .They are simply not interested ..And what they have is DE is just not digestable

  • @akhilcoder
    @akhilcoder 2 роки тому +26

    Regular content. Can be easily searched over internet.

  • @ArunSingh-rk7mm
    @ArunSingh-rk7mm 2 роки тому +3

    Thank you for talking about a demo pipeline, this could come in handy in interviews.

  • @NasimKhan-vu8oi
    @NasimKhan-vu8oi 2 місяці тому +1

    Excellent presentation. Presented very nicely, concisely, and to the point.

  • @arunsundar3739
    @arunsundar3739 5 місяців тому +1

    helps to see the big picture, thank you very much :)

  • @shaistaqureshi8408
    @shaistaqureshi8408 2 роки тому +1

    I just wanna say thank you for this video

  • @NehaSingh-wp4mf
    @NehaSingh-wp4mf 7 місяців тому

    Very well explained and all important topics were covered, thankyou for your efforts. Very helpful.

    • @SCALER
      @SCALER  7 місяців тому

      Thanks! Glad this was helpful! 😃

  • @daniyaqureshi6201
    @daniyaqureshi6201 2 роки тому +1

    Thank you for brilliant video

  • @umakantyadav9972
    @umakantyadav9972 2 роки тому +1

    Thanks Shashank for explaining in very understandable manner,
    But i have one question you have not discussed about Staging Area??

  • @TheSoumyakole
    @TheSoumyakole 10 місяців тому +1

    How can NOSQL (specifically Cassandra, MongoDB ) be good for ad-hoc analytical queries as mentioned during 12:05?

  • @AmitSharma-xv6sh
    @AmitSharma-xv6sh 10 місяців тому

    This is really really a very detailed and great explanation of end-to-end data pipeline building architecture. Hatsoff to your hardwork and putting this video out there for us brother. It will definitely clear the doubts and picture about how pipeline work for data migration/ingestion/integration based projects.
    Thanks a lot. 🙏

    • @SCALER
      @SCALER  10 місяців тому

      Thanks! Glad this was helpful! 😃

  • @Rk-mv8sz
    @Rk-mv8sz 2 роки тому +1

    Good content . Thank you🙏

  • @MarkyGoldstein
    @MarkyGoldstein Місяць тому

    Well presented, thanks

  • @marksun6420
    @marksun6420 Рік тому +1

    Thanks

  • @shrutiikarla1055
    @shrutiikarla1055 2 роки тому

    Thank you scaler

  • @endpermia
    @endpermia Рік тому

    Thank you! This was really helpful and well-explained.

    • @SCALER
      @SCALER  Рік тому

      Happy to hear that! 🙌🏼

  • @asishjoshi5774
    @asishjoshi5774 2 роки тому

    very nice.. thanks a ton!

  • @tamannamam3563
    @tamannamam3563 2 роки тому

    I easily understand this video

  • @FaizanKhan-ct7pc
    @FaizanKhan-ct7pc 2 роки тому +1

    As a data engineer, should you know all of these tech before getting a job or is it acquired during one?

    • @Watson22j
      @Watson22j Рік тому

      you can easily get an entry level job in data engineering if you know good sql, basic python, basic cloud and hadoop architecture.

  • @panktikhurana8906
    @panktikhurana8906 2 роки тому +1

    Awesome content 🙂

  • @krishnasaksena2364
    @krishnasaksena2364 2 роки тому

    Thanks scaler! 🔥

  • @obiradaniel
    @obiradaniel 2 роки тому

    Thank you.

  • @divyanshtayal5077
    @divyanshtayal5077 2 роки тому

    Make more vedios Gurudev thankyou very much

  • @healthificteam8465
    @healthificteam8465 2 роки тому

    Can't wait!

  • @saniyasharif9861
    @saniyasharif9861 2 роки тому

    Brilliant video again

  • @it3374
    @it3374 Рік тому +1

    Please 1 pipeline practical karke dikhao ...UA-cam PE Aisa ek bhi vdo nhiye Jo big data ki pipe line create karke dikhaya ho...

  • @ramangupta6159
    @ramangupta6159 2 роки тому +1

    Grafana is a really good monitoring tool

  • @ruthmk
    @ruthmk 6 місяців тому

    Double like 👍🏽
    Thank you

  • @abhisekchowdhury8584
    @abhisekchowdhury8584 2 роки тому

    Awesome Video

  • @StartDataLate
    @StartDataLate 4 місяці тому

    here is a summary:
    00:57 - Understanding of data domains (example: finance data terminology, what is the relationship, primary key, foreign key. Give business side a clear image what can data engineers provide)
    02:57 - Choosing data sources (example: sql database, distributed file system, API, sensor data, web application generated)
    04:43 - Determine the data ingestion strategy( full load or incremental load)
    08:37 - Design the data processing plan (pipeline design real-time process, or batch process)
    11:11 - Set up storage for the pipeline output ( amazon s3 HDFS for datalake, AWS redshift, Hive for datawarehouse, dump back in transational databases)
    13:19 - Plan the data workflow (scheduler, Apache airflow, apache nifi, Azkaban)
    14:42 - Monitoring and governance tools (alert for pipeline failing, tools: Kibana, Grafana, DataDog, PagerDuty)

  • @saibabatelagamsetty2538
    @saibabatelagamsetty2538 2 роки тому

    Really good Content

  • @cutipy433
    @cutipy433 2 роки тому

    Very nice content

  • @justdataengineer3138
    @justdataengineer3138 2 роки тому

    When will complete Data Engineering course will be launched from Scaler?

  • @nandlaljaiswal7217
    @nandlaljaiswal7217 2 роки тому +1

    Need full course for Data Engineer

  • @shanayakhan839
    @shanayakhan839 2 роки тому

    Redshift is already setup on the cloud, what about Hive?

  • @saniyapoetry8386
    @saniyapoetry8386 2 роки тому

    Very nice 🙂

  • @parisreview4651
    @parisreview4651 2 роки тому

    You guys did a great job.

  • @PankajKumar-vv5db
    @PankajKumar-vv5db 2 роки тому

    Here the data source is MySQL, what if there was data coming in from multiple sources.

  • @piyushjain419
    @piyushjain419 2 роки тому +1

    Scaler knows what us students are searching for on google before an exam lol

  • @AkashKumar-kx9vj
    @AkashKumar-kx9vj 2 роки тому +1

    Shashank just makes everything so easy to understand

  • @ashutoshrai5342
    @ashutoshrai5342 Рік тому +1

    Bumb explanation.What he is explaining is based on his experience.Its not at all generic.He himself needs to improve

  • @nemodbuniversity
    @nemodbuniversity Рік тому

    Aadha adhura gyan

  • @sheenagupta896
    @sheenagupta896 2 роки тому +1

    Thank you for talking about a demo pipeline, this could come in handy in interviews.

  • @fazaila2047
    @fazaila2047 2 роки тому

    Grafana is a really good monitoring tool

  • @bangalibangalore2404
    @bangalibangalore2404 Рік тому

    Data Modelling part was missed I guess

  • @Sameerkhan-kt5jj
    @Sameerkhan-kt5jj 2 роки тому

    More Data engineering related content please

  • @avshekraj
    @avshekraj Рік тому

    thank you for the nice explanantion

    • @SCALER
      @SCALER  Рік тому +1

      Happy to hear that! 🙌🏼

  • @prachiipandeyy
    @prachiipandeyy 2 роки тому

    🔥🔥🔥