I think scaler should have separate course for Data engineering with Dsa and system design with industry level courses as most of guys are working in data engineer field than as Data science Waiting for such quality course to move into product based company
@@ankitKumar-js1ow Till now they do not have a plan/module for Data Engineering .They are simply not interested ..And what they have is DE is just not digestable
This is really really a very detailed and great explanation of end-to-end data pipeline building architecture. Hatsoff to your hardwork and putting this video out there for us brother. It will definitely clear the doubts and picture about how pipeline work for data migration/ingestion/integration based projects. Thanks a lot. 🙏
here is a summary: 00:57 - Understanding of data domains (example: finance data terminology, what is the relationship, primary key, foreign key. Give business side a clear image what can data engineers provide) 02:57 - Choosing data sources (example: sql database, distributed file system, API, sensor data, web application generated) 04:43 - Determine the data ingestion strategy( full load or incremental load) 08:37 - Design the data processing plan (pipeline design real-time process, or batch process) 11:11 - Set up storage for the pipeline output ( amazon s3 HDFS for datalake, AWS redshift, Hive for datawarehouse, dump back in transational databases) 13:19 - Plan the data workflow (scheduler, Apache airflow, apache nifi, Azkaban) 14:42 - Monitoring and governance tools (alert for pipeline failing, tools: Kibana, Grafana, DataDog, PagerDuty)
Check out our FREE masterclasses by leading industry experts now: bit.ly/3Apojjv
I think scaler should have separate course for Data engineering with Dsa and system design with industry level courses as most of guys are working in data engineer field than as Data science
Waiting for such quality course to move into product based company
@@ankitKumar-js1ow Till now they do not have a plan/module for Data Engineering .They are simply not interested ..And what they have is DE is just not digestable
Regular content. Can be easily searched over internet.
Haha
Paid Content is terrible .
Thank you for talking about a demo pipeline, this could come in handy in interviews.
Excellent presentation. Presented very nicely, concisely, and to the point.
helps to see the big picture, thank you very much :)
I just wanna say thank you for this video
Very well explained and all important topics were covered, thankyou for your efforts. Very helpful.
Thanks! Glad this was helpful! 😃
Thank you for brilliant video
Thanks Shashank for explaining in very understandable manner,
But i have one question you have not discussed about Staging Area??
How can NOSQL (specifically Cassandra, MongoDB ) be good for ad-hoc analytical queries as mentioned during 12:05?
This is really really a very detailed and great explanation of end-to-end data pipeline building architecture. Hatsoff to your hardwork and putting this video out there for us brother. It will definitely clear the doubts and picture about how pipeline work for data migration/ingestion/integration based projects.
Thanks a lot. 🙏
Thanks! Glad this was helpful! 😃
Good content . Thank you🙏
Well presented, thanks
Thanks
Thank you scaler
Thank you! This was really helpful and well-explained.
Happy to hear that! 🙌🏼
very nice.. thanks a ton!
I easily understand this video
As a data engineer, should you know all of these tech before getting a job or is it acquired during one?
you can easily get an entry level job in data engineering if you know good sql, basic python, basic cloud and hadoop architecture.
Awesome content 🙂
Thanks scaler! 🔥
Thank you.
Make more vedios Gurudev thankyou very much
Can't wait!
Brilliant video again
Please 1 pipeline practical karke dikhao ...UA-cam PE Aisa ek bhi vdo nhiye Jo big data ki pipe line create karke dikhaya ho...
Grafana is a really good monitoring tool
Double like 👍🏽
Thank you
Awesome Video
here is a summary:
00:57 - Understanding of data domains (example: finance data terminology, what is the relationship, primary key, foreign key. Give business side a clear image what can data engineers provide)
02:57 - Choosing data sources (example: sql database, distributed file system, API, sensor data, web application generated)
04:43 - Determine the data ingestion strategy( full load or incremental load)
08:37 - Design the data processing plan (pipeline design real-time process, or batch process)
11:11 - Set up storage for the pipeline output ( amazon s3 HDFS for datalake, AWS redshift, Hive for datawarehouse, dump back in transational databases)
13:19 - Plan the data workflow (scheduler, Apache airflow, apache nifi, Azkaban)
14:42 - Monitoring and governance tools (alert for pipeline failing, tools: Kibana, Grafana, DataDog, PagerDuty)
Really good Content
Very nice content
When will complete Data Engineering course will be launched from Scaler?
Need full course for Data Engineer
Redshift is already setup on the cloud, what about Hive?
Very nice 🙂
You guys did a great job.
Here the data source is MySQL, what if there was data coming in from multiple sources.
Scaler knows what us students are searching for on google before an exam lol
Shashank just makes everything so easy to understand
Bumb explanation.What he is explaining is based on his experience.Its not at all generic.He himself needs to improve
Aadha adhura gyan
Thank you for talking about a demo pipeline, this could come in handy in interviews.
Grafana is a really good monitoring tool
Data Modelling part was missed I guess
More Data engineering related content please
thank you for the nice explanantion
Happy to hear that! 🙌🏼
🔥🔥🔥