What "REALLY" is Data Engineering? By a Data Engineer
Вставка
- Опубліковано 8 лип 2024
- What is Data Engineering? What is the difference between a Data Engineer, a Data Scientist and a Data Analyst? What skills are required to become a Data Engineer? What is the need of a Data Engineer?
In this video, I break all these questions down while showcasing the past, present and the future of Data Engineering. If you find it helpful, don't forget to leave a like and subscribe to the channel 😊
Book 1:1 sessions with me to fast track your journey to becoming a successful data engineer here:
topmate.io/jash_radia
Follow me on social media here:
LinkedIn: / jashradia
Instagram: / jash.radia
#technology #cloud #google #dataengineering
Roadmap video: • God Tier Data Engineer...
System Design video: • How to NOT Fail a Syst...
Sources:
trends.google.com/trends/expl...
www.educative.io/blog/what-is...
/ 20130306150901-25760-t...
trends.google.com/trends/expl...
quanthub.com/what-is-data-eng...
Thank you for watching this video! I have tried to cover entire Data Engineering as a concept which can be useful for experts to beginners! Btw, thanks for 4K subs! Our community is growing strongly ♥
I have been following your channel from the beginning and I cant thank you enough for the valuable information/perspective you put into the world. I am extremely excited for our meeting on Monday through Topmate! -Chris
Hi Chris, sure! I can see your appointment. Looking forward to it 😊
Thank you very much for clearing fundamental concepts of data engineering very comprehensively. Your videos are far and far better than other youtubers like darshil parmar, learning bridge etc who have never explained in layman's term and always tried to explain as fast as possible (from my point of view).
Keep it up
Thank you so much! These guys have been on UA-cam for a long time and kind of feels like an achievement to be even compared with them! I generally try to focus on improving the quality of my own video with every upload 😊 Thanks for watching and liking this content!
Thank you for sharing this article on DE Jash. Much appreciated.
Thanks, Tushar 😊
Thank you Jash. Really a clear cut & well explained video.
This was informative and I can totally relate it. I worked as a report programmer and now as an analyst and also involved in ad hoc model development request. I thought algorithms can do magic😅I spent mostly 6 months to learn basic ML concepts but when I got my hands wet on model building I realised that it’s the data which does the magic mostly and without a quality data no model can survive.
Now I’m learning DE concepts which I neglected earlier although I use hive every day for my work.😢
Can you please make a separate video on different job families inside DE ? and also any tips for analysts or some one who is already in data field and wants to become cloud data engineer?
Thanks Suraj! You can learn tech like Spark, Snowflake, BQ etc in your free time if you already have worked in Hive, it will give you a good kickstart 😊 and yes it is a great suggestion. I will add it in the backlog!
Jash the good thing about u is that u explain thing's well and also put the links for us reference in all video's..
Ur DE roadmap is my fav video...
Love your content 🙂
Thank you so much! 😊 getting these kind of positive comments is the reason I push extra mile to get more quality in the content!
@@JashRadia Google is my dream company... And u r one of the reason I'm switching to data engineering...
Presently I'm working as ETL MDM datastage developer from 1 year...
Hope I get a chance to crack google interview..
Keep posting such great insightful videos jash... U r motivating many like me..
This is awesome to know! I wish you all the best for your goals. Hope to see you at Google one day!
Cool video, this was really helpful
This is so informative. Keep up with the good work 😄👍
Thank you so much! 😀
Very Detailed 👌🏻👌🏻 Your hard work reflects in each video 👏🏼
Thank you! 😊😊
This is a really clear and informative video. Thanks Jash♥
Great to know! Thanks for watching 🫡
Thank you very much for your precise and informative videos💛
Pleasure! 😊
Liked and Subscribed, liked the guy who got enlightened, add such things more often at the right time and place, for the audiences to remember apart from the excellent analysis of data definitions across the board, Liked and Subscribed @Jash
Thank you! 😃
Pretty much great video a about DE and history.....DE has been there since long time and hope you start the separate playlist for DATA Engineering and where you can explain and few practical stuff and also You can come up with the industry live use cases for the same.
Thanks and Yes, already thinking about doing this. That's why I created a separate DE playlist righr after uploading this video!
Wow that was a great eye opener, loved your content coz as u mentioned for me DE was like pipelines, tools, etc... I would really like to hear about the points u mentioned here. How can I build my approach trying to achieve them? Any roadmap, some content that helps me understand it deep dive?
1.Figuring out important metrics in data.
2. Finding out insights.
3. Making recommendations based on historical data points.
Thank you, Kunal! 😊 these skills are much harder to master compared to technical skills. For them, we also have to understand the business context and goal of why we are building the data pipeline? What question are we trying to answer? You have to think from an analyst or a product owner's lenses to get this business value. Your experience will also help understanding this viewpoint over time. I have a technical roadmap on the channel but this will help you from technical point. Business domain knowledge is also a must no matter where you work. Be it healthcare, finance or something else.
thank you!
thanks broo. ur videos are really helpful.
Thank you so much! 😄
Hi Jash.. Thanks for the very informative content.
Thanks 😊 glad you liked it!
Great content Jash! This clarity was really important!
Thanks, Chetan. And yes indeed. 😊
Very nicely explained... If possible, can you please make a video on how to approach Data Engineering career for Database developers and DBAs (Oracle, Postgres, Sybase or any other) ?
Thank you and good suggestion! For DBAs, learning DE concepts becomes a little easier since they might have experience with SQL and DBMS or DWH concepts. It can give you a boost. You can skip those sections in my roadmap..
@@JashRadia hello sir, I am really confused at my career path.please please kindly help me. I am in cognizant. Jr software engineer. And doing basic devops pipeline work. I don't want programming and coding like development team , where should I go in devops or data engineering as I feel DE is hectic , more heavy duties, tough to learn. I know power bi, tableau ,SQL ,python. Where should I head.. please help me
Great video
Thank you!
Thanks for informative video
Please make video on how fresher can get into this field
Thank you and sure, thanks for the suggestion. I will do that too. That's why I did this video first to understand DE as a concept.
I feel like what you described in the intro is more about what data analysts do
Most people believe that data engineering is all about tech. But I think we also have to be business centric. In the intro, I mentioned
Finding out important metrics- this can help on creating the data model before creating the data pipeline
Transforming and aggregating data- this is one of the core skills for DE
Finding out insights - I agree this point falls more on analyst side but if we understand data enough to figure out what went drastically wrong when it did, it can reduce the debugging time in the pipeline, too.
Again, these are only my views.
Man ! I love your videos. so crisp , point to point and informative.
I am trying to switch into DE from Mainframe Analyst. While preparing I figured out from my own that there are different kind of DE roles, as you mentioned in this video as well.
In 7 years of my IT expreience, I have worked on SQL, Power BI,Visuzulations, Data Analysis, Databasees, python. Apart from this in past 4 months i learend Snowflake, Airflow, Big Query,basics of pyspark,GCP certification,GCP hands on Labs.
Now i am little bit stuck and confused when i see JD of jobs. My skills are matching only 50% of what i have learned or sometimes companies are looking for extra skills as well. What to do in this condition? Where to focus more or what mistake i am doing? Would be great if you can guide me man!
Thank you and 50% is fine. Try figuring out what is the most common skill missing. Work on that. Even if JD matches 70%, you should apply. Don't wait for 100% match.
@@JashRadia thanks for your feedback, any place where I can try actual hands on or make real time projects ?
@@shivendrakhare1583 bhaiya yaar we need to every here and then upgrade our self in IT n .. Also do any guy need to remember the first work he/she has done in starting of his/her career as I am afraid I am least occupied in learning and remembering things.
sir i am fresher in college ,i wanted to pursue my career as data engineer ,what are the online platform that you suggest me to take courses,i tried ibm's data engineering but it was boring and i need a course which is interactive
is data engineering tasks repetitive? I find the cleaning data exercises mundane.Maybe DE is more than just cleaning data. What are the interesting DE tasks in your opinion?
How we can start side projects in data engineering. Where we can connect to extract raw data except web scrapping??
How we can design near real time data pipelines same as we use in projects in companies??
Checkout websites like project pro for such projects.
Try the websites like data.world/ and kaggle for datasets. You can also search for standard datasets like TPC-H from relational dataset repository relational.fit.cvut.cz/dataset/TPCH
For real time data, there are multiple Open source APIs available here:
www.programmableweb.com/category/real-time/api
what your point on 'python' or 'java+scala' as i heard 2nd option is widely used in building projects compared to 1st one.
I believe Python has more scope outside of big data world too. And also, data libraries make python simple to use. In terms of job openings, Python is way ahead than java plus scala
Hi Jash,
That was truly informative and cleared the things which one should bag to start applying for Data Engineer roles. However, I have a question.
In what hierarchy should one learn the following topics to become good in the data domain :- Python (Beginner to Advance), Machine Learning, Cloud Computing, Deep Learning, AWS/GCP/Azure, AI/Deep Learning, System Design, DBMS/Distributed Systems, DSA (if needed)
Thank you.
Thank you and for hierarchy, we should follow this: SQL -> Python (DSA) -> Spark -> DWH and other data concepts -> Cloud -> System design -> Docker -> ML
Thank you @@JashRadia
Can you discuss DSA and system design needed for Data engineer profiles in big product companies ? Or make content like you create for roadmap.
I have already created a video about system design. And DSA is nothing special. Only intermediate level skills or problem solving questions are required from leetcode etc.
Sir I had learned fundamental but have some series projects in mind based on data Science A. I predictable application using cogintive service based such cv. Which way i sould go through.
Hi Chirag. I don't understand the question.
please make video about how data engineers start there career as an intern or fresher as mostly companies apply experience criterias for this role?
Hi bro
For someone who doesn’t really like coding which role would you suggest
Data engineering or data analyst .
Please help . Thank you
Data analyst
What's the difference between data scientist and data engineer?
Thanks alot for this video. It was really informative.
Just one question-
As a data engineer which language shall i pick for spark..
Is it python or scala.
As people say python is good if you want to go in ML related things.
But scala is good for hard core data engineering work.
Just wanted to know your thoughts on this.
Thanks in advance.
Thank you so much and I prefer Python because
1) libraries related to Data
2) Use cases outside of Spark
3) More job postings
4) Easily integrated with cloud connectors like Snowflake which is not available in scala
Yes, scala is faster in terms of performance but PySpark is getting better with every version. Python use cases overall outside spark will beat scala anyday. So I'd prefer Python.
Thanks for such a detailed answer
@@sarthakverma5921 You're welcome!
hi this is a very informative video .can you say any websites to practise the real time projects and daily how many hours we need to practise to become a data engineer can NON IT student can become a data engineer
Thank you for watching it! For all this, checkout my data engineer roadmap with links of resources and projects.
For a Non IT student, you need to spend more time in the prerequisites section mentioned in that video.
ua-cam.com/video/WgCavqDntlQ/v-deo.html
Please suggest a good source for dwh/ data modeling.
You can refer to this course for DWH. www.udemy.com/course/data-warehouse-fundamentals-for-beginners/
I have seen a lack of courses on Data Modeling especially related to questions that are generally asked in the interviews. I am creating such course and will soon launch it when it is done.
Hi Josh, what would be the best degree to pursue a career in Data Engineering, is it Computer Science or Data Analytics?.
Computer science all the way. Data analytics is a topic that shouldn't have its own degree. You should learn it from websites like Coursera or udemy or UA-cam. Doing a specific degree in it doesn't make sense. CS will be useful no matter where your career takes you including data domain.
I am trying to get into data engineering , should I focus on python for data science or python in general ?
Learning Python basics is key. Because as a DE you will be using it alot. Then gain some knowldge on basic data libraries like pandas, numpy etc. Learning ML related python is optional but good to have. Scikit would fall in this category.
I am from Mechanical Engineering background and I have an interest for this role. what are the roadmaps for different field engineers?
Checkout the roadmap on my channel. It will make you good in all those areas from basic. Then you have to double down on the field you are interested in
Career after becoming a data engineer at big tech companies like is there more to explore
Yes, there are always new things to learn. For example, I mostly worked on aws and azure but now working in GCP.
Slowly I am also doing a little bit of ML engineer work even as a DE. Learning is great in this field.
New concepts like Data mesh and tools like immuta and starburst are hot topics to learn.
@@JashRadia Yes because I've graduated this year so i was not having idea about it. I joined cognizant as a big data and pyspark role. Slowly learning tools and technologies.
Btw thanks for answering the question.
No problem. All the best for your journey 😊
Hi Jash, I have started working on spark, and I want to learn about internals of spark , like how executors, cores, partitions, jobs, stages, tasks and how they are created when I run a spark job (with several joins and aggregation). I am able to see these in Spark UI but not able to understand how the no.of jobs,stages created each time.
I would appreciate if you could suggest any blog, video or courses for the same because the only example i find on the internet is the word count problem. I would also suggest you to write a blog or make a video on this topic because it is not explored much. Great video btw.. i was able to relate to most of what you said. #YNWA.
Thanks a lot Kaushik and yes, I get your question. I am creating a course on DE for a website which will cover all these things. You can also book some time with me on topmate to get even better idea.
Apart from that, you can refer to these courses/articles:
medium.com/analytics-vidhya/spark-ui-c7f2ca9ef97f
www.databricks.com/session/deep-dive-into-monitoring-spark-applications-using-web-ui-and-sparklisteners
@@JashRadia Thank you . Hope we make top 4 this time 😄. #YNWA
So true man. Looks so hard now. And we got Madrid in UCL again 🥲🥲 #YNWA
Is it a good career for a fresher and how difficult it could be to become a data engineer as a fresher
Yes and it is easy to get a data engineering job at a mid scale or service based company as a fresher. After some time you can switch to MAANG level companies too.
I am currently studying my 2nd year of BCA so can I start my career as a data engineer
Yes, you can start learning the basics and then start applying for internships as well. I have seen folks with BCA or MCA a lot in data field.
I work on building etl pipelines (sql,excel,python) on raw data(excel, databases). Can I call myself as a data engineer?
Yes! 100%
@@JashRadia thank you for the quick response. Please let me know if you can mentor me in landing at Google or other product based organization. I have 3 years of experience in aforementioned work at ADP Hyderabad
@@pavankumar-ni3my yes I can. I have been doing this with about 15 more people on topmate on regular basis. Feel free to book a slot with me. Link is mentioned in the description.
Please suggest some good resources to read data pipe system design.
Try websites like project pro and then learn different services in cloud platform. every system design data pipeline question in interview can be different. You just have to figure out when to use what services. This comes with practice and knowing usecases of cloud services.
@@JashRadia Thank you very much for reply. I lost my job this month. Giving lot of interviews. I have 3.8 years of experience in sql ,Python ,spark , airflow ,bigquery , snowflake , dbt etc. Still I am getting blank in system design questions. Hope you will make video of this topic as well.
Thanks for the suggestion. Added in the backlog. And all the best for your job search!
What certifications can you purchase as a data engineer ?
Depends on the cloud platform you are working. For GCP, Professional Data engineer is recommended.
Also, for general spark certification, databricks certification has a lot of value. Developer associate one.
how much time required for learning data engineer? To land a internship or job (at least 5lpa),
I know python, SQL, and dbms.
I am in 3rd year information Technology branch.
It can take about 4 months if you already know Python and SQL
@@JashRadia thank you! One more question how much DSA i have to know for data engineering ? And which level of problem solving (Easy or medium of leetcode).
@@somnathdutta6311 welcome! And leetcode easy and medium will do. checkout my roadmap video I have mentioned these things in detail there.
@@JashRadia ok thank you so much.
@@somnathdutta6311 no problem! 😊
Please make video for bsc student to get into data engineering
Companies take bsc students or not or they need masters ?
They take bsc students but it is usually service based companies. Then you can switch to product based after some experience.
@@JashRadia okay
What will be the future of data engineering? Will it be a good career?
Yes, 100%. DE will and always be a prerequisite for DS. Only tools and technologies will change. Checkout this post I did on LinkedIn for this question recently.
www.linkedin.com/posts/jashradia_bigdata-dataengineering-data-activity-6995981582578642944-36SG?
Thanks Jash
@@mohanprasath1781 no problem!
@@JashRadia yeah but data access will be a problem in web 3 because of user permission
And may be there is a chance of structure data in web 3
@@Nick-du9ss we don't need to store or process PII data. Proper anonymization with masking and tokernization needs to be done. This is handled in today's environment as well and same can be implemented in web3. Don't see what's so different about that?
Can we add javascript with react js in data engineering
Please let me know ??
Very little use of both I would say. Javascript can be used somewhere like writing stored procs or some container based apps otherwise not required. You don't need react js at all
Bro I am studying in Artificial intelligence.
How is the view from the top of that pyramid? 😌
Hello sir I am a teacher of English age 30 l have been trying to make a transition into this domain is it really possible if I devote 1 year for building my skills on python Hadoop SQL etc
Yes, it is. 1 year is sufficient time.
bhaiya plss ek non tech placement pe detail video bna do aap>>>
Sure, thanks for the suggestions. For which profile are you specifically looking for?
What is road map to learn data enginnering
It is on my channel. Check it out.
@@JashRadia ok sir ..
Hi iam a etl developer, ple let me know how to become data engineer,
Hi Rahul, please checkout my DE roadmap video link is in there in the description. It applies to everyone. You can skip the parts you already know. If you need more 1:1 help, feel free to book a time with me on Topmate. Link is in the description.
I am from non tech background i am learning data engineering how do i crack google
Start with the roadmap that I have posted on the channel and start getting DE experience in any company. Once you have enough skills and experience, your non tech degree won't matter and then you can apply to google.
Cleaning data
🇱🇰❤️
Where you see data engineering after 10 years.....
Can b.a graduate become data engineer
Yes!
Jash brother which is better in salary at entry level data engineer or data analyst
@@longliveindia1637 Data engineers and Data scientists generally earn significantly more than Data Analysts. But data analytics is a good way to enter data world from a non tech background.
@@JashRadia thanks buddy
@@JashRadia after data analyst what should i opt data engineer or data scientist in terms of salary
You mean, data scienist..
You said definition of what data science folk does.
Simply Wasting time video
Plez help me on how to contact you Jash ?
You can use the topmate link added in the description of the video.