Top AWS Services A Data Engineer Should Know

Поділитися
Вставка
  • Опубліковано 6 чер 2024
  • This is a high-level overview that the top services on AWS a data engineer should know in order to solve their data engineering challenges. I explain it by using an example of integrating 2 different data sources to create a central data repository to enable our hypothetical analytics team to perform their own self-service analytics. This video is broken down into Data Ingestion, Data Lake, Transformation, Data Warehouse, Data Analytics, Application Integration, Data Pipeline Orchestration, and Monitoring.
    Timeline
    00:00 Introduction
    01:05 Data Ingestion
    04:24 Storage - S3
    05:10 Transformation
    05:47 Data Catalog
    06:44 Data Warehouse
    07:13 Data Analytics
    09:08 Application Integration
    10:28 Orchestration
    11:57 Monitoring
    buy me a coffee: www.buymeacoffee.com/dataengu
    useful links:
    AWS Serverless Data Lake Architecture: • AWS Serverless Data La...
    Optimize Data Lake: • 3 Tips To Optimize You...
    SNS vs SQS: • SNS vs SQS Comparison?...
    #AWS
    #dataengineering

КОМЕНТАРІ • 167

  • @cringe6006
    @cringe6006 Рік тому +52

    Woah woah
    I know nothing about AWS
    Then why the heck did i totally totally understand this video
    It was crystal clear
    I usually don't subscribe to channel but this time it was not even a question 👍🏾
    Man I wish you could teach me all about data engineer

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому +4

      Thanks for the wonderful feedback! I'm glad the way I explained it was helpful. Thanks for subscribing! More AWS related content to come.

    • @cringe6006
      @cringe6006 Рік тому

      @@DataEngUncomplicated thank you
      Eagerly waiting for more content 😁

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому +1

      Thanks, I try to strike a balance with overview videos and technical tutorials on aws. Fun fact, I think you are my 5,000 subscriber!

    • @cringe6006
      @cringe6006 Рік тому

      @@DataEngUncomplicated 👏🥳🎉🎊

  • @rahulsrivastava9787
    @rahulsrivastava9787 2 дні тому +1

    The concepts in this video went inside my brain like a hot knife going in butter. Great video for someone like me who comes from a functional background. Great work...really appreciated.

  • @renzcarillo7277
    @renzcarillo7277 2 роки тому +76

    As a self taught data engineering student, figuring out what services to start with aws is very hard - this indeed uncomplicates everything!

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому +6

      Thank you for the kind words Renz! I'm glad it was helpful.

    • @francismagnusson378
      @francismagnusson378 Рік тому +1

      hey there, could you give a link to a resource for data engineering? im about to start a job in DE and im kinda intimidated with the various skills needed for the job. I already know Python and SQL (which is why i was hired, or so im told) but i know nothing about DE. im about to start this udemy course on Python, SQL, and Pyspark, but im afraid it might not be enough. any help would be appreciated, thanks!

    • @samb23692
      @samb23692 Рік тому +1

      @@francismagnusson378 Hi, how did you proceed? How's your job going on?

  • @jihanzhang5527
    @jihanzhang5527 Рік тому +2

    Great video. Something to add here: S3 Select can be used for quick and adhoc querying dealing with single S3 file. Athena can also work directly with S3 files if you just need some quick data understanding and investigation. EMR Serverless can address the headache for managing EMR cluster and in the meantime gives your more power for ML.

  • @go556
    @go556 Рік тому +2

    It is so far the most helpful video I saw about aws services for DE. I hope there are more likewise. Thanks a lot for sharing!

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Thanks for the comment! Yes, new videos related to data engineering and AWS every week!

  • @vibhavmishra2002
    @vibhavmishra2002 Рік тому

    I am glad I found this video. Brilliant overview. cheers !!

  • @mehmetkaya4330
    @mehmetkaya4330 Рік тому

    Such a great video! Summarized basic AWS services for data engineering very nicely! One of the best! Thanks!

  • @chriscrocker438
    @chriscrocker438 Рік тому +2

    I'm currently working on my AWS certification and will be referencing the diagram from this video often. Thanks for the clear and concise walk through of the context of each of these services!

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      You're welcome Chris, I'm glad it was helpful. Good luck on your AWS certification!

  • @LittleDetours
    @LittleDetours Рік тому +1

    Love your clarity on the topic. Subscribed! Can't wait to explore all your videos👀

  • @shreyaroraa2234
    @shreyaroraa2234 Рік тому +1

    One of the best AWS explanation I saw so far

  • @ricardolizano8851
    @ricardolizano8851 8 місяців тому +1

    This is pure gold. Thanks!

  • @victoriwuoha3081
    @victoriwuoha3081 2 роки тому +2

    @DataEng Uncomplicated. This has to be one of the best explanation of how I can use AWS for my data analytics engineering workloads. Thank you for the detailed summary of the various services.

  • @jasminew7573
    @jasminew7573 2 роки тому +2

    Great video Adriano! It helped me understand all the AWS services better.

  • @obiebbw6630
    @obiebbw6630 2 роки тому +3

    As someone else commented. I'm learning to be a Data engineer and learning what each application is used for has been a struggle. I'm learning the Azure system, but seeing this visual helped. New sub.

  • @user-bq9ph5im1q
    @user-bq9ph5im1q 9 місяців тому +1

    Thanks for creating this video. You explained the concepts very clearly.

  • @Ghillieye
    @Ghillieye Рік тому +1

    Great overview and I think your method of slowly explaining the diagram section by section is brilliant! A follow up video of a real use case would be even better. Subbed!

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Hi Adrian, Thanks for the feedback and subscribing! Can you elaborate on your suggestion? Are you thinking of an actual hands on tutorial or overview use case type video?

  • @saiduluchintha3766
    @saiduluchintha3766 2 роки тому

    Excellent video.. the sequence you have covered this in is seamless.
    I am surely having this for quick reference.

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому

      Thanks so much for your kind words Saidulu. I really appreciate it.

  • @tiisetsomokhesi5206
    @tiisetsomokhesi5206 2 роки тому +5

    I am starting as Data Engineer with a company that uses AWS ( I am from Azure background), this video has been really helpful with the architecture and services.

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому

      Thanks Tiisetso! I'm glad this video was helpful. Thank you for leaving me a comment

  • @sushilamahato
    @sushilamahato 2 роки тому +3

    I found this video very useful as a learner. Thank you!

  • @user-ij4ih8qp3e
    @user-ij4ih8qp3e 24 дні тому +1

    Thank u so much. Your tutorial helps me a lot.

  • @Omer698
    @Omer698 3 місяці тому +2

    Your channel is a god send. Data Engineering channels are rare on youtube and those that do exist are tailored towards Indian Students. Thank you for the content and you've got a new subscriber.

  • @SlimmDrea
    @SlimmDrea 2 роки тому +2

    This was perfect! Exactly what I was looking for lol.

  • @naraendrareddy273
    @naraendrareddy273 9 місяців тому +2

    Hands down the number 1 video for beginner Data Engineers

  • @cludianobre
    @cludianobre 7 місяців тому +1

    fantastic video. Thanks for this.

  • @nareshs7710
    @nareshs7710 Рік тому

    simply well articulated

  • @nikeating
    @nikeating 2 роки тому +1

    Great video. Super well summarised

  • @kennedysigauke953
    @kennedysigauke953 Рік тому

    Very informative, thanks!

  • @tamasensei550
    @tamasensei550 3 місяці тому +2

    This video is really great. As an ETL developer, I aspire to become a data engineer in the next few years. Your explanation is very clear!

  • @endpermia
    @endpermia 9 місяців тому

    Thank you so much for this video! I have an interview tomorrow and this boosted my understanding and confidence. Great explanations!

  • @clovisfilho93
    @clovisfilho93 Рік тому

    Great video!! Thanks for sharing, it really help me to better understand AWS tools

  • @melimesesan5786
    @melimesesan5786 4 місяці тому +1

    Awesome job!

  • @BeABetterDev
    @BeABetterDev 2 роки тому +1

    Amazing Video!

  • @ajprasad6865
    @ajprasad6865 Місяць тому +1

    Clear and concise

  • @networkfreddy2000
    @networkfreddy2000 4 місяці тому

    Quickly subscribed. Currently a AWS Cloud Engineer for a AI Company so I've been upskilling in Data Engineering . Planning to take the DEA-C01 exam. Great information and your presentation style is perfect!

    • @DataEngUncomplicated
      @DataEngUncomplicated  4 місяці тому

      Thanks so much for the kind words! I'm glad it was helpful. Good luck on the exam!

  • @oluwatobitobias
    @oluwatobitobias 2 роки тому

    God bless the works of your hand....great job

  • @brozkeeper
    @brozkeeper 10 місяців тому +1

    You sir, are the main man. Thank you.

  • @ravivarma8988
    @ravivarma8988 Рік тому

    It works! Thanks a lot.

  • @senthilsds
    @senthilsds 2 роки тому +1

    I am looking for hands on experience. This video helps me understand concepts better

  • @sailpawar6164
    @sailpawar6164 Рік тому +1

    i had watched so many other videos on same topic..this is the one i was looking for even though i didn't know what exactly i was looking for as everything was new

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Thanks sail! I'm glad it was what you were looking for. What were you searching for on UA-cam exactly?

    • @sailpawar6164
      @sailpawar6164 Рік тому

      @@DataEngUncomplicated i am familiar with hadoop environment..i wanted to know how to do all of it in aws..now i know! thanks

  • @groundingtiming
    @groundingtiming 4 місяці тому

    wow this is awesome!

  • @sergiojulio
    @sergiojulio 2 роки тому +1

    Great video thanks

  • @gilang6128
    @gilang6128 10 днів тому +1

    love this

  • @rofu37
    @rofu37 Рік тому

    Channel name checks out

  • @saquib513
    @saquib513 2 роки тому +7

    This is such a great video. Any chance you will be doing a full fldge video on implementing these tools together? And I love your teaching style, I would love to know if you offer any courses that I can take.

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому +3

      Hi Nazz, thank you for your kind words! Yes I plan on making a playlist that has technical tutorials on implementing each component so if you subscribe to my channel, you will get notified when those videos are released! Unfortunately I don't offer any courses at this time, I'm just focusing on making UA-cam videos to help data engineers on AWS!

  • @johndanson4427
    @johndanson4427 2 місяці тому

    10/10 in 2022 - although in the last year or 2 - - - apache iceberg, spark and kafka have got added into the mix - surfaced as "need-to-haves", rather than 'also useful'.
    Still the best Data Engineering overview demo on YT.

    • @DataEngUncomplicated
      @DataEngUncomplicated  Місяць тому

      Thanks John, yea I was thinking at some point to update this for 2024 or 2025. You are right, there are also new services like lake house architecture related data formats such as iceberg, delta or hudi that are now supported.

  • @jwtsfj
    @jwtsfj 2 роки тому +1

    You are a legend sir

  • @demohub
    @demohub 2 роки тому +1

    Thanks for sharing

  • @andrewting3081
    @andrewting3081 Рік тому

    Bruh, thank you SO MUCH!

  • @ososummer88
    @ososummer88 Рік тому

    Great video!

  • @techtransform
    @techtransform 2 роки тому +1

    You are awesome

  • @jamespaz4333
    @jamespaz4333 Рік тому

    Wowww excellent video. Thank you very much. Is there any course that you could recommend to learn these specific tools?

  • @mwaqze
    @mwaqze Рік тому +1

    Hi there, thanks for such a wonderful explanation of a complex topic. Can you share the diagram picture through a link please?

  • @Larry21924
    @Larry21924 3 місяці тому +1

    This is pure perfection. I read a book with similar content, and it was pure perfection. "Mastering AWS: A Software Engineers Guide" by Nathan Vale

  • @Bill0102
    @Bill0102 5 місяців тому +1

    Your work is truly impressive; it reminds me of a book I read that had a similar impact. "AWS Unleashed: Mastering Amazon Web Services for Software Engineers" by Harrison Quill

  • @309-baby7
    @309-baby7 Рік тому

    great introduction to these services. is there a specific data integration service to get data from salesforce (cRM) source?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      AWS glue appears to have salesforce connectors so that would be an option. I'm sure you could do it in lambda functions as well if your data is small enough as well

  • @sandeepsingavarapu3839
    @sandeepsingavarapu3839 2 роки тому

    Very informative video, Thank you. I am trying to learn Data engineering and trying to do some real world projects. Could you create few videos for End to End data engineering projects with and also some real world projects/ideas to try.

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому +1

      Hi Sandeep, yes, this is high on my video list! Thanks for the suggestion!

  • @chandrabhatt
    @chandrabhatt Рік тому

    Subscribed

  • @samb23692
    @samb23692 Рік тому

    Hi, Great video. Can you give suggestions on how to start learning these services?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Sure, I can provide some suggestions on how to start learning AWS services:
      1. Start with the AWS Free Tier: AWS offers a free tier for many of its services, which allows you to explore and experiment with them without incurring any charges. This is a great way to get started and familiarize yourself with the AWS platform.
      2.Take online courses and tutorials: AWS provides a wealth of resources for learning, including online courses, tutorials, and documentation. You can start with the AWS Training and Certification website, which provides a range of free and paid courses on various AWS services.
      3. Join AWS user groups and forums: Joining user groups and forums can be a great way to learn from other AWS users and get answers to your questions. AWS provides an official forum, as well as many user groups around the world.
      4. Practice with real-world scenarios: Once you have a basic understanding of AWS services, try to apply what you have learned to real-world scenarios. This will help you understand how the services work together and how they can be used to solve real-world problems.
      5. Get certified: AWS offers a range of certifications for different roles and levels of expertise. Getting certified can be a great way to demonstrate your skills and knowledge to potential employers.

  • @pankajjagdale2005
    @pankajjagdale2005 9 місяців тому

    Great Video..!! AWS App flow is missing........... Thank you

    • @DataEngUncomplicated
      @DataEngUncomplicated  9 місяців тому +1

      Great point, I know this service is being used more recently

  • @jamisonlewis4884
    @jamisonlewis4884 Рік тому

    Very well done! Don't forget the importance of data lineage though. Big time clients always want the capability to visually track data lineage.

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому +1

      Thanks Jamison! Yes! This is important but there isn't a dedicated data lineage or governance service released yet in aws...datazone was announced at reinvent which should fill this gap hopefully

  • @helovesdata8483
    @helovesdata8483 Рік тому +1

    I'm preparing for an data engineer interview. The company is looking for someone good at creating pipelines in aws. I'm going to use your videos. I read so many different definition for "ingestion". Ingestion comes right after extraction in the ETL process, right.

  • @mercantilism954
    @mercantilism954 Рік тому

    Thank you for the great video.
    I have one question. Wouldn't it be very costly to use all of the AWS services? I store lots of data in S3 and it costs $100-150 a month.

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому +2

      Hi, it all depends on your use case for your data and access patterns. For example although you have 10 TB of data, it doesn't mean you are querying all 10 TB in every query and rather only doing queries on subsets of your data.

  • @Draco-pu4ro
    @Draco-pu4ro Рік тому

    Hi, What Services should I use if I have a source which sends CSV files and the schema changes every week? The column names are different and new columns were added each time. Ideally need to expose the data from these files into tables. Any suggestions as to which services should I use?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Hi Draco, It sounds like using the glue catalog would be a good choice to register your data in as it handles drifting schema. You can use a crawler to automatically scan and identify the changes in the schema

  • @gabrieljeca
    @gabrieljeca 2 роки тому +1

    Good content! But how about AWS Managed Workflows for Apache Airflow for orchestration? Wouldn’t it be better to orchestrate lambdas and glue jobs with MWAA?

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому +1

      Hi Gabriel, thank you! Yea this is a great point, this could have been a service added to the orchestration component of the diagram. It's a good option but I don't think it's "better "necessarily since you it's another server you have to pay for the server to keep running 24/7 vs step functions and glue orchestration are serverless and only pay per x # of invocations.

    • @gabrieljeca
      @gabrieljeca 2 роки тому

      @@DataEngUncomplicated Thanks for the answer and the great insight there. I guess going serveless is always the best option. But execution logs of both from glue orchestration and step functions are accessible in cloud trail?

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому

      The logs for glue and step functions are actually accessible in cloud watch logs.

  • @user-ym9hn4km8l
    @user-ym9hn4km8l 7 місяців тому

    Why there is an arrow from AWS Glue Catalog to the Data warehouse (Red Shift)?

    • @DataEngUncomplicated
      @DataEngUncomplicated  7 місяців тому

      Glue catalog works on databases as well as data lakes so you can define your redshift datasets in AWS glue to keep track of them

  • @suleimanumar258
    @suleimanumar258 Рік тому

    Can you do the same but for Azure services?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Hi Suleiman, sorry I'm not as familiar with Azure services. AWS is what I am currently focus on.

  • @iamdare
    @iamdare 2 роки тому

    thanks for this. do you have a course on Udemy on Data Engineering?

  • @helovesdata8483
    @helovesdata8483 2 роки тому +1

    If we clean the data after loading it into S3, this would be ELT right?

  • @danpefok3793
    @danpefok3793 Рік тому

    Interesting! Great job but the author di not speak on Security. I think we need security too.

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому +1

      Thanks Dan, you are right, I left out security. I would throw up KMS in the security section if individuals wanted to encrypt their data with kms keys

  • @josecarlossantos7673
    @josecarlossantos7673 Рік тому

    What's the alternative to load data from curated zone to Redshit and Athena. Lambda + Glue or it isn't necessary?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      When you say alternatives, do you mean AWS native alternatives? AWS announced an auto load feature from s3 to redshift I guess assuming that the schemas are the same.

    • @josecarlossantos7673
      @josecarlossantos7673 Рік тому

      @@DataEngUncomplicated Yes. For example: lambda and EMR before curated layer and after from curated to redshift and Athena do you recommend any aws service to load the data?

  • @BenOgorek
    @BenOgorek Рік тому

    Great video! 2 questions:
    1) Any reason you didn’t mention DMS?
    2) What services help you out with database changes (deltas)?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому +2

      Hey Ben, good call out on DMS. DMS is a good service for data engineers to learn if their focus is on data migration. For database changes, I have used both aws glue or lambda functions depending on the size of the data and building the delta logic in python.

  • @DaveThomson
    @DaveThomson 2 місяці тому

    Question: If you are pulling data from external API's would you use Glue to do this or would you use something else to get this infromation and store it in S3 first and then use glue to trasform the data in s3?

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 місяці тому

      Great question Dave! I would recommend using lambda functions to ingest the data in S3. Glue is for processing large amounts of data and has a bit of a start up time. You will probably want a lambda function pulling data from your API frequently so the data load size would probably be relatively low.

    • @DaveThomson
      @DaveThomson 2 місяці тому

      Thanks, thats direction I went.
      I have a meta lambda, a datasource lambda (1 for each data source) and a s3 upload lambda. Using step functions.
      This way the meta lambda gets the customer infromation required, spawns parallel datasource lambdas which all pass data to s3 upload lambdas. ​
      My only concern is how to best structure it in s3 for Glue.
      End goal here is Athena / Quicksite for BI purposes.
      I looked at AppFlow for some of this but hated it since I couldn't get all the object at one time and had to build an object per flow. So if a single data source has a lot of objects thats a lot of flows which seems annoying.

    • @DaveThomson
      @DaveThomson 2 місяці тому +1

      @@DataEngUncomplicated Thanks. I went with step fucntions.
      A master function that gets customer meta data and spawns functions for different datasources that all end up calling a s3 upload function.
      Now my only concern is am I storing the data properly in s3 for glue to make use of.
      Something like
      # Format the file name based on the current date and data type
      file_name = f"{data_type}_{year}-{month}-{day}.json"
      # Update the S3 key (path) to use the 'year=YYYY/month=MM/day=DD' partitioning convention
      s3_key = f"{customer_id}/year={year}/month={month}/day={day}/{data_type}/{file_name}"

    • @DataEngUncomplicated
      @DataEngUncomplicated  Місяць тому

      @@DaveThomson I hope you figured this structure out by now, but if you want to use athena, you need to have your datasets seperated into different objects (folders) in S3. I would add a partitioning strategy as well which will save you in query costs if you know how your data will be queried.

    • @DaveThomson
      @DaveThomson Місяць тому

      @@DataEngUncomplicated Thanks!

  • @makhus8337
    @makhus8337 9 місяців тому

    can you do entire project for this?

    • @DataEngUncomplicated
      @DataEngUncomplicated  9 місяців тому

      Yes, I have done projects using most of these services in the past.

  • @playingneutral
    @playingneutral Рік тому

    aws is a not a career but just a cloud platform right? where we can put our skills and start working in cloud based environment right or not? pls clear me out that if i just directly with a non tech or no data anylytics background persue data analytics certification of aws but prepare through the learning material provided by aws and also hands on practice would i get the job easily? or i need to specialize all the 200 services? and also other python etc pls guide pls not getting answer to this anywhere

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      Hello, these are good questions that lots of people starting with AWS might have! Yes, AWS is just a cloud platform. I would say you should still have the foundational data analytics skillset in order to he succesful. You definitely don't need to specialize in 200 services to get a job. I would focus on learning the services that are relevant for a particular role. Nobody knows every single AWS service there is just too many. For your question about is AWS certification enough to get a job, it all depends on the role, the employer and what they are looking for. I would say it can't hurt your chances of getting a job if you are looking for a role that involves AWS.

  • @EthanDeng
    @EthanDeng Рік тому

    Use AWS Batch to Batch Data Ingestion

  • @hamalishah
    @hamalishah 4 місяці тому

    interesting

  • @nainaarabha9186
    @nainaarabha9186 11 місяців тому

    I have one doubt. Can we host multiple kafka producers in one ec2 instance?

    • @DataEngUncomplicated
      @DataEngUncomplicated  11 місяців тому

      Are you talking about using Amazon Managed Streaming for Apache Kafka?

    • @nainaarabha9186
      @nainaarabha9186 11 місяців тому

      @@DataEngUncomplicated yes!

  • @anoopkumar2142
    @anoopkumar2142 2 місяці тому +1

    hopefully zero ETL is going to change a major chunk of dependencies when managing the data within the aws ecosystem.

  • @hunnidkray534
    @hunnidkray534 Рік тому

    Is there a pdf file to print out the diagrams

  • @rememberthename911g
    @rememberthename911g 2 роки тому

    How would I encorparate AWS into my project if I am using a websites API as the source of my data?

    • @rememberthename911g
      @rememberthename911g 2 роки тому

      Its not much data. A max of a couple hundred lines but I still want to be able to show an employer I can use different services

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому

      I think your asking how to ingest data from an api into aws? There are many ways to do this but for your purpose you can write a lambda function that uses the requests library to read data from the API and use the python library aws data wrangler to write the data to s3.

    • @rememberthename911g
      @rememberthename911g 2 роки тому

      @@DataEngUncomplicated Thats exactly what I was asking, thanks. Can you make sure this sounds correct though?
      1. AWS lambda to ingest data from API call and write that data to an s3 bucket
      2. Read data from s3 using Python notebook file (that is using PySpark package) or read data from s3 using AWS EMR

    • @DataEngUncomplicated
      @DataEngUncomplicated  2 роки тому

      @@rememberthename911g yup this works, you might want to define your data source in a glue catalog table so it will be more easily ingested into a glue job or pyspark job.

  • @surendhirankrishnamoorthy6689
    @surendhirankrishnamoorthy6689 3 місяці тому +1

    As the channel name says you're making things uncomplicated. 🎉😅

  • @Naveen-hk3yh
    @Naveen-hk3yh 9 місяців тому

    Great video I working as AWS data engineer from past two years overall experience is 11 years.
    Could you recommend what certification I have to do as data engineer confused as different types of AWS certification exists

    • @DataEngUncomplicated
      @DataEngUncomplicated  9 місяців тому

      Hi Naveen! yea it is confusing because there isn't really a specific data engineering certification. The Developer associate and the AWS Data Analytics Specialty are the best one. I would also go after the database specialty if you think you will be working a lot with databases

  • @DaveThomson
    @DaveThomson Місяць тому

    Do you do any consulting?

    • @DataEngUncomplicated
      @DataEngUncomplicated  Місяць тому

      Hey David, I'm actually a full-time AWS D&A consultant for a company that is an AWS partner. Let me know if you want to chat.

    • @DaveThomson
      @DaveThomson Місяць тому

      @@DataEngUncomplicated I would like to chat. I too work full time for a partner.

    • @DataEngUncomplicated
      @DataEngUncomplicated  Місяць тому

      @@DaveThomson Great, feel free to contact me through the email I have posted on my channel.

    • @DaveThomson
      @DaveThomson Місяць тому

      @@DataEngUncomplicated sent you an email.

  • @nj6553
    @nj6553 Рік тому

    Millions or billions...

    • @DataEngUncomplicated
      @DataEngUncomplicated  Рік тому

      I have no context to what this means but I'm going to respond with we can process millions or billions of records in data engineer with AWS 😉

  • @user-br6oe3kf9k
    @user-br6oe3kf9k 6 місяців тому

    Hi sir currently am learning sql and python I should start learning Big data am not knowing the proper way to start the way you were telling was so good so I felt like asking it will be really grateful if you please help me through this how can I contact you sir

    • @DataEngUncomplicated
      @DataEngUncomplicated  5 місяців тому

      Hello I know there are a lot of concepts and technologies to learn! you can reach me at dataenguncomplicated@gmail.com