YouTube Data Analysis | END TO END DATA ENGINEERING PROJECT

Поділитися
Вставка
  • Опубліковано 2 сер 2024
  • Join My Data Engineering Courses - datavidhya.com/courses
    In this video, you will execute the END TO END DATA ENGINEERING PROJECT using Kaggle UA-cam Trending Dataset.
    If you are someone who wants to learn Data Engineering by doing hands-on projects then this video is for you!
    👉🏻Part 2 of the video - • UA-cam Data Analysis ...
    ✨Visit ProjectPro for more projects - bit.ly/3iIGfM5
    Useful links mentioned in the video -
    1. Create Your AWS Account - aws.amazon.com/premiumsupport...
    2. Download AWS CLI - aws.amazon.com/cli/
    3. Download Data From Here - www.kaggle.com/datasnaek/yout...
    Find Commands and Code Used in the video here - github.com/darshilparmar/data...
    👉🏻 Learn Basics (If you don't know) 👈🏻
    1. Learn Python From Here - bit.ly/3t9pFLx
    2. Learn SQL here - bit.ly/3KNmUW8
    3. Learn Spark and Big Data - bit.ly/3telBcW
    4. Learn AWS Lambda - aws.amazon.com/lambda/
    Join Data With Darshil Discord Server: / discord
    Timestamps
    0:00 Project Introduction (Very Important)
    2:22 What You Will Learn
    4:57 Understand Dataset That We Are Going To Use
    6:16 On-Premise vs Cloud Data Processing
    8:50 Create Your AWS Account
    9:19 Best Practices To Follow On AWS
    9:59 Create IAM Account For Admin
    13:15 Install AWS CLI
    16:05 Create S3 Bucket and Upload Our Data
    22:05 What is Data Lake?
    23:40 Understand Data That Got Uploaded
    24:45 Build Glue Crawler and Catalog
    30:14 Use Athena and SQL to Query Data
    32:20 Solving Error and Preprocessing Data
    34:57 Writing ETL Job In Lambda and Cleaning Data
    42:33 What is AWS Lambda and Layers?
    47:51 Querying Our Cleaned Data on Athena
    👦🏻 My Linkedin - / darshil-parmar
    📷 Instagram - / darshilparmarr
    🎯Twitter - / parmardarshil07
    🌟 Please leave a LIKE ❤️ and SUBSCRIBE for more AMAZING content! 🌟
    3 Books You Should Read
    📈Principles: Life and Work: amzn.to/3HQJDyP
    👀Deep Work: amzn.to/3IParkk
    💼Rework: amzn.to/3HW981O
    Tech I use every day
    💻MacBook Pro M1: amzn.to/3CiFVwC
    📺LG 22 Inch Monitor: amzn.to/3zk0Dts
    🎥Sony ZV1: amzn.to/3hRpSMJ
    🎙Maono AU-A04: amzn.to/3Bnu53n
    ⽴Tripod Stand: amzn.to/3tA7hu7
    🔅Osaka Ring Light and Stand: amzn.to/3MtLAEG
    🎧Sony WH-1000XM4 Headphone: amzn.to/3sM4sXS
    🖱Zebronics Zeb-War Keyboard and Mouse: amzn.to/3zeF1yq
    💺CELLBELL C104 Office Chair: amzn.to/3IRpiL2
    👉Data Engineering Complete Roadmap: • Data Engineer Complete...
    👉Data Engineering Project Series: • Data Engineering Proje...
    👉Become Full-Time Freelancer: • Best Freelancer Series...
    👉Data With Darshil Podcast: • Podcast Series - Data ...
    ✨ Tags ✨
    data engineering projects, big data project, data engineering project hands-on, hands-on data engineering projects, learn data engineering, data engineering roadmap, how to become data engineer, data engineering free projects, big data engineering, big data
    ✨ Hashtags ✨
    #dataengineer #project #darshil

КОМЕНТАРІ • 696

  • @kpyoutuber4671
    @kpyoutuber4671 Рік тому +57

    Thank you to Darshil Parmar!.
    Please note that you deployed the lambda function at 39:00 minutes of the video. It is not mentioned specifically in your explanation.
    If not deployed it will only run the default code which will anyway run successfully with hello-world print.

    • @DarshilParmar
      @DarshilParmar  Рік тому +15

      Yes, I have made mistake while editing the video, lot of people faced this error

    • @haziq7885
      @haziq7885 Рік тому +1

      thanks for the solution! i was stuck here for a long time 😅
      Darshil Parmar thanks so much for the video! hopefully this solution can be pinned for others to refer to :)

    • @fahadbakshi5449
      @fahadbakshi5449 Рік тому +1

      @@DarshilParmar can you plz provide me a solution of error at 39:00 . i am getting error as
      "{
      "statusCode": 200,
      "body": "\"Hello from Lambda!\""
      }"

    • @DarshilParmar
      @DarshilParmar  Рік тому +7

      @@fahadbakshi5449 It's not an error, You need to click on Deploy button

    • @theniyota
      @theniyota Рік тому

      This post needs to be pinned. I wasted todays trying to figure out how to make the lambda function work until I decided to go through the comments.

  • @DarshilParmar
    @DarshilParmar  2 роки тому +339

    It takes a lot of effort and energy to execute the entire project and record it! I hope you find this useful and make sure you Like this video :)

    • @aritra1414
      @aritra1414 2 роки тому +1

      What according to you will be the best resource to understand lambda in depth? I need help on that. I am working on bigdata project, but this was not my domain, learning new things and I need to learn faster. Any leads will be helpful for me. Thanks in advance. Also, please keep producing such awesome contents. Thanks a lot!!

    • @DarshilParmar
      @DarshilParmar  2 роки тому +3

      @@aritra1414 check out AWS reinvent videos on UA-cam on lambda and read white paper on lambda to understand more

    • @vivekpuurkayastha1580
      @vivekpuurkayastha1580 2 роки тому

      Hi Darsheel .. great video as always and yes i did click on like 😀 ... Can you please make a video on how to create a project in Dev environment and then switch to production environment in AWS ..... Basically how to manage the code Lifecycle in AWS from Dev to Production.... or may be you can point to a resource ... Thanks

    • @shashibhushansingh1628
      @shashibhushansingh1628 2 роки тому

      By following these steps is im able to build this project in azure

    • @tanmayshinde7853
      @tanmayshinde7853 2 роки тому +1

      I respect your efforts but to be honest I didn't understand anything. please me it in simpler way or breakdown it into smaller chunk if you can

  • @snehakadam16
    @snehakadam16 9 місяців тому +48

    Thank you to Darshil.
    This is for those who facing issues -
    1) Replace awswrangler with awssdkpandas in the code. The rest code remains the same.
    2) Add Layer : AWSDataWrangler-Python3.8 replaced it with AWSSDKPandas-Python3.8 version 10
    3) Create db_youtube_cleaned db using Glue or Athena before running the code.
    4) For Task timed out issue - increasing the memory along with time, for eg. time = 5 min, memory = 512 MB
    Hope this helps :)
    Tip: Guys, please go through the comments, if you are stuck. You will be able to find a solution for sure.

    • @DarshilParmar
      @DarshilParmar  9 місяців тому

      Thanks for putting this in one comment

    • @snehakadam16
      @snehakadam16 9 місяців тому

      Thank you for the amazing tutorial and putting a lot of effort @@DarshilParmar. Looking forward to more projects :)

    • @vamsivenna
      @vamsivenna 9 місяців тому

      @snehakadam16 @Darshilparmar facing issues like
      "errorMessage": "Unable to import module 'lambda_function': No module named 'awssdkpandas'",
      "errorType": "Runtime.ImportModuleError",
      "stackTrace": []
      help me out

    • @vamsivenna
      @vamsivenna 9 місяців тому

      it worked i think the database name should be = ""db_youtube_cleansed"""
      even awswrangler with AWSSDKPandas-Python3.8 version 10 and memory 256 mb is working fine for me. Thank you.
      But as per the video the database should get created automatically

    • @prathmeshsinha4705
      @prathmeshsinha4705 9 місяців тому

      How did you solved this ?@@vamsivenna

  • @shrirajpawar7817
    @shrirajpawar7817 9 місяців тому +50

    For people watching this tutorial now,
    AWS DataWrangler has been changed to AWS SDK for Pandas. Name has been changed but core functionality remains same

    • @vishnuvardhan9082
      @vishnuvardhan9082 6 місяців тому +1

      thank you so much man, was going nuts on this! how did you know about this?

    • @TheAINoobxoxo
      @TheAINoobxoxo 4 місяці тому +1

      thanks man much appreciated

    • @nikitha_sirka
      @nikitha_sirka 4 місяці тому

      @@vishnuvardhan9082 hii,
      When I changed the layer to AWS SDKPandas and modified the code I found the same error
      Error :
      {
      "errorMessage": "Unable to import module 'lambda_function': No module named 'AWSSDKPandas'",
      "errorType": "Runtime.ImportModuleError",
      "stackTrace": []
      }

    • @user-wi6fk8hu6x
      @user-wi6fk8hu6x 4 місяці тому +1

      Thank you very much

    • @iamayuv
      @iamayuv 2 місяці тому

      thanks bhai

  • @vitoriagarcia9876
    @vitoriagarcia9876 Рік тому +3

    These tutorials are so helpful for me! And also, they show how much effort on production you put into them. Thank you so much, Darshil!

  • @fabriciomiriani
    @fabriciomiriani Рік тому +2

    Amazing job - I'm just starting to use AWS because I would like to become a Cloud Engineer and this just incredible. Thank you a lot for your effort !!

  • @teja_surya
    @teja_surya 2 роки тому +1

    This project and you explaining it in a simple and elaborate way was awesome. Keep them coming!

  • @bfkgod
    @bfkgod 2 роки тому +24

    Darshil, you have made one of the most valuable DE learning channels on youtube. Keep up the amazing work! Thank you.

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      Thanks, will do!

    • @Punithan-rj5ng
      @Punithan-rj5ng 4 місяці тому

      @@DarshilParmar Function Logs
      START RequestId: 2b020a60-532e-4b33-9933-7cc87b5406cc Version: $LATEST
      An error occurred (EntityNotFoundException) when calling the CreateTable operation: Database db_youtube_cleaned not found.
      Error getting object youtube/raw_statistics_reference_data/CA_category_id.json from bucket de-on-youtube-raw-useast1dev. Make sure they exist and your bucket is in the same region as this function.
      LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue in the function code. However, in rare cases, a Lambda runtime update can cause unexpected function behavior. For functions using managed runtimes, runtime updates can be triggered by a function change, or can be applied automatically. To determine if the runtime has been updated, check the runtime version in the INIT_START log entry. If this error correlates with a change in the runtime version, you may be able to mitigate this error by temporarily rolling back to the previous runtime version. For more information, see docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
      [ERROR] EntityNotFoundException: An error occurred (EntityNotFoundException) when calling the CreateTable operation: Database db_youtube_cleaned not found.

  • @jerichocruz29
    @jerichocruz29 Рік тому +2

    Darshil, this is an amazingly executed project and it was easy to follow. Thanks for taking the time to put this together. Great channel.

  • @mananyadav6401
    @mananyadav6401 Рік тому +7

    Amazing @darshil ....It is clearly visible how much effort u have put in for ppt , video reording , storyboarding and including small small nuances and error that could be potentially faced.
    It can't express in words how valuable it is and how much information you are providing for the community. Really inspiring and motivating.
    Someone in other comment rightly mentioned It is a pure gem on UA-cam

  • @imsdengineer
    @imsdengineer Рік тому

    Great Stuff, You Rock boey! The entire video was very much intuitive and I must say that without a shadow of doubt that all the nitty gritties of Data is discussed in this, heading for the second part now. Worth a ⌚

  • @ajitagalawe8028
    @ajitagalawe8028 Рік тому

    Best video I have seen so far for the end to end project in big data. Thanks!

  • @ajtam05
    @ajtam05 Рік тому +5

    Just wanted to say thank you to Darshil Parmar for these projects. It's hard to find anything online that helps to this extent from end-to-end. This is great stuff! Cheers! :)

    • @mcaddit6802
      @mcaddit6802 Рік тому

      I am unable to proceed further after clicking on test getting err0r:"errorMessage": "'s3_cleansed_layer'",
      "errorType": "KeyError", can anyone pls tell what's the problem?

    • @vinaydhande9926
      @vinaydhande9926 9 місяців тому

      Are You Solve the error@@mcaddit6802

  • @janwienke6479
    @janwienke6479 Рік тому +4

    Great Project Documentation to try for yourself. One little thing to add would be a rough aws cost estimate. Definitely a thing I would be looking for if I was starting.

  • @shreeyajoshi9771
    @shreeyajoshi9771 Рік тому

    Thanks a loads for this video Darshil! Very much appreciated! 👏👏👏👏

  • @ShubhamKumar-fs9wi
    @ShubhamKumar-fs9wi 9 місяців тому

    Thank you Darshil for this amazing video, it was very helpful. Just completed this whole project plus did some extra work of moving data to redshift using glue job as well while creation connection and enabling vpc endpoint.:)

  • @castlemonohunter3019
    @castlemonohunter3019 2 роки тому

    Thank you so much for doing this.... The only one on youtube with curated data related content ❤💫

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      Thank you for your support and kind words

  • @lloydwang8108
    @lloydwang8108 2 роки тому +1

    hey Darshil, I rarely comment but just wanted to say a big thank u. This helped me out a lot! Looking forward to more of such content in the future :D

  • @dntking40
    @dntking40 Рік тому

    Excellent work. Thank you for this amazing workshop.

  • @trishasingh8832
    @trishasingh8832 2 роки тому

    Great work Darshil! 🔥🔥

  • @rohitsaha08
    @rohitsaha08 11 місяців тому

    this is a great project with your excellent guidance Darshil. Thank you!😀

  • @penninahgathu7956
    @penninahgathu7956 Рік тому

    Thank you so much for teaching us such valuable content! Be blessed

  • @skateforlife3679
    @skateforlife3679 Рік тому +21

    (edited) Important note on missing libraires :
    - AWSDataWrangler-Python3.8 is not still available
    - I replaced it with AWSSDKPandas-Python3.8 version 1

    • @anupammathur918
      @anupammathur918 Рік тому +3

      I am having error for database db_youtube_cleaned not found can you please check once?

    • @soumyaranjandash3597
      @soumyaranjandash3597 Рік тому

      @@anupammathur918 same here

    • @anupammathur918
      @anupammathur918 Рік тому +2

      @@soumyaranjandash3597 becz that is not created go to athena nd create one db with that name

    • @mackshonayi943
      @mackshonayi943 Рік тому +1

      @@anupammathur918 Thanks this helped. I created the database in Glue and it worked

    • @sahityamamillapalli6735
      @sahityamamillapalli6735 Рік тому

      @@anupammathur918 can you please elaborate in anthena data sources are there

  • @HemantSharma-fw2gx
    @HemantSharma-fw2gx 2 роки тому

    Thank you so much for this darshil!...keep up the good work🙌

  • @ArnavMondal14
    @ArnavMondal14 Рік тому +1

    Great video. Loved it and helped me build my resume. Would love to do what you do today and freelance

  • @revathil8986
    @revathil8986 3 місяці тому

    Excellent explanation. Each and every step is easy to follow and understandable.

  • @mohammedjouhar6363
    @mohammedjouhar6363 2 роки тому

    Thank you, man.. Keep up the good work!

  • @ashutoshprakash9468
    @ashutoshprakash9468 2 роки тому +1

    Great Work, Darshil!!! Next level Data Engineering knowledge provided by you in this content. ✌️ Industry level project.

  • @ritikkeshari1463
    @ritikkeshari1463 2 роки тому

    Thanks alot brother..needed such kind of lecture ..really helped in enhancing my skills..please make more such videos

  • @adityaanand835
    @adityaanand835 2 роки тому

    Really appreciate your hardwork you bring to tthe table!!!

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      Thank you making my hardwork pay off by watching video

  • @gomes8335
    @gomes8335 2 роки тому +1

    Just love your content man. Keep it up !!!

  • @idhwanibhatt
    @idhwanibhatt 2 роки тому +7

    Thank you so much Darshil for this video. We need more such project based learning in data engineering instead of just cliche theory. 😆

    • @DarshilParmar
      @DarshilParmar  2 роки тому +2

      Yes, more videos like this is coming

    • @mcaddit6802
      @mcaddit6802 Рік тому

      I am unable to proceed further after clicking on test getting err0r:"errorMessage": "'s3_cleansed_layer'",
      "errorType": "KeyError", can anyone pls tell what's the problem?

  • @youraverageguide
    @youraverageguide 2 роки тому

    Thanks for all your efforts! This is perfect.

  • @arpansatpathi9645
    @arpansatpathi9645 2 роки тому +11

    Thank you Darshil for this wonderful video. One thing I would like to point out that might help others following this tutorial is whenever you update your lambda function, click on deploy first to actually test your changes. In my case I wasn't getting any errors and later realized that the default hello world code was still running.

    • @DarshilParmar
      @DarshilParmar  2 роки тому +4

      Yes, I might have made mistake while editing the video, I did click on deploy and lot of people missed it.
      I will keep this in my mind
      Thank you for the feedback

    • @arpansatpathi9645
      @arpansatpathi9645 2 роки тому +2

      @@DarshilParmar while we're at it, could you please give a solution for the EntityNotFoundException that somebody else also pointed out. I'm also getting the same error and haven't been able to resolve it. Tried creating the cleansed database in glue manually but still it is not working. Hope to get a reply.
      Thanks in advance :)

    • @angelnadar1209
      @angelnadar1209 Рік тому

      Thanks Ashutosh ,same thing faced by me .thanks for posting this comment ,its helpful.

    • @angelnadar1209
      @angelnadar1209 Рік тому

      ​@@arpansatpathi9645 was it resolved ?if yes could you post the solution?​

    • @thesevenacoustics
      @thesevenacoustics Рік тому

      @@arpansatpathi9645 rename 'db_youtube_cleaned' to 'de_youtube_cleaned' in env varialble

  • @ataurrehman3664
    @ataurrehman3664 2 роки тому +1

    Thank you so much for making this video!!! This would be 6-7th video of yours which I've added to my playlist. I request you to post more such project videos in different domains.

    • @DarshilParmar
      @DarshilParmar  2 роки тому +2

      Thank you, yes I will try to post such videos

  • @zendr0
    @zendr0 2 роки тому +1

    Absolute gem! Thank you for making this video. Learned a lot today.
    And if possible, Although I know you have your job, please try to make more of such content in future.
    Lots of love💛💛

    • @DarshilParmar
      @DarshilParmar  2 роки тому +1

      I will try my best to provide as much as I can

    • @mcaddit6802
      @mcaddit6802 Рік тому

      I am unable to proceed further after clicking on test getting err0r:"errorMessage": "'s3_cleansed_layer'",
      "errorType": "KeyError", can anyone pls tell what's the problem?

    • @ishan358
      @ishan358 Рік тому

      How do you solve runtime error

    • @drishtihingar2160
      @drishtihingar2160 8 місяців тому

      @@mcaddit6802 yeah I am also getting same error, how did you solved it. Can you help me out

  • @prikshitbatta
    @prikshitbatta Рік тому +3

    Hi, Darshil thanks for this project. Faced a lot of errors but took two days to complete the project. In the end, it is satisfying.😀

    • @vanadin8009
      @vanadin8009 10 місяців тому

      can you give the estimated cost of aws services used in this project it will be of so much help and thank you

    • @vasudevreddy3527
      @vasudevreddy3527 10 місяців тому

      @@vanadin8009 we can do basically for free with free tier AWS account

  • @intrepidm8753
    @intrepidm8753 2 роки тому

    its a great one, very useful n resourceful for aspirants like me👍🏼

  • @ahmedmohiuddin1866
    @ahmedmohiuddin1866 2 роки тому

    WOWW. This is amazingggg. Thanks Darshil. I have just started watching the video and looking at the content got me excited.

    • @DarshilParmar
      @DarshilParmar  2 роки тому +1

      This is the type of comment I wait for, thanks for supporting my work!

    • @ahmedmohiuddin1866
      @ahmedmohiuddin1866 2 роки тому

      @@DarshilParmar you’re welcome

  • @takisally
    @takisally Рік тому

    Thank you for this particular video

  • @iamdare
    @iamdare Рік тому

    Hi Darshill, good video and thanks very much. I learned a lot. Please in your subsequent videos, do try to zoom in more often so we can get to see what you’re doing on the screen. Thanks.

  • @SankarJankoti
    @SankarJankoti 2 роки тому +2

    Your content on data is pure! No match.

  • @trytrybutdontcry1-zf9yu
    @trytrybutdontcry1-zf9yu 3 місяці тому +1

    Great Video, Thank you so much!!

  • @mackshonayi943
    @mackshonayi943 Рік тому

    Thank you Darshil, I completed this part successfully. Your content is invaluable may God bless you

    • @fahadbakshi5449
      @fahadbakshi5449 Рік тому

      bro i got stuck at 30:00 minute can u plz help me

    • @ishan358
      @ishan358 Рік тому

      How did you solve runtime error

  • @ijaj.datanerd
    @ijaj.datanerd 2 роки тому

    Thanks, man.....appreciate your hard work and effort

  • @Bijuthtt
    @Bijuthtt Рік тому

    Awesome tutorial project. I could complete this session.

  • @shyam96105
    @shyam96105 2 роки тому

    Hi Darshil, please make video on how do you deliver the project to your clients after completing it.

  • @youraverageguide
    @youraverageguide 2 роки тому +1

    Part1 done. It was really informative. Waiting for part 2!

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      Check link in the description for that

    • @harshalshende69
      @harshalshende69 Рік тому

      Bro can u plzz help me for this actually I stuck in part 1 during catalog data from1week so I can move forward if u tell my mistakes over their🙏it will big help for me

    • @ishan358
      @ishan358 Рік тому

      ​@@harshalshende69do you find solution

  • @paytmoffers7794
    @paytmoffers7794 Рік тому +2

    I hope you have more such projects for us in your pipeline 😍 Please do it

  • @raghuboyapati7311
    @raghuboyapati7311 2 роки тому

    Great video man.
    AWS has updated the Emphemeral storage of Lambda to 10 GB

  • @dimlight1172
    @dimlight1172 2 роки тому +3

    This is the first video I'm watching from your channel sir. Even though I didn't understand much, I watched till the end. Your explanation of the stuff just kept me going. I'm a 2nd year engineering student. I have done web development earlier and have basic knowledge of SQL. This data engineering field seems to be very interesting field to dive in. Can you please guide me through sir? So that I can learn the topics and build some good projects using necessary and relevant technology.

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      I have complete roadmap on my channel for Data Engineering and study plan, you can check that and start working

  • @rocky6517
    @rocky6517 2 роки тому

    Thanks Darshil for Creating this Video

  • @pranjalipandey4233
    @pranjalipandey4233 2 роки тому

    Thanks Darshil for such a valuable learning...

  • @sudhanshuupadhyay458
    @sudhanshuupadhyay458 Рік тому

    Great and helpful tutorial.

  • @huzaifa_2590
    @huzaifa_2590 11 місяців тому

    It Was An Amzaing Project. I Learned Alot From This Video. Thank You So Much, Appreciated The Work.

  • @yichiz9389
    @yichiz9389 6 місяців тому

    Thank you so much!

  • @ririraman7
    @ririraman7 2 роки тому

    Thank you very much brother.

  • @nadianizam6101
    @nadianizam6101 Рік тому

    Excellent.plz upload more project

  • @mrutyunjay7877
    @mrutyunjay7877 2 роки тому

    Thank you so much Darshil !

  • @AnandSharma-kz2bs
    @AnandSharma-kz2bs 2 роки тому

    That is pro level content ❤️💥

    • @DarshilParmar
      @DarshilParmar  2 роки тому +1

      Pro level viewer who watches entire 1 hour thing in the world of reels/shorts

  • @ueeabhishekkrsahu
    @ueeabhishekkrsahu Рік тому +1

    After consistently working for 2 days, finally done with the project.

    • @sembrueldorinvil4167
      @sembrueldorinvil4167 Рік тому

      How did you solve the runtime problem?

    • @ishan358
      @ishan358 Рік тому

      ​@@sembrueldorinvil4167same question bro

    • @ishan358
      @ishan358 Рік тому

      How did you solve runtime lamda error ?

    • @sembrueldorinvil4167
      @sembrueldorinvil4167 Рік тому +1

      Found the answer. Assign memory to your function. I think 500 MB should work.

    • @ishan358
      @ishan358 Рік тому

      @@sembrueldorinvil4167 then also same error so much frustrating it is

  • @nguyentiensu4088
    @nguyentiensu4088 Рік тому

    Thank you very much!

  • @shubhamdeshmukh6339
    @shubhamdeshmukh6339 2 роки тому

    Hi Darshil! You r doing great job
    Please make more video on end to end project

  • @yassaryelurkar3631
    @yassaryelurkar3631 Рік тому

    Hey great video. I wanted to ask whether I will be charged for using AWS Athena coz it mentioned additional charges for using athena query when I opened it. Thanks for the video.

  • @abhisheknakate9347
    @abhisheknakate9347 2 роки тому

    thank you darshil very informative content........please upload 2nd part of video

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      It is uploaded, check link in the description

  • @pankajchandel1000
    @pankajchandel1000 11 місяців тому

    among this and covid project ..which one should i try building first as a beginner ?

  • @ankitchilkalwar8410
    @ankitchilkalwar8410 2 роки тому

    Please create similar end to end project using GCP services there is no content available on UA-cam.
    it will be very helpful,
    Thanks👍

  • @siddhideshmukh6424
    @siddhideshmukh6424 Рік тому

    Hey Darshil ,i rarely comment but just wanted to tell you tht you are awesome nd you content is just amazing ❤️

  • @agnimitram340
    @agnimitram340 Рік тому

    Thanks Darshil big help.

    • @meetpatel1873
      @meetpatel1873 Рік тому

      Did you get charged while using AWS services under free tier?🤔

  • @aman6646
    @aman6646 2 роки тому

    Great Video 🔥🔥💯 Thanks

  • @satishmajji481
    @satishmajji481 2 роки тому +1

    @Darshil Parmar - In part-2 of this video, "region=us/" folder is not created for me; only ca and gb folders are created upon running the ETL job. PS: I added "predicate_pushdown = "region in ('ca','gb','us')" as well but folder is missing for "us" region. Can you please take a look at this?

  • @rishav144
    @rishav144 2 роки тому

    best project 🔥

  • @surrealsoupuniverse
    @surrealsoupuniverse Рік тому

    Hello what should i learn before doing this project? What are the prerequisites? Thanks

  • @sunny60035
    @sunny60035 2 роки тому

    Hi Darshil I have followed your end to end Data Engineering project on covid data Analysis it helped me to learn about different services on Aws and what exactly a data engineer does. Can u please make a end to end Data Engineering project using MS AZURE and Databricks. Thanks again 👍

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      Maybe in future I will do it, if you want then check out ProjectPro, not saying because I am sponsored or anything but they really have good projects.

    • @charudattbelsare2903
      @charudattbelsare2903 2 роки тому

      @@DarshilParmar Does they provide Azure or AWS platform access?

  • @shubhamgargade4045
    @shubhamgargade4045 2 роки тому

    Hi Darshil, I'm confused to choose the career between backend developer and data engineer both finds interesting to me. I like more coding than SQL queries which will be better.

    • @MDARUN-ph1dw
      @MDARUN-ph1dw 2 роки тому +2

      backend then because u like coding heavy

    • @DarshilParmar
      @DarshilParmar  2 роки тому +2

      As Arun said go with Backend because Data Engineering is more SQL heavy

  • @danala5963
    @danala5963 2 роки тому +1

    This was a great help..one question though..when you executed this project using different AWS services S3, Athena, Glue etc.. what was the approx. cost you got after full project execution...Thanks

    • @DarshilParmar
      @DarshilParmar  2 роки тому +1

      Most likely there won't be any charge if you are under free trial but even if they charge you it will be max 3-5$
      You can raise support ticket stating you were just trying to learn about service and they won't charge you

  • @Watson22j
    @Watson22j 9 місяців тому +1

    Hey Darshil, if you were a beginner how would you write such long code in lemda? I am a beginner and wondering how will I be able to write such codes

  • @dipankrdey1850
    @dipankrdey1850 Рік тому

    Thank you for your effort. How much amount AWS billed to build this project ? Or all services are under free tier ?

  • @ranjansrivastava9256
    @ranjansrivastava9256 6 місяців тому

    Dear Darshil, Could you please let us know which architecture have you used in the demo -- Lambda architecture or Kappa Architecture. Wanted to understand more on architecture prospective. Please share your thoughts.

  • @yeggadisaisiddharth7877
    @yeggadisaisiddharth7877 4 місяці тому

    Superrr bro

  • @mandardeshpande2246
    @mandardeshpande2246 9 місяців тому +1

    Hi Darshil while running the Athena job getting HIVE_CURSOR_ERROR: Row is not a valid JSON Object - JSONException: A JSONObject text must end with '}' at 2 [character 3 line 1]
    This query ran against the "de_database_raw" database, unless qualified by the query. error

  • @rohitagarwal5319
    @rohitagarwal5319 11 місяців тому

    @DarshilParmar I tried using flatten transform in ETL job but it didn't work
    is it because json contains array?
    can you suggest me how to proceed with ETl in few words so that I can work on that

  • @wajidturi
    @wajidturi 2 роки тому

    wow amazing man

  • @paytmoffers7794
    @paytmoffers7794 Рік тому

    You are amazing 💯

  • @mairios521
    @mairios521 9 місяців тому +1

    Hi Darshil! First of all, I would like to say "Thank you" for this tutorial.
    I need to mention something, I was following each steps but AWS is now different and some options are no longer available or they are so different.
    I can't believe that AWS platform changed so much in just one year.
    My question is: will you update this tutorial in the future?

    • @DarshilParmar
      @DarshilParmar  9 місяців тому +1

      Everything is same, you just have to find right options with new UI

    • @mairios521
      @mairios521 9 місяців тому

      @@DarshilParmar Thanks for your quick response!!

  • @ishikapatel3318
    @ishikapatel3318 6 місяців тому

    hello thank you for this video. I am having a problem while configuring, every time i configure I get exited out.

  • @rohanchoudhary672
    @rohanchoudhary672 Рік тому

    Took me 5 days to complete this video with hands on.
    But these were all worth it.
    - Complete noob me

    • @ishan358
      @ishan358 Рік тому

      Bro help me to solve run time error @rohanchoudhary672

  • @AdityaRaj-dp1no
    @AdityaRaj-dp1no Рік тому

    Hey Darshil!
    I have followed the part 2 video till where we have to create an ETL job but I am finding it a bit difficult to create the job as the GUI has changed and the AWS glue studio is updated. Can you please tell me the steps to create the job in similar manner in glue studio GUI?
    PS: I tried few things but I had to delete the jobs as they were not doing the same function as your job in the video.

    • @TheAINoobxoxo
      @TheAINoobxoxo 4 місяці тому

      u can reverse the aws gui to old one

  • @vivekpuurkayastha1580
    @vivekpuurkayastha1580 2 роки тому

    Hi Darsheel .. great video as always and yes i did click on like 😀 ... Can you please make a video on how to create a project in Dev environment and then switch to production environment in AWS ..... Basically how to manage the code Lifecycle in AWS from Dev to Production.... or may be you can point to a resource ... Thanks

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      Hey Vivek,
      Thank you and converting from Dev To Prod is same but you need to follow best practices such as instead of creating lambda from UI you will have some code that can handle it, you will deploy that code on github and then it will run automatically, etc...
      There are no definitive guide, you just have to figure it out based on requirements but here is a blog on that
      blog.gruntwork.io/how-to-build-an-end-to-end-production-grade-architecture-on-aws-part-1-eae8eeb41fec

  • @ashishkumarg5
    @ashishkumarg5 3 місяці тому

    What is the similar service in GCP for AWS Glue Crawler ?

  • @percyjackson1662
    @percyjackson1662 Рік тому +6

    for those trying it now-
    1. awswrangler name has been changed to awssdkpandas. Rest code wise - it remains the same
    2. you need to have glue database created before hand, otherwise it throws error .

    • @satyabratadey3898
      @satyabratadey3898 Рік тому

      Hi Percy, while trying to add Aws layers, I only get 3 options - AppConfig Extension, Lambda Insights Extension, Parameters and Secret Lambda extension. Not sure what I am missing. Please help

    • @satyabratadey3898
      @satyabratadey3898 Рік тому

      @Darshil

  • @____prajwal____
    @____prajwal____ Рік тому +6

    FYI - Now AWS Wrangler has been renamed to AWS SDK Pandas

    • @reypaulobae4895
      @reypaulobae4895 Рік тому +1

      Lifesaver ! Thanks

    • @mayurkumar23
      @mayurkumar23 5 місяців тому

      prajwal, I am getting this error:
      {
      "errorMessage": "Glue table does not exist in the catalog. Please pass the `path` argument to create it.",
      "errorType": "InvalidArgumentValue",
      "stackTrace": [
      " File \"/var/task/lambda_function.py\", line 40, in lambda_handler
      raise e
      ",
      " File \"/var/task/lambda_function.py\", line 27, in lambda_handler
      wr_response = wr.s3.to_parquet(
      ",
      " File \"/opt/python/awswrangler/_config.py\", line 735, in wrapper
      return function(**args)
      ",
      " File \"/opt/python/awswrangler/_utils.py\", line 178, in inner
      return func(*args, **kwargs)
      ",
      " File \"/opt/python/awswrangler/s3/_write_parquet.py\", line 719, in to_parquet
      return strategy.write(
      ",
      " File \"/opt/python/awswrangler/s3/_write.py\", line 313, in write
      raise exceptions.InvalidArgumentValue(
      "
      ]
      }
      Please help.

  • @ericalbertobernal101
    @ericalbertobernal101 Рік тому

    good job !

  • @shouryanagpal5813
    @shouryanagpal5813 6 місяців тому

    Hi Darshil, I always think of starting your project videos but I always got stuck whether aws cloud services willl be charged or it's free or is there any other alternatives

  • @surajpatra8705
    @surajpatra8705 2 роки тому

    Thank you bhaiya for this video

    • @DarshilParmar
      @DarshilParmar  2 роки тому

      You are welcome

    • @meetpatel1873
      @meetpatel1873 Рік тому

      Did you get charged while using AWS services under free tier?🤔

  • @imshivamchoudhary
    @imshivamchoudhary 2 роки тому +1

    Thanks for such a great video. I would really buy a course of yours if you have one.

    • @divyaj7062
      @divyaj7062 2 роки тому

      True Dharshil... Consider creating a Udemy course too so that it reach larger audience as well as become affordable too

    • @DarshilParmar
      @DarshilParmar  2 роки тому +2

      I am planning to do this in future, I will keep everyone updated

  • @rizbasamalah5326
    @rizbasamalah5326 Рік тому +3

    Thank you so much Darshil for the video! I am having an issue when trying to create a crawler, getting error : "The following crawler failed to create: "name of the crawler"
    Here is the most recent error message: Account 'Number of account' is denied access." Tried to check the IAM roles created, deleted recreated again, however still receiveing the same message. Would you have an idea what could be the issue?

    • @shubhambhadra8196
      @shubhambhadra8196 Місяць тому

      raise a ticket to aws support they will help to get access

  • @snehakadam16
    @snehakadam16 9 місяців тому +1

    ​ @DarshilParmar Hi again, can you please share the pyspark code for the ETL job in the latest Glue catalog? It would be great help. (part 2)

  • @ram.FxTrading
    @ram.FxTrading Рік тому

    thanks

  • @hassannasr7736
    @hassannasr7736 3 місяці тому

    If you are getting a runtime error when running the lambda function even after 3 minutes. Make sure to add
    import pandas as pd
    This will solve the issue as the AWS wrangler changed to AWS SDK Pandas