Big Data Processing Using Distributed Maps and AWS Step Functions (S3 + Lambda)

Поділитися
Вставка
  • Опубліковано 16 січ 2025

КОМЕНТАРІ • 17

  • @Langstonrocks
    @Langstonrocks Рік тому +1

    Thanks pal, I've been loving your videos for years and this one helped me to quickly solve a current task at my job!

  • @andyweeks2216
    @andyweeks2216 Рік тому

    Can't wait for you Step Function Course, Daniel. Thanks a bunch for this video.

  • @haiderh1339
    @haiderh1339 Рік тому

    I was just researching on this topic for a project and saw you uploaded newest video about it 4 hours ago :D

  • @87messibarca
    @87messibarca 3 місяці тому

    From 20:15 onwards you mix up inline vs distributed with standard vs express. Great vid tho!

  • @mariumbegum7325
    @mariumbegum7325 Рік тому

    Great content!

  • @dianad150
    @dianad150 3 місяці тому +1

    It is so complicated to use STEP function to do relatively simple tasks which can be done in other way.

  • @deepak.rocks.
    @deepak.rocks. Рік тому

    Great 👍

  • @WiredMartian
    @WiredMartian 9 місяців тому

    Is there a way to preserve order of execution here? Suppose I need to aggregate results from the CSV and I need to maintain the original order of items from the input CSV.

  • @tamaskiss3237
    @tamaskiss3237 6 місяців тому

    Is there a way to overcome the overhead what map run adds to the overall state machine execution? The execution time seems to be around 8s in your video but the individual lambda executions seem to be ready around 2-300 ms. Is there any recommendation for an alternative solution if latency is critical (around 5s)?

  • @vinodreddy1722
    @vinodreddy1722 Рік тому

    How to store the data in csv after modification?

  • @tello9504
    @tello9504 Рік тому

    How could we show this in github? I know the first step would be to create a design doc about the architecture but I would like to know if you have any examples. I want to put together a portfolio to showcase my work but I would like to explain it effectively on my github.

  • @MrAbdel776
    @MrAbdel776 Рік тому

    A great channel. Thank you! I have tried this code on a large csv file with half a million records. Unfortunately, it takes forever. I am not sure what is wrong. I hope anyone can provide some help.

    • @BeABetterDev
      @BeABetterDev  Рік тому +1

      Hi there! Are you sure you enabled the "Distributed Map" mode and not using inline?

    • @MrAbdel776
      @MrAbdel776 Рік тому

      @@BeABetterDev Thank you for your response. Yes, I used the "Distributed" mode, as you indicated in the video. The code runs reasonably with a small number of records. However, with a large file, it takes more than an hour.

    • @MrAbdel776
      @MrAbdel776 Рік тому

      I tried the batching and it makes a big difference. I use batching of 250 elements. I am able to read 50,000 records in 17 seconds. I will try the 500,000 soon. Thank you!

  • @artbart9080
    @artbart9080 7 місяців тому

    Hi.Tried to reproduce and stuck with error:States.ExceedToleratedFailureThreshold.Cause:The specified tolerated failure threshold was exceeded. CSV file was an issue. Initially I saved excel file with test data as CSV UTF-8 and after error I saved as CSV. Execution succeeded.

  • @rishiraj2548
    @rishiraj2548 Рік тому

    👍