I always enjoyed the 'down to earth' practical business cases from @Gary Stafford. This one is really good. Thanks for sharing. I've learned a lot with this tutorial.
Thanks a lot! This video deserves more views. It’s the first concise to the point video I’ve found where actual data and actual results are shown end to end. I have a question I hope you could answer. How would you handle data that changes. E.g. in a couple of days a customer cancels a ticket with id 1234, concert id 321. Now the calculations needs to take this into account, no?
Hi, thanks for this video. Very interesting ! Would be curious to know how would you handle incremental updates of this aggregated tables through Athena SQL queries ? with that architecture, would you run full calculation for entire set of data all over at each execution ?
Here is the documentation on the current level of integration possible with Amazon Athena (docs.aws.amazon.com/athena/latest/ug/querying-hudi.html): "Currently, Athena supports snapshot queries and read optimized queries, but not incremental queries. On MoR tables, all data exposed to read optimized queries are compacted. This provides good performance but does not include the latest delta commits. Snapshot queries contain the freshest data but incur some computational overhead, which makes these queries less performant."
Hey Gary, found you through your medium articles and now I'm watching your youtube videos. Excellent content!
Excellent hands on video covering building a datalake in AWS.
I always enjoyed the 'down to earth' practical business cases from @Gary Stafford. This one is really good. Thanks for sharing. I've learned a lot with this tutorial.
Gary, that was just perfect! Fast and straight to the point. 👏👏. Thank you!
Great video. Many complex concepts have been explained using simple language and examples.
Really great presentation, thanks for that.
This was excellent 👏 Top marks my man!
Very Useful 🌟
Thanks a lot! This video deserves more views. It’s the first concise to the point video I’ve found where actual data and actual results are shown end to end.
I have a question I hope you could answer. How would you handle data that changes. E.g. in a couple of days a customer cancels a ticket with id 1234, concert id 321.
Now the calculations needs to take this into account, no?
ua-cam.com/video/25StasmCVSw/v-deo.html
Hi, Thanks so very much for this. Is it possible to do an incremental load into s3 from RDS with glue?
What's the use case?
Great video
Hi, thanks for this video. Very interesting !
Would be curious to know how would you handle incremental updates of this aggregated tables through Athena SQL queries ? with that architecture, would you run full calculation for entire set of data all over at each execution ?
Here is the documentation on the current level of integration possible with Amazon Athena (docs.aws.amazon.com/athena/latest/ug/querying-hudi.html):
"Currently, Athena supports snapshot queries and read optimized queries, but not incremental queries. On MoR tables, all data exposed to read optimized queries are compacted. This provides good performance but does not include the latest delta commits. Snapshot queries contain the freshest data but incur some computational overhead, which makes these queries less performant."
How much did all of this cost for you for a month?