Build your Data-Lake with AWS S3 and Athena using the Glue crawler | correct S3 Folder Structure

Поділитися
Вставка

КОМЕНТАРІ • 7

  • @aacasd
    @aacasd Рік тому +1

    good demo man

  • @rankena
    @rankena 2 роки тому +1

    errr, what if you want to filter by FolderA? :D

  • @josemanuelgutierrez4095
    @josemanuelgutierrez4095 Рік тому

    I have a question my friend , the last part wnen you show us all your data on athena , what are the benefits as a company for example to use it ?? . Can you tell me pls , because many company are using the same service but to be honest I don't know exactly the right use of those service . Thank you

  • @rushikeshkadam5282
    @rushikeshkadam5282 2 роки тому

    Brother please help
    So i have created a custom endpoint URL for Amazon elasticsearch (Open Search) .Certificate is issued from AWS itself and i have configured Route53 with Cname. But still it can't load my custom URL. But the default URL provided by AWS it's working.
    I don't what's happening. I am thinking elastic search is not accepting my SSL Certificate.
    Any solution how can i connect to my elastic search and kibana via custom url?

  • @navinsai5726
    @navinsai5726 2 роки тому

    Good video brother
    Great video soumil...couple of questions: I could not relate your Case A & Case B on folder structure. What is difference between Folder A vs Projectfiles or Folder B vs ProjectFiles 1? Aren't they both same, you just calling a different name folder a refers to projectfiles and folder b refers to projectfiles 1? Can you give a practicle example of case A and Case b folder structure?
    Where do you get the values of yyyy/mm/dd for your folder structure? are those load year, month ,day values or date that represent when event or sales occurred?
    there are a lot of things that can be done on AWS console but none of the video is teaching a complete deployable code from one environment to another environment, it's the crux of data engineering principles with agile development ....
    also, is it practical to ask your data analyst to keep querying with year/dd/mm all the time? my users just want to do "select * from table" ...that's all they know :-0
    assume Tableu/quicksight connects to Athena and if it doesn't generate the right partitioning values, how does these queries react, do you get a FAT BILL at the end of the month?

  • @adityasunny99
    @adityasunny99 2 роки тому

    I think bucket structure will be determined by use case, in your case you want all this by year. Suppose if i want it by project files by year, then your 2 architecture will be right. Please correct if i am wrong?