Corrections - 62 - C A. VACUUM FULL Orders: Reclaims disk space and sorts the data according to the sort key but does not specifically analyze interleaved sort keys. B. VACUUM DELETE ONLY Orders: Recovers space from deleted rows without sorting the table. C. VACUUM REINDEX Orders: Reclaims space, sorts data, and analyzes interleaved sort key columns, providing thorough optimization for tables with interleaved sort keys. D. VACUUM SORT ONLY Orders: Sorts data according to the sort key without reclaiming space from deleted rows. 44 - C 50 - C 67 - C
Question 50 - is option C - because in query editor v2, there is an option to run a query on schedule, this is the one with least effort. Apache Airflow comes with lot of overhead, it's not with least effort.
For question 56, I think that option D (Lambda + Step Functions) is more cost-effective than using Glue (option A). For me A would be more suited to least operational overhad type of question. Since the question don't speficy the data size being processed, it makes me go with option D (for large data processing seems like Glue would be best and/or only option)
You are correct. Thank you for pointing it to me. A. VACUUM FULL Orders: Reclaims disk space and sorts the data according to the sort key but does not specifically analyze interleaved sort keys. B. VACUUM DELETE ONLY Orders: Recovers space from deleted rows without sorting the table. C. VACUUM REINDEX Orders: Reclaims space, sorts data, and analyzes interleaved sort key columns, providing thorough optimization for tables with interleaved sort keys. D. VACUUM SORT ONLY Orders: Sorts data according to the sort key without reclaiming space from deleted rows.
Q50 - If we need to use apache airflow to refresh the redshift mview, we still need to write a DAG for it.. so its more effort to do it. Not sure if C is the right answer as we need to automate the refresh schedule
question 48 : I think it's A, as if there are some void or null values in the sales_amount column, the count(*) will still count them, but it still doesn't calculate sales amounts ! Weird question
The question options has typo. they meant YEAR is YYYY-MM-DD format so you extract year from this format. It should be B option if the question was asked right.
Yes. C looks like a right option. The distribution method for the large tables was already mentioned as EVEN in the question. So, for better performance we can leave the large tables to EVEN distribution and change the distribution of small tables to ALL.
66) Option B - Use the query result reuse feature of Amazon Athena for the SQL queries: Not suitable because the data is refreshed every hour, and using cached results might lead to outdated insights. Please explain me
73 question and options are close agreed DataBrew is drag drop and anyone can do but then B also anyone can do and no where in question it is mentioned it is needed regularly on going basis so do we assume that's case unless question states one time?
I couldn't understand the answer of question-50. how come airflow do that. MV automatically refreshes in redshift but this option is not available either
76: why B but not A! You're concerned with operational overhead. But question says it already has RedShift but not Athena. Why do you choose to use Athena rather than RedShift!?
Using Amazon Redshift Spectrum to query the data would involve more operational overhead since you need to spin up a Redshift cluster to get Redshift spectrum to work.
You can run Athena queries on DynamoDB using the PartiQL language. Federated query pass-through is also useful when you want to run SELECT queries that aggregate, join, or invoke functions of your data source that are not available in Athena.
Corrections -
62 - C
A. VACUUM FULL Orders: Reclaims disk space and sorts the data according to the sort key but does not specifically analyze interleaved sort keys.
B. VACUUM DELETE ONLY Orders: Recovers space from deleted rows without sorting the table.
C. VACUUM REINDEX Orders: Reclaims space, sorts data, and analyzes interleaved sort key columns, providing thorough optimization for tables with interleaved sort keys.
D. VACUUM SORT ONLY Orders: Sorts data according to the sort key without reclaiming space from deleted rows.
44 - C
50 - C
67 - C
54-C
Question 50 - is option C - because in query editor v2, there is an option to run a query on schedule, this is the one with least effort.
Apache Airflow comes with lot of overhead, it's not with least effort.
Q54 - Answer should be B. Hadoop to EMR ( Hive metastore in EMR) and data catalog is via Glue Data Catalog
Hi sthithapragna. Thank you so much. I cleared AWS DE certification. Almost 40 questions were from this playlist. Thank you team.
74) Answer is A , because we need to first process the data and then load
You can use Streaming ingestion to process as well.
docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html
44 default setting for EC2 for root EBS volume is default delete on termination to true
Agree, for me the right answer would be "C".
I agree, its definitely C
@sthithapragna what is your answer here ?
Corrected my answer in the pinned comment. Thank you all for keeping an eye.
@@sthithapragnakk I passed DEA-C01 last week, thank a lot for your valuable contribution! :D
44) By Default EBS volumes data deletes after EC2 instance termination
The answer is C
Correct, its C
Thank u i was looking for this playlist please upload more questions for practice
Sure, will do
For question 56, I think that option D (Lambda + Step Functions) is more cost-effective than using Glue (option A). For me A would be more suited to least operational overhad type of question. Since the question don't speficy the data size being processed, it makes me go with option D (for large data processing seems like Glue would be best and/or only option)
Very usefull playlist for the exam. Thanks.
62 seems wrong - reindex option helps analyze the interleaved sort keys as mentioned in the question and then does a full vacuum
You are correct. Thank you for pointing it to me.
A. VACUUM FULL Orders: Reclaims disk space and sorts the data according to the sort key but does not specifically analyze interleaved sort keys.
B. VACUUM DELETE ONLY Orders: Recovers space from deleted rows without sorting the table.
C. VACUUM REINDEX Orders: Reclaims space, sorts data, and analyzes interleaved sort key columns, providing thorough optimization for tables with interleaved sort keys.
D. VACUUM SORT ONLY Orders: Sorts data according to the sort key without reclaiming space from deleted rows.
Q50 - If we need to use apache airflow to refresh the redshift mview, we still need to write a DAG for it.. so its more effort to do it. Not sure if C is the right answer as we need to automate the refresh schedule
question 48 : I think it's A, as if there are some void or null values in the sales_amount column, the count(*) will still count them, but it still doesn't calculate sales amounts ! Weird question
The question options has typo. they meant YEAR is YYYY-MM-DD format so you extract year from this format. It should be B option if the question was asked right.
A only gives the count not sales amount, you already pointed that out so cannot be A at all.
@@vreddyc3608 There is no EXTRACT function in Athena.
I think it's B
@@vreddyc3608 Its B. It has typo Instead of sales_date column they mentioned table name I believe (where extract(year from sales_date) = '2023')
Q67 how is it D . There is no partition concept in redshift only distribution. Shouldn't it be C
Yes. C looks like a right option. The distribution method for the large tables was already mentioned as EVEN in the question. So, for better performance we can leave the large tables to EVEN distribution and change the distribution of small tables to ALL.
48: B , If year is stored in ddmmyyyy format then B option will help
Question 65 - I think the correct answer is A and D, as they complement each other and form a consistent pipeline
66) Option B - Use the query result reuse feature of Amazon Athena for the SQL queries: Not suitable because the data is refreshed every hour, and using cached results might lead to outdated insights. Please explain me
agree. could anyone please explain this?
Question 54: AWS DMS does not support migration of Hive metastore, so option A is not correct one.
I think option B is correct
A option no mention about EMR also
Yes, i would go with B as well
73 question and options are close agreed DataBrew is drag drop and anyone can do but then B also anyone can do and no where in question it is mentioned it is needed regularly on going basis so do we assume that's case unless question states one time?
You can see in the first sentence, they receive a DAILY file. It means you need to do this operation every day
I couldn't understand the answer of question-50. how come airflow do that. MV automatically refreshes in redshift but this option is not available either
Would be writing my exam tomorrow, would be sure to update you guys on my experience.
All the best
Hey guys, I got my certs and I super endorsing this channel! Thank you so much @sthithapragna
@@GodfreyAmaechi-j9x from this dumps you received any questions for your exam
@@GodfreyAmaechi-j9xall exam questions are the same?
@@GodfreyAmaechi-j9x can you let us know how you prepare for it ?
Thank you
You're welcome! All the best
Question 43, why would the step function needs IAM permission to access the S3 bucket?
can i get a pdf version , hows the question chance to came on actual exam ?
Could you please share link for databrick data engineering associate exam as well?
76: why B but not A! You're concerned with operational overhead. But question says it already has RedShift but not Athena. Why do you choose to use Athena rather than RedShift!?
Using Amazon Redshift Spectrum to query the data would involve more operational overhead since you need to spin up a Redshift cluster to get Redshift spectrum to work.
You can run Athena queries on DynamoDB using the PartiQL language. Federated query pass-through is also useful when you want to run SELECT queries that aggregate, join, or invoke functions of your data source that are not available in Athena.
REdshift spectrum can query from S3 not from other db sources
Hi @sthithapragna, as a member of your channel can I access a pdf of the Solution Architect Professional?
send me an email
@@sthithapragnakk how can i join the join and get the pdf
When will have next update?
I Have sheduled the exam
Can also share the PDF file for DEA-c01
How can I become a member of this channel? I am not seeing any option to join .
There should a button that says join next to subscribe
Why 46 C? shouldn't be D?
Dear sir,
I have purchased your membership.
Please provide me a pdf for AWS solutions architect associate exam. SAA
Sent
hello @sthithapragna, I have subscribed, can i have receive the dumps please ?
hi sir, i have membership, where can i get pdf of these question from question 1-40 and 41-80
send me an email if you havent already done so
q50 is probably c not a
I agree, there is a feature in query editor v2 called "scheduled queries" making it easy to schedule a refresh query
This was my opinion as well
C is actually a better answer here
i agree with C
even while creating materialize view we can pass the parameter auto refresh , so c is more appropriate solution