AWS Certified Data Engineer Associate Exam Practice Questions - ANALYSIS P2 (DEA-C01)

Поділитися
Вставка
  • Опубліковано 11 гру 2024

КОМЕНТАРІ • 80

  • @sthithapragnakk
    @sthithapragnakk  4 місяці тому +14

    Corrections -
    62 - C
    A. VACUUM FULL Orders: Reclaims disk space and sorts the data according to the sort key but does not specifically analyze interleaved sort keys.
    B. VACUUM DELETE ONLY Orders: Recovers space from deleted rows without sorting the table.
    C. VACUUM REINDEX Orders: Reclaims space, sorts data, and analyzes interleaved sort key columns, providing thorough optimization for tables with interleaved sort keys.
    D. VACUUM SORT ONLY Orders: Sorts data according to the sort key without reclaiming space from deleted rows.
    44 - C
    50 - C
    67 - C

  • @susmitatripathi3569
    @susmitatripathi3569 2 місяці тому +5

    Question 50 - is option C - because in query editor v2, there is an option to run a query on schedule, this is the one with least effort.
    Apache Airflow comes with lot of overhead, it's not with least effort.

  • @rizzu13riz
    @rizzu13riz Місяць тому +2

    Q54 - Answer should be B. Hadoop to EMR ( Hive metastore in EMR) and data catalog is via Glue Data Catalog

  • @MsGeethaa
    @MsGeethaa 2 місяці тому +1

    Hi sthithapragna. Thank you so much. I cleared AWS DE certification. Almost 40 questions were from this playlist. Thank you team.

  • @db1542
    @db1542 5 місяців тому +4

    74) Answer is A , because we need to first process the data and then load

    • @sthithapragnakk
      @sthithapragnakk  4 місяці тому +2

      You can use Streaming ingestion to process as well.
      docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html

  • @kirankapadia5551
    @kirankapadia5551 7 місяців тому +8

    44 default setting for EC2 for root EBS volume is default delete on termination to true

    • @mrgabrielsrangel
      @mrgabrielsrangel 7 місяців тому +1

      Agree, for me the right answer would be "C".

    • @danielopanubi2525
      @danielopanubi2525 6 місяців тому +1

      I agree, its definitely C

    • @elvisbrahi2523
      @elvisbrahi2523 4 місяці тому

      @sthithapragna what is your answer here ?

    • @sthithapragnakk
      @sthithapragnakk  4 місяці тому +3

      Corrected my answer in the pinned comment. Thank you all for keeping an eye.

    • @minntheinaung
      @minntheinaung 4 місяці тому

      @@sthithapragnakk I passed DEA-C01 last week, thank a lot for your valuable contribution! :D

  • @db1542
    @db1542 5 місяців тому +9

    44) By Default EBS volumes data deletes after EC2 instance termination

  • @siddhi106
    @siddhi106 8 місяців тому +1

    Thank u i was looking for this playlist please upload more questions for practice

  • @fbrcode
    @fbrcode 5 днів тому

    For question 56, I think that option D (Lambda + Step Functions) is more cost-effective than using Glue (option A). For me A would be more suited to least operational overhad type of question. Since the question don't speficy the data size being processed, it makes me go with option D (for large data processing seems like Glue would be best and/or only option)

  • @c.danielhabibtodegnon6204
    @c.danielhabibtodegnon6204 2 місяці тому

    Very usefull playlist for the exam. Thanks.

  • @yogeshbendre716
    @yogeshbendre716 4 місяці тому +3

    62 seems wrong - reindex option helps analyze the interleaved sort keys as mentioned in the question and then does a full vacuum

    • @sthithapragnakk
      @sthithapragnakk  4 місяці тому

      You are correct. Thank you for pointing it to me.
      A. VACUUM FULL Orders: Reclaims disk space and sorts the data according to the sort key but does not specifically analyze interleaved sort keys.
      B. VACUUM DELETE ONLY Orders: Recovers space from deleted rows without sorting the table.
      C. VACUUM REINDEX Orders: Reclaims space, sorts data, and analyzes interleaved sort key columns, providing thorough optimization for tables with interleaved sort keys.
      D. VACUUM SORT ONLY Orders: Sorts data according to the sort key without reclaiming space from deleted rows.

  • @sprak8711
    @sprak8711 3 місяці тому

    Q50 - If we need to use apache airflow to refresh the redshift mview, we still need to write a DAG for it.. so its more effort to do it. Not sure if C is the right answer as we need to automate the refresh schedule

  • @youssefbenlhabib3019
    @youssefbenlhabib3019 7 місяців тому +2

    question 48 : I think it's A, as if there are some void or null values in the sales_amount column, the count(*) will still count them, but it still doesn't calculate sales amounts ! Weird question

    • @vreddyc3608
      @vreddyc3608 7 місяців тому +2

      The question options has typo. they meant YEAR is YYYY-MM-DD format so you extract year from this format. It should be B option if the question was asked right.

    • @sthithapragnakk
      @sthithapragnakk  4 місяці тому

      A only gives the count not sales amount, you already pointed that out so cannot be A at all.

    • @sthithapragnakk
      @sthithapragnakk  4 місяці тому +1

      @@vreddyc3608 There is no EXTRACT function in Athena.

    • @luanvan83
      @luanvan83 4 місяці тому

      I think it's B

    • @priyankakandaswamy2456
      @priyankakandaswamy2456 3 місяці тому

      @@vreddyc3608 Its B. It has typo Instead of sales_date column they mentioned table name I believe (where extract(year from sales_date) = '2023')

  • @malcolmjohntamang9544
    @malcolmjohntamang9544 6 місяців тому +7

    Q67 how is it D . There is no partition concept in redshift only distribution. Shouldn't it be C

    • @kiranp2808
      @kiranp2808 4 місяці тому +2

      Yes. C looks like a right option. The distribution method for the large tables was already mentioned as EVEN in the question. So, for better performance we can leave the large tables to EVEN distribution and change the distribution of small tables to ALL.

  • @abhijeetsingh3027
    @abhijeetsingh3027 13 днів тому

    48: B , If year is stored in ddmmyyyy format then B option will help

  • @patriciacafundo1626
    @patriciacafundo1626 Місяць тому

    Question 65 - I think the correct answer is A and D, as they complement each other and form a consistent pipeline

  • @AMM2012
    @AMM2012 4 місяці тому +1

    66) Option B - Use the query result reuse feature of Amazon Athena for the SQL queries: Not suitable because the data is refreshed every hour, and using cached results might lead to outdated insights. Please explain me

    • @OACisco
      @OACisco 18 днів тому

      agree. could anyone please explain this?

  • @MrKuljas
    @MrKuljas 2 місяці тому +2

    Question 54: AWS DMS does not support migration of Hive metastore, so option A is not correct one.

  • @kirankapadia5551
    @kirankapadia5551 7 місяців тому +1

    73 question and options are close agreed DataBrew is drag drop and anyone can do but then B also anyone can do and no where in question it is mentioned it is needed regularly on going basis so do we assume that's case unless question states one time?

    • @truongchidien3810
      @truongchidien3810 5 місяців тому +1

      You can see in the first sentence, they receive a DAILY file. It means you need to do this operation every day

  • @MohammadBilal-c2w
    @MohammadBilal-c2w 3 місяці тому +1

    I couldn't understand the answer of question-50. how come airflow do that. MV automatically refreshes in redshift but this option is not available either

  • @GodfreyAmaechi-j9x
    @GodfreyAmaechi-j9x 7 місяців тому +2

    Would be writing my exam tomorrow, would be sure to update you guys on my experience.

    • @sthithapragnakk
      @sthithapragnakk  7 місяців тому +1

      All the best

    • @GodfreyAmaechi-j9x
      @GodfreyAmaechi-j9x 7 місяців тому +4

      Hey guys, I got my certs and I super endorsing this channel! Thank you so much @sthithapragna

    • @aswaths9857
      @aswaths9857 7 місяців тому

      ​@@GodfreyAmaechi-j9x from this dumps you received any questions for your exam

    • @divingcicada
      @divingcicada 7 місяців тому

      @@GodfreyAmaechi-j9xall exam questions are the same?

    • @elvisbrahi2523
      @elvisbrahi2523 5 місяців тому

      @@GodfreyAmaechi-j9x can you let us know how you prepare for it ?

  • @easylondon3334
    @easylondon3334 8 місяців тому +1

    Thank you

  • @mohamedeldessouky7044
    @mohamedeldessouky7044 3 місяці тому

    Question 43, why would the step function needs IAM permission to access the S3 bucket?

  • @atishh100
    @atishh100 8 місяців тому

    can i get a pdf version , hows the question chance to came on actual exam ?

  • @kumarnishant6037
    @kumarnishant6037 2 місяці тому

    Could you please share link for databrick data engineering associate exam as well?

  • @schwarzkelloggs
    @schwarzkelloggs 3 місяці тому

    76: why B but not A! You're concerned with operational overhead. But question says it already has RedShift but not Athena. Why do you choose to use Athena rather than RedShift!?

    • @amir_ob
      @amir_ob 3 місяці тому

      Using Amazon Redshift Spectrum to query the data would involve more operational overhead since you need to spin up a Redshift cluster to get Redshift spectrum to work.

    • @amir_ob
      @amir_ob 3 місяці тому

      You can run Athena queries on DynamoDB using the PartiQL language. Federated query pass-through is also useful when you want to run SELECT queries that aggregate, join, or invoke functions of your data source that are not available in Athena.

    • @LathaIyer-y4p
      @LathaIyer-y4p 2 місяці тому

      REdshift spectrum can query from S3 not from other db sources

  • @fernandocarrillo1995
    @fernandocarrillo1995 9 місяців тому

    Hi @sthithapragna, as a member of your channel can I access a pdf of the Solution Architect Professional?

    • @sthithapragnakk
      @sthithapragnakk  9 місяців тому +1

      send me an email

    • @nikiniki17
      @nikiniki17 Місяць тому

      @@sthithapragnakk how can i join the join and get the pdf

  • @divingcicada
    @divingcicada 7 місяців тому

    When will have next update?

  • @satyamsinha5429
    @satyamsinha5429 4 місяці тому

    I Have sheduled the exam

  • @kavyagottumukkala3466
    @kavyagottumukkala3466 5 місяців тому

    Can also share the PDF file for DEA-c01

  • @subhamaybhattacharyya
    @subhamaybhattacharyya 6 місяців тому

    How can I become a member of this channel? I am not seeing any option to join .

    • @sthithapragnakk
      @sthithapragnakk  6 місяців тому

      There should a button that says join next to subscribe

  • @MohammadBilal-c2w
    @MohammadBilal-c2w 3 місяці тому

    Why 46 C? shouldn't be D?

  • @pshashank2411
    @pshashank2411 9 місяців тому

    Dear sir,
    I have purchased your membership.
    Please provide me a pdf for AWS solutions architect associate exam. SAA

  • @chrismes3780
    @chrismes3780 5 місяців тому

    hello @sthithapragna, I have subscribed, can i have receive the dumps please ?

  • @SportsSpectraa
    @SportsSpectraa 7 місяців тому

    hi sir, i have membership, where can i get pdf of these question from question 1-40 and 41-80

    • @sthithapragnakk
      @sthithapragnakk  7 місяців тому

      send me an email if you havent already done so

  • @Vilayat_Khan
    @Vilayat_Khan 8 місяців тому +10

    q50 is probably c not a

    • @youssefbenlhabib3019
      @youssefbenlhabib3019 7 місяців тому +11

      I agree, there is a feature in query editor v2 called "scheduled queries" making it easy to schedule a refresh query

    • @danielopanubi2525
      @danielopanubi2525 7 місяців тому +1

      This was my opinion as well

    • @praisetaiwo2530
      @praisetaiwo2530 4 місяці тому +1

      C is actually a better answer here

    • @amir_ob
      @amir_ob 3 місяці тому

      i agree with C

    • @AbhishekChauhan-ur9tj
      @AbhishekChauhan-ur9tj 2 місяці тому

      even while creating materialize view we can pass the parameter auto refresh , so c is more appropriate solution