How to NOT Fail a System Design Interview (By a Data Engineer)

Поділитися
Вставка
  • Опубліковано 23 гру 2024

КОМЕНТАРІ • 77

  • @JashRadia
    @JashRadia  2 роки тому +5

    We have just reached 1000 subs! And it's all thanks to you super awesome subscribers and viewers! ❤️
    If you have not subscribed already, what are you waiting for?
    Drop your questions or suggestions below! Let's talk :)

  • @yao780607
    @yao780607 7 днів тому

    As an SWE preparing for DE interview, I found the mock interview and resources extremely helpful. Thank you!

  • @brownwolf05
    @brownwolf05 2 роки тому +7

    awesome video jash, generally nobody talks about piepline designing and sytem design for the DE domain. Great sharing of resources, great applaud to you

    • @JashRadia
      @JashRadia  2 роки тому

      Thanks a lot, Arnav! And yes, there is a lack of videos on such topics in DE, that was my main reason to create one! Really glad you liked it 😊

  • @ankit_in_munich
    @ankit_in_munich 8 місяців тому +5

    Awesome video, really helpful.
    I found these questions related to system design which are usually asked in the interviews. If you could share some insights on these would be really helpful.
    🌊 Data Pipeline Design: How would you design a data pipeline to handle large volumes of streaming data? (e.g., IoT devices or website clickstreams)
    ⚖ Batch vs. Stream Processing: Explain the differences between batch and stream processing and when to use each in data systems.
    🏢 Data Warehousing: Design a data warehousing system for e-commerce. Discuss storage tech, data modeling, and querying methods.
    🧩 Data Partitioning & Sharding: How to improve performance and scalability by partitioning and sharding a large database? Discuss trade-offs.
    📦 Data Serialization Formats: Compare JSON, Avro, Parquet, and ORC. When to use each in data processing?
    📜 Data Compression: Discuss data compression techniques in big data systems and choosing the right algorithm.
    🌐 Distributed Data Processing: Explain distributed data processing with Hadoop, Spark, or Flink, emphasizing fault tolerance and data locality.
    🔄 Data ETL: Design an ETL process to migrate data from a relational DB to a data lake. Discuss tools and frameworks.
    ⚙ Resource Configuration: Handling 100GBs of data per spark-submit - How to configure the cluster?
    🧐 Data Quality & Validation: Ensuring data quality and validation in data pipelines, handling missing or erroneous data.
    🔒 Data Security: Best practices for securing sensitive data in big data environments - encryption, access control, auditing.
    📈 Scalability: Scaling data systems horizontally and vertically to meet growing data volumes and workloads.
    📊 Monitoring & Logging: Importance of monitoring and logging in data systems, tools, and metrics for system health.
    🗃 Data Archiving & Retention: Data archiving and retention strategy for a data warehouse, handling historical data.
    💰 Cost Optimization: Strategies to optimize data storage and processing costs in cloud-based data architectures (AWS, Azure, GCP).
    📜 Data Governance: Role of data governance in data engineering - ensuring compliance with data regulations.

  • @salmansayyad4522
    @salmansayyad4522 Рік тому +2

    hey, really cleared my concepts. thanks bro, keep posting such videos.

  • @jaladhithakur7567
    @jaladhithakur7567 2 роки тому +6

    Great video. This really explains how to solve system design questions. Also, thanks for sharing the resources 😃

    • @JashRadia
      @JashRadia  2 роки тому

      Really glad you liked it, this one took a lot of effort :D

  • @snaekboi
    @snaekboi 2 роки тому +1

    That was excellent!
    Well structured, and well explained. Even my undergraduate self was able to understand quite a bit of it :)

    • @JashRadia
      @JashRadia  2 роки тому

      Thank you! Really glad you liked it 😊

  • @shubhambhandari431
    @shubhambhandari431 3 місяці тому +2

    In an interview i was asked to design pipeline to get data from api and store in db for dashboarding. Each region has its own data and all data should be collected centrally. My answer was to put data in kafka and from there use airflow or flink to process and store in data warehouse for each region. And all regions incremental data to be moved to central data warehouse. Then interviewers next question was to write code of airflow component. I said rn i need to use google to write code as i don't like to remember syntax and function names but i know what logic should be used and all. That's how I got rejected in last round.😢

    • @MrMLBson09
      @MrMLBson09 Місяць тому

      They're retarded. Everyone uses Google and if I'm the hiring manager I'd appreciate your honesty. Fuck em.

  • @priyankapandey9122
    @priyankapandey9122 2 роки тому +1

    Thanks jash for creating the video. Really helpful.

    • @JashRadia
      @JashRadia  2 роки тому

      Aweosme to know! Thanks for watching 😊

  • @sirajansari2848
    @sirajansari2848 2 роки тому +3

    Excellent Video !!
    Although, I have a question. What would have changed in the data pipeline if the source would have been a streaming source ? Where should we put the Kafka/PubSub in the data pipeline ?

    • @JashRadia
      @JashRadia  2 роки тому +1

      2 options. Between source and landing or between landing and processing. Also can be put in both places.
      And thanks btw 😊

  • @financial_cycle
    @financial_cycle 10 місяців тому +2

    the intro is so accurate

  • @larissaarreola2448
    @larissaarreola2448 7 місяців тому +1

    this was so helpful and somehow reassuring, I really appreciate all your work and effort! Keep it up (:

  • @abhinavpandey5306
    @abhinavpandey5306 10 місяців тому

    Can you share the detailed list of topics from basic to advanced for system design for data engineers

  • @1godfrey
    @1godfrey Рік тому

    this is amazing please do more system design vids like this for data engineers where you go in depth like this

  • @ajr1791ze
    @ajr1791ze Рік тому +1

    any good book to learn system design for data engineering & not SDE.

  • @puneetnaik8719
    @puneetnaik8719 2 роки тому +2

    Great video on system design. Just wanted know to what tool/software do you use for creating data flow diagram during interview.

    • @JashRadia
      @JashRadia  2 роки тому +1

      Draw.io is my go to tool. In some cases, I also just actually draw it on the screen using one note etc.

    • @puneetnaik8719
      @puneetnaik8719 2 роки тому

      @@JashRadia thank you so much

  • @pradhyumansinghmandloi8240
    @pradhyumansinghmandloi8240 Рік тому +2

    Is software engineer system design are same as data engineering system design?

  • @nibu6868
    @nibu6868 3 місяці тому

    Thanks Jash, Great content! I guess it is implicit in your design that EU data will only be in EU database and US data will only be in US database. Is that right? What about certain global application data, would it be a challenge to sync it in both regions?

  • @gurumoorthysivakolunthu9878
    @gurumoorthysivakolunthu9878 5 місяців тому +1

    Thank you....

  • @Nick-du9ss
    @Nick-du9ss 2 роки тому +1

    Where can we find projects for data engineering for targeting product based companies ?

    • @JashRadia
      @JashRadia  2 роки тому

      Go to this link for GCP and pick projects you're interested in. They also let you create a qwiklabs environment for hands on.
      cloud.google.com/architecture
      For AWS, you'll find some projects here in the projects section.
      thirdeyedata.ai/projects/data-engineering/?_gl=1*72axvl*_ga*MTU0Mjg4MzYxLjE2NjI1MjQ1NjY.*_ga_DPYTFQ0MMC*MTY2MjUyNDU2Ni4xLjEuMTY2MjUyNDU3MS4wLjAuMA..&_ga=2.184481186.54541546.1662524567-154288361.1662524566
      And I have also mentioned a data engineering course from acloudguru in the description. It has a lot of hands on for GCP

    • @Nick-du9ss
      @Nick-du9ss 2 роки тому +1

      @@JashRadia thanks

  • @artofheart2891
    @artofheart2891 2 роки тому +1

    Excellent Information.
    Thank you for sharing..

    • @JashRadia
      @JashRadia  2 роки тому

      Thanks for watching! 😀

  • @SidharthanPV
    @SidharthanPV 2 роки тому +1

    Thank you for your video.

  • @vrohan07
    @vrohan07 2 роки тому +1

    Thank you

  • @hamidomar350
    @hamidomar350 2 роки тому +1

    Appreciate the effort, the content is very clear and presented in a way that it is easy to absorb.
    I would love to know if beginners can actually configure and test these systems, even if to a limited extent, without having to pay for any cloud service?
    If not, what would be the next best thing once can do to learn these concepts practically?

    • @JashRadia
      @JashRadia  2 роки тому

      Thank you so much for watching and liking the video! 😊
      For the hands on practice, every cloud has some free tier usage. For example, snowflake gives 30 days trial and AWS is free always for certain services. Just ensure you are checking if the service is free or not before spinning up.

  • @meetpatel9690
    @meetpatel9690 2 роки тому +1

    Could you tell us that what will be the job of data engineering in web 3.0 ?

    • @JashRadia
      @JashRadia  2 роки тому

      Interesting question..
      I had a job offer before joining google in a web3/defi company as a data engineer..
      The job of data engineering is going to remain pretty much what it has already been. The difference being, the data is going to be captured from web3 services and products rather than traditional way.
      Web3 also has a ton of information already available especially on smart contracts. This will be the starting point.

    • @meetpatel9690
      @meetpatel9690 2 роки тому +1

      @@JashRadia means data engineering job is going to be remained in web 3.0 space ?

    • @JashRadia
      @JashRadia  2 роки тому

      @@meetpatel9690 100%. Any firm needs to handle data and learn it to grow. Web2 or web3.

    • @meetpatel9690
      @meetpatel9690 2 роки тому +1

      @@JashRadia thanks jash

  • @asktostranger8296
    @asktostranger8296 Рік тому +1

    Can you tell
    Difference between system design round
    For software engineer vs data engineer
    If I cover the lld and hld for software engineer
    Will i able to answer the system design question for data engineering
    Plsss clarify 🙏🙏🙏

    • @JashRadia
      @JashRadia  Рік тому +1

      I would suggest you to check out system design video on this channel. I am explaining this very thing by 2 examples.

    • @hritikapal683
      @hritikapal683 Рік тому

      @@JashRadia video other than this one? If yes can you provide me the link? thankyou!

  • @angelnadar6451
    @angelnadar6451 2 роки тому +2

    Hello Jash ,
    The podcast with Shashank was helpful.
    Just can you please let me know what resources you used for dsa using python.

    • @JashRadia
      @JashRadia  2 роки тому

      Hi Angel, thanks for watching it. I am sure you'll find the content you like on this channel, too.
      As for the DSA preperation, I would recommend hackerrank. There is a specific python path and a dsa path that you can solve using python. It starts from basic and goes till advanced

    • @angelnadar6451
      @angelnadar6451 2 роки тому +1

      @@JashRadia thank you so much Jash !!! Have a great day !!!😊

  • @video-hs5no
    @video-hs5no 2 роки тому +1

    Isn't it datalab deprecated?

    • @JashRadia
      @JashRadia  2 роки тому +1

      Yes, should be replaced by vertex ai

    • @video-hs5no
      @video-hs5no 2 роки тому

      @@JashRadia vertex AI workspace

  • @nobodyinparticula100
    @nobodyinparticula100 8 місяців тому

    Crisp and concise. Super helpful.
    Is this entry level question? Does L5 level have same depth? And is mostly for India based companies or US based?
    Will Data Engineer have more of the second example type System Design? Or is it same as SDE?
    What is a good site to practice DE specific system designs?
    Thank you!

    • @JashRadia
      @JashRadia  8 місяців тому

      Questions are similar no matter the level. It's just that you are expected to go in more depth in L5 in your answer. India or US doesn't matter, it's common.
      DEs will have 2nd type of system design mostly unless it's a startup.

  • @joyjitpal
    @joyjitpal 2 роки тому +1

    Excellent explanation

    • @JashRadia
      @JashRadia  2 роки тому +1

      Really glad it was helpful, thanks for watching 😁

    • @joyjitpal
      @joyjitpal 2 роки тому +1

      Please make more of these kind of videos

    • @JashRadia
      @JashRadia  2 роки тому +1

      @@joyjitpal will definitely do!

  • @WelcomeToDataverse
    @WelcomeToDataverse 2 роки тому +2

    Do data engineers are asked system design?

    • @JashRadia
      @JashRadia  2 роки тому

      Yes, we are. More often than not it's the 2nd type of question that I have covered rather than 1st one.

    • @WelcomeToDataverse
      @WelcomeToDataverse 2 роки тому +2

      @@JashRadia Yes I went through the whole video. I posted this comment while the intro.

  • @jacksmith7160
    @jacksmith7160 2 роки тому +1

    How much time needed to become data Engineer if we start learning data engineering from scratch ?

    • @JashRadia
      @JashRadia  2 роки тому +1

      Anywhere between 4-6 months. You can refer to this post I made on LinkedIn.
      www.linkedin.com/posts/jashradia_cloud-bigdata-technology-activity-6960846154028728320-e_Dc?

    • @jacksmith7160
      @jacksmith7160 2 роки тому

      @@JashRadia it will help a lot to get into data engineering

  • @paragradia
    @paragradia 2 роки тому +1

    Cool 😎 Awsome information 👌

  • @ritwikverma2463
    @ritwikverma2463 2 місяці тому

    this system design is more focused on backend not on DE side

  • @thedailyepochs338
    @thedailyepochs338 Рік тому +1

    Awesome

  • @ganeshtaware9883
    @ganeshtaware9883 2 роки тому +1

    Hi Jash....I am regular follower of your vedios. I would like to connect with you if possible ...Is Data Engineer opening still in Google ? How to prepare for the same ...
    I have 12 years of experience mainly in ETL and datawherhousing...what is path that I should prepare that will help to get into Goole or like companies ...
    Sorry many questions are in single comments but it will be great if we have one to one ...

    • @JashRadia
      @JashRadia  2 роки тому

      Currently hiring freeze is there for most positions in Google. So I wouldn't recommend applying now. And for getting selected, it's about getting your basics right. Prepare on SQL, Data modeling, DSA, pipeline design, distributed systems and cloud in the same order I mentioned here. You will be good.

    • @ganeshtaware9883
      @ganeshtaware9883 2 роки тому +1

      @@JashRadia Thanks.

  • @sathyamanikantabk4483
    @sathyamanikantabk4483 2 роки тому +3

    do a unboxing video of the perks that you got from GOOGLE may be it will inspire others and All the best brother hope to see you Bigdata content on your channel and I have just subscribed your channel and Congratulations and wish me the same luck for Google

    • @JashRadia
      @JashRadia  2 роки тому +1

      While I would love to do that, I have already unboxed everything out of excitement 😂 but yes, I can create a video separately showing all the things we get. Thanks for suggestion!
      And yes, best of luck to you!

    • @sathyamanikantabk4483
      @sathyamanikantabk4483 2 роки тому +1

      @@JashRadia lol...I understand

  • @tusharhatwar
    @tusharhatwar 2 роки тому +1

    Hey Jash!
    Just gone through your video on Shashank's Channel
    Landed up here and subscribed to your channel.
    I would request to bring more content on this channel related to Big Data technologies.
    By the way, I Loved your calm and composed way of Explanation.
    Sent you a LinkedIn connection request, It would be great if you can accept it :)

    • @JashRadia
      @JashRadia  2 роки тому

      Thanks Tushar! I'll also keep your suggestion in mind about the content. Glad to know you like it so far 😊