Facebook System Design Interview: Design an Analytics Platform (Metrics & Logging)

Поділитися
Вставка
  • Опубліковано 16 чер 2024
  • Don't leave your system design interview to chance. Sign up for Exponent's system design interview course today: bit.ly/3K0lTtS
    Watch our mock Facebook system design interview. Neamah asks Hozefa (Facebook, Wealthfront EM) a system design question on building a metrics and logging service.
    Watch more videos here:
    - Amazon SDE answers binary tree question: • Amazon Software Engine...
    - Google SWE answers algorithms interview question: • Google Software Engine...
    - Google TPM answers Tiktok system design interview question: • System Design Mock Int...
    - Microsoft SWE answers algorithms interview question: • Microsoft Software Eng...
    👉 Subscribe to our channel: bit.ly/exponentyt
    🕊️ Follow us on Twitter: bit.ly/exptweet
    💙 Like us on Facebook for special discounts: bit.ly/exponentfb
    📷 Check us out on Instagram: bit.ly/exponentig
    ABOUT US:
    Did you enjoy this interview question and answer? Want to land your dream career? Exponent is an online community, course, and coaching platform to help you ace your upcoming interview. Exponent has helped people land their dream careers at companies like Google, Microsoft, Amazon, and high-growth startups. Exponent is currently licensed by Stanford, Yale, UW, and others.
    Our courses include interview lessons, questions, and complete answers with video walkthroughs. Get access to hours of real interview videos, where we analyze what went right or wrong, as well as our 1000+ community of expert coaches and industry professionals, to help you get your dream job and more!
    #systemdesign #facebook #software #engineeringmanagement #tech #entrepreneurship #exponent #tpm
    Chapters -
    00:00:00 - Introduction
    00:00:33 - Clarifying questions
    00:01:52 - Requirements
    00:07:34 - API
    00:10:12 - Design
    00:21:02 - Interview analysis
  • Розваги

КОМЕНТАРІ • 75

  • @tryexponent
    @tryexponent  2 роки тому +2

    Don't leave your system design interview to chance. Sign up for Exponent's system design interview course today: bit.ly/2Nl5Bn5

  • @heterodyned
    @heterodyned 2 роки тому +24

    Metrics and logging sound like two “separate” design questions 🤷🏻‍♂️

  • @sivaranjansahu7427
    @sivaranjansahu7427 Рік тому +4

    Great video. Very natural and realistic, not like the rehearsed and phony ones like many other videos on YT.

  • @yuanzhang3393
    @yuanzhang3393 2 роки тому +26

    for the system needs to be in near real time, this is a clarify question to ask the interviewer instead of assuming.

  • @rituraj889
    @rituraj889 Рік тому +26

    I have a suggestion for all videos under this platform
    First they are super helpful when it comes how to carry out the whole design..like how to estimate or begin with
    But all of them lack cross questions from interviewee side..I mean in real life we can be bombarded with minute detail level questions
    Like in this video...how is data enrichment working or how come we are making data collection to be configurable without having whole business use cases
    Bottomline : make it more tougher :)

    • @cricket4671
      @cricket4671 7 місяців тому +1

      Candidate doing what they showed in video would get downleveled in best case scenario 😅

  • @mrsiddhu2012
    @mrsiddhu2012 2 роки тому +27

    Great video - Kudos to the interviewer for making the environment so comfortable. Few things:
    1. I could not understand the choice of data base. Ideally this should be a combination of TimeSeries DB + Data warehouse?
    2. Certain key components/Aspects like rule engine ( for acting on events), Notification systems ( for notifying the interested subscribers) were missing?
    3. 90 days retention is a very less SLA. Down-sampling the data for lowering the volume and storing it for long term could have been discussed.
    4. I thought that the interviewer wanted to go beyond just visualization - To automated actions ( alarms etc.) and analytics too.

    • @pavananantharama3762
      @pavananantharama3762 Рік тому +5

      Good Observation. Felt like this interview is going in the wrong direction the moment a NoSQL DB was chosen.
      I would say time series DB or an Elastic search system would have been a good choice.
      The key takeaway for me was how well Hozefa communicated his thoughts and solutions. Very good communicatorr.

    • @schan263
      @schan263 Рік тому +3

      This interview is simply too short. At least need another 10 minutes for the design discussion.

  • @bentchow
    @bentchow 2 роки тому +3

    Introducing a Data Catalog would really help with managing PII and auditing where and how sensitive data is being used through data lineage.

  • @dicksonchibuzor7625
    @dicksonchibuzor7625 2 роки тому +20

    Heard the interviewer made mention as part of the requirements that system could scale up to a billion users meaning events could be at least double of that (depending on the metrics that want to be tracked). I think in that case maybe a nosql wouldn't be the best persistent data storage decision. Maybe an OLAP kind of database (like clickhouse) should be used. This will definitely have a drastic positive impact in the query time for both visualization (retrieving) and inserting of events and will also help in creating way faster aggregates.
    Also another improvement that can be made with the design is maybe the queue can come after the validation/scrubbing service and not before. It could help save some space in the queues and not have them overwhelmed because only validated data will get into the queue and invalidated ones are discarded . Only when I see the queue should come before is if we are validating a very large batch of payload at a time then maybe we can stick with this design because validation might take some time for extremely large batches.

    • @implemented2
      @implemented2 2 роки тому

      It will be beneficial to use a queue in case clients are the trusted ones and validation is not required or minimal. This applies to internal services, when you are building an infrastructure solution for internal usage within the company.

    • @opencompare
      @opencompare 2 роки тому +2

      i would probably write all non-real time events straight to a data lake with high throughput. And later ingesting those using distributed data processing platforms like spark or Hadoop. Only gold or silver state data should be stored in DB for analytical purposes.

    • @jasonwakeman
      @jasonwakeman 10 місяців тому

      @@opencompare This exactly. a distributed schemaless db would require hundreds of instances just to handle the writes. to query this much data would take a lot of cpu and time. so, query the whole thing once per hour/day/etc and aggregate it into many tables of a relational db where the aggregates can be queried quickly/cached.

  • @RM-bg5cd
    @RM-bg5cd 10 місяців тому +2

    Event Sourcing and projections for visualization would have been amazing here

  • @seenu007
    @seenu007 2 роки тому +19

    Not satisfied with the discussion. Scope of the question is not clear. Are we building a system for analytics computed out of logging data or are we building a system that has logging and analytics as separate components?
    Interviewee could have discussed about:
    1. Grain of data that is sent by logging system. Is it individual events or aggregated counts?
    2. Database design that is optimized for analyzing time-series data
    3. Could have expanded machine-generated events and user-generated events and have different treatments on those datasets down the line.

  • @saip009
    @saip009 Рік тому +14

    Question to people proficient in designing backend systems - is this a good example of an interview or the design? I personally found this to be crossing out checkboxes in an interview. There isn't enough trade off discussions or building towards a solution. This seems like an inconsistent brain dump of a known solution.

    • @jasonwakeman
      @jasonwakeman 10 місяців тому +1

      I agree. Interviewer did a great job (real interviews wouldn't ask this broad of a question tbh), but i doubt this candidate would make it to next level. If the question was specifically about real-time user events, the answer might pass. but this is not a valid solution for big data. actual solution requires many services and multiple databases to aggregate the data for various use cases. Not one giant db which handles all writes and all queries. storage is cheap, so a solution with a single db doesn't really make sense for big data/analytics

    • @konradte
      @konradte 9 місяців тому +5

      from my experience this is not a real world interview. This was only drawing circles and rectangles without talking about data model, time series database, database schema, failure detection, monitoring. The candidate would be bombarded with questions right away. This is a "nice" design to draw but that doesn't take you through onsite with any serious company.

  • @schan263
    @schan263 Рік тому +6

    This interview is too short so there is not enough time to talk about some of the details. The design interview should be at least 40 minutes. The candidate only had 21 minutes. There's not enough time to do deep dive and it seemed rushed. There is not time to talk about scaling the individual components. Sampling is not scaling. The interview should be longer so that the candidate can talk about how many servers are needed, how much disk space required for X number of years/month, how many requests can be served per second, etc.
    The requirements list is too short. I feel we didn't spend enough time on the requirements.
    How would you determine the level of the candidate based on this interview performance?

  • @gdbroman
    @gdbroman 2 роки тому +3

    Hozefa is a beast!!!!

  • @rohitparthasarathy6671
    @rohitparthasarathy6671 5 місяців тому

    I think for time series data we should be using RDBMS with Sharding or even better have the graph being generated from In memory DB.

  • @riit1564
    @riit1564 Рік тому +3

    Feedback:
    1. should have talked more details about data storage and how the storage would support faster queries. Some sample queries as example must be shown and these queries are served.
    2. No mention of how logs are stored and indexed for faster search.
    3. Didn't justify the usage of queue?

    • @tryexponent
      @tryexponent  Рік тому

      Hey Rahul, thanks for watching and leaving your feedback! Appreciate it!

  • @user-sy8ny3vx6m
    @user-sy8ny3vx6m 3 місяці тому

    Its surprising that there was no discussion on OLAP storage solutions,since we will be analysing these metrics as end product

  • @psychoprincess8920
    @psychoprincess8920 Місяць тому

    Small correction at 6:00:
    For money/banking system, consistency should be more prioritized over availability.

  • @amazingabhay
    @amazingabhay 4 місяці тому

    whats the process of archiving looks like ? how/who gonna move data from main db to archive db and what would happen to precomputed visualisation data ?

  • @tapanparida3176
    @tapanparida3176 2 роки тому +7

    very good... both interviewer and interviewee did excellent job... lot to learn from this video... i have an interview tomorrow with amazon, hope this helps....

    • @designpathy
      @designpathy 2 роки тому

      how did it go ?, i have mine in second week of jan..

    • @eaf207
      @eaf207 2 роки тому

      @@designpathy Good luck y'all. Can I chat with you I have one coming up soon.

    • @sachinmalik9574
      @sachinmalik9574 2 роки тому

      @@designpathy how did your went

  • @profkg6613
    @profkg6613 Рік тому +2

    This could be a case for Kafka for message processing queue with event driven API in mind..

  • @BuyCarsTVPakistan
    @BuyCarsTVPakistan Рік тому +2

    Ok good interview with Imran Hashmi :p

  • @t3ntube357
    @t3ntube357 2 роки тому

    may I know the tool name they used?

  • @karthikr5884
    @karthikr5884 2 роки тому

    Nice:)

  • @downshiftturbo8974
    @downshiftturbo8974 24 дні тому

    Low latency as a NFR didn't make sense to me. Nothing on priority or transactional data like money is involved. This is something passive and it will be used later to make business decisions

  • @sagarchoudhury56
    @sagarchoudhury56 Рік тому +7

    I think this interview will not fly. lots of flaws

  • @amitkumarsrivastava9261
    @amitkumarsrivastava9261 Рік тому +2

    NoSQL DB for a time series data. What a Joke!!! Can't believe FB EM giving this sort of design

  • @gufengmsa
    @gufengmsa 7 місяців тому

    The design is a little superficial . In the context of monitoring systems, the crucial 'dive deep' question pertains to data aggregation and the trade-offs between storage capacity and performance.
    The real world monitor system like cloudwatch and prometheus (push vs pull) have be mentioned during interview as well.

  • @CommanderShepard05
    @CommanderShepard05 2 роки тому +2

    dear team, please provide the name of the tool that the user is using for drawing the architecture

  • @arunsatyarth9097
    @arunsatyarth9097 2 роки тому +10

    Not in depth at all

  • @davezhang8314
    @davezhang8314 2 роки тому +5

    Load balancer is redundant if you're using a queue. Events should be published to the queue right away and available consumers (validation service) will handle events as they become available.

    • @gsb22
      @gsb22 2 роки тому +11

      I believe you dont expose the queue directly and it has to sit behind a service which actually pushes the data onto the queue. And since this service, needs to scale up and down, we should need LBs in front of front end servers.

    • @KevindraSingh
      @KevindraSingh 2 роки тому +1

      Yup exposing an implementation detail like queue directly to the client will hurt the system in the long term when there comes a requirement to modify the design.

    • @dicksonchibuzor7625
      @dicksonchibuzor7625 2 роки тому +1

      You shouldn't expose the queue directly to the event payload also it will help with "Load balancing"😃 especially for the scale the interviewer mentioned (definitely should have multiple queues ) .

    • @japanboy31415
      @japanboy31415 2 роки тому

      wrong.

    • @jcaliz
      @jcaliz 2 роки тому +1

      Queues usually benefits of having fast protocols like TCP and UDP (in case you don't care about data loss), exposing these protocols to the end user is not safety.

  • @kartech4592
    @kartech4592 Рік тому

    Can a load balancer directly insert to a queue?

    • @rituraj889
      @rituraj889 Рік тому

      Yeah good point..isnt LB by defailt part of MQ
      I mean number of partitions or consumers can do the same thing

  • @neerajkhanna3024
    @neerajkhanna3024 Рік тому +4

    Why was visualization a big piece of the discussion. Design was metrics and logging, which lacked depth. It's whole blob of logging data coming, could be stored in timeseries DB or even object store like S3 then moved to DW like Redshift. Why NOSql DB needed in this case.

    • @jasonwakeman
      @jasonwakeman 10 місяців тому

      not sure where you get the idea that it is a whole blob? imagine a webapp: you would want to be logging individual events so that if browser is closed you don't lose any. s3 would work but is not best choice: imagine having a lambda for every single user event that wrote to s3

  • @AnushkaVijay-cv7tk
    @AnushkaVijay-cv7tk 10 місяців тому

    does meta ask system design question to sde1 role

    • @tryexponent
      @tryexponent  10 місяців тому

      Hey AnushkaVijay-cv7tk! Typically SDE1 candidates will not be asked system design questions

  • @OneMillionDollars-tu9ur
    @OneMillionDollars-tu9ur 21 день тому +1

    I am a system design interviewer and a hiring manager and I will probably give him a NO.

  • @iamworstgamer
    @iamworstgamer 2 роки тому

    18:32 this question had no clear answer given

  • @HEKTO3
    @HEKTO3 Рік тому

    Not very successful interview

  • @jjc5258
    @jjc5258 2 роки тому +12

    This interview gonna fail, bad example

    • @ramgamery
      @ramgamery 4 місяці тому

      Why? Can you please explain?

    • @ashwin81088
      @ashwin81088 16 днів тому +1

      This interview went well imo. The system he described is what we use in my org. They took 3 years to develop but our boss designed it in 20 mins.

  • @gsb22
    @gsb22 2 роки тому +4

    17:10
    WOW. I mean there were nit picks before this point but this is a big NO. The analytics platform HAS to save each and every event no matter what. It doesn't matter if this is being used by 1 user or trillions users, you have to store each and every event. The response to the scale problem, would be to scale out queue and ingestion service as the number of event increases.

    • @gsriram7
      @gsriram7 2 роки тому +2

      @@joed5714 Actually its not. We heavily use sampling to keep up with upstream and its a standard practice when it comes to exorbitantly high (like 10000+ B events per second). We have a data pipeline to ingest packet header from all the routers. A router can process 5 Gbps and there are 1000+ routers and it is impossible to ingest all those events without sampling. Ofcourse unless you provision 10000+ 32 core instances

    • @nagoorshaik8025
      @nagoorshaik8025 2 роки тому +3

      I think whether it is a BIG NO or absolutely YES is to be decided on use case. Depend on the metrics and the purpose we collect this data it might not necessary to collect metrics from every use. Sampling important statistical method that gives expected results with out going through each and every input. I know we have tools/methods and frameworks to be able collect each input with out a miss, but again do we need to do this or not has to be decided first otherwise you are jumping in to a solve a problem which doesn't exist.

    • @andrew3
      @andrew3 2 роки тому +1

      Sampling is done by all the major tech players for large applications. It is completely valid to suggest.

    • @daryaarbuzova3315
      @daryaarbuzova3315 10 місяців тому

      Stopped the video at the same timestamp to process what was said :\ Agree that it's a big NO. For example, sampling user conversions for ads analytics is not acceptable.

  • @wuaaron662
    @wuaaron662 2 роки тому +7

    uhm...uhm...uhm...uhm...uhm...
    ....................................................is it how it works in real system???????

    • @tsjoshi
      @tsjoshi 2 роки тому

      "What if..." that's what happens all the time in reality.

  • @craigslist1323
    @craigslist1323 Рік тому +3

    How is this guy a manager. Probably will fail intern interviews.

  • @yashmishra3900
    @yashmishra3900 2 роки тому +3

    Who is the interviewer pls include her LinkedIn ID