No one should start building with scale in mind; it's called over engineering and 99.99% of the time you will redo it within few months. 95% of the developers/companies are not doing large scale software development. They don't need super duper setups and neither should they buy into serverless/mircroservices marketing gimmicks. Devs just need to write better/optimized code and spend more time learning about network and database
If you are starting out. You will be better off building a monolith with an off the shelf CSS theme, and SASS database. You don’t need to worry about scaling and CI/CD, and even downtime. You need to validate your hypothesis, not worry about imaginary things. If you have the time and the knowledge, or if you want to use that occasion to learn… then why not… otherwise you are making harder for yourself to succeed
There are a lot of factors that come into play. What’s the problem, what’s the likelihood of needing to scale for increasing workload or larger code base/more developers? What would it look like to put off designing for scale? Is there an approach that lets us put off this decision until we have a better understanding, or that puts the system in a better position to evolve if it needs to? When planning these things it can help to do some level of POC. Over engineering exists, but this is just another factor that should be considered when analyzing the problem.
Also scale is such an overloaded concept. What is scale? RPS? Hundreds of millions of rows of data being processed? People always forget computers and databases are ridiculously fast. Your application of a couple thousand of users does not have scale and will be likely be fine in a single instance.
Wrong, building with a micro services architecture is not over engineering and is far from premature optimization. If you build with scale in mind you don’t have to decide what exact architecture to use if you do it well.
The codebase I work with at work was originally made with this mentality. That it'll never get big or complicated. Now the source archive is bigger than windows' and linux's combined. We're still trying to migrate it and our workflows to git
I agree that you should build things with scale in mind, however for side projects a lot of the cloud "pieces" in your diagram are cost-prohibitive. I think it's perfectly acceptable to stand up a tiny vps and deploy your software that way (either via docker or whatever) and tackle migrating to more scalable solutions as needed. The key here is knowing what to avoid (like storing stuff in memory, or relying on the local filesystem) and migrating shouldn't be very difficult.
I think we're mostly in agreement here - an important part is to avoid making the backend stateful. I'd also add that the database schema should be designed with scale in mind - perhaps the trickiest and most overlooked part. Regarding cost - I think ECS, Elastic Beanstalk are pretty cost efficient at small scale. Lambda and S3 are even more so.
Local filesystem also has its place. I assume you mean to not rely on it for durability, and that is absolutely true. It doesn't hurt to use it as a cache for some workloads though.
A lot of the things I build are administration panels and tooling for corp users, with SSO and generally no demand for large scaling, we know demand is
My experience is similar. We have some servers that are load-balanced but the vast majority of projects are hosted on a 2vCore, 4GB RAM instance. Most of the businesses need an internal system to automate some processes. UA-camrs always make it seem like every project is the next Netflix of something
Great comment. Similar experience. Simple Stack will always win until it doesn't. Then you will need to solve the next problem but there is no need to solve a problem that is not there and probably will not be there.
I used to manage a Kubernetes cluster on AWS via EKS. But ngl, it was pretty rough. Don't get me wrong, I love containerized workloads and prefer it over "serverless", but K8s in AWS comes with a ton of overhead. Also, something to consider is that EKS and ECS have their own version of the cold start problem. It's not as granular as Lambda, but if you don't have enough spare capacity to handle a spike in traffic, you have to wait for additional underlying EC2 or Fargate instances to spin up, which can take minutes, whereas Lambda can spin up new instances in seconds.
Very good point - for the reasons you mentioned I think serverless might be the king in scenarios where traffic is extremely spikey and unpredictable. This can be mitigated somewhat by configuring your ECS scaling to be more proactive, ie scale up long before that extra capacity is needed, but that means extra cost. Plus, given a sufficient level of unpredictability, it might also be ineffective. I personally haven't worked with applications fielding these sorts of traffic patterns, but I'm sure they do exist.
Totally agree. I work on AKS (Azure's EKS equivalent) and it also has a ton of other problems or overheads involved. All these rely on underlying VMs (EC2s) and they take time to spin up and get added to the node pool. These underlying VMs get into bad state sometimes. Clusters gets into bad state, etc etc. Also if we use Kubernetes, we have to manage things like mTLS between pod to pod communication if we need encryption at Transit, whereas in case of lambda Functions, it's handled by cloud providers. But I agree that they are better than lambda Functions calling other functions nightmare.
Lambda is always going to be vastly more responsive to scaling than EKS, but it's worth noting that you can rectify a lot of these scaling concerns with Karpenter, which afaik has first-class support in EKS (because they built it for exactly this reason)
FWIW at my work we compared vitess and citus and ended up going with citus. Vitess is basically still mysql 5.7 in terms of features while citus is generally 2 weeks behind major postgres versions and supports all features. We also needed a SLA that hosting vitess ourselves inside k8s couldn't give us, but azure cosmos for postgres (which is just citus) could. You can also start at 1 node and scale out later, which is useful for some of our smaller datacenters. We couldn't use planetscale because of data residency requirements of some countries. We also didn't want to have to get a lot of people to learn vitess, which has a more complicated cli. Citus would be harder to host entirely inside k8s if you wanted to though. I think there's a k8s operator but it's 3rd party commercial.
interesting, this is a great anecdote for those thinking about scaling an SQL database. I hadn't actually heard of Citus, thanks for putting it on my radar!
Love the video. You work through architectures step by step with clear reasoning of trade offs. Will definitely be recommending your video to anyone looking for an introduction to systems architecture. A+
Great video! Would've loved to see one last scenario, the Enterprise-scale multi-region deployment. Not that any individual developer would use it or needed for their own projects, but many of the devs in your audience work/want to work for some of these larger orgs, and it would be great to see how this "mega architecture" would look like from your perspective
Building for scale definitely takes way longer. Like if you are using serverless suddenly you have to worry about IAM and configuring each individual lambda, passing messages around through SQS and in general now your whole system is asynchronous which comes with it's own set of problems and requires specific event driven patterns to manage which adds complexity and time. Not to mention this setup makes debugging more difficult and local development isn't possible (yes I know about local stack but its unreliable and limited). My approach is the same way I approach programming: KISS. Just make a single container for your application, address scaling issues when they arise if any. I cant tell you how many times I've worked on a serverless project that is just for some internal tool with less than 1000 users.
A small nitpick: what you've called Docker containers are usually OCI containers and don't require Docker runtime to operate. I.e. in k8s there's no Docker runtime or you can run a container built in Docker in Podman.
I wish you would've talked about vendor lock-ins as well. I'm still a bit hesitant and still drawn to images/containers rather than server-less functions for that particular reason. Or am I missing something?
My philosophy for the jump to server less is that I need every piece of the pie to self hostable. That way I know that even if it's a pain, I always have the door open to leave. I'll tell you in 10 years how this theory plays out
I usually pipe the lambdas written in node js through a REST api gateway. Consolidating them into a monolith express app for containers isn’t a huge effort if I ever decide to go that route but at present it seems unlikely. One of clients works on an extremely small budget and low traffic and for them lambda is a godsend due to the minuscule operational and maintenance costs - the cold start time with the lambdas makes it less snappy but they are happy with the overall performance.
Keep everything containerized and you'll always have an exit path. Cloud providers actually don't want their customers to feel trapped. You make more money from customers that actually like working with you, not captives.
I admit that the "cloud" and "microservices" looks like a shiny toy that I need to adopt to all of my problems at first, but simplicity of modular monolith wins in 99% of my projects. You scale and transform as needed, not on your day 1.
you don't need the cloud or microservices to make your application scalable! your application just needs to be stateless (that's the easy part) and think about how to design your database schema such that it can scale (that's the harder part). I personally think monoliths are completely fine in many cases. Re: cloud though, it's hard to argue against the cloud when things are so inexpensive (or completely free)
Continious integration is actually just pushing changes to trunk/master/main frequently. For example, if you're using feature branches that last longer than about a day, you're not using CI regardless of how the software is built.
I think scaling the database is the deepest subject here. Sharding can absolutely be a pain. There are actually some options for relational (SQL) DBs that scale out; but it is quite advanced tech so there aren't a ton of good options. It's also hard to know exactly what you're getting from a scale-out RDBMS if you don't already know the relevant concepts like consensus, replication, transactionality, etc. TiDB is one of the most noteworthy open source examples I've come across. It's built on top of a distributed KV store called TiKV. It's written in Rust and Go. The authors maintain the most mature implementation of RAFT consensus in the Rust ecosystem that I'm aware of. Under the hood it's using RocksDB as an embedded KV store for the replicated state machine. It just seems very smartly designed to me, so I hope I get to actually use it one of these days. It would be my first choice for a SQL DB that I need to scale.
My only concern about tikv/tidb is the latency in distributed scenarios and i don't think we have any benchmark. Tje only one i managed to find was for single node instances. For now, i feel like the only serious self hosted solution (for me) would be Vitess + MySQL (MyRocks)
7:00 - cockroach db is from the cloud spanner guys. it's a beast. it has a serverless offering. use that basically always unless you need something specific.
@@codetothemoon netflix. The biggest CockroachDB cluster at Netflix is a 60-node, single-region cluster with 26.5 terabytes of data and they have 100 production clusters running atm.
One Multi-master concession can be eventual consistency. Instead of fully atomic transactions, you get pseudo atomic on the node, then other nodes have eventual consistency guarantees. Meaning you have eventual consistency of data reads. You would manage writes and synchronization with something like zookeeper.
yeah I definitely see what they were going for, but I do suspect the visceral reaction most probably have from the name offsets the sense of resilience that they are actually trying to convey. branding is hard...
Key takeaway: be prepared to scale, learn how to scale an architecture because it's useful, avoid bloating your archietcture with microservices and serverless functions and acknowledge the fine line between monolithic and modular architecture for each application
my dev group didn’t plan for scale. our db was json because that’s what we were most comfortable with (we’re amateurs). our api was all one file because we were just trying to get it done. now we have 200k users a week and i’m desperately trying to rewrite the backend.
@@codetothemoon I would argue the opposite. They went the quick route and validated their ideas in the market. This trumps any ideal architecture. Now they have the luxury of having enough users to be forces to care about scale. Thinking too much about scale early would make them loose steam, time, energy, money while not getting to market. Ideally you do both, but realistically there's always overhead in thinking about scale early.
Schema is not the only reason you use a SQL database. What you really want is transaction consistency. If you use a key-value store, you will have to do that yourself in the application. That is hard and if you are able to implement in the application, it would probably be much slower than the database doing it since the database engineers (the people who build the database software) are skilled in that area and also there is a lot of effort put in just for that.
transaction consistency and the broader "distributed systems" concept of consistency (I think you are referring more to the former though) are actually both supported by many NoSQL databases
All of the cost factors could be thrown out of the window if you just host the entire backend stack yourself (kubernetes, databases, apis, etc). Once you set up your own private "backend as a service", every single future project can be easily integrated by either adding another container, or adding another node to the cluster. The nodes are pretty cheap as well and you can get second hand mini desktops for less than the price of an raspberry pi for 10x the performance.
I agree with most of the content. The only addition I would do is the use of a queue (usually Kafka) to decouple between nodes, which seems to be very popular.
I would just throw in that if you ARE using kubernetes, you can add an auto-scaler to you containers (Deployment or StatefulSet) to get a lot of the same benefit of serverless, without having to redesign your app
That's a great breakdown! Love your presentation style. I think I've encountered every pattern already over the course of my career ;-) Those diagrams look very elegant. Glad you had the url of the site in there. What didn't really work out was using an orm for talking to the database. I was working for a pretty big e-commerce shop in Germany and when you don't have control over your own queries, bad things happen to your database. (n+1 cascading queries). At another job I had to work with cassandra. Was ok in the beginning, but I was really happy once we migrated to postgres. The pain of not being able to query the data more flexible was just too high. So I'd always prefer a relational db over noSql as long as the dataset fits in there.
thank you, really happy you liked it. Agree with your perspective on ORMs. They seem great on paper but in practice they tend to cause issues - to add to what you mentioned, I think there's a lot of value making your queries explicit such that you have fine grained control over them. Also because they are explicit you can just copy and paste them directly into a SQL tool and iterate on them until you get exactly what you're looking for. Then copy back to the code, replace tokens with bindings, etc. Definitely understand the relief in going back to SQL after being on Cassandra. Hard to go without that power unless you absolutely have to 😎
I'm thinking of a mix of containers and serverless. Well it's an Instagram clone and thinking of using serverless for the image stuff like processing, storing. Bad idea?
I think this is a great approach - assuming the image stuff is async such that the user's browser won't be actively waiting for the results. I think serverless is a great fit for those sorts of things, while serving all the APIs consumed by the browser in the containers to reduce latency
please show more of the glove80 keyboard. I only come here to see you're still using it. Doesn't have to be crazy...just like 1 more degree of tilt. kthanks bye.
funny, I was using it during filming but the camera angle on this one happened to cut off my desk and most of the keyboard (I think that's what you're referring to). I'll make sure it's more in frame next time :)
the hardest part is scaling the db, for that reason i use a mix of cockroach/postgres, scylladb and redis all at the same time depending of what is my data needs.
I feel like you could even skip using a proper database and just stand up a small sqlite database as proxy for a potential later solution. I just updated a minecraft plugin and I have no users and hooked up a mysql database that has absolutely terrible latency and would've been better off just doing the local sqlite (but it's also made me realise I should be doing the db updates async instead of sync)
potentially. but if you're planning for scalability, you'd just need to think about designing your schema for scale right from the start. then switching from sqlite to another database theoretically shouldn't be too bad.
There is no problem scaling once it's needed, if while developing your system, you are always aware of the two most important things in software development: coupling and cohesion.
Not sure I agree with “never use lambdas that call each other”, that really depends on your traffic pattern. We have a main GraphQL lambda that handles most requests, so it’s almost always “warm” and it infrequently invokes other lambdas for document and report generation, which call back to the GraphQL lambda to query the data they need to generate the documents. It works quite well and has never had a cascading cold start issue
Super video. I like your watered down approach to building with scale in mind. 99% of projects will never make it to the stage of scale up, and that’s okay, but being aware of the trappings of your architecture is important; I’ve gone through the process of moving from serverless to containers and it was a pain but very worth it as the user base grew. Nice to see a shout out for EBS, which is a lovely way to step into containers Another place serverless is seen a lot is with ml runtimes (which tend to be pretty expensive)… I’m curious have you come across a good architecture for these, particularly with cacheing?
for me the best way to scale the system is to tune its performance. scale the server is linear gain but but tune system performance could be exponential gain from O(n) to O(log n)
great point - I should have touched on it in the video. the horizontal scaling techniques discussed are not an alternative for optimizing the time complexity of your business logic. We assume your application logic is optimized already, and I should have called this out explicitly.
I suggest if you are using serverless functions for fetching data from sql database use a proxy like pgbouncer for postgres as the connections count will will not decrease even if it's done and may take up to 15 mins and will quickly reach the limit
where does something like pocketbase or a SQLite server fit into all this? I've found SQLite on one machine is super simple and easy to set up, rather than having a load balancer or multiple db servers.
SQLite is fantastic for getting started, but it may be very difficult to make it a long term scalable solution. to be fault tolerant and horizontally scalable, your application hosts can't be used as durable data stores as you may need to add and remove them as your traffic volume fluctuates. I think there is some narrative around replication with SQLite, but that seems a bit like going against the grain.
they can definitely help, and I've used them when cornered by cold start latency. But it always felt like a band aid to cover up the fact that we really should have gone with ECS for latency sensitive use cases.
Thank you for the brevity and clarity with which you share this information. Brevity and Clarity are rare these days. I don't development but I design complex systems and there just too much vague theoretical BS out there on this. Please make more such videos on system design or consider monetizing your gift to do this through a udemy course.
Going multi-region unlocks another layer of complexity. Often multi-master isn't enough, and you need to go full CQRS or something to segregate reads and writes while simultaneously propagating event logs across regions.
There are serverless functions now without much cold starts, at least for JavaScript code, which is run in custom js runtimes that dont take too long to spin up (but not all APIs / libraries are supported)
I think its called edge runtimes (separate from concept of edge location). I don't think AWS supports these yet. I think the big one that does is cloudflare workers. I am just going off the top of my head, I could be getting some details wrong, but its something to look into
Very good explanation for the most common patterns in use nowadays. And the cautionary section on serverless is excellent. I haven't used them, and I won't 😀
The problem actually not with the over designed. Its with the promised services which usually are not up to par. For example those kubernetes which are not run on bare metal are usually has way lower performance but the bare metal ones cost way higher than maintaining them our selves. So yeah the real problem is always make sure that our first design conform to the goal of the clients that will be using it. If its small then simple monolith is fine but if we got standby clients that has a lot of users then yeah focus on scaling at first is always good one.
Great video. We run a monolithic architecture here, and it's largely the same setup as you've shown - we use HAProxy+ProxySQL in front of our application and database stacks (Percona XtraDB Cluster). Long shot but is there any chance you'd share those Architecture diagrams?
thank you! I have no problem sharing the diagrams themselves, but I don't think I can do so without sharing my personal email address. Here's the eraser.io code for my favorite architecture though, this should get you pretty close to anything you'd like to replicate from the video! // Define groups and nodes User { Web Browser [icon: browser] Person [icon: user] } Application { CDN [icon: aws-cloudfront] Frontend { HTML [icon: code] JavaScript [icon: javascript] } Backend { Container Platform [icon: aws-ecs]{ Load Balancer [icon: aws-elastic-load-balancing] Instance 1 [icon: aws-ec2] Instance 2 [icon: aws-ec2] Instance 3 [icon: aws-ec2] } Database Cluster [icon: database] { Node 1 [icon: database] Node 2 [icon: database] Node 3 [icon: database] Node 4 [icon: database] } } } // Define connections Person Web Browser CDN Frontend Load Balancer Instance 1 Database Cluster Load Balancer Instance 2 Database Cluster Load Balancer Instance 3 Database Cluster
for me yes - the services you've cited are kind of my go-tos just because i'm so used to AWS. But I realize other cloud providers have equivalents of these that are probably viable options as well.
What do you think of serverless based on v8 isolates such as cloudflare workers? There are some limitations and security concerns, but if you can deal with the limitations and aren't handling super sensitive data, it solves a lot of the cost/cold start issues of microvm/firecracker based serverless. Do you know why AWS doesn't have such an offering yet?
Thank you! I haven't personally tried using Go for lambdas, but typically smaller bundle size means preferable cold start times, so I suspect it's got something to do with that
You didn't mention that lambda to lambda requests means you're being charged for both lambdas at the same time, since the first one is still running while waiting for the second... Another ossue is that you would need to set up yhe lambda yo run within vpc peering, otherwise invoking another lambda will cause another trip through the Internet. But if you have to, it may be worth looking into step functions.
that's a great point as well, thanks for pointing it out! I think you can have the one lambda invoke the other without going through the internet simply by using the AWS SDK to do the invocation - though I could be wrong about that. Also, step functions are great! Definitely a much better option than having Lambdas call each other directly - assuming its for non-realtime operations of course.
Designing for scale on side projects or small ones is good habit for when you need to scale. You also can't know if your small project might become big later on I disagree with don't allow serverless to invoke serverless though - offloading large dependencies to their own lambda and invoking that only when needed can help a ton in keeping normal start times down. You do need to be careful to not invoke more from there
Isn’t the problem of “your data should fit in a single instance with sql” solved by using NAS/ SAN? Yea the processing is still on a multi core single node machine but storage shouldn’t be a concern, no?
it's a great question. I think the issue with that approach is that at some point the storage layer would become a performance bottleneck. That or the compute power of the machine (even if you used the largest one available). Also, the single machine would be a single point of failure.
I'm not familiar with Celery so this answer might not be super helpful, but for async processing, I like to use something like AWS SNS/SQS, with a lambda consuming and processing messages in the SQS queue. This setup provides some nice elasticity. sorry if I'm misunderstanding the use case :)
@@codetothemoon thanks for the reply, I have a question regarding this to, so imagine I have setup the Amazon sqs with my api, but with that which auto scaling process i should follow , because my code is a bit larger to put it on lamda, or should I use any other way to deploy it , ?
What he said about scalability in mind is just a bunch of tools with a catchy name and some over-the-top technical buzzwords that when the finance team asks "Why is our tech-infra cost twice the amount of the revenue," He will answer with "This is just the beginning, I'm about to unleash another service that will do nothing to the stakeholder, but still cool."
there have been several comments citing cost as a concern, which is surprising to me. The approaches outlined in this video can be extremely inexpensive, and in many cases nearly free
@@codetothemoon Since it's about scaling, my comment is also from a cost-scaling perspective. Yes, it's cheaper (near free) at some stages, but at certain stages the cost is unpayable, therefore, it's not scalable. Let's say we have 10k DAU, with a cost of 1 dollar for every active user. When using a serverless approach, by the time we reach 100k DAU, the cost will be still 1 dollar for every active user, which means, we're not on a correct scale, we just provide more power to the system, even worst, the cost will be higher than 1 dollar and some of us will get laid off.
Gotcha - I think its true that the cost goes up linearly with DAU, but the slope is much shallower than the numbers you are giving here. Unless your application is ultra computationally intensive, you should be looking at *a fraction of a penny* per DAU, not $1@@gigiperih That said, what do you see as the best (more cost effective) alternative to the approaches laid out in the video?
yeah I'm skeptical of that as well. for most applications costs should rise very modestly with scale when using T* EC2 instances or Lambda. That said, what do you see as a better alternative solution?@@et379
thanks - i've heard great things about railway but haven't tried it myself yet. what do you like/dislike most about it? would you recommend it over AWS?
Good point - maybe this is personal bias but I try to stay away from caching layers for which state needs to be managed in my application business logic. Totally open to using Redis as a durable data store, but in cases where that’s not possible, a CDN seems like it will handle much of the caching requirements. That said I can see there being some cases where bespoke caching logic would be preferable, ie cache entry A can be leveraged for both request A and request B, but a CDN would have no way of knowing that.
@@codetothemoon another use case would be blocking unwanted JWT cross a set of services which have not yet expired. Another one is to use it for session storage so any instance of a service can pickup the request.
Depends on the use case, but typically a server based approach (ECS, EKS, etc). Consolidating the layers of Lambdas can also be considered but that may just be kicking the can down the road
@@codetothemoon but what if you want to stay in serverless/lambda land? Consolidation feels a bit antipaterny, wondering if there are better solutions. Keep thinking but the only thing that comes to mind is warming up the lambdas
My setup ec2 nginx 😅 Ec2 is t2 micro Ram usage is 98% But i dont care Unless an untill i am designing a end user facing product i tolearate response times upto 10 seconds 😅
Nice! I used to run on EC2 but it kept crashing because of the memory. So i moved to AppRunner (dockerized). Now its a managed service like beanstalk, only like £3 a month, and comes with auto scaling, no memory worries, its just amazing! Slow deployments but its worth the price
I really enjoyed this video. But I do disagree with you about “always building your application to be scalable” for a few different reasons. First, I think that an application’s architecture and scalability should be designed with the needs of that application in mind. If I only need my basic calculator web app to host a maximum of 5 users at a time, then I can just host one backend node. If I need my package tracking application for example, UPS, to handle thousands of packages being scanned every second, and varying throughout the days and busy seasons, then the design will be much different. Design for current needs, not future potential ones. It’s like buying a 12 ft fishing pole to fish for lake trout. Second, high level engineering like this costs money. Engineer’s time is valuable and costly. Why spend so much time ensuring we have the most optimized data cluster when we have more important features to implement and bugs to fix? Third, it’s impossible to design a system that is perfect at scaling for your specific case. In any case, you will discover specific bottlenecks you didn’t know would occur, latencies, etc. and will have to make changes as you scale. In summary, design your system for what you need it to do. Don’t waste precious time and money on low impact outcomes. You will have to make changes along the way, it won’t be perfect from the beginning.
In my mind there is no such thing as microservices anymore, only planetary system. You start with a monolith(Sun) then you can break it down into services(planets) and so on(satellites).
nice! funny, i originally started learning Rust ~2 years ago specifically as a means of mitigating cold start issues. What I didn't look into is how the crate size affects cold start time - have you had any learnings there or does it seem imperceptible pretty much no matter what?
@@codetothemoon As long as you use one lambda per route, you usually don't have to worry about binary size. But it is true that you have to keep the number of dependencies low. I always check each crate to see if it let's me disable default features and only build what I need. I couple months ago I built a "lambdalith" that handled all the routes and it was definitely slower than normal Rust lambdas but still faster than Python ones
@@codetothemoon cold start do not linearly scale with package sizes. There seems to be some weirdness under the hood with firecracker and the package size, as testing within my company showed that in some scenarios, a slightly larger package can have an improved coldstart. Ofc if you're maxing out at 50MB zipped then it's a different story
@@codetothemoon pinnacle is hard to say. Maybe kernel dev, who knows. But web apps are a mess. An entirely over engineered industry making things that people don't really want in the first place. Just make native stuff so people can be happy instead of waiting on yet another slow ass web2.0 mistake
@@JorgetePanete I’m a TC-39 Delegate, so I might be biased, but JS has a place in the client. Though probably in a much reduced role to how it is used today.
@@AnthonyBullard My view on javascript/python is that they can be great for really small pieces of code, otherwise you need the goods from statically typed compiled languages
This video had a lot of useful info, but not a cohesive narrative l frame. The first question you asked was never answered. (Why _is_ the first slide a bad architecture? You didn't tell us. Also, why are you using it in every project if it's so bad. Also, also, what's the trap of using this totally ormal architecture? Is the trap that we don't always use it? Cause if so, you didn't frame it very well in the opening paragraphs.) You also claimed that every toy project should include scaling simply because you wouldn't have to learn it more than once. That's bananas. Your recommendation lacks a basic cost/benefit analysis. Why would I add more work to a project if I can't articulate a clear benefit? Also (and least important), it would be really helpful if you described the key parts of each slide before using the word "this" to refer to it. I could have understood your key points while driving my car, but you expected me to read it.
The part the serverless is horrible, you described the benefit of this approach as insignificant and downside as just a wrong way of using the architecture. I can easily disuse you “best” architecture. Overall, pretty weak.
@@codetothemoon Well, I have been using Lambdas in production since they were still in beta. There is no 'particular' use case for serverless. If you are a small company, serverless is a good way to go. You do not need to care about security patching of your EC2; scalability groups, as you mentioned, you pay as you go (We are running a $15M+ company, and I do not really see Lambda in our bill). Most web APIs are simple and do not require any complex background operations (Got a request, fetched data from a database, returned response). Regarding 'cold start', like you mentioned, it's not bad, but you did not say that if you want to make it 'fast', you can just provision capacity for a particular function. (It's going to be warm). You outsource orchestration to AWS, which is not bad at all. Would I use it as primary infrastructure to build Netflix? No! But there is so much serverless has to offer: Easy event based architecture implementation, shot learning curve for developers, etc.
it has been so long since the video of "web scale" became viral people have forgotten about that. suggesting kv and document databases shows how inexperienced you are
You keep saying "I dont think anyone does this anymore" multiple time. I definitely do and so do most people I know. Automation infrastructure and cloud infrastructure costs money VS just hiring someone to do the steps on something they host themselves. 90% of the time your first example where something is hosted on a single server is good enough. Not only that multiple sites hosted on a single server is also the norm.
these days you can get a full CI/CD setup completely free via GitHub Actions or its analogs from the major cloud providers. And the free tier EC2 instance (and likely analogs from other cloud providers) should be sufficient for most small projects. re: multiple sites on one physical server - I realize this is common and there's nothing wrong with it. But I'd personally prefer to approach this by using small cloud VMs which represent fractions of one bare metal machine (like the free tier one I referred to), such that each environment is completely isolated.
@@codetothemoon do you really need a CI/CD pipeline for most projects? I honestly dont think you do. If I'm not deploying a standard CMS I'm building an app by myself that wont have more than a few thousand users. I dont even use frameworks HTML+CSS+Some PHP and a little jQuery for web stuff. C++ / Python for desktop applications. That does the job 99% of the time. It means my projects have essentially zero dependencies, so I avoid dependency hell. No complex build environment. Meaning I dont need containerisation. I'm able to maintain and secure that much more easily than say a standard react project that pulls in 2-3 dozen node packages.And thats just the frontend. As far as I can tell after 15+ years of software development. The front end world has become seriously unneccesarily complex and bloated in order to achieve very simple things. Theres a massive over reliance on libraries instead of people just being able to build simple functionality. Perfect example was while porting a current simple frontend to react, we had a feature that simply showed a remote desktop ID in a floating div when a button was clicked, a simple show/hide feature. Literally 1 line of code that toggles a divs visibility using a ternary statement. One of the guys on the team couldnt do that and instead used an external library to achieve the same functionality. A library that now becomes a build dependence, that needs to go though a security audit and so does every subsequent update to that library which is something we need to now track, not to mention the size of the library is tens of KB. To do the same thing one line of code done. We are going backwards.
@@codetothemooni agree with this. Also, paying $30 a month on scalable software is nothing compared to an engineers wage. Seems like a no brainer to go with cheaper, time saving tech
No one should start building with scale in mind; it's called over engineering and 99.99% of the time you will redo it within few months. 95% of the developers/companies are not doing large scale software development. They don't need super duper setups and neither should they buy into serverless/mircroservices marketing gimmicks. Devs just need to write better/optimized code and spend more time learning about network and database
If you are starting out. You will be better off building a monolith with an off the shelf CSS theme, and SASS database. You don’t need to worry about scaling and CI/CD, and even downtime. You need to validate your hypothesis, not worry about imaginary things. If you have the time and the knowledge, or if you want to use that occasion to learn… then why not… otherwise you are making harder for yourself to succeed
There are a lot of factors that come into play. What’s the problem, what’s the likelihood of needing to scale for increasing workload or larger code base/more developers? What would it look like to put off designing for scale? Is there an approach that lets us put off this decision until we have a better understanding, or that puts the system in a better position to evolve if it needs to? When planning these things it can help to do some level of POC. Over engineering exists, but this is just another factor that should be considered when analyzing the problem.
Also scale is such an overloaded concept. What is scale? RPS? Hundreds of millions of rows of data being processed? People always forget computers and databases are ridiculously fast. Your application of a couple thousand of users does not have scale and will be likely be fine in a single instance.
Wrong, building with a micro services architecture is not over engineering and is far from premature optimization. If you build with scale in mind you don’t have to decide what exact architecture to use if you do it well.
The codebase I work with at work was originally made with this mentality. That it'll never get big or complicated. Now the source archive is bigger than windows' and linux's combined. We're still trying to migrate it and our workflows to git
Very clear descriptions and explanations with no unnecessary fluff. Such videos are hard to find. Kudos.
thanks you so much, glad you found it valuable!
Yo! Great video, man.
I love how you went through this in an iterative approach, much like you would when dealing with issues of scale.
thank you, much appreciated!
I love your videos man
I agree that you should build things with scale in mind, however for side projects a lot of the cloud "pieces" in your diagram are cost-prohibitive. I think it's perfectly acceptable to stand up a tiny vps and deploy your software that way (either via docker or whatever) and tackle migrating to more scalable solutions as needed. The key here is knowing what to avoid (like storing stuff in memory, or relying on the local filesystem) and migrating shouldn't be very difficult.
I think we're mostly in agreement here - an important part is to avoid making the backend stateful. I'd also add that the database schema should be designed with scale in mind - perhaps the trickiest and most overlooked part. Regarding cost - I think ECS, Elastic Beanstalk are pretty cost efficient at small scale. Lambda and S3 are even more so.
Local filesystem also has its place. I assume you mean to not rely on it for durability, and that is absolutely true. It doesn't hurt to use it as a cache for some workloads though.
@@bonsairobo that is like saying I’ll have n number of mediocre memcache instances. I would recommend this.
@@bscheirman It's a valid solution on the client side, and a working interim solution on the server side. memcache is not a free dependency.
Definitely, I don’t think he was talking about deploying your website to Netlify or not 🤣
A lot of the things I build are administration panels and tooling for corp users, with SSO and generally no demand for large scaling, we know demand is
My experience is similar. We have some servers that are load-balanced but the vast majority of projects are hosted on a 2vCore, 4GB RAM instance. Most of the businesses need an internal system to automate some processes. UA-camrs always make it seem like every project is the next Netflix of something
Great comment. Similar experience. Simple Stack will always win until it doesn't. Then you will need to solve the next problem but there is no need to solve a problem that is not there and probably will not be there.
I used to manage a Kubernetes cluster on AWS via EKS. But ngl, it was pretty rough. Don't get me wrong, I love containerized workloads and prefer it over "serverless", but K8s in AWS comes with a ton of overhead. Also, something to consider is that EKS and ECS have their own version of the cold start problem. It's not as granular as Lambda, but if you don't have enough spare capacity to handle a spike in traffic, you have to wait for additional underlying EC2 or Fargate instances to spin up, which can take minutes, whereas Lambda can spin up new instances in seconds.
Very good point - for the reasons you mentioned I think serverless might be the king in scenarios where traffic is extremely spikey and unpredictable. This can be mitigated somewhat by configuring your ECS scaling to be more proactive, ie scale up long before that extra capacity is needed, but that means extra cost. Plus, given a sufficient level of unpredictability, it might also be ineffective.
I personally haven't worked with applications fielding these sorts of traffic patterns, but I'm sure they do exist.
Totally agree. I work on AKS (Azure's EKS equivalent) and it also has a ton of other problems or overheads involved. All these rely on underlying VMs (EC2s) and they take time to spin up and get added to the node pool. These underlying VMs get into bad state sometimes. Clusters gets into bad state, etc etc. Also if we use Kubernetes, we have to manage things like mTLS between pod to pod communication if we need encryption at Transit, whereas in case of lambda Functions, it's handled by cloud providers.
But I agree that they are better than lambda Functions calling other functions nightmare.
Lambda is always going to be vastly more responsive to scaling than EKS, but it's worth noting that you can rectify a lot of these scaling concerns with Karpenter, which afaik has first-class support in EKS (because they built it for exactly this reason)
FWIW at my work we compared vitess and citus and ended up going with citus.
Vitess is basically still mysql 5.7 in terms of features while citus is generally 2 weeks behind major postgres versions and supports all features. We also needed a SLA that hosting vitess ourselves inside k8s couldn't give us, but azure cosmos for postgres (which is just citus) could. You can also start at 1 node and scale out later, which is useful for some of our smaller datacenters. We couldn't use planetscale because of data residency requirements of some countries. We also didn't want to have to get a lot of people to learn vitess, which has a more complicated cli.
Citus would be harder to host entirely inside k8s if you wanted to though. I think there's a k8s operator but it's 3rd party commercial.
interesting, this is a great anecdote for those thinking about scaling an SQL database. I hadn't actually heard of Citus, thanks for putting it on my radar!
Love the video. You work through architectures step by step with clear reasoning of trade offs.
Will definitely be recommending your video to anyone looking for an introduction to systems architecture. A+
thank you so much for the kind words!
Great video! Would've loved to see one last scenario, the Enterprise-scale multi-region deployment. Not that any individual developer would use it or needed for their own projects, but many of the devs in your audience work/want to work for some of these larger orgs, and it would be great to see how this "mega architecture" would look like from your perspective
Building for scale definitely takes way longer. Like if you are using serverless suddenly you have to worry about IAM and configuring each individual lambda, passing messages around through SQS and in general now your whole system is asynchronous which comes with it's own set of problems and requires specific event driven patterns to manage which adds complexity and time. Not to mention this setup makes debugging more difficult and local development isn't possible (yes I know about local stack but its unreliable and limited).
My approach is the same way I approach programming: KISS. Just make a single container for your application, address scaling issues when they arise if any. I cant tell you how many times I've worked on a serverless project that is just for some internal tool with less than 1000 users.
A small nitpick: what you've called Docker containers are usually OCI containers and don't require Docker runtime to operate. I.e. in k8s there's no Docker runtime or you can run a container built in Docker in Podman.
thanks for pointing this out! I definitely got a little sloppy there.
@@codetothemoon Love your work nevertheless!
I wish you would've talked about vendor lock-ins as well. I'm still a bit hesitant and still drawn to images/containers rather than server-less functions for that particular reason. Or am I missing something?
My philosophy for the jump to server less is that I need every piece of the pie to self hostable. That way I know that even if it's a pain, I always have the door open to leave. I'll tell you in 10 years how this theory plays out
I usually pipe the lambdas written in node js through a REST api gateway. Consolidating them into a monolith express app for containers isn’t a huge effort if I ever decide to go that route but at present it seems unlikely. One of
clients works on an extremely small budget and low traffic and for them lambda is a godsend due to the minuscule operational and maintenance costs - the cold start time with the lambdas makes it less snappy but they are happy with the overall performance.
I run an express server I can run locally, and then deploy it to Lambda with a thin wrapper. Migrating is very easy this way.
@@origanamithis is my plan.
Keep everything containerized and you'll always have an exit path. Cloud providers actually don't want their customers to feel trapped. You make more money from customers that actually like working with you, not captives.
I admit that the "cloud" and "microservices" looks like a shiny toy that I need to adopt to all of my problems at first, but simplicity of modular monolith wins in 99% of my projects. You scale and transform as needed, not on your day 1.
you don't need the cloud or microservices to make your application scalable! your application just needs to be stateless (that's the easy part) and think about how to design your database schema such that it can scale (that's the harder part). I personally think monoliths are completely fine in many cases. Re: cloud though, it's hard to argue against the cloud when things are so inexpensive (or completely free)
Continious integration is actually just pushing changes to trunk/master/main frequently.
For example, if you're using feature branches that last longer than about a day, you're not using CI regardless of how the software is built.
I think scaling the database is the deepest subject here. Sharding can absolutely be a pain. There are actually some options for relational (SQL) DBs that scale out; but it is quite advanced tech so there aren't a ton of good options. It's also hard to know exactly what you're getting from a scale-out RDBMS if you don't already know the relevant concepts like consensus, replication, transactionality, etc.
TiDB is one of the most noteworthy open source examples I've come across. It's built on top of a distributed KV store called TiKV. It's written in Rust and Go. The authors maintain the most mature implementation of RAFT consensus in the Rust ecosystem that I'm aware of. Under the hood it's using RocksDB as an embedded KV store for the replicated state machine. It just seems very smartly designed to me, so I hope I get to actually use it one of these days. It would be my first choice for a SQL DB that I need to scale.
My only concern about tikv/tidb is the latency in distributed scenarios and i don't think we have any benchmark. Tje only one i managed to find was for single node instances.
For now, i feel like the only serious self hosted solution (for me) would be Vitess + MySQL (MyRocks)
How are you going to use it as a SQL dB if it's KV?
@@abhishekshah11 TiKV is the distributed KV used for state replication. TiDB is built on top of it with a full SQL engine.
7:00 - cockroach db is from the cloud spanner guys. it's a beast. it has a serverless offering. use that basically always unless you need something specific.
nice, this is a great testimonial to have! seems great on paper, but I tend to be cautious until I hear success stories...
@@codetothemoon netflix. The biggest CockroachDB cluster at Netflix is a 60-node, single-region cluster with 26.5 terabytes of data and they have 100 production clusters running atm.
One Multi-master concession can be eventual consistency. Instead of fully atomic transactions, you get pseudo atomic on the node, then other nodes have eventual consistency guarantees. Meaning you have eventual consistency of data reads.
You would manage writes and synchronization with something like zookeeper.
I always thought CockroachDB was a disgusting name, but it implies that your data will survive well after the apocalypse.
yeah I definitely see what they were going for, but I do suspect the visceral reaction most probably have from the name offsets the sense of resilience that they are actually trying to convey. branding is hard...
Key takeaway: be prepared to scale, learn how to scale an architecture because it's useful, avoid bloating your archietcture with microservices and serverless functions and acknowledge the fine line between monolithic and modular architecture for each application
my dev group didn’t plan for scale. our db was json because that’s what we were most comfortable with (we’re amateurs). our api was all one file because we were just trying to get it done. now we have 200k users a week and i’m desperately trying to rewrite the backend.
i feel your pain! this is precisely the scenario that I'm hoping to help folks avoid with this video!
@@codetothemoon I would argue the opposite. They went the quick route and validated their ideas in the market. This trumps any ideal architecture. Now they have the luxury of having enough users to be forces to care about scale. Thinking too much about scale early would make them loose steam, time, energy, money while not getting to market. Ideally you do both, but realistically there's always overhead in thinking about scale early.
Schema is not the only reason you use a SQL database. What you really want is transaction consistency. If you use a key-value store, you will have to do that yourself in the application. That is hard and if you are able to implement in the application, it would probably be much slower than the database doing it since the database engineers (the people who build the database software) are skilled in that area and also there is a lot of effort put in just for that.
transaction consistency and the broader "distributed systems" concept of consistency (I think you are referring more to the former though) are actually both supported by many NoSQL databases
All of the cost factors could be thrown out of the window if you just host the entire backend stack yourself (kubernetes, databases, apis, etc).
Once you set up your own private "backend as a service", every single future project can be easily integrated by either adding another container, or adding another node to the cluster. The nodes are pretty cheap as well and you can get second hand mini desktops for less than the price of an raspberry pi for 10x the performance.
I agree with most of the content.
The only addition I would do is the use of a queue (usually Kafka) to decouple between nodes, which seems to be very popular.
Watched a few of your videos, loved the clear and precise way of explaining, subbed.
thanks, really happy to have you onboard!
I would just throw in that if you ARE using kubernetes, you can add an auto-scaler to you containers (Deployment or StatefulSet) to get a lot of the same benefit of serverless, without having to redesign your app
That's a great breakdown! Love your presentation style. I think I've encountered every pattern already over the course of my career ;-)
Those diagrams look very elegant. Glad you had the url of the site in there.
What didn't really work out was using an orm for talking to the database. I was working for a pretty big e-commerce shop in Germany and when you don't have control over your own queries, bad things happen to your database. (n+1 cascading queries).
At another job I had to work with cassandra. Was ok in the beginning, but I was really happy once we migrated to postgres. The pain of not being able to query the data more flexible was just too high. So I'd always prefer a relational db over noSql as long as the dataset fits in there.
thank you, really happy you liked it. Agree with your perspective on ORMs. They seem great on paper but in practice they tend to cause issues - to add to what you mentioned, I think there's a lot of value making your queries explicit such that you have fine grained control over them. Also because they are explicit you can just copy and paste them directly into a SQL tool and iterate on them until you get exactly what you're looking for. Then copy back to the code, replace tokens with bindings, etc.
Definitely understand the relief in going back to SQL after being on Cassandra. Hard to go without that power unless you absolutely have to 😎
I'm thinking of a mix of containers and serverless. Well it's an Instagram clone and thinking of using serverless for the image stuff like processing, storing. Bad idea?
I think this is a great approach - assuming the image stuff is async such that the user's browser won't be actively waiting for the results. I think serverless is a great fit for those sorts of things, while serving all the APIs consumed by the browser in the containers to reduce latency
It would be nice to discuss the different methods of testing scale and getting some real world data.
please show more of the glove80 keyboard. I only come here to see you're still using it. Doesn't have to be crazy...just like 1 more degree of tilt. kthanks bye.
funny, I was using it during filming but the camera angle on this one happened to cut off my desk and most of the keyboard (I think that's what you're referring to). I'll make sure it's more in frame next time :)
Really loved to see a video from you on such a topic. Cleared a lot of ambiguity.
:)
thanks, glad you got something out of it!
the hardest part is scaling the db, for that reason i use a mix of cockroach/postgres, scylladb and redis all at the same time depending of what is my data needs.
This was a really amazing way of explaining system design and the eraser too you used is amazing!
Thank you!
thanks for watching, glad you liked it!
Great info in such short time. Gives me a wide perspective of latest system deployment choices
thanks, glad you found it valuable!
I feel like you could even skip using a proper database and just stand up a small sqlite database as proxy for a potential later solution. I just updated a minecraft plugin and I have no users and hooked up a mysql database that has absolutely terrible latency and would've been better off just doing the local sqlite (but it's also made me realise I should be doing the db updates async instead of sync)
potentially. but if you're planning for scalability, you'd just need to think about designing your schema for scale right from the start. then switching from sqlite to another database theoretically shouldn't be too bad.
There is no problem scaling once it's needed, if while developing your system, you are always aware of the two most important things in software development: coupling and cohesion.
Not sure I agree with “never use lambdas that call each other”, that really depends on your traffic pattern. We have a main GraphQL lambda that handles most requests, so it’s almost always “warm” and it infrequently invokes other lambdas for document and report generation, which call back to the GraphQL lambda to query the data they need to generate the documents. It works quite well and has never had a cascading cold start issue
Super video.
I like your watered down approach to building with scale in mind. 99% of projects will never make it to the stage of scale up, and that’s okay, but being aware of the trappings of your architecture is important; I’ve gone through the process of moving from serverless to containers and it was a pain but very worth it as the user base grew.
Nice to see a shout out for EBS, which is a lovely way to step into containers
Another place serverless is seen a lot is with ml runtimes (which tend to be pretty expensive)… I’m curious have you come across a good architecture for these, particularly with cacheing?
for me the best way to scale the system is to tune its performance. scale the server is linear gain but but tune system performance could be exponential gain from O(n) to O(log n)
great point - I should have touched on it in the video. the horizontal scaling techniques discussed are not an alternative for optimizing the time complexity of your business logic. We assume your application logic is optimized already, and I should have called this out explicitly.
What's your thoughts on the de-clouding that's happening? Particularly in the small and medium business world?
I wasn't aware of this phenomenon. is there somewhere I can read about it? are they switching to self hosting?
I suggest if you are using serverless functions for fetching data from sql database use a proxy like pgbouncer for postgres as the connections count will will not decrease even if it's done and may take up to 15 mins and will quickly reach the limit
ahh this makes sense, thanks for the tip!
where does something like pocketbase or a SQLite server fit into all this? I've found SQLite on one machine is super simple and easy to set up, rather than having a load balancer or multiple db servers.
SQLite is fantastic for getting started, but it may be very difficult to make it a long term scalable solution. to be fault tolerant and horizontally scalable, your application hosts can't be used as durable data stores as you may need to add and remove them as your traffic volume fluctuates. I think there is some narrative around replication with SQLite, but that seems a bit like going against the grain.
What do you think about lambda warmers for the cold start problem?
they can definitely help, and I've used them when cornered by cold start latency. But it always felt like a band aid to cover up the fact that we really should have gone with ECS for latency sensitive use cases.
Thank you for the brevity and clarity with which you share this information. Brevity and Clarity are rare these days. I don't development but I design complex systems and there just too much vague theoretical BS out there on this. Please make more such videos on system design or consider monetizing your gift to do this through a udemy course.
thanks for the kind words! this one seemed to resonate with folks so I'll probably be making more.
Going multi-region unlocks another layer of complexity. Often multi-master isn't enough, and you need to go full CQRS or something to segregate reads and writes while simultaneously propagating event logs across regions.
if the sharding is "automatic", idk how to trust it
with turso, you can have a database per user or per 10 users or something
This was extremely well presented and very informative, thank you!
thanks for watching, really happy you got something out of it!
There are serverless functions now without much cold starts, at least for JavaScript code, which is run in custom js runtimes that dont take too long to spin up (but not all APIs / libraries are supported)
I think its called edge runtimes (separate from concept of edge location). I don't think AWS supports these yet. I think the big one that does is cloudflare workers.
I am just going off the top of my head, I could be getting some details wrong, but its something to look into
Awesome, and much clarity you given, thank You
thanks, glad you liked it!
Very good explanation for the most common patterns in use nowadays. And the cautionary section on serverless is excellent. I haven't used them, and I won't 😀
3:10 - sticky sessions on load balancers is a thing if absolutely necessary
The problem actually not with the over designed.
Its with the promised services which usually are not up to par. For example those kubernetes which are not run on bare metal are usually has way lower performance but the bare metal ones cost way higher than maintaining them our selves.
So yeah the real problem is always make sure that our first design conform to the goal of the clients that will be using it.
If its small then simple monolith is fine but if we got standby clients that has a lot of users then yeah focus on scaling at first is always good one.
Great video. We run a monolithic architecture here, and it's largely the same setup as you've shown - we use HAProxy+ProxySQL in front of our application and database stacks (Percona XtraDB Cluster). Long shot but is there any chance you'd share those Architecture diagrams?
thank you! I have no problem sharing the diagrams themselves, but I don't think I can do so without sharing my personal email address. Here's the eraser.io code for my favorite architecture though, this should get you pretty close to anything you'd like to replicate from the video!
// Define groups and nodes
User {
Web Browser [icon: browser]
Person [icon: user]
}
Application {
CDN [icon: aws-cloudfront]
Frontend {
HTML [icon: code]
JavaScript [icon: javascript]
}
Backend {
Container Platform [icon: aws-ecs]{
Load Balancer [icon: aws-elastic-load-balancing]
Instance 1 [icon: aws-ec2]
Instance 2 [icon: aws-ec2]
Instance 3 [icon: aws-ec2]
}
Database Cluster [icon: database] {
Node 1 [icon: database]
Node 2 [icon: database]
Node 3 [icon: database]
Node 4 [icon: database]
}
}
}
// Define connections
Person Web Browser CDN Frontend Load Balancer Instance 1 Database Cluster
Load Balancer Instance 2 Database Cluster
Load Balancer Instance 3 Database Cluster
Sorry for not thanking you sooner, as I've been super busy. So, thank you! @@codetothemoon
What platforms do you use to host this infra? AWS Elastic Beanstock/ECS for backend and AWS RDS/DynamoDB for db?
for me yes - the services you've cited are kind of my go-tos just because i'm so used to AWS. But I realize other cloud providers have equivalents of these that are probably viable options as well.
What do you think of serverless based on v8 isolates such as cloudflare workers? There are some limitations and security concerns, but if you can deal with the limitations and aren't handling super sensitive data, it solves a lot of the cost/cold start issues of microvm/firecracker based serverless. Do you know why AWS doesn't have such an offering yet?
0:59 The services I work with have this architecture (except it's on-prem). Almost all have bespoke hot restart that eliminates downtime.
ZIP files startup way quicker for some reason on lambdas (at least my experience with GO). Anyone know why?
Fantastic video btw.
Thank you! I haven't personally tried using Go for lambdas, but typically smaller bundle size means preferable cold start times, so I suspect it's got something to do with that
You didn't mention that lambda to lambda requests means you're being charged for both lambdas at the same time, since the first one is still running while waiting for the second... Another ossue is that you would need to set up yhe lambda yo run within vpc peering, otherwise invoking another lambda will cause another trip through the Internet.
But if you have to, it may be worth looking into step functions.
that's a great point as well, thanks for pointing it out! I think you can have the one lambda invoke the other without going through the internet simply by using the AWS SDK to do the invocation - though I could be wrong about that. Also, step functions are great! Definitely a much better option than having Lambdas call each other directly - assuming its for non-realtime operations of course.
Amazing video. Thanks for posting! What tool did you use to create the diagrams in your video?
thanks for watching! The diagramming tool is eraser.io. Fantastic tool
Designing for scale on side projects or small ones is good habit for when you need to scale. You also can't know if your small project might become big later on
I disagree with don't allow serverless to invoke serverless though - offloading large dependencies to their own lambda and invoking that only when needed can help a ton in keeping normal start times down. You do need to be careful to not invoke more from there
Where is the "link to the article in the description"? (not that we can't see the url plainly in the video, just noticed it wasn't there. :)
ahhh thanks for pointing this out! I've just added it 😎
Isn’t the problem of “your data should fit in a single instance with sql” solved by using NAS/ SAN?
Yea the processing is still on a multi core single node machine but storage shouldn’t be a concern, no?
it's a great question. I think the issue with that approach is that at some point the storage layer would become a performance bottleneck. That or the compute power of the machine (even if you used the largest one available). Also, the single machine would be a single point of failure.
Please do continue making amazing videos like this one
thank you, I aim to do so!
Cf Workers not mentioned 😢
good point - this is a personal bias as I haven't tried them. heard great things about them so I've put it on my list of stuff to check out 😎
What do you recommend when the application has a celery worker , how can we scale it .,
I'm not familiar with Celery so this answer might not be super helpful, but for async processing, I like to use something like AWS SNS/SQS, with a lambda consuming and processing messages in the SQS queue. This setup provides some nice elasticity. sorry if I'm misunderstanding the use case :)
@@codetothemoon thanks for the reply, I have a question regarding this to, so imagine I have setup the Amazon sqs with my api, but with that which auto scaling process i should follow , because my code is a bit larger to put it on lamda, or should I use any other way to deploy it , ?
What site did you use to make these diagrams??
eraser.io!
Does anyone know what tool that is in the browser that he is using?
eraser.io
@@codetothemoon thanks! 💯
does the last one means one lambda per function? So I huge waste of decades of thinking around how to handle more than one request in every framework?
"This is what everyone starts with"
*screams in still working on 20 year old legacy monolith app*
hahaha! yeah, "we'll do it better later" doesn't always seem to happen...
Nice view & format!
Thank you glad you liked it!
I used CockroachDB, but not in large scale. I would say it works excellent.
It's just like Postgres. Btw it has nice free tier.
Cockroach has a lot of issues with inconsistent reads. Catastrophic for serious apps.
ahh this is good to know - are there any articles or videos about this you can refer me to?
Care to elaborate? What kind of issues are there?
would love to know more about this
What he said about scalability in mind is just a bunch of tools with a catchy name and some over-the-top technical buzzwords that when the finance team asks "Why is our tech-infra cost twice the amount of the revenue," He will answer with "This is just the beginning, I'm about to unleash another service that will do nothing to the stakeholder, but still cool."
there have been several comments citing cost as a concern, which is surprising to me. The approaches outlined in this video can be extremely inexpensive, and in many cases nearly free
@@codetothemoonwell only if your solutions don't actually scale. As soon as your usage goes up the bill hits the fan
@@codetothemoon Since it's about scaling, my comment is also from a cost-scaling perspective. Yes, it's cheaper (near free) at some stages, but at certain stages the cost is unpayable, therefore, it's not scalable.
Let's say we have 10k DAU, with a cost of 1 dollar for every active user. When using a serverless approach, by the time we reach 100k DAU, the cost will be still 1 dollar for every active user, which means, we're not on a correct scale, we just provide more power to the system, even worst, the cost will be higher than 1 dollar and some of us will get laid off.
Gotcha - I think its true that the cost goes up linearly with DAU, but the slope is much shallower than the numbers you are giving here. Unless your application is ultra computationally intensive, you should be looking at *a fraction of a penny* per DAU, not $1@@gigiperih
That said, what do you see as the best (more cost effective) alternative to the approaches laid out in the video?
yeah I'm skeptical of that as well. for most applications costs should rise very modestly with scale when using T* EC2 instances or Lambda. That said, what do you see as a better alternative solution?@@et379
great vid. what game are those city maps from?
Thank you! Game is Cities: Skylines. The first one. Apparently the sequel which came out a few months ago is still full of bugs
Nice content, I'm now using railway to deploy my apps
thanks - i've heard great things about railway but haven't tried it myself yet. what do you like/dislike most about it? would you recommend it over AWS?
Forgot to talk about caching such as Redis
Good point - maybe this is personal bias but I try to stay away from caching layers for which state needs to be managed in my application business logic. Totally open to using Redis as a durable data store, but in cases where that’s not possible, a CDN seems like it will handle much of the caching requirements. That said I can see there being some cases where bespoke caching logic would be preferable, ie cache entry A can be leveraged for both request A and request B, but a CDN would have no way of knowing that.
@@codetothemoon another use case would be blocking unwanted JWT cross a set of services which have not yet expired. Another one is to use it for session storage so any instance of a service can pickup the request.
What sketching tool is this?
so what’s the solution for the lambdas calling lambdas?
Depends on the use case, but typically a server based approach (ECS, EKS, etc). Consolidating the layers of Lambdas can also be considered but that may just be kicking the can down the road
@@codetothemoon but what if you want to stay in serverless/lambda land? Consolidation feels a bit antipaterny, wondering if there are better solutions. Keep thinking but the only thing that comes to mind is warming up the lambdas
> nobody does this anymore
> elastic beanstalk
Oh man
is the implication that elastic beanstalk is out of style too?
I assumed most people have moved on to ECS directly
My setup ec2 nginx 😅
Ec2 is t2 micro
Ram usage is 98%
But i dont care
Unless an untill i am designing a end user facing product i tolearate response times upto 10 seconds 😅
hah nice! whatever works for you - this setup is definitely cost effective! free tier can really take you a long way....
Nice! I used to run on EC2 but it kept crashing because of the memory. So i moved to AppRunner (dockerized). Now its a managed service like beanstalk, only like £3 a month, and comes with auto scaling, no memory worries, its just amazing! Slow deployments but its worth the price
I really enjoyed this video. But I do disagree with you about “always building your application to be scalable” for a few different reasons. First, I think that an application’s architecture and scalability should be designed with the needs of that application in mind. If I only need my basic calculator web app to host a maximum of 5 users at a time, then I can just host one backend node. If I need my package tracking application for example, UPS, to handle thousands of packages being scanned every second, and varying throughout the days and busy seasons, then the design will be much different. Design for current needs, not future potential ones. It’s like buying a 12 ft fishing pole to fish for lake trout. Second, high level engineering like this costs money. Engineer’s time is valuable and costly. Why spend so much time ensuring we have the most optimized data cluster when we have more important features to implement and bugs to fix? Third, it’s impossible to design a system that is perfect at scaling for your specific case. In any case, you will discover specific bottlenecks you didn’t know would occur, latencies, etc. and will have to make changes as you scale.
In summary, design your system for what you need it to do. Don’t waste precious time and money on low impact outcomes. You will have to make changes along the way, it won’t be perfect from the beginning.
In my mind there is no such thing as microservices anymore, only planetary system. You start with a monolith(Sun) then you can break it down into services(planets) and so on(satellites).
Be careful about over-engineering though. When just starting out, making the app work as intended is more important
your lack of experience tells you "you don't need to build like this"
I’m not sure I understand (perhaps also due to my lack of experience)
“… and, in that case, you’ll need to do what’s called sharting…”😂😂
Teehee! 💩
This video was so good!
Thank you really glad you liked it!
Great video!
thank you, really glad you liked it!
Terrific video! Don’t listen to all the haters.
thank you!
well it's official. Now I need to procure a Leptos shirt :)
Hah! AFAIK they don’t have any funding so buying swag seems like a good way to support them 😎
@@codetothemoon do you have a link to the official store for them so I know it's headed to Greg and crew?
They link to a store on the official leptos site 👍
I build lambdas with Rust. The cold starts are almost imperceptible.
nice! funny, i originally started learning Rust ~2 years ago specifically as a means of mitigating cold start issues. What I didn't look into is how the crate size affects cold start time - have you had any learnings there or does it seem imperceptible pretty much no matter what?
@@codetothemoon As long as you use one lambda per route, you usually don't have to worry about binary size. But it is true that you have to keep the number of dependencies low. I always check each crate to see if it let's me disable default features and only build what I need.
I couple months ago I built a "lambdalith" that handled all the routes and it was definitely slower than normal Rust lambdas but still faster than Python ones
@@adrianjdelgado ahh this is interesting thanks!
@@codetothemoon cold start do not linearly scale with package sizes. There seems to be some weirdness under the hood with firecracker and the package size, as testing within my company showed that in some scenarios, a slightly larger package can have an improved coldstart. Ofc if you're maxing out at 50MB zipped then it's a different story
Great video
thank you, glad you liked it!
My god I need my data sharted right now.
I think Cockroach is used by Pinterest, so I guess it's reliable.
Hehe can't unhear "sharting" every time you say it
i know - most unfortunate tech term ever 💩 I believe we have Ultima Online to thank for it
This is focused pretty much on web apps, which is like the trash of software development
in your view, what type of software development would be the pinnacle?
@@codetothemoon pinnacle is hard to say. Maybe kernel dev, who knows. But web apps are a mess. An entirely over engineered industry making things that people don't really want in the first place. Just make native stuff so people can be happy instead of waiting on yet another slow ass web2.0 mistake
thanks!
thanks for watching, glad you got something out of it!
>infinitely scalable
>javascript
I think the JavaScript was client side, not node. Given that CttM loves Rust (and wears a Leptos tshirt), I doubt he’s advocating server-side JS
@@AnthonyBullard I saw the shirt too and I thought it was weird to give js today as an example even if on the client
@@JorgetePanete I’m a TC-39 Delegate, so I might be biased, but JS has a place in the client. Though probably in a much reduced role to how it is used today.
@@AnthonyBullard My view on javascript/python is that they can be great for really small pieces of code, otherwise you need the goods from statically typed compiled languages
This video had a lot of useful info, but not a cohesive narrative l frame. The first question you asked was never answered. (Why _is_ the first slide a bad architecture? You didn't tell us. Also, why are you using it in every project if it's so bad. Also, also, what's the trap of using this totally ormal architecture? Is the trap that we don't always use it? Cause if so, you didn't frame it very well in the opening paragraphs.)
You also claimed that every toy project should include scaling simply because you wouldn't have to learn it more than once. That's bananas.
Your recommendation lacks a basic cost/benefit analysis. Why would I add more work to a project if I can't articulate a clear benefit?
Also (and least important), it would be really helpful if you described the key parts of each slide before using the word "this" to refer to it. I could have understood your key points while driving my car, but you expected me to read it.
This is a bit harsh. You could have said all this in a nicer way and still make your point.
The part the serverless is horrible, you described the benefit of this approach as insignificant and downside as just a wrong way of using the architecture. I can easily disuse you “best” architecture. Overall, pretty weak.
thanks for the feedback! would you consider serverless critical for your particular use case?
@@codetothemoon Well, I have been using Lambdas in production since they were still in beta. There is no 'particular' use case for serverless. If you are a small company, serverless is a good way to go. You do not need to care about security patching of your EC2; scalability groups, as you mentioned, you pay as you go (We are running a $15M+ company, and I do not really see Lambda in our bill). Most web APIs are simple and do not require any complex background operations (Got a request, fetched data from a database, returned response). Regarding 'cold start', like you mentioned, it's not bad, but you did not say that if you want to make it 'fast', you can just provision capacity for a particular function. (It's going to be warm). You outsource orchestration to AWS, which is not bad at all. Would I use it as primary infrastructure to build Netflix? No! But there is so much serverless has to offer: Easy event based architecture implementation, shot learning curve for developers, etc.
it has been so long since the video of "web scale" became viral people have forgotten about that. suggesting kv and document databases shows how inexperienced you are
Thanks for your feedback!
You keep saying "I dont think anyone does this anymore" multiple time. I definitely do and so do most people I know.
Automation infrastructure and cloud infrastructure costs money VS just hiring someone to do the steps on something they host themselves.
90% of the time your first example where something is hosted on a single server is good enough. Not only that multiple sites hosted on a single server is also the norm.
these days you can get a full CI/CD setup completely free via GitHub Actions or its analogs from the major cloud providers. And the free tier EC2 instance (and likely analogs from other cloud providers) should be sufficient for most small projects.
re: multiple sites on one physical server - I realize this is common and there's nothing wrong with it. But I'd personally prefer to approach this by using small cloud VMs which represent fractions of one bare metal machine (like the free tier one I referred to), such that each environment is completely isolated.
@@codetothemoon do you really need a CI/CD pipeline for most projects? I honestly dont think you do.
If I'm not deploying a standard CMS I'm building an app by myself that wont have more than a few thousand users.
I dont even use frameworks HTML+CSS+Some PHP and a little jQuery for web stuff. C++ / Python for desktop applications.
That does the job 99% of the time. It means my projects have essentially zero dependencies, so I avoid dependency hell. No complex build environment. Meaning I dont need containerisation.
I'm able to maintain and secure that much more easily than say a standard react project that pulls in 2-3 dozen node packages.And thats just the frontend.
As far as I can tell after 15+ years of software development. The front end world has become seriously unneccesarily complex and bloated in order to achieve very simple things.
Theres a massive over reliance on libraries instead of people just being able to build simple functionality.
Perfect example was while porting a current simple frontend to react, we had a feature that simply showed a remote desktop ID in a floating div when a button was clicked, a simple show/hide feature.
Literally 1 line of code that toggles a divs visibility using a ternary statement.
One of the guys on the team couldnt do that and instead used an external library to achieve the same functionality.
A library that now becomes a build dependence, that needs to go though a security audit and so does every subsequent update to that library which is something we need to now track, not to mention the size of the library is tens of KB.
To do the same thing one line of code done. We are going backwards.
@@codetothemooni agree with this. Also, paying $30 a month on scalable software is nothing compared to an engineers wage. Seems like a no brainer to go with cheaper, time saving tech