I'm filming a video tomorrow about all of the dumbest things people have said about UploadThing. Reply with some good ones here and you might get featured ;)
I've been using uploadthing for a long time now. I know how a S3 bucket works but honestly I got screwed up on handling the permissions of S3 initially. Uploadthing is faster, smoother to configure & clean in it's operations. I hope uploadthing becomes a norm for all the businesses. It's really good. Wishing good luck to Theo and Julias.
"Would rather use the much more stable and simpler Amazon S3, and does speed even matter? The user should be fine waiting a few more seconds." - Some guy in discord
I mean even if you didn't try to do that consciously, it would happen - you don't write everything perfectly the first time, specially when working on an MVP. It has to work before it can be optimized
@@bastianventura dude, it's not a nuclear fusion equation analysing app! it's a freakin S3 uploader! you could have it up and running in 1 prompt! but i guarantee you 90% of its code is to limit your ability to upload based on your tier... you should 1. PLAN 2. CODE 3. Optimize, i guessed he missed 1.
the web dev world is slowly reverting. soon we will get "we used literally zero npm packages and just vanilla JS, and our product shipped 10x faster, and the average API response time is 0.0001ms"
Curiously recently I discovered a way with vanilla navigation api and view transition to make an app like nextjs, with all features, faster and don't need build step
while the tooling has a few npm packages for sure Astro is great for that you can ship zero JS if you want, proper grid layout with a few lines of CSS (as much as I love tailwind it adds pages and pages of CSS), and the (optional) SSR features like Astro Actions are specifically designed to work without JS.
1) S3 does support resumability 2) File sizes can be checked using `content-length-range` 3) S3 can reject on file extension and mime types 3) You could have ditched Lambda and done a webhook back to the server
1 to 2 years and we will have completed a full cycle. "new" web devs already "discovering" PHP again. Not long before people uploading HTML files to a nginx/Apache server again and calling it "zero dependency websites". This will be the new big thing.
More and more, I'm coming around to the idea that all these microservices, serverless, edge networks, etc. create way more complexity than is needed for the vast majority of use cases. We devs do love to complicate things.
One day we will figure out how to cut out the middleman entirely and upload straight to our own servers, which can then transcode files, upload them to S3, etc. Oh wait, we actually had that figured out in 2005...
I can make it even simpler by removing your server and just by using free and open source Uppy to upload directly to S3/R2 or wherever, it has resumability and other plugins for free too
I think the difference is that Uppy's "ingest server" can be run by you (using Tus) or by Transloadit. You still need a server if you wish to have resumability and no ghost files, though
I know the joke is that "this meeting could be an email" But i feel UT could just be a blog describing best practices for S3 and or a config file. What is the product offered?
Let us know about the costs difference later down the line, because serverless tends to be very expensive, but your new infrastructure uses a lot more bandwidth on the application side
I had a debate with this bloke on his discord 2 years ago where he couldn’t fathom that I refused to use serverless for a next app due to serverless constraints and performance issues I was having. Good to see he’s finally coming around 🎉
one year later. "we made file reading 10x faster and lowered our cloud cost 10x by going bare metal server. " it i always nice to see people gets excited when they re invent the wheel
Legitimate question, but isn’t 1.5s to upload 3.2MB still really slow? I don’t know what kind of internet you have, but a 50mbps upload would’ve sent the data in 500ms, what is taking the extra second?
@@ramonsouza9846 well, 11:15 is the *upgraded* version of the app... Sure it's faster, but I wouldn't call a turtle amazing when compared to a snail if it could have the speed of a rabbit...
Wow this is massive reduction in complexity! I hope though one day we'll have technology advanced enough to use this thing called "Your server" to store a file. Sure hope we would be able to achieve even less arrows on the graph then...
The bring your own bucket is really important. We have contracts at work that specify we have to story customer data in Australia, so if we can't control where it's stored, we can't use the service.
@@aaronevans7713 I suppose you'd only use the AWS S3 SDK in the back-end server anyway and send pre-signed URLs to the front-end, right? Otherwise you'd have to push some form of credentials to the front end. Honest question: What's the issue with a 3.2 MB (uncompressed JavaScript) client in the back-end?
This puts a limit on the bandwidth available as you are proxying the file uploads to s3, if you have a ton of concurrent uploads you will also need to scale your own servers.
Not if the "Ingress Server" is on AWS EC2. Instead of paying S3 traffic coming from internet, they are paying S3 traffic from inside AWS (Which may be even cheaper). Incoming traffic to the server from internet is free (Well, let's say included on the per-hour price)
@@framegrace1 I am not talking about pricing, but about bandwidth, S3 has distributed endpoints for content delivery and you can have 100s of people upload simulateniously at high mbps, on the other hand your one ec2 instance is limited to whatever mbps amazon has to it, and if you try to upload 4-5 big files at the same time (from different users with good bandwidth) it will bottleneck it for everyone
@@framegrace1 And what is your point exactly? What i said is that to handle more concurrent users they will need to scale the number of instances they run. Then they need to use load balancing to distribute the content across the ec2 instances. And what is more you lose the advantages of the distributed infrastructure of s3 that amazon has built.
Did you account for filesystem caching before/after in your demo? I assume so, given the breadth of architecture changes you described. But caching recently used files in RAM (as modern operating systems tend to do) can make a very noticeable difference in responsiveness. Especially if the files come from spinning disks, network storage, RAID with parity, SATA SSDs... Pretty much anything but NVME. Any kind of A/B performance testing where caching is a possibility requires either pre-caching all inputs (run it several times until the numbers look stable) or somehow guaranteeing that the inputs will never be cached. I'm sure you already knew that. But people often forget that it applies to their own demos.
Did you do any load/performance tests for your UT Ingest Server? Would be really nice to have a video just on that :) Also scaling of this server is an interesting topic...
This can be simply solved by client notifying the server once the file upload done. This is just over engineering at its finest. His reasoning was there will be ghost files if the client didn't notify the server. Solution to that is client always upload to a temp location and move the file to actual location when client notified the file has been uploaded. And you setup a s3 lifecycle to delete files based on the update date.
does it allow for resume and improve the time for smaller uploads? Still, they made their changes and going back to S3 isn't feasible for their marketing too. Plus they now support other types of "buckets" so I guess it isn't just S3 being inefficiently use, instead it gives them marketing leverage to be more independent and agnostic
I know this will sound smug and I am sorry, but: 100% faster should be 0 seconds. So 377% faster and 509% faster mentioned at 3:10 makes no sense, what do those numbers mean? How did you calculate them?
I believe he meant something like: 100% = two times as performant -> final time x/2 377% = three dot 77 times as performant -> final time x/3.77 If it took 5 seconds and now I takes me 1 second I would say my thingy is doing 500% better, because I can do one thingy five times in the time the old thingy took to do one
Impressive. But wouldn't it be even faster, if we remove some more requests and roll the "Your Server", "UT Ingest Server" and "S3" components into a single thing managing uploaded files? Something that kind of works as a common base for data?
hang on, are you basically getting the files from the clients now? Will you have the same bandwidth as the direct S3? Will you pay for the ingress traffic?
uploading should really just be a single chunked transfer http request with a single response. the server can easily athenticate that and save the partial data to get resumability, and more
Thanks @dsherc, this is insane. Just side note: this could be a bit misleading: I see that you mentioned we can’t check file size of filed being uploaded. But even if we can’t dynamically check file size while uploading, we can limit the max file size via adding the size cap to the pre-signed post. Essentially this is what i did: on upload requests we ask for the file size, we return the presigned url with file size added to the signature as cap.
I think you are ignoring a huge security loophole in your logic. If the browser gets the presigned URL, then they can just use it directly without having to go through your ingest server, thus ending up with ghost files anyways
It's kind of sad that resumable file transfer is a big feature now, because I remember it being a standard thing when I was a kid. It was lost somewhere along the way, and I'm glad to see someone is paying attention.
@@danhorus oh right, I got the impression it was clients own authentication and direct upload to S3. I obviously don't understand what this solution provides.
@@dancarter5595 An easier way to upload things? They also add some code to the process so you don't need to do it yourself. I mean it's like using Vercel so you don't have to set your infra.
Yeah, that's the thing here that sort of defeats using it for anything production that is user-data sensitive. In EU at least, cause the us ofc doesn't care for user data. Because you are going to be in breach of GDPR. Since you are the administrator of the data, you cannot share it with 3rd parties without consent.
So the upload/forward from the UT ingest server to the "S3" is now not validated. Which means if the connection between those two fails for some reason at any point, you get invalid results. That is a huge cost. In theory even if you had validation, the ingest server would need to store the files until the actual upload/forward completes. Yes, even if you practically pipe the upload directly between two sockets. Additionally you keep connections alive (from the file upload to the ingest server) while waiting for the response of the external server. That's not good. If these servers take longer than expected to respond, your ingest server may stack a bunch of inactive sockets which it keeps open for no other reason than waiting. You essentially now have an external bottleneck for your hosted server, costing you resources. Also as you said by yourself. The difference is much higher with smaller files which just means that your overhead from different requests got reduced. Because of course less requests means, less added latency. The percentages are kind of misleading. You would actually need a graph that shows difference depending on file size.
Fwiw, Lambdas are not the only way to have serveless compute in AWS. ECS Fargate also offers the benefits of serverless (scale to zero, pay for what you use, etc) without the limitations of Lambda.
Well that is nothing surprising, everyone should know that each serverless our cloud computation application always has an overhead. It is like saying, the new built file upload in rust is 10x faster than in javascript lol
Congrats on v7!! Quick question. If the uploads go to your server and then from your "proxy" to s3, aren't you duplicating network usage at the same time? I imagine that for large videos/files it would get quite expensive compared to the previous approach
@@t3dotgg nice! Hope they keep not charging for that in the future 😂 I guess this would become more noticeable if you allow "bring your own bucket" as it will no longer be in your account. What about the cost of re-processing/proxying the video/file on your server? You will go from 0 to "something". Really curious about this as well!
I love it. I had implemented the same structure you had in the past and I was planing on creating an ingest to propagate super similar to your architecture. That’s a great validation of concept. I would love to use your project but I run all in GRPC to traffic the data.
@@kevboutin everything is ok in the right context. Personally I tend to shy away from anything that names itself something that it clearly isn't. There are always servers...
@@m12652 so the name of something is your problem? The name is not a problem for me if it solves problems and increases productivity for less money. Priorities always vary I suppose. 🤷♀
I'd love to see Theo work on some Remix projects. Remix offers a great deal of built-in type safety, eliminating the need for extra implementation effort.
If theo makes this fully free (100% self hosted for everyone) I will be very happy It would be no longer a service tho But could offer premium capabilities for companies
I'm doing a beginner's web dev course that has a file storage project. I ran into the latency issue with this architecture on day one. Originally I tried: 1. Client sends upload request to my server. 2. Server requests signed URL from Supabase. 3. Supabase responds with URL. 4. Server sends URL to client. 5. Client uploads and notifies server when it's done. 6. Server updates db and sends success response. I can't center a div but I could tell this was horrifically slow! I noticed immediately and switched to streaming through my server to Supabase which was 2-3x faster for small files.
I see the upload to the bucket from the client browser goes through te ingest server and forwards to the bucket hosting server. here is an idea for custom file scanning/checking: can there be a future where a a website can host their own "approval server" that receives a connection from the ingest server, and "listens in" on the file as it is being uploaded to the bucket server and gives a go/no back to the ingest server? it doesn't seem like it slows down the upload (as it is being scanned as it is uploaded), takes barely any time to get the green light, and if it gets rejected the ingest server just tells the bucket server to discard the upload and returns an error to the client browser. with how fast "just forward the packet" seems to be, it is mostly up to the approval server to respond quick enough. headers are always at the start and are the most checked thing to scan on, so by the time the file uploaded the headers has been processed and a green light has been given to ingest. Just an idea. let me know what you think.
But what did you use to build your ingest server?!?! typescript? .NET? Go? Rust? Something else??? I wanna know the details about your serverFULL architecture!!! There's no details in your blog post either about what you used to build your ingest server in, how it's hosted, etc. I'm extremely interested in what you landed on for those tech choices.
Who is this for? Why am I able to just do uploads to s3 in all my apps with aws apis/sdks without 3rd party packages to help, let alone a 3rd party saas service? Honest question. I just don't get why this exists or would be popular beyond maybe a brand new dev following a tutorial where s3 is just out of scope... I'm either too dumb to see where the value is or too smart to depend on a saas to do what the aws sdks do for free.
> Says "Honest question" then immediately shits on the people using it by calling them "brand new devs" Assuming this is actually honest, maybe check out my other videos about UploadThing and S3? tl;dr - if you think S3 is easy to set up, your implementation is FULL of security issues and probably offers a bad user experience too Most real companies with object storage have built their own UploadThing-like solution, but we're a generic that anyone can use at any scale :)
Even these pitches at the end "now you can bring your own bucket!" and "now you can run our server directly in your infra!" seem baffling to me. We already have our own buckets and our own infrastructure simply by using s3 directly. How are those selling points of introducing a saas between us and s3? Again, honest question. I have never felt more out of touch, and can't tell if thats a good or a bad thing hahaha.
Infra matters more than what frontend/client could ever achieve. Because on frontend you can only show the loader nothing else because client has limited internet bandwidth.
When we trigger S3 uploads/copies through various means, rather than having our API state update the front end we allow our client to hit a headObject presigned url to assert that the object has successfully landed. Requires some ugly polling but it’s cheap polling
@@t3dotgg But YT manages their object storage :) I’m genuinely surprised there’s a market for what your company is offering-it’s something an above-average developer could probably knock out in a day as part of their sprint. That said, it takes real business savvy to identify a need and turn it into a viable product with customers. No criticism of your product at all-it’s more of an eye-opener for those of us in tech about how smart business moves can make all the difference.
@@hemanthaugust7217true, really amazing the javacripts guys can complicated everything, any reasonable back end dev finish that a single day with a lib
Great success! It's also quite cute that, even after so many live-streams and videos that you have done, you end up sounding a bit like a school kid presenting their project the first-time in front of the class, when you are talking about something that you are really proud of.
Now it's faster - but it also cost more money - you need to run server, you need to pay for the bandwidth and so on. So it's a trade off - you will pay more for your infra - you will get better user experience. It's the same as with Auth, you can use 3rd party auth system, which saves you ton of work but you can't control the user experience to the very details.
this is similar to investment, companies need to say something different, since most of them aren't innovative, instead they just go back and forth between things we have done in the past so people will invest in them.
I wonder how pricing would work with "bring your own bucket". But we're very excited for it since our organisation has rules on what geolocation a bucket can exist in. And even just using local infrastructure.
@@Itsneil17 you know that this is like saying "just make you WordPress"? I guess Upload Thing is simpler, but getting right is really hard. That's why we use abstractions that hide the real complexity
content analysis + detection, transcoding, knowing when uploads are complete, client and server communicating directly with one another... sounds like php4
I think theo should ditch this project. A lot of theo fans can gather the pitch fork at me , but just read other comments (ftp , you can directly do this in s3 , the point of uploadthing was that the data wasn't passing through their server , uppy exists yeah , don't use uploadthing
Congrats. Owning your own infrastructure is something I always found important. Can we expect video titles like "We stopped using the cloud" with details about how you manage your own bare-metal Linux servers soon?
This just in. Serverless proven to be a buzzword to keep you purchasing overpriced subscription model technology. In other news, paint is wet when applied.
This statement is probably coming from someone who has never built any applications professionally using serverless solutions. It's a paradigm shift and one many people haven't wrapped their heads around yet. People fear what they do not understand and despise things that require LOTS of real world work to become proficient in.
Congrats on the launch, less complex and faster, net win 👍 I imagine you went with the previous architecture first because it let you bootstrap more quickly, without committing yet to the upfront cost of rolling & maintaining your own ingest server. is that right?
This is just meant to be educational to show what things can be slow and how to resolve them for unexperienced developers who haven't reached or considered these steps on their journey.
@@macchiato_1881 That was my 2nd guess but I was seeing a lot of comments in bad faith so I really couldn't tell without any tone indicators lol. In a way my reply speaks to them too
@@sanjaux why do you need tone indicators? People like you need to handle negative comments better. I get not all criticism is good. But are you just going to whine at every valid negative criticism or joke you get?
@@macchiato_1881 Well the actual jokes no I'd ignore those, but criticism is best resolved through talking it out. Since this isn't criticism, more signs would have helped differentiate your joke from something actually worth discussing. Handle them better? I'm just trying to understand the thought process behind some comments (the serious ones)
I'm filming a video tomorrow about all of the dumbest things people have said about UploadThing. Reply with some good ones here and you might get featured ;)
"Typical case of things developers care about, but the customers dont"
- some twitter user
@@martinlesko1521 that’s the one that inspired the video :’) The security one was too good as well
I've been using uploadthing for a long time now. I know how a S3 bucket works but honestly I got screwed up on handling the permissions of S3 initially. Uploadthing is faster, smoother to configure & clean in it's operations. I hope uploadthing becomes a norm for all the businesses. It's really good. Wishing good luck to Theo and Julias.
"Would rather use the much more stable and simpler Amazon S3, and does speed even matter? The user should be fine waiting a few more seconds." - Some guy in discord
"I mean, just self host. ¯\_(ツ)_/¯" - Another random discord guy
Rule No.2 when you make an App: Make it slow so that when you remove the slow logic in the code, you can brag about how fast it became.
What is rule No. 1?
I mean even if you didn't try to do that consciously, it would happen - you don't write everything perfectly the first time, specially when working on an MVP. It has to work before it can be optimized
😂
@@bastianventuraexactly, premature optimization is the death of projects. Make it work, then make it fast
@@bastianventura dude, it's not a nuclear fusion equation analysing app! it's a freakin S3 uploader! you could have it up and running in 1 prompt! but i guarantee you 90% of its code is to limit your ability to upload based on your tier... you should 1. PLAN 2. CODE 3. Optimize, i guessed he missed 1.
the web dev world is slowly reverting. soon we will get "we used literally zero npm packages and just vanilla JS, and our product shipped 10x faster, and the average API response time is 0.0001ms"
bro i'm writing on paper.
and I am here moving from vanilla JS into npm land..
who wouldve known that less is more
Curiously recently I discovered a way with vanilla navigation api and view transition to make an app like nextjs, with all features, faster and don't need build step
while the tooling has a few npm packages for sure Astro is great for that you can ship zero JS if you want, proper grid layout with a few lines of CSS (as much as I love tailwind it adds pages and pages of CSS), and the (optional) SSR features like Astro Actions are specifically designed to work without JS.
2024 is the year of serverlesslessness
Wouldn't it be a serverfulness?
bro left vercel and realised serverless is better
And serverless was never actually serverless
Or serverfulness
Ran out of VC money 😂
1) S3 does support resumability
2) File sizes can be checked using `content-length-range`
3) S3 can reject on file extension and mime types
3) You could have ditched Lambda and done a webhook back to the server
Truth! I was shaking my head during so much of this.
very very true
Lmao expecting proper knowledge from Theo is stupid. He's a UA-cam influenza
THIS!! I'm a noob just starting out with backend, and even I was thinking why not just do all that on S3 directly...why need the middleman uploader?!!
Theo finally discovered servers. Massive win
1 to 2 years and we will have completed a full cycle. "new" web devs already "discovering" PHP again. Not long before people uploading HTML files to a nginx/Apache server again and calling it "zero dependency websites". This will be the new big thing.
More and more, I'm coming around to the idea that all these microservices, serverless, edge networks, etc. create way more complexity than is needed for the vast majority of use cases. We devs do love to complicate things.
but then deploying everything yourself isnt a great idea either
@@martinlesko1521 why not?
We are just learning. We want to make things better, so we try something new. Then the flaws show up and we adapt.
I've been saying that for years! Every major outage too its basically always one of DNS or _microservices_
Resume Driven Development.
Doesn't help AWS (in particular) sell you their shit even if it's worse for you.
One day we will figure out how to cut out the middleman entirely and upload straight to our own servers, which can then transcode files, upload them to S3, etc. Oh wait, we actually had that figured out in 2005...
I used to use TUS in C# and it was a pain in the ass, I ended up writing my own upload client and server code and the code was 10x simpler...
why would we do something faster and more logical when we can do something easy and new? Logic left the room long time ago
I can make it even simpler by removing your server and just by using free and open source Uppy to upload directly to S3/R2 or wherever, it has resumability and other plugins for free too
I think the difference is that Uppy's "ingest server" can be run by you (using Tus) or by Transloadit. You still need a server if you wish to have resumability and no ghost files, though
@@danhorus Interesting
I know the joke is that "this meeting could be an email"
But i feel UT could just be a blog describing best practices for S3 and or a config file.
What is the product offered?
😂 savage
Let us know about the costs difference later down the line, because serverless tends to be very expensive, but your new infrastructure uses a lot more bandwidth on the application side
I had a debate with this bloke on his discord 2 years ago where he couldn’t fathom that I refused to use serverless for a next app due to serverless constraints and performance issues I was having. Good to see he’s finally coming around 🎉
I thought the selling point was that with upload thing your data never passes through it.
the other day Theo was working on Laravel, now he's going back to servers, tech really is evolving backwards
The old ways are still best.
@brainiti I don't know about best, but it helps that the old ways were resource constrained so we know how to makes things well while being lean
S3 has resumability. You must tweak a bunch of config and code to do it. But it works.
it's incredible how such nice things happen when Vercel turns off the taps 😉
Why I need a service for this in the first place?
It turns out you really really don't. In fact it's probably dirtier and bad practice to use this.
How is it even possible to upload 4MB of images in 1.5 seconds, nooo, impossible, upload so fast. I mean what are we even watching...
So uploadthing is an abstraction on s3? S3 already has a dead simple API so what am I missing?
next step would dont use a SaaS and setup S3 on your own
yes but then you need SRE engineers, which is not cheap.
The most astonishing thing is how this can be a product someone pays for :) 99,99999% of is just S3.
Did you watch the video?
Wait until he figures out how quick and simple FTP is...
Wait until pfqniet realizes that this is built for people with actual users...
@@t3dotgg well, in many cases FTP was enough for enterprises, so... :D
@@d3stinYwOw it still is 😢 (sftp will NEVER die)
@@t3dotgg just steer clear of bank tech and you'll never have to find out
@@t3dotgg lol you sound so goofy when you reply to people like this
one year later.
"we made file reading 10x faster and lowered our cloud cost 10x by going bare metal server. "
it i always nice to see people gets excited when they re invent the wheel
Brb going to make a website in Assembly.
@@BlueEyesWhiteEagle that never happened before but i see that some js frameworks developing coding style that very similar to spagetthi php code.
@@orcofnbu its sarcasm bud. Lol
Legitimate question, but isn’t 1.5s to upload 3.2MB still really slow? I don’t know what kind of internet you have, but a 50mbps upload would’ve sent the data in 500ms, what is taking the extra second?
Groundwork, check the 5:58 mark.
@@ramonsouza9846 well, 11:15 is the *upgraded* version of the app... Sure it's faster, but I wouldn't call a turtle amazing when compared to a snail if it could have the speed of a rabbit...
Theo realized that he would be homeless if he continued using serverless
Bro has 7 major versions in a year
We follow semver :)
@@t3dotgg7 breaking changes in a year? Still insane
@@alexeydmitrievich5970 it's a new product, of course they are gonna have a lot of breaking changes
Yikes 😬
Uhu, so your customers had to rewrite the entire integration 7 times in the same year? So sad for people with real projectd
So, if I write my own upload logic, instead of using serverless upload services (like uploadthing), my apps will be much faster?
Wow this is massive reduction in complexity! I hope though one day we'll have technology advanced enough to use this thing called "Your server" to store a file. Sure hope we would be able to achieve even less arrows on the graph then...
the worst part is, theo trash talk dhh's blog post about leaving the cloud just a year ago and he is slowly getting towards it...
1.5s to upload 4 images and a total of under 4MB? That's the fast version that has chat asking how it's possible?
Yeah haha, people never deal with massive uploads, nowadays SaaS is the goat
@@victor95pc I think you mean S3 the goat.
The bring your own bucket is really important. We have contracts at work that specify we have to story customer data in Australia, so if we can't control where it's stored, we can't use the service.
This might be a stupid question, but what is the advantage this service provides over a library integrated on my server or front end?
@@aaronevans7713 I suppose you'd only use the AWS S3 SDK in the back-end server anyway and send pre-signed URLs to the front-end, right? Otherwise you'd have to push some form of credentials to the front end. Honest question: What's the issue with a 3.2 MB (uncompressed JavaScript) client in the back-end?
This puts a limit on the bandwidth available as you are proxying the file uploads to s3, if you have a ton of concurrent uploads you will also need to scale your own servers.
Not if the "Ingress Server" is on AWS EC2. Instead of paying S3 traffic coming from internet, they are paying S3 traffic from inside AWS (Which may be even cheaper).
Incoming traffic to the server from internet is free (Well, let's say included on the per-hour price)
@@framegrace1 I am not talking about pricing, but about bandwidth, S3 has distributed endpoints for content delivery and you can have 100s of people upload simulateniously at high mbps, on the other hand your one ec2 instance is limited to whatever mbps amazon has to it, and if you try to upload 4-5 big files at the same time (from different users with good bandwidth) it will bottleneck it for everyone
@@halfsoft If they use a normal single EC2 instance on the free tier, of course. But I guess they have someone who knows what they are doing.
@@framegrace1 And what is your point exactly? What i said is that to handle more concurrent users they will need to scale the number of instances they run. Then they need to use load balancing to distribute the content across the ec2 instances. And what is more you lose the advantages of the distributed infrastructure of s3 that amazon has built.
Did you account for filesystem caching before/after in your demo? I assume so, given the breadth of architecture changes you described. But caching recently used files in RAM (as modern operating systems tend to do) can make a very noticeable difference in responsiveness. Especially if the files come from spinning disks, network storage, RAID with parity, SATA SSDs... Pretty much anything but NVME.
Any kind of A/B performance testing where caching is a possibility requires either pre-caching all inputs (run it several times until the numbers look stable) or somehow guaranteeing that the inputs will never be cached. I'm sure you already knew that. But people often forget that it applies to their own demos.
Its the old convenience vs performance choice in software. A tale as old as time.
Did you do any load/performance tests for your UT Ingest Server? Would be really nice to have a video just on that :) Also scaling of this server is an interesting topic...
This can be simply solved by client notifying the server once the file upload done. This is just over engineering at its finest. His reasoning was there will be ghost files if the client didn't notify the server. Solution to that is client always upload to a temp location and move the file to actual location when client notified the file has been uploaded. And you setup a s3 lifecycle to delete files based on the update date.
does it allow for resume and improve the time for smaller uploads? Still, they made their changes and going back to S3 isn't feasible for their marketing too. Plus they now support other types of "buckets" so I guess it isn't just S3 being inefficiently use, instead it gives them marketing leverage to be more independent and agnostic
Doesn't moving files cost money with S3? Not sure
@@theairaccumulator7144once in the region you can transfer within the region for free. It going back out the region will then cost again
We always did that with Rails years ago using a free gem maintained by the community not a SaaS company
> This can be simply solved by client notifying the server once the file upload done.
Rule #1 of web security:
Never believe the client.
Thank you for sharing the details, this is great work, excited for you guys and I may become a customer in the near future :)
Hey @t3dotgg, curios to know if/how you have mitigated against slowloris DoS attacks with the new architecture?
I know this will sound smug and I am sorry, but: 100% faster should be 0 seconds. So 377% faster and 509% faster mentioned at 3:10 makes no sense, what do those numbers mean? How did you calculate them?
I believe he meant something like:
100% = two times as performant -> final time x/2
377% = three dot 77 times as performant -> final time x/3.77
If it took 5 seconds and now I takes me 1 second I would say my thingy is doing 500% better, because I can do one thingy five times in the time the old thingy took to do one
we will be able to use uploadthing to upload to our own google bucket? mind blowing!
I call it Stupidity as a Service, or SaaS in short. Ain't that a bootyful thing?
Right after the Vercel sponsorship ended we get this..?
Anyone needs a s3 upload proxy?😮
But why? U can just upload to s3 directly.
Hot babes like it when u be SaaSsy... at least that's what she said.
Interesting to see you share the thought process behind everything, helps to learn :)
Impressive. But wouldn't it be even faster, if we remove some more requests and roll the "Your Server", "UT Ingest Server" and "S3" components into a single thing managing uploaded files? Something that kind of works as a common base for data?
hang on, are you basically getting the files from the clients now? Will you have the same bandwidth as the direct S3? Will you pay for the ingress traffic?
What I have learned, when it comes to IT.. the absurd amount of work is usually necessary due to initial incompetence...
With the amount of time webdev goes full circle I am surprised we never get dizzy.
uploading should really just be a single chunked transfer http request with a single response. the server can easily athenticate that and save the partial data to get resumability, and more
Thanks @dsherc, this is insane.
Just side note: this could be a bit misleading:
I see that you mentioned we can’t check file size of filed being uploaded.
But even if we can’t dynamically check file size while uploading, we can limit the max file size via adding the size cap to the pre-signed post.
Essentially this is what i did: on upload requests we ask for the file size, we return the presigned url with file size added to the signature as cap.
BYOB: Bring Your Own Bucket
Hoped for more info about the new server ( why no serverless, what's the tech, etc. ), but this looks amazing and makes sense now 😊 Great video 😊
I think you are ignoring a huge security loophole in your logic. If the browser gets the presigned URL, then they can just use it directly without having to go through your ingest server, thus ending up with ghost files anyways
Aka. We started using servers, the results are insane!
It's kind of sad that resumable file transfer is a big feature now, because I remember it being a standard thing when I was a kid. It was lost somewhere along the way, and I'm glad to see someone is paying attention.
S3 doesn't support resuming!? Jesus Christ. This is exactly what I mean.
@@guard13007S3 does support resumability and more. His UploadThing is useless
Up next: I went back to client side react
Oh cool, now my third party upload service has access to all the data I store. Neat.
They already had access before, no? It's their S3 bucket
@@danhorus oh right, I got the impression it was clients own authentication and direct upload to S3. I obviously don't understand what this solution provides.
@@dancarter5595 An easier way to upload things? They also add some code to the process so you don't need to do it yourself. I mean it's like using Vercel so you don't have to set your infra.
Yeah, that's the thing here that sort of defeats using it for anything production that is user-data sensitive. In EU at least, cause the us ofc doesn't care for user data. Because you are going to be in breach of GDPR. Since you are the administrator of the data, you cannot share it with 3rd parties without consent.
Sounds like serverless slop is circling back. Also Just uploading directly to S3 is theoretically still faster.
I'd assume S3 to even practically be faster.
@@shubhamcweb Yeah, based on his illustration, simply removing the middleman in between with a direct connection to S3 is obviously faster.
So the upload/forward from the UT ingest server to the "S3" is now not validated. Which means if the connection between those two fails for some reason at any point, you get invalid results. That is a huge cost.
In theory even if you had validation, the ingest server would need to store the files until the actual upload/forward completes. Yes, even if you practically pipe the upload directly between two sockets.
Additionally you keep connections alive (from the file upload to the ingest server) while waiting for the response of the external server. That's not good. If these servers take longer than expected to respond, your ingest server may stack a bunch of inactive sockets which it keeps open for no other reason than waiting. You essentially now have an external bottleneck for your hosted server, costing you resources.
Also as you said by yourself. The difference is much higher with smaller files which just means that your overhead from different requests got reduced. Because of course less requests means, less added latency. The percentages are kind of misleading. You would actually need a graph that shows difference depending on file size.
Fwiw, Lambdas are not the only way to have serveless compute in AWS. ECS Fargate also offers the benefits of serverless (scale to zero, pay for what you use, etc) without the limitations of Lambda.
Well that is nothing surprising, everyone should know that each serverless our cloud computation application always has an overhead. It is like saying, the new built file upload in rust is 10x faster than in javascript lol
14:50 No offense but weird to round one up and the other down, when both are 733 ms.
3733 ms -> almost 4 seconds
733 ms -> almost half a second
If the amount is greater than 1, it's natural to round to a whole number :)
Yeah, the double standard rouding is a bit cringe. But you can tell he's really happy, and with those numbers, I'd be happy too.
Are bandwidth costs negligible now? If not this seems much more expensive for UT to scale.
You've truly mastered the art of making things simple (or should I say, too simple) while monetizing the convenience. Well played. 👏
hmm... i dont get it. why not just request s3 upload permission from the client and upload directly to s3? bit confused...
I am working on an adult website. can i use upload thing or its against the TOS?
1.3 seconds to upload 5mb doesn’t sounds quick, maybe I’m missing something but in 2024 this is awfully slow result
Congrats on v7!! Quick question. If the uploads go to your server and then from your "proxy" to s3, aren't you duplicating network usage at the same time? I imagine that for large videos/files it would get quite expensive compared to the previous approach
If the storage and server are in the same AWS region and account, AWS will not charge :)
@@t3dotgg nice! Hope they keep not charging for that in the future 😂 I guess this would become more noticeable if you allow "bring your own bucket" as it will no longer be in your account.
What about the cost of re-processing/proxying the video/file on your server? You will go from 0 to "something". Really curious about this as well!
I love it. I had implemented the same structure you had in the past and I was planing on creating an ingest to propagate super similar to your architecture. That’s a great validation of concept.
I would love to use your project but I run all in GRPC to traffic the data.
It's pretty cool you naturally use a sequential diagram to explain it without even thinking about it or at least mentioning it.
Time to remove all the sleeps in the code
You improved your product by eliminating network hops as you should do. But the main component (s3) is still server less.
How do you host your ingestion server? Are you running your own k8s cluster?
That upgrade sounds as the logical path. Amazing optimization and simplification from user perspective!
Serverless has become a huge pain, I'll definitely not use it for a new project.
It was just another pointless sales pitch...
This is such a ridiculous comment. I can provide dozens of real examples where serverless has transformed team productivity.
@@kevboutin everything is ok in the right context. Personally I tend to shy away from anything that names itself something that it clearly isn't. There are always servers...
@@m12652 so the name of something is your problem? The name is not a problem for me if it solves problems and increases productivity for less money. Priorities always vary I suppose. 🤷♀
I'd love to see Theo work on some Remix projects. Remix offers a great deal of built-in type safety, eliminating the need for extra implementation effort.
If theo makes this fully free (100% self hosted for everyone) I will be very happy
It would be no longer a service tho
But could offer premium capabilities for companies
I'm doing a beginner's web dev course that has a file storage project. I ran into the latency issue with this architecture on day one. Originally I tried:
1. Client sends upload request to my server.
2. Server requests signed URL from Supabase.
3. Supabase responds with URL.
4. Server sends URL to client.
5. Client uploads and notifies server when it's done. 6. Server updates db and sends success response.
I can't center a div but I could tell this was horrifically slow! I noticed immediately and switched to streaming through my server to Supabase which was 2-3x faster for small files.
I see the upload to the bucket from the client browser goes through te ingest server and forwards to the bucket hosting server.
here is an idea for custom file scanning/checking:
can there be a future where a a website can host their own "approval server" that receives a connection from the ingest server,
and "listens in" on the file as it is being uploaded to the bucket server and gives a go/no back to the ingest server?
it doesn't seem like it slows down the upload (as it is being scanned as it is uploaded), takes barely any time to get the green light,
and if it gets rejected the ingest server just tells the bucket server to discard the upload and returns an error to the client browser.
with how fast "just forward the packet" seems to be, it is mostly up to the approval server to respond quick enough.
headers are always at the start and are the most checked thing to scan on,
so by the time the file uploaded the headers has been processed and a green light has been given to ingest.
Just an idea. let me know what you think.
But what did you use to build your ingest server?!?! typescript? .NET? Go? Rust? Something else??? I wanna know the details about your serverFULL architecture!!! There's no details in your blog post either about what you used to build your ingest server in, how it's hosted, etc. I'm extremely interested in what you landed on for those tech choices.
Who is this for? Why am I able to just do uploads to s3 in all my apps with aws apis/sdks without 3rd party packages to help, let alone a 3rd party saas service? Honest question. I just don't get why this exists or would be popular beyond maybe a brand new dev following a tutorial where s3 is just out of scope... I'm either too dumb to see where the value is or too smart to depend on a saas to do what the aws sdks do for free.
> Says "Honest question" then immediately shits on the people using it by calling them "brand new devs"
Assuming this is actually honest, maybe check out my other videos about UploadThing and S3? tl;dr - if you think S3 is easy to set up, your implementation is FULL of security issues and probably offers a bad user experience too
Most real companies with object storage have built their own UploadThing-like solution, but we're a generic that anyone can use at any scale :)
Even these pitches at the end "now you can bring your own bucket!" and "now you can run our server directly in your infra!" seem baffling to me. We already have our own buckets and our own infrastructure simply by using s3 directly. How are those selling points of introducing a saas between us and s3? Again, honest question. I have never felt more out of touch, and can't tell if thats a good or a bad thing hahaha.
Kudos Theo! And thank you for driving us away from serverless!
"just forward the packet bro, don't process it" *makes app 5x faster*
Do you now need to pay more for ingress into the VPC to the server compared to the user sending directly to S3?
Infra matters more than what frontend/client could ever achieve.
Because on frontend you can only show the loader nothing else because client has limited internet bandwidth.
When we trigger S3 uploads/copies through various means, rather than having our API state update the front end we allow our client to hit a headObject presigned url to assert that the object has successfully landed. Requires some ugly polling but it’s cheap polling
So the product is a S3 proxy server? Alright
Technically speaking, UA-cam is also just a proxy server on top of object storage ;)
Technically speaking that's only a part of their API
@@t3dotgg But YT manages their object storage :) I’m genuinely surprised there’s a market for what your company is offering-it’s something an above-average developer could probably knock out in a day as part of their sprint. That said, it takes real business savvy to identify a need and turn it into a viable product with customers. No criticism of your product at all-it’s more of an eye-opener for those of us in tech about how smart business moves can make all the difference.
@@hemanthaugust7217true, really amazing the javacripts guys can complicated everything, any reasonable back end dev finish that a single day with a lib
Everything is an API over a storage
What would the pricing structure be like for BYOB?
I thought it was gonna be like "not serverless anymore but ... edge"
bring your own bucket + file filtering seems intersting
Great success! It's also quite cute that, even after so many live-streams and videos that you have done, you end up sounding a bit like a school kid presenting their project the first-time in front of the class, when you are talking about something that you are really proud of.
Why can we only sign in with Github?
Now it's faster - but it also cost more money - you need to run server, you need to pay for the bandwidth and so on. So it's a trade off - you will pay more for your infra - you will get better user experience. It's the same as with Auth, you can use 3rd party auth system, which saves you ton of work but you can't control the user experience to the very details.
Great use case! Love this kind of videos
this is similar to investment, companies need to say something different, since most of them aren't innovative, instead they just go back and forth between things we have done in the past so people will invest in them.
I wonder how pricing would work with "bring your own bucket". But we're very excited for it since our organisation has rules on what geolocation a bucket can exist in. And even just using local infrastructure.
Just make your own infra/software for this. Waste of money spending it on upload thing
@@Itsneil17 you know that this is like saying "just make you WordPress"? I guess Upload Thing is simpler, but getting right is really hard. That's why we use abstractions that hide the real complexity
@@Itsneil17 making our own infra/software also costs money.
@@RedPsyched I've made my own infra for stuff like this. Yes it wasn't cheap at the start but now it costs less than using 3rd party
@@Qrzychu92 yes infact new discovery that not everyone uses wp. People use frameworks not a drag and drop editor for building websites.
Your servers are in Elixir now.?
content analysis + detection, transcoding, knowing when uploads are complete, client and server communicating directly with one another... sounds like php4
I think theo should ditch this project.
A lot of theo fans can gather the pitch fork at me , but just read other comments (ftp , you can directly do this in s3 , the point of uploadthing was that the data wasn't passing through their server , uppy exists
yeah , don't use uploadthing
I want to do audio file uploads through an api endpoint to s3 should I keep it as SQS infront of Lambda with maximum timeout or make a microservice
Congrats. Owning your own infrastructure is something I always found important. Can we expect video titles like "We stopped using the cloud" with details about how you manage your own bare-metal Linux servers soon?
This just in. Serverless proven to be a buzzword to keep you purchasing overpriced subscription model technology. In other news, paint is wet when applied.
This statement is probably coming from someone who has never built any applications professionally using serverless solutions. It's a paradigm shift and one many people haven't wrapped their heads around yet. People fear what they do not understand and despise things that require LOTS of real world work to become proficient in.
C'mon bro, next you are gonna try to tell me water is wet or something?
Congrats on the launch, less complex and faster, net win 👍
I imagine you went with the previous architecture first because it let you bootstrap more quickly, without committing yet to the upfront cost of rolling & maintaining your own ingest server. is that right?
You mean to say, removing a thing which causes you thing to be slow makes your thing go fast? 🤯🤯🤯🤯🤯🤯
This is just meant to be educational to show what things can be slow and how to resolve them for unexperienced developers who haven't reached or considered these steps on their journey.
@@sanjaux it's a joke. Jesus karen
@@macchiato_1881 That was my 2nd guess but I was seeing a lot of comments in bad faith so I really couldn't tell without any tone indicators lol. In a way my reply speaks to them too
@@sanjaux why do you need tone indicators? People like you need to handle negative comments better. I get not all criticism is good. But are you just going to whine at every valid negative criticism or joke you get?
@@macchiato_1881 Well the actual jokes no I'd ignore those, but criticism is best resolved through talking it out. Since this isn't criticism, more signs would have helped differentiate your joke from something actually worth discussing. Handle them better? I'm just trying to understand the thought process behind some comments (the serious ones)