Sqlite Is Getting So Good
Вставка
- Опубліковано 21 лис 2024
- Recorded live on twitch, GET IN
Article
turso.tech/blo...
support me via the link
tur.so/topshelf
By: Glauber Costa | x.com/glcst?re...
My Stream
/ theprimeagen
Best Way To Support Me
Become a backend engineer. Its my favorite site
boot.dev/?prom...
This is also the best way to support me is to support yourself becoming a better backend engineer.
MY MAIN YT CHANNEL: Has well edited engineering videos
/ theprimeagen
Discord
/ discord
Have something for me to read or react to?: / theprimeagen
Kinesis Advantage 360: bit.ly/Prime-K...
I came here for Sqlite but the presentation just mentioned it and then took a hard turn to Rust and deadlocks.
Fun fact: I referenced a video of yours about SQLite for a NotebookLM generated podcast. The "host's" now call SQLite "Squeel Lite" now. Thanks for nothing pal! :)
that's crazily hilarious 🤣🤣
😂
try not to train ure models on inferior data sources
I used to say it ironically. Now I say it "squeel lite" without even thinking about it
Once you hear squeel, you can never go back. It's just too catchy and correct.
Sqlite is godly. I started using it instead of json files for small projects. Somehow it's faster for tiny datasets. Didn't expect that.
It’s faster than the file system, check their blog
@@paladin9876 yeah cause file system works on pages, while sqlite directly uses disk.
Both use pages. If you have 10 files and you want to use them, your app must use fopen() ten times. But if those same files are stored in a Sqlite database then we only have to use fopen() one time. Thats the difference.
@@martijnb3381that's not all, it also depends on how one is updating the JSON file, if a program is just dumping/stringifying and writing the whole data on each update, that's also a issue, especially if said data is tens or hundreds of MB.
Exactly what the creator of SQLite said. "SQLite was not created to go up against databases like Postgresql or MySql, it was created to go up against flat file databases."
1:07 Reinventing the wheel is a fantastic way to understand the _wheel_ more broadly.
We should all endeavor to re-invent the wheel once. If for nothing more than to get it out of your system. But the key is to do that BEFORE people start paying you to code.
@@seanwoods647 Or, do it in the downtime when no major features is being planned. When code has gone to production you get lots of use cases that _should_ be turned into Tests.
I use SQLite for my beginner projects whenever I start a new programming language, typically a TODO list (yes, technically reinventing wheel, no I dont care that there's alot, its for learning, people should reinvent the wheel to learn), and its amazing
Bro same. Together we can flood the market and monopolize shitty to do apps.
"Blame the tools."
I literally LOL'd. Perfect delivery.
To be fair, sometimes the tools are indeed buggy 😅
Rawdogging sync rust is indeed the way the lord intended it. But i am using Embassy, which is async and i am loving it. Calling await on a physical pin that resolves when there is physically 3.3v on it is wild.
hardware interrupts ?
@@marwan7614 yup. that's what they are under the hood. but you don't have to write anything by hand. just throw in an await and the computer does it's thang.
🔥 love the sponsorship Acknowledgement straight out of the gate.
you do realize it is required by law right?
@@schachmatsch4790 no shit, but the fact that he says it so clearly and boldly is a lot different than others that just have a small notice somewhere most people won't directly see it
What's it with you guys and sponsorship.
If a video is sponsored and the creator lies about the content; I stop watching the creator. simple
@@schachmatsch4790 I'm not familiar with the law. Platform thing or something else?
@@p0xygen Except for the fact that he hedges. "Turso does pay me" and "This article, I guess, you would definitely be able to call this thing sponsored." What thing? Is he saying it just because Turso pays him for something else? What thing is he talking about, this youtube video? If Turso is paying him to make the video, just say so.
Imagine when Prime discovers formal methods and starts writing specifications
He'll finally write Haskell..
Prime and writing D and Ada? No way
He might even learn to spell TLA+
The expert system I maintain (IRM) is actually one of the original applications that Sqlite was developed for.
DST is the acronym for STD in my native language. Listening to this video was an experience.
I'm not sure if Daylight Saving Time is that in my timezone.
In the Star Wars Jedi Academy community it refers to Dark Side Tools which is a bunch of hacks like wallhack, aimbot etc which is quite notorious lol
7:30 When you hold a non-async safe lock across an async boundary, the executor might suspend the task for an unknown amount of time.
During that time another task that waits on the same mutex might be woken up and then wait until the first task lets go of the mutex, which will block the executor from continuing to execute the first task, which leads to a deadlock where all threads of the executor are all blocked by the mutex.
Yep. And async mutex explicitly exist to avoid this dead lock. For whatever reason they didn't use it. Looks like an application bug and not a rust bug.
@@justanothercomment416 I mean that it was an application bug was never in question.
And using a non-async lock in async code, *actually* can have its advantages.
async locks work by keeping a linked list of the task wakers that are waiting on it, which has a certain overhead.
So if you know you *never* keep a lock over an async-await point, it might actually worth using it.
(Especially in some low-level code, for example when implementing a future manually.)
@@justanothercomment416 It's actually officially recommended in async Rust book and I think in a Jon Gjenset video about async, to use *sync Mutexes* across small critical sections, *as long as you don't straddle an async boundary* . Of course, that turned out to be harder than it looks.
@@VivekYadav-ds8oz Interesting. Because I looked before I even commented and found async mutexes to be the defacto recommendation. It would appear some of those with "authority" don't really understand what they are saying. Because the async implementation explicitly exists to avoid all of this nonesense.
@@justanothercomment416 I.. highly doubt that. The "official", or as you rightly called it - "authoritative" recommendations do explicitly point out these potential problems. But the thing is, async mutexes are very slow. And a lot of the times, the critical section is really small and doesn't involve an async operation, maybe like pushing into a HashMap, or updating a counter, etc. It did irk me too the first time I heard it, as it can create the problems precisely mentioned here. However, it is a risk people are willing to take, and as we saw sometimes it might not pay off.
The way you do aggregates (or rather a way), is to use an external cache, something like a redis or whatnot, to save the per-user metrics you want to aggregate, and then work on the cache when it comes to finalizing the numbers.
Don't run aggregate jobs across the databases, run it across your event log. Give every user a mutable DB that stores the live data, but pool all the user events for aggregation.
I work with multi-tenant databases. The issues usually come from not balancing the load correctly. In our testing we found pretty much every version of standard squeal postgres Microsoft whatever to be faster than squeal light
SQLite is used for embedded systems and stand alone applications. If you find yourself needing to load balance servers you’re using the wrong tool. Next try to use resident Postgres for your cell phone app. Different tools for different jobs.
@@ElSantoLuchador I don't think you understood. Turso is saying they are doing multi tenant SQL lite databases. I'm saying that's dumb
@@mattymattffs to be honest,I didn't get waht you mean in the first comment too. but I enterily agree with you
i love that every point in the conclusion of why spells out "borrow checker" aka thats the whole reason why.
The wheel has been re-invented probably more than any other part on your car
Fun fact: I have the job I have after striking up a conversation with Richard Hipp while in line at a conference. The power of the right T-shirt
I love SQLite. It’s so versatile and fast as hell.
Petition to actually change it to squeel.
Imagine MySqueel or PostGresqueel
you can't change it to what it is
literally impossible, comrade
Prime you should get Joran on stream to ask for advice with your simulation testing questions! It would be a really fun conversation
Sqlite is qualified for Aerospace.
Is that the new “military grade” meaningless label for things now. The “milspec” of tech?
Boeing is qualified for Aerospace
@@JohnSmith-op7lsDO-178B is the certification. I think its not a buzz word like "milspec" 😊
@@allesarfintYou win this comment section
So you're telling me Sqlite is missing a few bolts...
28:00 I guess it depends on your definition of "cold start". I assumed it would have meant that every database has all the important bits in the filesystem cache in any situation. If it actually meant "not having to boot a new VM", sure, that's much much easier thing to accomplish if you don't need VM level separation for security purposes.
And considering the amount of hardware security bugs related to VM escape vulnerabilities, I would rather bet on well written server software instead of VM containment.
In .NET, calling Wait (sync) on an async function can also result in a deadlock under certain scenarios and has to be avoided, so this isn't just a RUST thing.
yep. i was thinking the same
Can you give a example or is it a 1 in one Million thing?
@@schmidt-5099 I can't post links but search for article "Understanding Async, Avoiding Deadlocks in C#"
Search for article "Understanding Async, Avoiding Deadlocks in C#"
Understanding Async, Avoiding Deadlocks in C# article explains this more in-depth
Sqlite has always been good
I feel like a lot of us were just sleeping on it...
it's pretty much the only option for mobile
@@Kane0123 it is probably the most prolific piece of software of all time, you find it everywhere from your iphone to airplanes to missiles to hospitals. a major vulnerability in sqlite would shutdown a huge part of the world's economy.
It's always been good for client side storage in apps, all these people trying to shoehorn it into servers and hacking it into a server-side database are misguided. That's just not it's purpose.
@@justinsmith3981in flutter i use isar, nosql local database, it's FAST
so funny to see an idea you thought of on the throne the day before being discussed in a youtube video, I love it
Not entirely clear, but for that mutex, it should be an async mutex. If using the standard mutex described bug is expected.
wth is an async mutex?
@@hbobenicio It's a mutex which understands Rust's async wait and behaviors correctly for these types of use. Whereas a normal mutex blocks execution as it's becoming recursive, which isn't what you would want for an async operation and its scheduler in the first place. The "deadlock" is the block as requested by the mutex.
While making some assumptions as the full context was not provided, it appears this was not a Rust bug but a user bug from using the incorrect mutex for purpose.
We do single-tenant DBs at work. Orchestrating them is easy by having a single system/admin db where "pointer" records about each of the account exist. All the system-wide stuff also goes there of course
First time see Prime reading an articles positively.
I am impressed by things people are creating and I am jealous. The first 10 years of my career I spent building CRUD and video streaming apps (hooking third party service). Just 3 years ago, I deep dived into more interesting topics.
I totally knew you were going to say “shill-o-gen”!
1:00 I think even companies can benefit from reinventing the wheel.
Depending on what your company or project within that company does, all the various "wheels" might only almost work for you. Libraries are rarely general enough to fit all situations, and having a bunch of corner cases where you need to use inevitable hacks to work with a library/tool builds up into tech debt over time.
Specialized projects/libraries can also turn into debt, but that debt is generally very stable and not too painful.
Did they reinvent the wheel?
Or did they make of their own wheel so that they could have the wheel that they need?
I just feel like this saying inappropriately used sometimes.
Semantics. They reinvented the wheel. The only time it isn't is when it innovates because a solves notable problems or improves existing standards.
Reinvented for your niche is still reinventing.
tubeless tires were a reinvention of the wheel
@@audiocorps2334is it "reinventing the wheel" when what you did was make the wheel skinnier and taller for your specific use-case, or is it just "making the wheel I need"?
Antithesis = An-ti-thu-siss.
Prime reading skill issues aside /s,
Fantastic video and highlighted article. SQLite is great and Turso helps emphasise this here.
Anything involving Tigerstyle with DST just vibes with me these days. My favourite way to get shit done both in professional work and personal projects especially.
@7:51, a lot of thread synchronization and locks utilize thread local storage. These are not safe in async code since this code and the release of the locks might not run on the same thread. C# is introducing async task local storage to create async safe synchronization and locks.
It would be cool to implement this testing strategy on a very simple project with different branches or components for triggering different bugs just to learn/teach.
I'm building a project that has a SQlite DB per tenant, it's super stable and easy to do.
how many tenant? how you handle table/column change?
19:33 SQLite is a regular filesystem file. There is no real usable Async filesystem kernel routine. All async fs stuff is run in a thread pool under the hood and appear to be async using fake syntax sugar. In all languages.
NT Native API comes looking 😉
Sqlite is best for read heavy usage. but for write heavy usage, we should prefer other databases.
@21:22 The issue is the "move" keyword: Everything referenced in this block will be moved in!
I think creating a similar "clone" and "borrow" block would make sense. But you would loose the explicit `.clone` and `&` in those.
For what it is worth, the move block semantics is not that bad and it forces you to be explicit and correct. But yeah... so many people raise this issue.
I once got bamboozled doing that kind of research for myself. I wrote my crud app building my own scheduler using rusts mio non blocking event loops for any hardware io and thread pools with sync channels for work dispatch.
It was very lightweight on ram and rather fast. But when spamming it with simulated ddos attacks, it hung up and didnt answer any more requests at some point.
I gave up and rewrote the entire thing using tokio. And the problems went away!
Just found out a week later that my experience was most likely caused by a bug in rustls crate that i used in the first version that was not present in the tokio-rustls crate i used for the second. This bug was so bad, it was reported as a CVE because it hung up some place in the tls handshake that could be exploited.
Still am wondering about speed comparisons of both implementations with the bug in rustls being fixed by now.
But didnt get around to rewriting it again yet :D
sqlite will be the db of choice for my current hobby project getting used to rust.
If it just had a good timeseries module so I could throw influxdb out of the stack :D
@@Th1200 sqlite3 has a robust extension api. Write one
I use it all the time for personal projects. It’s amazing
i always use it in my small project where i willnt handle many data since last month, my reason for switching from postgres to sqlite for my personal project is because i dont have to set up postgress in the server when i deploy it to my vps. i just run the binary (i embed the sqlite in the binary + the frontend) and everything is done
Sqlite just works 😊
Yup, if you don't actually need an enterprise-grade db and/or your data set is small and/or your architecture does not need a separate database, SQLite is just perfect #Chef'sKiss
I don't think the deadlock is Rust specific, except in the matter that Rusts Mutex is non-entrant, where-as a reentrant Mutex might not have caused the deadlock (but that might cause other problems!).
Basically thread A locks the mutex and calls some_sync_function. Then some_sync_function does async stuff that causes the thread A to yield to the executor to run another task, via thread A. Thread A then get another request and runs the initial code again which tries to lock the Mutex. But since Rust's Mutex is not re-entrant thread A is now deadlocked with itself and the Mutex is perhaps never released. All other threads handling request now also deadlocks on the perma-locked Mutex.
2:17 someone will hear the file based database and say aha I know my excel skills are not wasted.
Sync event loop in Rust sounds really interesting, would love to see it along with a zero cold start serverless server.
Re-inventing the wheel can be seen in different ways. Dogmatic DRY people say it all the time, and they ignore the learning opportunity.
People who love frameworking instead of just writing code re-invent the wheel in a negative way because they often make frameworks that ignore or even prevent using what built-in features a technology has. (Like J.S. frameworks that don't let you use html5 goodness)
My only complaint about SQLite is i wish it would enforce column types
Strict mode. It’s in the docs
I did a multi-tenant encrypted at rest database out of SQLite3 16 years ago.
DST in pt-br is Doença Sexualmente Transmissível (english’s STD)
Great acronym, love it
It will be fun to watch all developers who migrated their projects to SQLite when the ops guys ask "And how are we supposed to do backups now?!"
"Microseconds did not matter..." My Z80 would like to have a chat with you.
This is what you call a solution in search of a problem kids.
we need kernel developers to use tiger beetle style. Just caught up on the rust/kernel drama from a few months ago and it is sad
Greetings from the Netherlands, the Primeagen. I am not sure if you noticed the acrostic in the section "Appendix: Why not Zig?" Look at the boldly printed first letter of each sentence.
yes i noticed it right away
@@ThePrimeTimeagen Excellent, I should have known
OT: If the Primagen would be reacting to the new biggest found prime number, how would he sign off that video? "The name is the Mersenne Primagen!"
Yes, that is very exciting!
For the example shown at 21:47, while it would be nice to have some shorthand, you can at least not pollute naming with inner or inner_something by doing the clone within task::spawn but before the async move {}
let a = Rc::new(something);
handle = tokio::task::spawn({
let a = a.to_owned();
async move { inner_usage(a); }
});
other_usage(a)
7:23 "Why can't you hold a mutex across an async function?"
You can, but the problem is when you .await. There's a lot underlying rust's future model here, but basically the future will *always* return pending or will just straight up block the thread forever, depending on if you used a mutex that returns a future (ala, an async mutex from tokio or smol) or one that just goes and contacts the operating system for sync (std or crossbeam). The way around this comes down to just not mixing mutexes, or not doing recursive locking.
Glauber's a boss
"That's why I'm personally inventing kubernetes myself right now"
I've always been iffy about serverless. I just hate the idea of building something on serverless then the platform decides to raise rates and I am sort of stuck because switching providers is too costly in time/money. I've always preferred running on stuff that can easily be self hosted or moved to another provider without any issue and because of that self hosting serverless just seems pointless.
Serverless definitely has it's use cases but I have yet to find a use case for my stuff that can't easily be done another way and without the fear of my bill exploding.
IMO serwverless is for projects that have no budged to expend in infrastructure, serverless is really expensive compared with cloud VMs and PaaS oferings like Azure App Service or AWS Lightsail.
also migrating away from serverless is most of the time, rewriting the entire app
It's deeply silly to me, just make people download it at that point
Async sucks. Zig killed async for solid reasons. Google killed io_uring. You use Tokio for direct IO, rayon for bulk parallel computation. But um, all kinds of issues. Also stare at Concurrency for LLVM. If they can salvage async, would be amazing. Cancel blockage, function pointers, measuring memory, hidden control flow, coloring. Ya, I wanna see. There's no roll your own to reduce syscalls. It's a nightmare. He would be the one to fix it though.
No cold start can be handled the same way android handles its VMs. You basically pre-fork the initialized base VM (zygot in Android speak IIRC). This way, you're always connecting to a pre-warmed image rather than cold starting on each need.
Where could we see a stream where the CEO of Turso, the CEO of TigerBeetle and Primagen team-up in the limp-biscuit-Olympics?
The Shillagean with this weeks Shillogism
"This post is going hard" right after he read the line about the "r-word" nice 😅
The unit tests I write are just straightforward testing practices that you'd find in any good book, and they catch all sorts of things I would not have thought about or predicted. I rely pretty much exclusively on just run-of-the mill testing practices, unit tests written in a basic unit test framework (a-la xUnit or whatever), and I haven't had ay code defects in anything I've released in literally years. I also write a lot of runtime asserts checking before/after contracts.
Of course: I've seen other people's tests and they most certainly fit this "only catch what you predicted" criticism. In fact more often they don't even catch what you would predict; more often than not if you look at the test it's actually testing nothing at all: no asserts that could fail, or everything meaningful is mocked out. That's not because "testing is limited", that's because people limited: they are not trained or competent at testing, and the organization largely tries to indoctrinate them to be uncritical and unthinking which goes against the ability to do testing well. Do you think calling it "simulation testing" is going to change anything in the long run? If that gets popular, the majority will just do that terribly too.
"everything meaningful is mocked out" dependencies should be mocked or you will be testing multiple classes at the same time, you should have separate test for the depedency that test it in isolation.
Nice! I love Sequel Lite!
I don't get not reinventing the wheel argument, because we don't have cars with wooden wheels. The risk to reward ratio has to be worthwhile.
Per user DB... I do not feel that at all. Have fun managing that in some multi instance cluster environment. A normal database is 10x easier
Per user db is bullshit in 99% cases, but there might be 1% cases where it makes sense. Like some online photoshop, or other offline-like tools used in online.
Sooo. Basically this is a 30 min ad? No thanks. Good that you point it out though.
W editor ❤️
Prime consistently highlighting paragraphs without two or three characters on each end is triggering me so much 😭
Can you mention that if you are paid to review/read an article whether that also has impact on your opinion on it? Because I automatically assume bias when something is sponsored
7:13 "ready or not, here i come, socket write" :D
_sorry, had too_
they would not have this problem if they hired C++ engineers instead of rust kiddies.
I quite literally have the tigerstyle video in the next tab. Didn't realize that was that.
1:31 the r-word is "rust"
Squeal lite?
0:00 Did he say squirrel it? Like making something become a squirrel?
I have seen such kind deadlicks because of OS changing thread. To me was solution to go shorter timeslots so that probability goes down
The solution was that time 20 yesrs ago use someyhing faster than os thread switch that time that was directx timers. so we had like 240 voip streams + control sent and recieved 30ms cunks while os alowed only 1ms delta. Mutex in hardware interupt callback caused trouble.
Does anyone where you can find the presentation that prime mentions around 9:35?
SurrealDB already has multi tenancy/db per user
Woah the word multi tenancy turned into a link
@@kv4648 for me it’s not, but i am seeing a bunch of other links. Mindblowing. UA-cam showing those based uniquely per user? Interesting…
Great video!
I think Elixir might be the only language that does async right. And other BEAM languages, of course.
KimStyle method implies keyboard, mouse and handgun
Hey that's Clickbait Prime lol. SQLite is freaking awesome for some things you would not expect.
Sqlite is amazing, pitty that it is a pain to get working in a cloud environment.
Can you make a video on how you implemented the seeded simulation testing in your go project?
The issue with async + mutex is that async is all about interweaving work on a thread - the thread "parks" the work and goes and does something else whilst waiting for the async task to return - do if the mutex is implemented at thread level (common) it blocks the entire thread and all async stuff happening on said thread. Easy to deadlock. Also if you understand multithreaded code in your language / environment / os etc you'll get right during analysis. I mean you do have to understand how async works under the covers before you go writing concurrent code, right? Or is that just me?
3:20; Couchbase enters the chat
Why do they need to create all 5000 db files in one call? MQ that thing and have "eventual" creation. Still fast but no threadaches?
I would used the word focus over terrified. If as a software engineer you are chosen to be overseen by the leader of DPRK, trust me, it's because you're the best there is in the country and you know exactly what you need to do.
Now if Joe Biden were to look over your shoulder, that would be terrifying. Because you'd know you weren't chosen because you were the best, but rather because you had the more compatible politics. You know you're likely to fail, and you have no idea what's gonna happen to you when you do.
I haven't used multi tenancy in production yet because too much overhead maybe I haven't implemented the correct pattern but I would appreciate really super simple multi tenancy I think I came across the same issues as these dudes
i don't get it. They are creating a service for creating and accessing SQLite database files? why? anyone can easily create their own SQLite database files on their own computer or server.
File-based database. MS Access came out in 1992.
Different use cases. Access is specifically for small buisnesses to store their records, it's very good at that and I'm amazed that they dropped support because people keep just using Excel.
Adding an advise advert
Always was the best
Interesting as I need to use SQLite for an upcoming assignment 😎