how does a Mutex even work? (atoms in the computer??)

Low Level

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 3 лют 2025

КОМЕНТАРІ • 122

@edgeeffect 2 роки тому ⁺⁷⁹
I looked at the title and thought "Oh! Everyone knows what a mutex is!" But I'm glad I watched because I'd never thought that acquiring a mutex either needs a mutex or needs to be atomic. ;)
Great stuff as ever!
@tf_d 2 роки тому ⁺¹²
Reminds me of the question, "who delivers the mailman's mail?"
@Stopinvadingmyhardware Рік тому ⁺³
@@tf_d his wife
@Vnifit 2 роки тому ⁺⁹⁰
God I love your channel, your videos are the best I've seen on embedded/assembly stuff. Really fascinating topics presented beautifully!
@savithaiyer8017 2 роки тому ⁺⁴²
Damn, I’d love a video on a DIY RTOS. Love your videos man!!!!
@esra_erimez 2 роки тому ⁺²
+1 because like wasn't enough
@formulaetor8686 Місяць тому
I LOVE YOU for the sheer simplicity of this video!
@KenJackson_US 2 роки тому ⁺⁶
If your processor has a Read-Modify-Write instruction, you can implement it in one instruction, simple. If your processor lacks it, the simplest way is to disable interrupts for the read, modify and write instructions and then reenable interrupts. Though you have to be careful to _not_ reenable interrupts if they were initially disabled.
@MindlessMegaLawl 2 роки тому ⁺¹⁵
Mad respect as an argentinian for showing footage of the Super TC2000 motor racing series, one of (if not) the most technologically advanced series in Latin America. Thank you ✨
@filker0 2 роки тому ⁺⁸
This is a good video, though I would have used different instructions for the MuTex.
Most ISAs have instructions that are intended for implementation of synchronization primitives like MutEx and Semaphores. The ARM Cortex-M provides LDREX and STREX. The first sets a reservation for the source memory address. If any other entity writes the location, the reservation is lost. STREX conditionally writes to the location, and will fail if the reservation has been lost. The entity that writes the memory and breaks the lock could be another thread on the same or other processor cores or DMA from an external master. Usually, a MutEx will not be set up in memory that is the target for DMA, but having multiple cores, even on microcontrollers, is becoming more common.
@victoraraujo1255 2 роки тому ⁺²
I'm a simple man. I see a video of this guy, i watch and leave a like
@crustykrist7144 2 роки тому ⁺¹
great video bro. Very clear and to the point. TY!
@LowLevelTV 2 роки тому
Thank you :)
@luismorales9878 2 роки тому ⁺⁴
Nice, i always wonder how could a mutex gives the security of multithread, the Atomic operations Is the key, thanks Man, AND like
@jonshouse1 2 роки тому ⁺³
"Mutex in C" should be "Mutex with POSIX threads". Plenty of systems use mutex without pthreads. ESP32 Base code and FreeRTOS for example. Not sure why anyone would make yet another RTOS. FreeRTOS is good for small stuff, NuttX for larger stuff, Linux for big stuff, Any of the three +Gate array for obscure/awkward stuff !
@PerryNewlin1 3 місяці тому
absurdly amazing explanation!
@gaborm4767 2 роки тому ⁺¹⁹
What are the differences between mutex and semaphore and what are the pros and cons?
@abidanbrito 2 роки тому ⁺¹³
From my understanding both are synchronization primitives and both are used to solve critical section problems. Having said that, a mutex is a locking mechanism, whereas a semaphore is a signaling mechanism.
The former goes to say that a single task can acquire the mutex, effectively owning it. This means that only the thread / process associated with that task can enter the critical section. The mutex needs to be released by the owner thread, and it can only do so upon exiting the critical section. At a high level, it can be described as a mutual exclusion object. A mutex object is either locked or unlocked.
The latter pretty much says that if a thread is waiting on a semaphore it can still be signaled by another thread. In short, a semaphore can be used by several threads (based on priority). That is not the case with a mutex. There are two types of semaphores: binary and counting. At a high level, it can be described by means of an integer variable. Note that a binary semaphore is not equivalent to a mutex.
As for pros / cons, I recommended looking those up online (there are some lists on blogs), or trying both out for yourself on a microcontroller.
@gregoryfenn1462 2 роки тому ⁺⁷
Mutex is a Boolean lock/unlock state. Semaphore is a finite resource counter (E.g. only three processes can access some data at once). Note that a mutex is a semaphore - it’s a semaphore where the counter is set to one so that only one thing can access the resource at at time.
@gaborm4767 2 роки тому
@@gregoryfenn1462 yes, so i ask pros and cons because if you can do the same and more with a semaphore like with mutexes, what is the reason for the existence of the mutexes?
@gregoryfenn1462 2 роки тому
@@gaborm4767 randombit answered too but the short answer in general is that since semaphores can do more than muticies, they won’t be as efficient or simple to use as a mutex. A tool that is can do lots of things will be less good at particular task compared to tools that are more specialised. So if you just need to lock or unlock a resource, a mutex is better than a semaphore.
@abdullafaizov7963 23 дні тому
Damn bro, you explain so nicely
@Nellak2011 2 роки тому ⁺¹
Atomic lock sounds like Atomic Clock.
Imagine a processor running on an Atomic Clock using Atomic Locks for Mutexes, while the Program running is running Atomic Simulations that were written in the Atom editor.
@nimcompoo 2 роки тому ⁺³
Everything was fast until the point u opened the atom editor
@KangJangkrik 2 роки тому
@@nimcompoo yess thanks to atom's electron js
@petevenuti7355 2 роки тому ⁺¹
I know that this is ment for humor, but timing between threads are critical and running on an atomic clock would be awesome
@PurpleComrad 2 роки тому ⁺²
Low Level Gang 😎
@joaquinipar6133 2 роки тому ⁺¹
0:52 unexpected tc2000 appearance
@petevenuti7355 2 роки тому ⁺⁷
what about in distributied systems? are there other techniques that work with highly asymmetric systems? possibly other complete semantics and methodologies that don't require coordination in distributed multi-processor systems. please what can you tell me?
@jumpingphantom 6 днів тому
Abraham Silberschatz's Operating systems book aka "the dinosaur book" is a classic gem it covers a lot of topics like this, I recommend anyone interested to give it a read
@rallokkcaz 2 роки тому ⁺¹
LOW LEVEL GANGGGGG!
@marksabelita 10 місяців тому
so freakin helpful! 👏👏
@n00blamer 2 роки тому ⁺¹
02:29 lock doesn't make instruction atomic; the instructions the prefix has effect are already atomic read-modify-write operations.
SO: "The LOCK prefix ensures that the CPU has exclusive ownership of the appropriate cache line for the duration of the operation, and provides certain additional ordering guarantees. This may be achieved by asserting a bus lock, but the CPU will avoid this where possible."
The effect of lock can be quite dramatic performance drop as "ownership of the appropriate cache line" means that on multi-core architecture the most recent writes into the line have to be propagated to every other core accessing the line (when the cores don't share the caches, the level of sharing varies between architectures and cache hierarchy levels but L1 is typically exclusive).
@AlessioSangalli Рік тому
Nah. Cache coherency is a small part of memory consistency. What the lock guarantees is that load/store queues are flushed and executed the right way and that the observes in the specified domain see the same thing. But the first part of your message is correct, as in a single core system hardware locks are not necessary.
@n00blamer Рік тому
@@AlessioSangalli TL;DR - the instructions are already atomic and lock does nothing along those lines. Short version is that it gives exclusive access to the memory and what you describe is effect of that fact. It just happens to be a detail that it is more efficient to have the exclusivity per cache line rather than all of the available memory so that more shit can happen at the same time. Thanks for clarifications btw.
@AlessioSangalli Рік тому
@@n00blamer yeah but what I wanted to clarify is that there are mechanisms that act even before the caches that play a role here. Thanks!
@n00blamer Рік тому
@@AlessioSangalli Indeed, I recognized that and thank you! I guess when you write too much past the comment in mind it gets tangled up ... :P anyways, cool stuff
@Meleeman011 9 місяців тому
this is why global state in frontend apps exists i get it now
@Bvic3 Рік тому ⁺¹
Great video. Now, how does lock work in silicon?
@gabrielbarrantes6946 Рік тому
Is probably quite simple, let's say two cores, you check the instruction on the other core if is not a mutex then execute normally, lock the memory block and continue, if by chance both cores are receiving the mutex in their pipelines at the same, then just choose one and lock that one, the other will wait in the loop until the lock is released...
Pretty much is communication within the cores as you would in human time scales, like a core would say, I am locking mutex at address in the next cycle, is any of you going to lock at that address next cycle, if no then no collision, if yes just "hard-print" which core will lock and which one will wait...
Is pretty much what is expected but at a cycle/instruction level...
"Hard print" because processors at printed not coded lol
@alexandrohdez3982 2 роки тому
👏👏👏 great video..
@jacob_90s 2 роки тому
This just really makes me hate all the assholes on programming forums I had to deal with all the more, back when I first started learning how to program in 2006. Just asking how anything in the standard library works would immediately get you attacked:
"Why are you even asking that; just use the ones in the library"
"What, you think you can do it better"
Even if you managed to get thru to them that you just wanted to learn how they work and weren't trying to implement it yourself, you would suddenly be met with this wall of silence. Always dozens of people willing to criticize, but never anyone willing to actually answer the damn question.
Have learned a lot since then, both about how the underlying mechanics of many programming languages and hardware, as well as that the most vocal people in a group often have little else to contribute that simply being the loudest, but it really made me reluctant to want to deal with other programmers unless it was face to face
@flaguser4196 2 роки тому
been using lock flags in gui apps even before having an idea what a mutex is. i think just calling it "mutex" without understanding how it works behind the scenes makes it unnecessarily mysterious.
@_-deep_a_b.862 9 місяців тому
i learned something Thanks!
@lucidmoses 2 роки тому ⁺²
May want to add a sleep step in there so your not wasting the cpu. Other threads could be wanting to run.
@conorstewart2214 2 роки тому
It should be a yield step not a sleep step, a sleep step would still waste the CPU, a yield step would allow the RTOS to execute other tasks whilst it is waiting. It is just an example though so what he showed here probably wont be what ends up in the final RTOS.
@SerBallister 2 роки тому
@@conorstewart2214 I do like RTOS. Need to be careful with thread priorities. Yield really only works on threads of same priority.
@marcossidoruk8033 Рік тому
The context switching may be expensive and the other threads may not be able to do anything meaningful if you are locking for short time periods. What LLL described is a spinlock and it is actually the way to go when you are only locking for short amounts of time and do not want to pay the context switching cost.
@AlessioSangalli Рік тому ⁺²
OK im a little confused. The STM32 blue pill should have one CortexM3 core. As it is not a multiprocessor, are hardware atomic operations (as in with the correct memory barriers) even useful?
@AlessioSangalli Рік тому
The example at 2:20 uses Intel assembly, that is more likely to have multiple cores
@dominiquefortin5345 Рік тому
For my part, I never understood why cpu maker create a “CAS” opcode (compare and set) for mutex. Compare operation require the ALU so are slow as opposed to an “XCHG” opcode (for i86 or TAS for 68k) which is much faster because it is only 1 read and 1 write to memory. From there, you build a spinlock mutex to make an OS mutex which can be used to make a full semaphore.
@marcossidoruk8033 Рік тому
cas is for other things, not spinlocks.
You are indeed right, atomic exchange one time and then atomic load in a loop is better for spinlock implementation.
@whtiequillBj 2 роки тому
I was recently reading about quiescent consistency vs liniarliability.
The document I was reading was: Quiescent Consistency: Defining and Verifying Relaxed Linearizability
tl;dr you wait for all code to end execution (a point of quiescence) so there are no thread collisions.
Linearizability is what you explain with the lock whereas quiescence would be programming the start and end of threads such that they wouldn't collide.
The documentation calls quiescence soft and linearized hard. Cause calls separated by points of quiescence can not collide yet using a lock is used essentially to fake quiescence.
Since you are writing an OS what is it that contributed to using the Linearized paradigm vs using times of quiescence to separator thread timing?
@herrxerex8484 2 роки тому
You are a godsend
@esra_erimez 2 роки тому ⁺¹
DIY RTOS video, please.
@kvelez Рік тому
Excellent.
@tempname-dr2bm Рік тому ⁺¹
This is not an mutex, this is a spinlock ;)
@Just_An_Idea_For_Consideration 2 роки тому
I may be wrong, but I believe we called this a Semaphore back in the early 1980s.
@DFPercush 2 роки тому
I didn't learn the word "mutex" in my computer engineering classes (mid 2000's) either, just semaphores. A mutex is basically a semaphore initialized to 1. But being able to implement that in user space with a spin lock gives certain advantages, I guess, like avoiding a context switch / system call. Sometimes you need to get the scheduler involved so you don't have worker threads eating up cpu cycles when there's nothing to do. But just ensuring exclusive access tends to burn cycles when things are already spun up anyway.
@ross9263 Рік тому
Hey can you do a video on concurrent programming: threads, fibers, etc. maybe even job systems
@ovsepavakian6109 2 роки тому
Does it mean all the mutex implementations are actually powered by cpu support of lock commands?
@kevy1yt 2 роки тому ⁺²
Isn’t there a C directive to define a variable, in this case your mutex var, as atomic so you would not need to drop into assembly?
@kevy1yt 2 роки тому
@250CC I never understood what volatile really means or does. I know it has something to do with interrupt routines. Thoughts?
@KaneYork 2 роки тому
@250CC do NOT attempt to use volatile to implement atomic access. Your code is wrong and will fail as soon as you stop using x86 processors (free acqrel on every basic memory operation)
Atomics also inhibit optimization, but you are told EXACTLY how optimization is inhibited instead of "I dunno it does something I guess"
Relaxed atomics are free on ARM, just use those
@KaneYork 2 роки тому ⁺¹
I have seen exactly two valid use cases for C volatile: (1) POSIX signal handlers (2) Platform-specific mmio code. The C standard alone gives no useful behavior for volatile and you must combine it with additional guarantees for it to do anything at all
@DFPercush 2 роки тому
@@kevy1yt If you do any development on game consoles you'll be using volatile to access I/O registers. It doesn't mean atomic though.
@kevy1yt 2 роки тому
@@DFPercush what is its purpose with I/O in consoles?
@Stopinvadingmyhardware Рік тому
Wait, you wrote a RTOS, and didn’t make the main process the scheduler, which keeps track of these things?
@109Rage 2 роки тому ⁺¹
I knew the basic idea of a mutex already, but it didn't occur to me that a mutex itself could run into a race condition as well, into partway into your explanation on what they are lol. Still not a hundred percent on how this works (dunno any assembly) but now I know it's not as straightforward as I naively assumed.
@int16_t 2 роки тому
Think of two people which needs to use a bathroom. The people are the threads and they need to shit. If they shit, they lock the bathroom and if they're done, they unlocked it.
@petevenuti7355 2 роки тому ⁺¹
got the source up anywhere?
@shadamethyst1258 2 роки тому
I heard that you also have to temporarily disable DMA before you perform the atomic operation; is that true on your chip?
@kayakMike1000 2 роки тому
Hum... I am working on the same thing, though I am targeting a few different chips... Sorta built-in portability.
So... Lemme get your opinion. The OS needs to set up each process stack (and initial context frame), each process heap, schedule each process, and provide mechanism for IPC?
@noahwilliams8996 2 роки тому ⁺¹
But how does the assembly instruction work?
@williamdrum9899 2 роки тому
I'm not too familiar with x86 but I believe what happens is that if another processor attempts to access it, it will halt until the lock is released and then continue.
So if I run the instruction
lock inc byte ptr [1234h] then any other processor that tries to do the same will idle until the first processor finishes. Remember this is only really a thing for instructions that need to read, edit, and writeback. You actually can't LOCK any arbitrary instruction from what I remember. Only the ones where it matters
@Jim_Frost 2 роки тому
Is your OS code on Github? It would be great to follow along for learning.
@mjan9347 2 роки тому
Can you do a video on TSX (HLE/RTM)?
@LewisCowles 2 роки тому
Do semaphore next ;) then yielding (generators)
@NoorquackerInd 2 роки тому
Does this mean that there are times when there are non-blocking mutex lock functions, like where it'll just fail (silently or not) if it's already in use?
@DFPercush 2 роки тому
It's certainly possible. For example in Windows API there is a function, WaitForSingleObject(handle, timeout), which can return with an error code after a specified delay (which could be 0). Or if you're rolling your own assembly, you can just set a flag indicating whether the lock succeeded and not loop. I'm not sure if the POSIX standard says anything about that. If you're stuck with a blocking API, you could also use a native mutex to protect your own custom mutex variable, to get around the problem in the "bad mutex" example. That's not perfect, as the process scheduler could interrupt you at any time with the lock active, but it at least minimizes the time in the critical section, assuming that the real data you're trying to protect takes significantly longer to process.
@efronlicht1043 2 роки тому
As another example, Go 1.18 added TryLock() to its various mutexes for exactly this case.
@kap1840 Рік тому
Will this implementation work in a multi processor environment where cpu caches are used?
@AlessioSangalli Рік тому
The atomic hardware operations will take care to trigger the appropriate memory barriers, otherwise they would.be useless. In a non-multiproceasor environment atomics are not even necessary for mutexes
@yokba 2 роки тому
Could you explain the difference between mutex and spinlock?
@kennichdendenn 2 роки тому ⁺¹
The spinlock is used in conjunction with a mutex variable - it is one possibility of waiting until the mutex var is no longer in the locked state. You can think of the spinlock as a while loop that checks, if the mutex var is currently in the unlocked state. If not, it checks again and again and again until the mutex finally IS unlocked. Thats kinda wasteful because it uses CPU cycles for the constant checking but very easy to implement.
@DFPercush 2 роки тому
A spin lock is one possible implementation of a mutex.
@SerBallister 2 роки тому ⁺¹
A mutex usually puts the waiting thread to sleep, so the only CPU used is when the mutex is released and the waiting thread is told to wake up. A spinlock will have the CPU in a loop constantly testing a variable.
@marcossidoruk8033 Рік тому
A spinlock puts the thread to spin on a loop while a mutex puts the thread to sleep, usually using a special OS system call that tells the OS the thread is waiting on a mutex so that the OS can optimize for that in a way it wouldnt be able otherwise.
Generally, spinlocks are faster than mutexes because mutexes pay a rather expensive syscall cost and spinlocks dont. However Spinlocks do not release the threads so if the number of threads is greater than the number of cores in the system the OS shcheduler will probably go nuts switching between the waiting threads and that is bad. Also spinlocks are generally a bad idea if you are waiting for relatively long periods of time.
So generally, use a mutex if you are waiting for long periods of time or you have a lot of threads and use a spinlock if you are locking and unlocking really quickly.
@mytech6779 Рік тому
Ah I suspected locks need some amount of hardware support.
@Ryan-xq3kl 2 роки тому
Are these only relevant to hardware level operations ?
@nothappyz 2 роки тому
Wait, wouldn't that be busy waiting? I'm taking a uni class on calculators, please educate me
@gabrielbarrantes6946 Рік тому
So, the software mutex is actually done at a processor level 😂😂, makes sense if you think about it, because is a recursive problem if you try to solve it at a software level, so do it at processor (physical) level, is not quite clear from the video tho...
@cvspvr Рік тому
just a friendly reminder: templeos is still the greatest operating system of all time
@lagging_barish3736 2 роки тому
Atomic Operations are operations on the CPU that cannot be interrupted and are therefore immune to a race condition.
What if there are 2 threads running on 2 separate cores of the cpu. Both threads can compare and swap at the exact same time. What happens then?
@marcossidoruk8033 Рік тому
One thread will have to wait on the other for the memory access, thats the whole point of atomic operations and I think he mentioned it.
@satellite964 2 роки тому
Yeah but how is it implemented in hardware>?
@AvenDonn 2 роки тому
Great, now I'm going to have to find a video how uninterruptable instructions are possible in multi-core CPU architecture.
And you didn't even show how mutexes are released. Or how re-entrancy is handled (each thread has an ID)
@KangJangkrik 2 роки тому
So Linux' mutex can be improved with that way?
@bob-ny6kn 2 роки тому
Back in the '80s we called it a semaphore. How is this "new" again?
@humanrayla4785 5 місяців тому
Still confused but progress
@eyemotif 2 роки тому ⁺²
// rawr
@TheExileFox 2 роки тому
But what if you want something that isn't a mutex but also isn't a regular data type? something with a "write protect lock" that can be enabled or disabled on-the-fly? (but still allows multiple threads to read from it regardless of it's state)
What do you suggest then?
@BuckFord-p3f 3 місяці тому
Adelbert Square
@admink8662 2 роки тому
Greatt
@sayanhanra3029 Рік тому
Can anyone give me resources for learning os? as I am a beginner.....
@traveldiary1455 8 місяців тому ⁺¹
It's not mutex , wrong tutorial , it's condvar
@khanumar6938 4 місяці тому
@@traveldiary1455 both work together
@petemoss3160 3 місяці тому
MuTexas?
@erin1569 16 днів тому
omg I really liked this, now pls just don't turn out to be some weirdo
@dtikvxcdgjbv7975 2 роки тому
Advanced.
@arctan2 2 роки тому
🐧
@_Omni 2 роки тому
Spinlock
@abdullaalmosalami Рік тому
I'm sorry but I don't think there was enough explanation on how this "global variable" actually "locks" some other piece of memory for one task. Let's say I have two tasks, A and B, and they both at some point are attempting to write to a global variable myVar. If within the code of each task you literally have myVar = something, then what? Where does this "global variable mutex" come in to play?
@cernejr 2 роки тому ⁺¹
To me the background music is very annoying. I had to watch on mute with CC.
@TheVoogler 2 роки тому
Lmao
@zilog1 2 роки тому
rawr owo

Наступне

Автоматичне відтворення