I think you have a mistake when talking about the walk function when you say the first entry of the newly allocated table is marked valid. pte is still pointing to an entry in the previous level page table which is marked valid and pointed to the newly allocated page with the line: *pte = PA2PTE(pagetable) | PTE_V;
plz continue this series!!! It was an excellent learning resource to get into kernel inner workings so far and I think for me most of the things clicket.
I'm gutted to have reached the end of the playlist so far, but I bet these videos are tough to make for you, considering the amount of information you're condensing into sensibly long videos. Thank you so much for that, I learned incredibly much!
Such a nice series, Francis. Really well explained, at least from the point of view of someone who has little to no experience in kernel stuffs. Can't wait for the next episode!
i keep my fingers crossed, that you continue the series, i am astonished by the way, that you do the usually boring code readings, i am not a newbie but to study the code is very intimidating, if there is no guide who knows which parts of code are important and of a secordary importance. great job
22:48 Having paging enabled for the kernel as well allows for the protections that you show later, such as read-only kernel code and stack guard pages. Incidentally, they should have mapped the .rodata read-only at 57:20. Interesting video, thank you!
Thank you so much for this amazing series of videos diving into the OS code! I really enjoyed the series. I'm eagerly awaiting the continuation about processes and trampoline, as well as the scheduler. Please keep making these videos; you're doing a fantastic job! By the way, do you know of any similar deep dives into the code for other projects? Any articles, books, blogs, or videos you could recommend?
Point of clarification: on the page table diagram, should all of the “PTE 512” be “PTE 511”? Since there are 512 entries but we’re going 0, 1, 2 … 510, 511.
Really love your series and I am looking forward for each new episode. I really like that you always follow functions to the end. However, I had the impression that this episode was a little bit quicker and you „rushed“ some parts, for example the KSTACK macro.
I feel even more confused... so every time I reference a pointer in a code, I go through all that page table jumping? How do I get contiguous memory access then, if in fact I get access to contiguous part of virtual memory? I need to spend more time on this topic...
@Low Byte Productions, Never did debugging with VS Code. Actually never used gdb before. What gdb extension are you using ? I use the risc-v toolchain directly on my macOS. And make qemo-gdb is working.
Virtual memory was much easier to understand when your Windows PC had 4 MB of RAM, while an application can successfully allocate and use 2 GB MB of memory. Memory is your 2 GB virtual address space. RAM is one thing that can by used to hold your memory.
Is there a reason these page table entries are not implemented as a struct using bitfield? I have the feeling this would make a lot of this much easier to grasp compared to dealing with all these magic bitshifts and mask operations. Is there the worry that the compiler will do some shenanigans in the background to the struct's memory layout? I thought this can be turned off?!
I can't say for sure, but I would guess that bitfields having implementation defined behaviour would be the main reason. Even if you can control the semantics with compiler flags, that is essentially out-of-band with respect to the code. Using shifts and masks means that the implementation will work regardless of compiler. As a side note: I don't personally think shifts/masks as magic. When I first encountered them, it was definitely confusing, but it actually become intuitive and natural at a certain point. I think that's the case with a lot of things.
When we are done setting the satp to turn on address translation and return to our main() (or more likely, just inc the PC), does this only work, because we have mapped the same physical address to the virtual address for the kernel? The PC will still point to the old address, if I understand it correctly.
Am I understanding correctly that these page tables are in main memory? Even if it lives de facto in the cache, that seems like an enormous overhead for literally every instruction. This can't be right, it can't be tying up the busses like that.
It's absolutely true. But what you're saying also true - it is an enormous overhead. This is why a structure called the TLB exists: the translation lookaside buffer. It's a cache that operates as a kind of hash table. Hardware accessing memory will first check the TLB, and otherwise walk the page table (writing the translated address to the TLB afterwards).
@@LowByteProductions I'm sort of disgusted, but also kind of impressed that it seems to still be pretty fast. What is the actual cycle count for a TLB hit in the cache, then?
I don't understand something. If there is 3 levels page table and each level hold 512 entries and each entry takes 8 bytes (64 bits) and function "walk" initialize them all, that means only pagetable needs 512*512*512*8 [B] == 1GB of the memory. Something's not right here (probably i didn't understand drawio diagram correctly).
That seems mostly correct, although it only considers the third level of page tables. In total, the page tables use (1 + 512 + 512*512)*(4KiB), which is about 1GiB + 2MiB. Remember that this only covers the memory used for virtual address translation. The final layer of page tables has 512*512*512 entries, which all point to 4KiB pages that hold the memory the process sees (512 GiB total).
It would need that much page table memory _if all_ of the virtual memory space were to be initialized, but that’s never the case. Programs start out with a pre-determined amount of memory needed to load the code and pre-initialized data, and at run time request additional memory on an as-needed basis, so the kernel only needs to initialize the page table for that smaller range of memory and then add pages as needed. Entries in the 1st-level table don’t need to point to a 2nd-level table if there is no memory allocated in the range the entry would point to, and likewise for the 2nd-level entries.
You don't need to use full table for everything. For example, you can use only 3 tables to map 2 meg process. You'll need ~1 gig table only to map something huge that use 512 gig of memory, which is only ~0.2%
I watched this three times now. What I dont get is: Why do we need that 3-level structure of tables? I mean, why cant we have a linear table with 512*512*512 entries each entry containing a mapping of virtual page to physical page?
The thing about page tables is that they're sparse, i.e. you actually don't define most of the table. You only need to setup entries for the memory you map for each process. What you're suggesting would be simpler in a way, but also extremely costly. 512*512*512 entries, with 8 bytes per entry is 1GiB - and that's per process! Not to mention the kernels page table.
Nice series I think that you are explaining stuff that are not directly related to kernel stuff a bit too much, so I end up skipping large parts of the video, like explaining the algorithm of printf going through each iteration of the loop, or explaining how linked list to handle free memory works I just feel like that is just a normal C programming not very kernel specific However remember that im just a random person that might disagree with you on something, so keep up the great work 😉
Your explanations are brilliant! I found a playlist of videos that cover the general concept of virtual memory and, for example, the TLB. As someone who has not studied this before, they really helped me gain a solid understanding of the topic: ua-cam.com/play/PLiwt1iVUib9s2Uo5BeYmwkDFUh70fJPxX.html
Great content, but difficult to digest. Maybe don't go so in_depth And just get the principle across. People who really want applicable knowledge will do further research anyway
That's the thing, though: I'm making content for people who want to go further. This whole channel is about going in depth because so few others actually do.
@@LowByteProductions I see. I understand. Maybe it maybe be better to have a fairly concise description of everything you plan/explanation to do in the beginning and also go in-depth later to gethr best of both worlds?
I think you have a mistake when talking about the walk function when you say the first entry of the newly allocated table is marked valid. pte is still pointing to an entry in the previous level page table which is marked valid and pointed to the newly allocated page with the line:
*pte = PA2PTE(pagetable) | PTE_V;
Yes, I thought the same thing.
Yes you're right!
This series is awesome. I really hope for continuation
plz continue this series!!!
It was an excellent learning resource to get into kernel inner workings so far and I think for me most of the things clicket.
Best *nix series on youtube by far!!
I really hope there will be a continuation!
Somehow the youtube algorithm landed me on this goldmine. great job!
I loved the series so far and hope we will get another episode in the future :)! Thank you for your work
Excellent. Simple code base, well explained. Don't keep us waiting for too long ...
Great Work. Please continue the series🙏🙏
Incredibly useful series of videos. Can't thank you enough. Hopefully you will get a chance to continue this series. Waiting eagerly.
I'm gutted to have reached the end of the playlist so far, but I bet these videos are tough to make for you, considering the amount of information you're condensing into sensibly long videos. Thank you so much for that, I learned incredibly much!
Such a nice series, Francis. Really well explained, at least from the point of view of someone who has little to no experience in kernel stuffs. Can't wait for the next episode!
i keep my fingers crossed, that you continue the series, i am astonished by the way, that you do the usually boring code readings, i am not a newbie but to study the code is very intimidating, if there is no guide who knows which parts of code are important and of a secordary importance. great job
I am currently studying xv6 (there is a book which is REALLY great). This series is a great compliment. Please continue it!!!!!!!!!
I have been waiting impatiently for another episode of "Source Dive" since the very end of the previous one ;-)
22:48 Having paging enabled for the kernel as well allows for the protections that you show later, such as read-only kernel code and stack guard pages. Incidentally, they should have mapped the .rodata read-only at 57:20. Interesting video, thank you!
Extremely valuable walkthrough series.
Thank you so much for this amazing series of videos diving into the OS code! I really enjoyed the series. I'm eagerly awaiting the continuation about processes and trampoline, as well as the scheduler.
Please keep making these videos; you're doing a fantastic job!
By the way, do you know of any similar deep dives into the code for other projects? Any articles, books, blogs, or videos you could recommend?
Amazing content. When you explained like this it is very understandable. I hope you continue on xv6 source dive.
This is simply amazing! Best OS explaining video I've seen so far. Thank you so much and please keep it up!
Point of clarification: on the page table diagram, should all of the “PTE 512” be “PTE 511”? Since there are 512 entries but we’re going 0, 1, 2 … 510, 511.
You're correct!
this is awesome. i love these videos that dig into well written open source code.
please keep updating!!!!
Great series. Thank you very much. Enjoying every second of it.
Great video, as always. Perfectly clear explanation just as it should be. p.s. its nice to watch how your hair grows with each video
This is great education and great entertainment all in one! Thank you!
Cant wait for the next episode
Loving this series. Keep up the awesome work!
Great series! When's the next part?
when are you doing #6? great content
I completely like these series, so sad that I could not made such series myself before. Now you are a "pioneer" ?) Really lovely stuff
Appreciate this series, was very useful. Thank you.
Thank you so much for another great video in this series.
Thanks for the high quality content!
Totally fascinating. Please continue.
Loving this series! :)
Really love your series and I am looking forward for each new episode. I really like that you always follow functions to the end. However, I had the impression that this episode was a little bit quicker and you „rushed“ some parts, for example the KSTACK macro.
Thank you. Yes it's quite a big topic and hard to fit into a hour!
Great series, pls release more pieces
When are you making more videos?! I love this series ?
Keep this series coming!! Very good videos
I might be wrong, but 55 bits for addressing is 2^55 or 4 petabytes of addresses? 256 petabytes of potential ram is indeed a really big number!
Thanks for making this video. I learned a lot.
Graag gedaan en goed om te horen!
oh no, I need the next episode! :D
This is awesome, can you finish the series
Can you explain a little bit about how you set up gdbserver attach to vscode? Or there is video I missed
such a great series
I feel even more confused... so every time I reference a pointer in a code, I go through all that page table jumping? How do I get contiguous memory access then, if in fact I get access to contiguous part of virtual memory? I need to spend more time on this topic...
@Low Byte Productions, Never did debugging with VS Code. Actually never used gdb before. What gdb extension are you using ? I use the risc-v toolchain directly on my macOS. And make qemo-gdb is working.
I'm using "GDB Beyond" & "Native Debug"
In your graphic, wouldn't the last page table entries be 511 since you start at 0?
You're right!
Virtual memory was much easier to understand when your Windows PC had 4 MB of RAM, while an application can successfully allocate and use 2 GB MB of memory.
Memory is your 2 GB virtual address space.
RAM is one thing that can by used to hold your memory.
I understand virtual memory right now.
Hey I am trying to debug with gdb, any hint of how to configure the launch.Json ?
thanks
Is there a reason these page table entries are not implemented as a struct using bitfield? I have the feeling this would make a lot of this much easier to grasp compared to dealing with all these magic bitshifts and mask operations. Is there the worry that the compiler will do some shenanigans in the background to the struct's memory layout? I thought this can be turned off?!
I can't say for sure, but I would guess that bitfields having implementation defined behaviour would be the main reason. Even if you can control the semantics with compiler flags, that is essentially out-of-band with respect to the code. Using shifts and masks means that the implementation will work regardless of compiler.
As a side note: I don't personally think shifts/masks as magic. When I first encountered them, it was definitely confusing, but it actually become intuitive and natural at a certain point. I think that's the case with a lot of things.
Is there any documentation for this source code
When we are done setting the satp to turn on address translation and return to our main() (or more likely, just inc the PC), does this only work, because we have mapped the same physical address to the virtual address for the kernel? The PC will still point to the old address, if I understand it correctly.
Exactly - it works because we built a 1-to-1 mapping for the physical address space.
Thank you!
Thanks again :-) Keep going.
I felt so adventurous that I tried to cross-compile the kernel on M1 Mac with nix and I failed miserably :D
I did not understand vm right now. I will need the next days to play a bit around.
Please do one on OpenSSL!
Am I understanding correctly that these page tables are in main memory? Even if it lives de facto in the cache, that seems like an enormous overhead for literally every instruction. This can't be right, it can't be tying up the busses like that.
It's absolutely true. But what you're saying also true - it is an enormous overhead. This is why a structure called the TLB exists: the translation lookaside buffer. It's a cache that operates as a kind of hash table. Hardware accessing memory will first check the TLB, and otherwise walk the page table (writing the translated address to the TLB afterwards).
@@LowByteProductions I'm sort of disgusted, but also kind of impressed that it seems to still be pretty fast. What is the actual cycle count for a TLB hit in the cache, then?
I don't understand something. If there is 3 levels page table and each level hold 512 entries and each entry takes 8 bytes (64 bits) and function "walk" initialize them all, that means only pagetable needs 512*512*512*8 [B] == 1GB of the memory. Something's not right here (probably i didn't understand drawio diagram correctly).
That seems mostly correct, although it only considers the third level of page tables. In total, the page tables use (1 + 512 + 512*512)*(4KiB), which is about 1GiB + 2MiB.
Remember that this only covers the memory used for virtual address translation. The final layer of page tables has 512*512*512 entries, which all point to 4KiB pages that hold the memory the process sees (512 GiB total).
It would need that much page table memory _if all_ of the virtual memory space were to be initialized, but that’s never the case. Programs start out with a pre-determined amount of memory needed to load the code and pre-initialized data, and at run time request additional memory on an as-needed basis, so the kernel only needs to initialize the page table for that smaller range of memory and then add pages as needed. Entries in the 1st-level table don’t need to point to a 2nd-level table if there is no memory allocated in the range the entry would point to, and likewise for the 2nd-level entries.
You don't need to use full table for everything. For example, you can use only 3 tables to map 2 meg process. You'll need ~1 gig table only to map something huge that use 512 gig of memory, which is only ~0.2%
So in xv6 the kernel's read-only data is mapped as writeable? >_> That's kinda funny
It is indeed! I guess it's for simplicitys sake more than anything else.
I watched this three times now. What I dont get is: Why do we need that 3-level structure of tables? I mean, why cant we have a linear table with 512*512*512 entries each entry containing a mapping of virtual page to physical page?
The thing about page tables is that they're sparse, i.e. you actually don't define most of the table. You only need to setup entries for the memory you map for each process. What you're suggesting would be simpler in a way, but also extremely costly. 512*512*512 entries, with 8 bytes per entry is 1GiB - and that's per process! Not to mention the kernels page table.
Nice series
I think that you are explaining stuff that are not directly related to kernel stuff a bit too much, so I end up skipping large parts of the video, like explaining the algorithm of printf going through each iteration of the loop, or explaining how linked list to handle free memory works
I just feel like that is just a normal C programming not very kernel specific
However remember that im just a random person that might disagree with you on something, so keep up the great work 😉
Your explanations are brilliant! I found a playlist of videos that cover the general concept of virtual memory and, for example, the TLB. As someone who has not studied this before, they really helped me gain a solid understanding of the topic: ua-cam.com/play/PLiwt1iVUib9s2Uo5BeYmwkDFUh70fJPxX.html
Chad
Great content, but difficult to digest.
Maybe don't go so in_depth
And just get the principle across. People who really want applicable knowledge will do further research anyway
That's the thing, though: I'm making content for people who want to go further. This whole channel is about going in depth because so few others actually do.
@@LowByteProductions I see. I understand. Maybe it maybe be better to have a fairly concise description of everything you plan/explanation to do in the beginning and also go in-depth later to gethr best of both worlds?
Great work. Please continue the series🙏🙏