The Origins of Process Memory | Exploring the Use of Various Memory Allocators in Linux C

Поділитися
Вставка
  • Опубліковано 5 чер 2024
  • In this video, we discuss how memory can be allocated to a process when coding in C using the Linux System Call ABI. We talk about how an ELF gets processed and loaded into memory, and how the memory is mapped between the user and kernel space. We go over three seperate methods (malloc, sbrk, and mmap) to allocate more memory to a process other than the predefined stack space that the compiler puts in at compile time.
    Drop a like and subscribe to be alerted about new videos!
    🏫 COURSES 🏫
    www.lowlevel.academy/courses/
    LLL Merch: linktr.ee/lowlevellearning
    Follow me on Twitter: / lowlevellearni1
    Follow me on Twitch: / lowlevellearning
    Join me on Discord!: / discord

КОМЕНТАРІ • 182

  • @michaelflynn6952
    @michaelflynn6952 2 роки тому +121

    well son, when a cpu loves some data in its cache very much...

  • @Bureikusaru
    @Bureikusaru 2 роки тому +64

    The addresses returned by calling sbrk(0) immediately followed by sbrk(0x1000) should actually be exactly the same if no other procedures like printf are allocating memory behind the scenes. What sbrk returns is actually the previous break, not the new break as suggested in the video. The returned address of sbrk(0x1000) can be used as newly allocated memory. Then calling a sbrk(0) after sbrk(0x1000) would actually show the new break after it was incremented by 4096 (0x1000) bytes.

    • @veis2208
      @veis2208 7 місяців тому +2

      Actually it is not guatenteed that printf does not change the program brake or used malloc

  • @metal571
    @metal571 2 роки тому +119

    Never seen brk or sbrk before after 7 years in the embedded world. Really interesting, thanks

    • @Conenion
      @Conenion 2 роки тому +12

      You have probably never seen them because a) they belong to the Unix/Linux API (#include ) and b) they are not meant to be used by end users.

    • @gabemcguire2463
      @gabemcguire2463 2 роки тому +6

      I have that I’ve seen sbrk only my second year in collage. I had to use it to implement malloc 😫

    • @metal571
      @metal571 2 роки тому +1

      @@Conenion I've used unistd.h many times but never seen this one before. Although where I work we are quickly moving from C to modern C++ where using "new" doesn't make much sense since e.g. std::make_unique exists etc

    • @Conenion
      @Conenion 2 роки тому +5

      @@metal571
      > never seen this one before.
      Again, because they shouldn't be used, even when using C. See
      man sbrk
      on your Linux box and see the "NOTES" section.

    • @Conenion
      @Conenion 2 роки тому +3

      @@gabemcguire2463
      👍
      Very good task for students to learn an important topic. Very good teacher.

  • @kalebbruwer
    @kalebbruwer 2 роки тому +34

    I've heard of writing your own memory allocators, but it never occurred to me that there is something below malloc, which uses heap pointers directly. Very interesting

    • @williamdrum9899
      @williamdrum9899 2 роки тому +11

      Repeat after me: There is no such thing as free space

  • @RayBellis
    @RayBellis 2 роки тому +220

    The heap is not really separate - the underlying memory returned by a `malloc` call is itself allocated using `sbrk` or `mmap` (these days more often the latter). It's an abstraction layer on top of those system calls, and it also permits resize and deallocating, which is why it's more complicated and slower. A particular disadvantage of using `sbrk` is that it's not possible to reclaim that memory once allocated - it becomes a permanent part of the process's virtual memory size (VSZ).
    You should probably also mention `alloca`, which can allocate a variable amount of memory on the stack but only for the duration of the current function - it's automatically released when the function terminates when the stack frame is reverted to its pre-function-call state.

    • @reecesx
      @reecesx 2 роки тому +23

      Yea, there's no way in hell modern allocators are using sbrk over mmap or better platform specific APIs to reserve uncommited page tables. I'd also question the heap notion. There is no such thing as a heap anymore. It's all just floating reserved heap objects controlled by platform runtimes, possibly wrapped with canary, and obscured with runtime heap aslr. We don't live in old POSIX land where you might find yourself with a defined map of memory blocks and you can only grow down and up until you collide (heap and stack position being defined by the platforms abi) anymore. This is BS. We can freely reserve and release pages (MmAllocateNonCachedMemory, alloc_pages) as well as reserve address space easily (mmap, file views/sections, ntallocatevirtualmemory, et al). Funny thing about the description, `We talk about how an ELF gets processed and loaded into memory, and how the memory is mapped between the user and kernel space`, he doesn't even mention binfmt.This dude stinks of larp. Whatever helps the guy boost his CV I guess?

    • @nothappyz
      @nothappyz 2 роки тому +42

      @@reecesx why you gotta be like that

    • @theterribleanimator1793
      @theterribleanimator1793 2 роки тому +1

      @@reecesx You stink of elitism.

    • @FelixHdez
      @FelixHdez 2 роки тому +2

      @@reecesx How did you learn all of this

    • @nickscurvy8635
      @nickscurvy8635 2 роки тому +10

      @@FelixHdez I can't speak for them but this stuff you either pick up very slowly over time just programming in these environments (eventually if you code in C long enough you start to wonder how malloc works and start poking around source code and manpages) or through an actual course in topics related to "low level" and "systems" programming.

  • @willvoiceit1
    @willvoiceit1 2 роки тому +54

    mmap actually returns MAP_FAILED (-1) on an unsuccessful allocation, so a NULL check won't catch it like with malloc. Learned that the hard way!
    Did not know about sbrk though.

    • @williamdrum9899
      @williamdrum9899 2 роки тому +3

      I've always wondered what NULL really means (as an assembly programmer I habe no clue how it's actually stored)

    • @flachlappen
      @flachlappen 2 роки тому +7

      @@williamdrum9899 in c it's just the number 0

    • @blucky_yt
      @blucky_yt 2 роки тому

      @@williamdrum9899 null doesn't really exist, what its actually represented as depends on the language, in C its pretty much 0, in other languages it's some type that the compiler then turns into whatever it represents null as

    • @hongkyang7107
      @hongkyang7107 2 роки тому

      @@williamdrum9899 I think it depend on which encode table was used, for most used Ascii and uft8, NULL is the value of 0x00

  • @marciomaiajr
    @marciomaiajr 2 роки тому +13

    Awesome explanations. I never thought about using system calls to allocate memory.

  • @paxdriver
    @paxdriver 2 роки тому

    Just discovered your channel this week, it's my fav find of the year kudos

  • @doowi1182
    @doowi1182 2 роки тому +3

    Awesome video! Been a C programmer for 4 yrs and haven't ever heard of sbrk or mmap!

  • @repflies
    @repflies 2 роки тому +4

    Quality content as always. Thank you!

  • @bastian9945
    @bastian9945 2 роки тому +5

    Keep this going I don't use it but your explanations are really good so I just like to watch and learn how it works.
    And I really like to see more rust.

  • @jackyli6716
    @jackyli6716 2 роки тому +1

    nice!I love this kind of tutorial video

  • @jakemorales7949
    @jakemorales7949 2 роки тому +17

    Memory allocation was such a mystery to me in college, and this helps me get a better picture. Thanks!
    So, can you also say that the 3 ways for getting memory is from the Heap, Stack, and OS Virtual memory, respectively?

    • @N00byEdge
      @N00byEdge 2 роки тому +6

      OS Virtual memory encapsulates heap and stack to begin with. Heap and stack are just special cases of os virtual memory

    • @Conenion
      @Conenion 2 роки тому +1

      You have to differentiate between user land and what the kernel does to manage a process' memory. A process is a running program. A program is a file on disk, it is what the compiler + linker spits out.
      A running process "sees" only the user land. In an OS with memory mapping, a process has no idea where in memory the code of the program (called text segment) resides. The process "sees" only linear addresses starting from zero. Mapping is done using 4k large pages and managed by the OS, with support from the hardware, called memory management unit (MMU). This unit also has a translation lookaside buffer (TLB) which is a cache for recent translations of virtual to physical memory.
      Heap and stack are managed by the user land (support libraries like the glibc in Linux contain the code for malloc/free to manage the heap).
      To answer your question: "Getting memory" is calling malloc (or similar depending on your programming language), that calll is then handled by the glibc (in user land) making the appropriate calls to the OS which then does the memory management for the process including mapping from virtual to physical addresses.The stack is handled automagically. It simply grows from the highest address possible downwards in direction to the end of the heap.

  • @excessreactant9045
    @excessreactant9045 2 роки тому

    Great video thanks 👍

  • @mahdavimail
    @mahdavimail 2 роки тому

    Thanks for sharing

  • @ManuelGx2
    @ManuelGx2 Рік тому +4

    Couple of questions. At the 5:00 mark:
    - The program break was incremented by 0x21000 (in the terminal output) and not by 0x1000 (as seen in line 11, in the editor). Why is that?
    - When you use that new allocated space as an array you start indexing at position 100, is there a particular reason why?
    BTW I really enjoy your videos, keep up the great work!

    • @Acorn_Anomaly
      @Acorn_Anomaly 8 місяців тому +2

      The printf() call is likely allocating memory behind the scenes.
      The sbrk() system call is documented as returning the _previous_ break value, not the position the break just got moved to. That way, if the break is increased(i.e. memory is allocated), the return value is a pointer into the newly available memory region. (If the adjustment is zero, then the previous break value is the same as the current value.)
      If no allocations are made in between the two calls, the sbrk(0) and the sbrk(0x1000) should actually return the same value.
      You can easily test this, by taking this code, removing anything that accesses the new memory, and changing the second sbrk() into another sbrk(0). You'll see that the break has adjusted, even when you didn't ask it to.
      But if you move BOTH sbrk() calls to BEFORE the printf() statements, they'll be the same.

  • @nickscurvy8635
    @nickscurvy8635 2 роки тому +1

    When ur kid asked where memory came from you missed an excellent opportunity to say something like
    "Someone told me once, but I don't remember"

  • @krzysztofadamski2884
    @krzysztofadamski2884 2 роки тому +13

    BTW in C you should not explicitly cast the void pointer. It is not needed and may be dangerous as it turns compiler warnings off for this assignment.

    • @moczikgabor
      @moczikgabor 2 роки тому

      If you want to access the memory pointed by the void ptr, you have to cast it to some type. How otherwise you access it as ints or structs or whatever?

    • @krzysztofadamski2884
      @krzysztofadamski2884 2 роки тому +7

      @@moczikgabor yes but you don't have to do this cast *explicitly*. In C (contrary to C++), you can assign void pointer to any pointer type without a cast. That is why you don't have to (and should not) cast the return value of malloc() in C.
      You are correct that you can't access (dereference) the value of void pointer as compiler needs to know the type. But when you assign the pointer, the type will be known.
      In other words, it should be:
      int *myNewArray = firstEnd;
      The cast is not needed.

    • @moczikgabor
      @moczikgabor 2 роки тому +1

      @@krzysztofadamski2884 You got me! I am using C for decades and didn't especially know that. 😱
      I do embedded work in C though, malloc and pointers are evil on microcontrollers, so I do not really use them.
      The biggest annoyance by the way is when you assign the arithmetic result of some uint8 operands to an uint16 variable, and it is not casted automatically to uint16 before the operation, so it will give wrong result...

    • @krzysztofadamski2884
      @krzysztofadamski2884 2 роки тому +4

      @@moczikgabor C in general is full of traps but I wouldn't call pointers "evil". In fact, the only part of C language I would consider evil are undefined and implementation-defined behaviours.
      The rules of automatic type promotions are also hard, though.

    • @moczikgabor
      @moczikgabor 2 роки тому +3

      @@krzysztofadamski2884 Yes, I do not mean pointers are evil in general, I am confident with them on proper systems. But they are evil on small (8-bit or so) microcontrollers. They can't point anywhere, some architectures has pointers of 8 bit which can't point too far. Some compilers generate code for you with bigger pointers, but when you check the disassembly you'll see that it requires a ton of workaround code, because the arch has no direct support for what you wrote. Another problem is many microcontrollers have Harvard architecture, and a very little RAM, so compilers may place string literals (for example) as true constants in the .text section, not in initialized data. Thus, if you try to index it with a pointer made for the RAM, it will fail. There are RAM ptrs, ROM ptrs, far, near... These are quite implementation-defined behaviours on MCUs, and you can't even be sure whether your code will work with another compiler. (hint: it won't... usually...)
      I'd rather avoid pointer arithmetic altogether on 8-bit MCUs, Either I solve the problem in other way, or I explicitly program the series of instructions the arch. can do instead (for example sequentially read FLASH content from address, with the core instruction available for this purpose).

  • @nzeu725
    @nzeu725 3 місяці тому

    very nice

  • @dream_emulator
    @dream_emulator 2 роки тому

    Such a cool channel

  • @zemoxian
    @zemoxian 2 роки тому +1

    It’s been well over a decade since I’ve used mmap but I think it’s possible to use it for Inter process communication.
    If you map the memory to a file, multiple processes can used the same file to share the memory.
    That’s about all I recall about the usage at this point. I wouldn’t even be surprised that I’m misremembering some other means of shared memory. I’m pretty sure I used mmap for that.

  • @tejasjani2544
    @tejasjani2544 2 роки тому +2

    It is to much advance concept for me. But it is so interesting 🤔. Thanks❤️ for introduce this new concepts.

  • @leokiller123able
    @leokiller123able Рік тому

    glibc malloc not just allocates memory, it does tons of optimisations to reduce the number of calls to mmap, avoid useless reallocations, etc.. that you don't want to deal with, so if you are working on a production build or just a large proram please use malloc, it will always be faster than mmap, the only case that I could think of using it is for memory-mapped files and if you want to reimplement your own allocator

  • @redcrafterlppa303
    @redcrafterlppa303 2 роки тому +5

    Casual C programmer :
    Memory comes from malloc
    Casual C++/Java/C# programmer :
    Memory comes from new
    😂

  • @krzysztofadamski2884
    @krzysztofadamski2884 2 роки тому +6

    The topic is interesting. When I interview people I almost always ask them exactly this question.
    However I have one complaint as your material may be confusing to people - you seem to suggest that malloc is *separate* allocation method from sbrk/mmap which is not the case. Malloc is just a function which does one of those two syscalls under the hood (which one it will use depends on the size of the block you are requesting). If you would strace your malloc example you world see exactly that.

    • @LowLevelLearning
      @LowLevelLearning  2 роки тому +1

      Yup! To get memory malloc definitely has to invoke MMAP to generate new arenas. I just wanted to get the point across about the different API's one has at their disposal, even if one relies on the other. Thanks for watching!

    • @krzysztofadamski2884
      @krzysztofadamski2884 2 роки тому

      @@LowLevelLearning well it can (and will) use brk() in addition to mmap.

    • @9SMTM6
      @9SMTM6 2 роки тому +1

      What kind of work do you do that this is relevant? I've done a bit of performance sensitive code, but I've never had to touch that, remotely.
      And considering what I've read from others, that memory allocated with sbrk is permanent, I think I'll leave the fingers of that anyways... It's also not ever going to be portable to Windows, probably also Mac.
      I would really think that in the scenarios, at least concerning performance, where malloc is not desirable, other approaches are probably superior, like preallocation, your own allocator for a manual memory pool, alloca, etc...

    • @krzysztofadamski2884
      @krzysztofadamski2884 2 роки тому +1

      @@9SMTM6 I do all sorts of stuff related to low level Linux programming on the boundary of kernel and userspace. I didn't say that knowing how malloc is implemented is helpful in my work but many people who work in my area would know that simply because they are interested in the innerworkings of the unix-like operating systems. They know that because they were curious enough, or they learned that by stracing their programs, it doesn't matter. What matters is if the job candidate is interested in those areas, if he goes into details, or just takes everything for granted.
      I'm not saying you have to know how malloc works to be positively evaluated by me but it is a good starting point for a discussion and further questions.
      I couldn't care less about Windows and Mac, but you are wrong, Mac does supprt brk()/sbrk() calls. It has unix roots.

    • @9SMTM6
      @9SMTM6 2 роки тому

      @@krzysztofadamski2884 frankly, while what I've learned from this video is not u interesting, it's not really what I've hoped for.
      Like yeah obviously malloc is using syscalls to request additional memory if it runs out. Duh.
      This video DOES describe different syscalls and how they behave, yeah.
      But what it doesn't do is explain WHY these calls behave like that, which is what the first part of the title implies it does, and what I was really looking for. It's just deffering to the OS.
      I know that there is some kind of virtual address translation, but how that works, and why it's designed the way it is is unclear to me.
      Why is the upper border 0x7FFF[..]? And not 0xFFF[..]? What's with the addresses between the stack and the heap? I'm pretty sure they are forbidden when not mapped manually, but noone says that explicitly? How the hell does an expansion of the bss section not fuck up every heap pointer?
      If you know of good material regarding that I'd welcome it.

  • @n0kodoko143
    @n0kodoko143 2 роки тому

    Cool!

  • @pooladkhay
    @pooladkhay 2 роки тому +1

    Please suggest some books for me to get a better understanding of os and memory and low level stuff

  • @Meow_YT
    @Meow_YT 2 роки тому

    While I don't understand x86 / x64 memory dynamics, coming from ARM and how Acorn memory management worked, I have some understanding of memory management in a virtual memory space.

  • @pnuema1.618
    @pnuema1.618 Рік тому

    Arnt you possibly dereferencing a NULL pointer on line 13 if malloc fails for your first example?

  • @roboticbrain2027
    @roboticbrain2027 2 роки тому +4

    FYI: sbrk() seems to accept negative values to shrink the memory again. But this must be a nightmare to maintain when using multiple different types of allocators...

    • @44r0n-9
      @44r0n-9 2 роки тому

      That's all handled by the OS, absolutely no problem.

    • @roboticbrain2027
      @roboticbrain2027 2 роки тому

      @@44r0n-9 The OS can't possibly handle that!
      Imagine allocator A increasing the system break by 1024 bytes. Then allocator B does the same. If allocator A now wants to release the 1024 bytes allocated by it, it can't do that without making sure allocator B already released it's section of memory. The best A can do is ignore the deallocation, which ultimately leaks memory. Hence: a nightmare to maintain. There is a reason sbrk() is a mostly legacy API.

    • @Kirillissimus
      @Kirillissimus 4 місяці тому

      ​​@@roboticbrain2027 I would not go as far as calling sbrk a legacy API as it is much faster than MMAP and theese are the only ways to allocate memory in Linux apart from the stack. Last time I checked glibc sourcecode whenever malloc needed a new chunk to extend the heap it tried to actually do sbrk() first and it only resorted to mmap() if it had reached the point when it couldn't. So it is still a very iportant system call and it is still used internally, although indeed I don't see much reason to use it manually unless you want to make your own version of malloc() for some reason.

  • @ronalerquinigoagurto555
    @ronalerquinigoagurto555 2 роки тому +1

    Hi can you maker video explaining uefi and device tree

  • @thehackr258
    @thehackr258 2 місяці тому

    if the memory allocated by sbrk is inside of ELF Does sbrk increase the size of the final executable?

  • @williamdrum9899
    @williamdrum9899 2 роки тому +1

    I'm used to NES which basically just says "You have 2 KB of RAM from 0x0000 to 0x07FF, it's your job to figure out how you want to use it"

  • @yacoubcheik76
    @yacoubcheik76 Рік тому

    Which compiler do you use ?

  • @luissalazar5000
    @luissalazar5000 2 роки тому

    ¿The memory addresses in the memory map and programs output correspond to x86 or they are not specific to any CPU?

    • @williamdrum9899
      @williamdrum9899 2 роки тому

      It depends on the CPU and the hardware. The example he gave is most likely based on x86 but I wouldn't take the address ranges too literally. This was more of a general overview

  • @honkhonk8009
    @honkhonk8009 2 роки тому +1

    Iv heard some people say that Linux was bad, because it was a monolithic kernel? What do they mean by that? Does it mean that your drivers were supposed to run as a seperate kernel from your OS?

    • @Skythedragon
      @Skythedragon 2 роки тому +1

      Yes, it means that a lot more stuff is part of the kernel (such as drivers). If one of these programs crashes, the entire kernel will crash, which is an issue.
      There's also microkernels, which do *a lot* more in userspace (so not in the kernel) meaning that if something crashes, the kernel will most likely keep running. The downside of running things in userspace is that is usually requires more syscalls to the kernel to do specific tasks, which can be slow.

    • @Conenion
      @Conenion 2 роки тому

      First things first: Linux isn't bad. :-)
      It is a monolithic kernel, yes, because the entire kernel is one large executable in the sense, that everything in the kernel runs in kernel mode, i.e. in the same context, including privileges. So, for example, if a driver hangs in an endless loop, the whole system hangs (except I believe there is a watchdog mechanism which prevents this). As you can see from my explanation, Linux is a monolithic OS, even though it has modules (*.ko in /usr/lib/modules). This only is a feature to load/unload drivers instead of building one large executable by linking them all in.
      The micro kernel fanboys said this is bad, because micro kernels do better by only having a very small kernel actually running in kernel mode. Drivers (and everything else that does not need it) do run in a less privileged mode. Beginning with the 90ties the academic OS crowd was mainly in favor of micro kernels, believing no new OS should follow the, in their view, antiquated monolithic kernel approach. As it turns out, though, MK are much harder to develop and to debug. Also, they introduce a communication overhead, which can be problematic in a kernel which affects the entire system performance. Torvalds being pragmatic wanted to develop something that works, he wasn't interested in academia perceived beauty.
      See the "Tanenbaum vs Torvalds" debate. It even has its own Wikipedia page.

  • @hextav
    @hextav 6 місяців тому

    could you make a new video on arena allocators?

  • @melvin4524
    @melvin4524 8 місяців тому

    .data stores global variables not const vars. All global vars are initialized to 0 unless specified otherwise. Also between heap and stack there is an adress space(memory page) used for shared libraries. Stack grows to lower mem adresses(x86,x64) to the shared lib mem page, not the heap address space

  • @AmanSingh-sp6bi
    @AmanSingh-sp6bi 2 роки тому +1

    BTW mmap returns -1 on error, not NULL.

  • @fusca14tube
    @fusca14tube 2 роки тому +3

    Hi LLL... Why the first addr of sbrk is 0x...06000 and the last is 0x...27000? Shouldn't be the last addr equals to 0x...07000?

    • @krzysztofadamski2884
      @krzysztofadamski2884 2 роки тому +6

      Nice catch! Most likely this happened because printf call allocated some memory (0x20000 bytes) for the buffer using malloc internally and malloc uses (s)brk itself to fullfil the allocation.
      Run this program via strace and you will see more sbrk() syscalls than those explicitly called from the code.

    • @fusca14tube
      @fusca14tube 2 роки тому

      @@krzysztofadamski2884 Wow! Thanks! I'll do it.

    • @LowLevelLearning
      @LowLevelLearning  2 роки тому +3

      @@fusca14tube Yup!, What he said^ glibc allocated memory in the backend to make room for the first printf.

    • @ABaumstumpf
      @ABaumstumpf 2 роки тому +1

      Also his explanation of what sbrk does was wrong.
      From the man-pages: "On success, sbrk() returns the previous program break. (If the break was increased, then this value is a pointer to the start of the newly allocated memory)."
      What we see is the increase due to printf - NOT the increase from his call to sbrk.
      If you had
      void* ptr1 = sbrk(0);
      void* ptr2 = sbrk(1000)
      then both would point to the same address.

    • @fusca14tube
      @fusca14tube 2 роки тому

      @@ABaumstumpf Thanks!

  • @haydengalloway5177
    @haydengalloway5177 Рік тому

    Your sbrk example code seg faults when you try to assign to any element in the array.

  • @vishwanathbondugula4593
    @vishwanathbondugula4593 2 місяці тому

    Hi LLL, I am just a beginner but your code without checking the return values of malloc, printf and using strcpy instead of strncpy felt very weird!

  • @PowerfullPillow
    @PowerfullPillow 2 роки тому

    How can I tell where a line of my code is stored? Can i execute it from inside the program itself or pass it as a string to some function?

    • @user-lt9oc8vf9y
      @user-lt9oc8vf9y 2 роки тому +1

      If you have a function of that code you can take it's function pointer and pass that around to some other functions. However you can't get the original source code at runtime once compiled. The C compiler turns your C code into a binary file if native machine code. What you can do, is cast the function pointer into an char pointer and read some of that machine code.
      But keep in mind that that string isn't zero terminated and you can't know the length of your functions bytecode.

    • @williamdrum9899
      @williamdrum9899 2 роки тому

      Technically yes, but it's not a good idea, since most modern machines use virtual memory and address space layout randomization, which means there's no guarantee that say for example &main = 0xDEADBEEF every time. This is something ypu can only really do on Game Boy or most game consoles made before the year 2000 or so

    • @williamdrum9899
      @williamdrum9899 Рік тому

      I think the closest you can get is printf("%p",&main); or something like that. As for how many bytes your function takes up, you pretty much need assembly for that. Pity there's no sizeof() for functions. Would be useful on embedded hardware

  • @pinguinul_gnom
    @pinguinul_gnom 2 роки тому +8

    Cool stuff. Now how do we allocate memory in aarch64 assembly? :)))

    • @LowLevelLearning
      @LowLevelLearning  2 роки тому +2

      Teehee ;D

    • @krzysztofadamski2884
      @krzysztofadamski2884 2 роки тому +1

      The same as you would from any other language - you call a syscall (brk or mmap) for that. Or, if you link to some library, you can call malloc function that will do that for you.

    • @bob-ny6kn
      @bob-ny6kn 2 роки тому +4

      Memory manage by hand before you code rather than writing bloatware. As a novice, I created an algorithm that reduced the size of "the expert's" code by a factor of eight. He called my code inefficient, so I showed it had no difference in execution. He called my code confusing, so I added a page of comment in the source. He had my code removed because he was the expert, and I was the novice. No body cares.

  • @jeffspaulding9834
    @jeffspaulding9834 4 місяці тому

    Strange there's no mention of mmap(2) being able to map files and devices, not just anonymous regions of memory.

  • @shadowchasernql
    @shadowchasernql 8 місяців тому

    So the three types of allocation is:
    malloc: the portable userland function
    mmap: the weird syscall one
    sbrk: you are not expected to understand this

  • @zxuiji
    @zxuiji 2 роки тому

    Yeah, I think it would be easier to manage memory with a couple of wrapper to both mmap and malloc (I say malloc because I don't like collisions in handling), I'm imagining something like the following:
    typedef struct _PAGE PAGE;
    struct _PAGE
    {
    size_t size;
    PAGE *prev;
    PAGE *next;
    void* (*palloc)( void *addr, size_t size );
    }
    PAGE stdc_page = { 0, NULL, NULL, realloc );
    void* palloc( void *addr, size_t size )
    {
    PAGE* page = (PAGE*)(((PAGE**)addr) - 1);
    page = page->palloc( page, size + sizeof(PAGE) );
    return page ? (void*)(((PAGE**)page)+1) : NULL;
    }
    Of course with a little more fault checking and declarations but the above is the simplest form I could think of to get across how I would map both types of functions into one

  • @CanoTheVolcano
    @CanoTheVolcano 2 роки тому

    Does c++ new use malloc behind the scenes?

    • @Conenion
      @Conenion 2 роки тому

      Yes. In any case, they have to use the same memory management code, at some point, as C++ can call C code.

  • @StephenBuergler
    @StephenBuergler 8 місяців тому

    what is stack memory? is something special about it? could you malloc some ram, put your stack pointer there, and never return?

    • @jeffspaulding9834
      @jeffspaulding9834 4 місяці тому +1

      If you're running on an operating system, the stack pointer is part of the C "runtime" - i.e. the mechanics of how it works are assumed to be taken care of for you. There's not any way to mess about with the stack pointer in the standard library without going beyond C and into assembly. So basically doing stuff like that puts you into undefined behavior territory as far as the C standard goes.
      (Note the "running on an operating system" part. If you're in a freestanding environment like a bare metal microcontroller, all bets are off.)
      Some operating systems give you some additional tools that let you add additional stacks and jump between them without dropping to assembly. During university I remember implementing a simple coroutine-based consumer-producer program that used two stacks on a FreeBSD or Linux box using (I believe) sigaltstack() to create the second stack. But I mostly copied the stack creating code and that was years ago, so I don't remember the exact technique.

  • @doktornouveau862
    @doktornouveau862 Рік тому

    I understand what you’re saying, but what does it mean?

  • @melihcelik9797
    @melihcelik9797 2 роки тому +5

    If you really need you allocator, use glibc with it. If you really can't, use mmap and manage the memory, sbrk and brk makes thing a lot harder.
    But lets be real, you probably can get access to glibc that is compatible with your system. Use that instead. Don't make things harder for you. This level of "low level" computing stuff is implemented better by the professionals that spent half of their lifes onto this thing. Don't make your day a nightmare, use malloc

  • @derpenz5376
    @derpenz5376 Рік тому

    Why does the stack grow backwards?

  • @TheR971
    @TheR971 2 роки тому

    When two nand gates love each other very much...

  • @RonJohn63
    @RonJohn63 Рік тому

    Out of curiosity, why did you write "(NULL != myHeap)"? It's backwards.

    • @LowLevelLearning
      @LowLevelLearning  Рік тому +1

      Getting in the habit of typing comparisons to constants backwards is good because, in the event that you accidentally type = instead of ==, for example (0 = x) will throw a compiler error and force you to check the problem, where as (x = 0) is valid syntax that returns value, and you may miss it.

    • @RonJohn63
      @RonJohn63 Рік тому

      @@LowLevelLearning interesting, and a good point.

  • @milasudril
    @milasudril 2 роки тому

    mmap only allocates multiples of pages

  • @emmanuelbeaucage4461
    @emmanuelbeaucage4461 Рік тому

    sometime a pointer and malloc love each others a lot... so they get married and have a lot of memory buffers together...

  • @KeinNiemand
    @KeinNiemand 9 місяців тому

    ok but how does it work on windows?

  • @hansdampf2284
    @hansdampf2284 6 місяців тому

    I don’t think your estimates of the different performance of malloc vs mmap vs sbrk are correct.
    Yes, sbrk is probably the fastest. Aber all it’s doing not much more than asking the operating system move the border of mapped pages.
    But glibcs malloc is way more complicated than most people think. It’s not just managing some blocks of memories. It’s doing tons of optimizations, it’s filling gaps that came from freeing and it’s using the empty space as an inner linked list only between the free blocks to traverse the list faster when it’s searching for a fitting free block. For big allocations if even uses mmap internally

    • @Kirillissimus
      @Kirillissimus 4 місяці тому

      malloc() actually does even more! At some point it becomes impissible to extend the last segment even further as it hits some other already allocated virtual memory and brk() just returns -1. That is when malloc uses mmap to get more memory chunks and each time it calls mmap it may get a random place within the virtual adress space. So it has to manage multiple chunks of various sizes which may or may not be connected together. And it must try its best to squeeze a newly requested data sections into one of the chunks before trying to mmap a new chunk for the thing and some spare. And don't forget that mmap-ed chunks can be redized unless they get way too big or hit some other already allocated memory so it is beneficial to try also that first instead of just getting new chunks mmap-ed whenever needed. And it is best to get some spare memory for next time. The logic is not super complex but to do mallloc() efficiently really takes much more than you might expect at first glance.

  • @soufianefariss
    @soufianefariss 2 роки тому

    TIL about sbrk/brk.
    Thank you.

  • @raptoress6131
    @raptoress6131 4 місяці тому

    The computer fairies

  • @9SMTM6
    @9SMTM6 2 роки тому +2

    It's kind of missing the part I find really relevant, while repeating much that I already know.
    And actually the first part of the title is at the very least misleading regarding that.
    Yeah, of course there's some syscall behind additional memory, if the heap is running out.
    And it's neat to see how these syscalls behave differently.
    But what I'm wondering is related to WHY they're behaving like that. How do these system calls do it?
    I don't really know how this works, what the OS does. All I know there's some kind of memory address translation going on.
    I would've loved to see that explained, and based on that you could explain why the different syscalls behave the way they do.

    • @moczikgabor
      @moczikgabor 2 роки тому

      I can't give you a complete and detailed answer, but before the Pentium processors the memory was accessed using absolute addresses, segment:offset pairs. Since then there is MMU (memory mapping unit), and segment registers inside the CPU are actually descriptor table indexes, and this descriptor table holds information where that memory block in the contiguous address space is. There is a related thing, TLB (translation lookaside buffer). It is a cache helping the virtual->physical memory address translation.
      Like I said, I have gaping holes in the detailed knowledge, I just started writing an operating system somewhere more then 20 years ago just for fun, and was very interested in the internals back then. Of course I never finished it, but at least it was bootable and it printed a welcome prompt. 😃

  • @wskinnyodden
    @wskinnyodden 2 роки тому

    I knew it! Elfs had to be involved in this shit! Hah! And you scientists deny magic! (ROFL)

  • @yajurrai6491
    @yajurrai6491 2 роки тому +2

    FIRST

  • @ggre55
    @ggre55 4 дні тому

    Evil syscalls 😂

  • @platin2148
    @platin2148 2 роки тому

    It’s not very useful todo a mmunmap in the example.

  • @anon_y_mousse
    @anon_y_mousse 2 роки тому

    I really wish people would stop pronouncing abbreviations in a manner inconsistent with the word of which they're an abbreviation. For instance, "strcpy()" is not "stir copy" but rather "string copy". It's like when people pronounce "char" as though you're going to flame-broil some meat. It should be "care" as it is an abbreviation for character.

    • @44r0n-9
      @44r0n-9 2 роки тому +1

      I can really visualize you sitting in a room with a dozen people talking about Care pointers and no one knowing wtf you're talking about lmao.
      I get the idea, but it's just not intuitive.

    • @anon_y_mousse
      @anon_y_mousse 2 роки тому +1

      @@44r0n-9 Well, people could just be unlazy and say character in full, as that's what it's supposed to represent. Supposed because it doesn't actually, but that was the original intent.