What Happens After The Compiler in C++ (How Linking Works) - Anders Schau Knatten - C++ on Sea 2023

Поділитися
Вставка
  • Опубліковано 5 вер 2024
  • cpponsea.uk/
    ---
    What Happens After The Compiler in C++ (How Linking Works) - Anders Schau Knatten - C++ on Sea 2023
    We all know roughly what the compiler does, it translates your source code into machine code. Thanks to Compiler Explorer, many of us now even look at the generated Assembly code.
    But wait a minute, that code is full of labels and function names, the CPU knows of no such things! Most of these are also defined in different compilation units, how can we jump to code we don't know where comes from? And even for our own compilation unit, how can the compiler know where in memory the machine code will eventually be loaded, so it can generate the right jumps? Even worse, what if that function comes from a dynamic shared object?
    This talk gives an introduction to how the compiler, linker, loader and operating system cooperates to get from a compilation unit to a running process. We'll look at static and dynamic linking, relocations, position independent code, sections and segments and virtual memory. The talk covers Linux only, but similar principles apply on Windows and Mac.
    ---
    Slides: github.com/phi...
    Sponsored By think-cell: www.think-cell...
    ---
    Anders Schau Knatten
    Anders started programming in Turbo Pascal in 1995, and has been programming professionally in various languages since 2001. He's currently a principal developer at Ascenium, working on a new general-purpose CPU design. He's the author of cppquiz.org and the blog C++ on a Friday.
    ---
    C++ on Sea is an annual C++ and coding conference, in Folkestone, in the UK.
    - Annual C++ on Sea, C++ conference: cpponsea.uk/
    - 2023 Program: cpponsea.uk/20...
    - Twitter: / cpponsea
    ---
    UA-cam Videos Filmed, Edited & Optimised by Digital Medium: events.digital...
    #cpp​ #cpponsea​ #compiler

КОМЕНТАРІ • 22

  • @hedgechasing
    @hedgechasing Рік тому +17

    Around 8:50 the mov eax, 0 before the call is actually not about main returning zero if you don’t specify anything, but actually part of the C abi. The functions here are written with nothing in the parenthesis as is normal in C++ but in C empty parenthesis does not mean no arguments, it actually declares a K&R style function with an unknown number of arguments of unknown types. The actually definition would need to specify args in order to use them, but callers could just write extern void whatever() to declare them since K&R function calls are not type checked. What the 0 specifically represents is the number of vector arguments (usually floating point values) passed to the function. Variadic functions need to know how many registers to save and so that value allows them to have an upper bound. If the empty parens were replaced with (void) the mov eax, 0 would go away even on -O0 and without making that change it will persist even at higher levels of optimization (assuming that the two functions are actually in two translation units so the function doesn’t get inlined)

    • @andersknatten
      @andersknatten 11 місяців тому +4

      Thanks for correcting that! I had forgotten about this difference between C and C++. I'm mostly writing C++, I guess it shows.:)

  • @deckard5pegasus673
    @deckard5pegasus673 10 місяців тому +6

    50:14
    -fpic = enforce memory limits on the size of the GOT.
    -fPIC = no size limit for the GOT

  • @nitsanbh
    @nitsanbh 11 місяців тому +4

    I learned so much from this talk,
    Thank you!

  • @Byynx
    @Byynx 6 місяців тому +1

    This video is a gem!!!

  • @alx9r
    @alx9r 11 місяців тому +3

    I can also recommend James McNellis’ “Everything you wanted to know about DLLs” on this topic.

  • @widnyj5561
    @widnyj5561 11 місяців тому +2

    The part about function calling near the end was the most interesting

  • @denisfedotov6954
    @denisfedotov6954 11 місяців тому +5

    Nice talk! However, lazy binding is disabled by default in modern Linux distributions as one of the attack mitigation techniques so that plt table is read-only during program execution. This is known as RELRO.

    • @andersknatten
      @andersknatten 11 місяців тому +1

      Thanks, I didn't know that.

  • @VincentZalzal
    @VincentZalzal 11 місяців тому +1

    Excellent talk, the clearest I've seen on this topic!

    • @cpponsea
      @cpponsea  11 місяців тому

      Great to hear! Thank you for your comment.

    • @andersknatten
      @andersknatten 11 місяців тому +1

      Thank you! I'm very happy to hear that.

  • @rezwanarefin3493
    @rezwanarefin3493 8 місяців тому

    18:05 Actually the compiler does know which compute() function you are calling in this example, compute() was in the same file. In fact, even at -O1 it will remove the call and inline compute(). The compiler wouldn't know that if compute() was not available in the same translation unit.

  • @dascandy
    @dascandy Рік тому +1

    @6:21 Middle line on the right has "48 89 e5" which is the start of your compute function, bottom left has "b8 01 00 00 00" which moves 0 into eax, followed by 5d (pop ebp) and c3 (ret).

    • @andersknatten
      @andersknatten 11 місяців тому +2

      Yeah, that's what I'm trying to point out at @7:25 too.

  • @Danielm103
    @Danielm103 Рік тому +1

    Awesome talk, interested to know what, Use Link Time Code Generation, and other optimizations like COMDAT folding and /OPT:REF do

  • @mikefochtman7164
    @mikefochtman7164 10 місяців тому

    Boy this explains a lot of nitty-gritty details. We had an application that required several separate processes to have access to a large block of common memory (about 64kbytes). We did this by defining a large int-array in a shared object and initializing it to non-zero. This was 20 some years ago, it might not still work, I don't know. But by initializing it the array was put in the shared .data segment. So each process had access to the same large array and one process could 'see' what another process wrote. (yes, there were other concerns about collisions and such, but the gist of it was that the DLL and its .data segment where shared by all)

    • @kayakMike1000
      @kayakMike1000 10 місяців тому

      Well, I suppose you could put a lock on that shared memory to ensure concurrent integrity.

  • @gustavbw
    @gustavbw 10 місяців тому

    53:20 (on lazy-loading): I understand the concept as being partially preparing data when declared, and only loading the full extent when used (or not even then) - or, disguising accessing some data as actually fetching it first, meaning it is declared, you can reference it, but it's not actually there. Instead the instructions to get it there is.
    What you're describing sounds to me like caching - i.e. storing the output of some functionality in an easily accessible way so that you do not have to invoke said functionality again. But I might be off here (also I come from a very much not systems/compiler background so I completely understand if "lazy-loading" is the term used for it in your field).
    Side question: Would this mean that you could have runtime dynamic linking if you implemented cache invalidation for this step of the process? (i.e. be able to change bits of the machine code as stored on disk, which when the invalidation occurs, would take effect?)

    • @andersknatten
      @andersknatten 10 місяців тому

      Yes, lazy loading is a good way to describe this!
      I guess you could do some sort of runtime dynamic linking if you had some way of resetting the GOT to point back into the PLT stub and then convince the dynamic linker/loader to load something different next time. Provided that you have prepared GOT/PLT entries for everything during compilation. Depending on what you mean by runtime dynamic linking of course, I'm just replying very generally here.
      Note, btw, that we never change any *machine code* here, we only change data. It's just pointers in the GOT that are updated, from pointing at the stubs in the PLT to pointing at the real functions.

  • @rudalert
    @rudalert 11 місяців тому

    Thank you for the interesting talk!
    Question about the last chapter: will the loader copy ("load") the function from the shared object into the .got section? I am confused how the state (if the function has any) is differentiated between the processes using the same shared object.

    • @andersknatten
      @andersknatten 11 місяців тому +1

      What kind of state are you thinking of? If you're thinking of function arguments and local variables, these go on the stack or in registers, which are unique to each process and in fact each invocation in that process. If you're thinking of local static variables, these go in data sections like `.data`, which each process gets a unique copy of. Only the read-only segments are shared between processes.