CppCon 2017: Nir Friedman “What C++ developers should know about globals (and the linker)”

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 47

  • @greenUserman
    @greenUserman 5 років тому +10

    Cool! I really liked this talk. Made me realize I don't know enough about the linker.

  • @ChristianBrugger
    @ChristianBrugger Рік тому +1

    This talk has some really great questions and comments at the end.

  •  7 років тому +12

    In most cases you don't need to have the global variable as part of the shared library export table. You can use a linker script to remove the symbol for the variable from the export table and only leave the functions/classes that are actually used by the host program visible. This way the shared library will always get private instances of its own globals.

    • @quickNir
      @quickNir 7 років тому +9

      Unfortunately I didn't have time to talk about visibility. As you said, you can hide visibility and this does resolve these issues, but then you get multiple copies of the global. I don't know if I agree with "most"; certainly any global that you don't own yourself, it's not really safe to hide. E.g. if you write a library that uses a logging library (very common), you shouldn't fail to export globals from that logging library. Someone might want to use the logging library and your library, and not want there to be two global loggers. In some cases it would be ok, if you say have a factory as an implementation detail. But many factories, part of the whole point is that clients of the library can register in it. So I think it simply depends.

    •  7 років тому +3

      Yes, you are right. My comment was intended to be an observation on your idea that more than one instance of the same global is less "evil" than multiple constructor/destructor calls on the same instance.

    • @ceigey-au
      @ceigey-au 6 років тому +1

      Off topic: hello fellow aviator-wearing sloth person!

  • @MPDR
    @MPDR 4 роки тому +6

    I once had a problem where I had a static vector that was filled at "compile time" with information about dynamic types for an ECS implementation. I encountered the bug when I noticed the vector was being filled before main() executed and then, when main() executed, the vector was actually empty, I ended using the lazy loading function solution shown in here that I found in a website. Cool to be able to actually understand why that was happening. Great talk!

  • @sanderjobing
    @sanderjobing 7 років тому +3

    With C++14 using gcc 5.4, I tried declaring a static g_str in the header file as shown in the presentation at 19:57, but this does not link: multiple definition of `detail::g_str[abi:cxx11]()'. Anyone tried this solution also? I was thinking of using it as a best practice.

    • @quickNir
      @quickNir 6 років тому

      Do you have the include guard for your header file? Sorry if that's a silly question. You definitely shouldn't have problems declaring (defining, to be precise) a *static* variable in the header.

    • @ryannicholl8661
      @ryannicholl8661 2 роки тому +1

      Of course not, this is an ODR violation.

  • @lesto12321
    @lesto12321 2 роки тому +1

    as embedded programmer we have tons of global that need performance access, need to share data between HW interrupt routine and main task/threads

  • @KobiCohenArazi
    @KobiCohenArazi 5 років тому +2

    Nir - around 22:03 - why std::string& g_str and not auto& ? there was a comment from the audience. thanks.

    • @puyadaravi3109
      @puyadaravi3109 4 роки тому

      @Kobi Cohen-Arazi Because then you would be accessing the global via a pointer (he could have used `auto` or `std::string`). Also I think there is a `static` keyword missing for the `g_str` declaration.

  • @shivanshu3
    @shivanshu3 6 років тому +2

    This doesn't seem to happen with MSVC. MSVC creates 2 separate globals - one for the dll and one for the exe. But I was able to repro this with g++ on Linux.
    I'm using cl.exe version 19.11.25506 and link.exe version 14.11.25506.0

    • @bloodgain
      @bloodgain 6 років тому +5

      I would see the creation of 2 separate globals as an error. It's certainly unexpected behavior from the programmer's perspective. Here, where the example is just a value, it may not matter, but it certainly matters in many cases -- e.g. the example given of a global logger.

    • @andersknatten
      @andersknatten 3 роки тому

      I have the same problem. This works fine on Linux, but on Windows with cl.exe 19.16.27045 and link.exe 14.16.27045.0 I get two separate globals.

    • @pauldubois0
      @pauldubois0 2 роки тому

      It's because in Unix toolchains, symbols in shared libs default to being visible; but in MS toolchains they default to being hidden. See other comment threads on this video for the implications of symbol hiding.

    • @andersknatten
      @andersknatten 2 роки тому

      @@pauldubois0 Did you verify this? I don't have the code I used to test with anymore, but I'm pretty sure I exported the symbols when testing with msvc.

  • @iddn
    @iddn 7 років тому +24

    AFAICT the C++ standard totally ignores the fact that shared libraries exist. Everything is assumed to be a monolithic application. This leads to problems not only with Boost.Log but also Boost.ASIO, and can even prevent mixing the two in the same application. It's a nightmare.

    • @ryannicholl8661
      @ryannicholl8661 2 роки тому

      Static/dynamic are implementation details.

    • @llothar68
      @llothar68 Рік тому

      @@ryannicholl8661 Implementation should be part of the language. It's insane that stupid (abstract guys with high IQ but stupid as shit in real life) try to isolate the language from the world it's living in. That we don't have standardized language enforced build tools is part of it. This is not the 1980s anymore where specifiying a language is enough. I hate you people from the deepth of my heart.

    • @llothar68
      @llothar68 8 днів тому

      @@ryannicholl8661 this is a real stupid ignorant answer from university ivory tower

  • @Manava2012
    @Manava2012 5 років тому

    @25:00 Defining global in header file is not a possibility.Right? You would have multiple global with same name with each translation unit and Build Fails. Not sure what is meant here.

    • @keris3920
      @keris3920 4 роки тому

      Globals can be defined in headers. Consider a header only library where all headers are included in a single header file. You can define a global at the top of your aggregate header and use it in all subsequent headers.

  • @kwkfortythree39
    @kwkfortythree39 7 років тому +4

    I go to slides url but I cannot find the ones corresponding to this talk, only the general repository. What's the direct link please?

    • @farway-417
      @farway-417 7 років тому +1

      It is CppCon's default message. It might be Nir Friedman hasn't given them the slides (yet).

    • @quickNir
      @quickNir 7 років тому +17

      Apologies about that, I need to submit them. Here is the direct link: www.nirfriedman.com/reveal_globals_linker/.

    • @zhaohui0923
      @zhaohui0923 7 років тому

      could you please submit a pdf or slide version to the official git repository? I still can't find it. Thanks a lot.

  • @andersknatten
    @andersknatten 3 роки тому

    One important heads up: Make sure that these global variables (or the functions that contain them) are exported in the dynamic library! Both in the case of the inline global, and in the case of the static local (the only two I checked), you get two different copies used in the running process if the symbols are hidden. The loader / dynamic linker is not able to ensure uniqueness of hidden symbols.
    When making a shared library, it's not unusual to do `-fvisibility=hidden` to only export the symbols you intend to export.
    - For the inline solution you can just annotate the variable it with `__attribute__ ((visibility ("default")))` to export it, and everything works.
    - For the static local solution, you have to annotate the function *containing* the static local with `__attribute__ ((visibility ("default")))`

  • @zhaoli2984
    @zhaoli2984 6 років тому +2

    nice talk. that trick is neat

  • @ryannicholl8661
    @ryannicholl8661 2 роки тому

    ODR violation?

  • @AbbeyRoad69147
    @AbbeyRoad69147 Рік тому

    He is tricking the linker into not executing the global contructor. Global constructers happen before main() or on load of the .so file. No surprise it segfaults.

  • @lukeskywalker2116
    @lukeskywalker2116 Рік тому

    Ada 95 solved this problem and the C++ people said “oh, that’s too hard.” Ada 95 is C++ 2030

  • @markramirez3920
    @markramirez3920 2 роки тому

    "C++ uses C linkers" ...

  • @TheEVEInspiration
    @TheEVEInspiration 7 років тому +4

    So, don't use dynamic linked libraries.
    There is no need for them in most applications as they offer no benefits, only downsides.
    The whole thing is an artificial problem and should not impact development concerns.

    • @quickNir
      @quickNir 7 років тому +26

      This statement is rather hyperbolic. Static and shared linking both have their pros and cons. And if you're writing code for other people, they may have a requirement of having you ship a shared library (or both), so it will not be up to you, so the issues here will still be relevant. For a quick but reasonable summary of the trade-offs: stackoverflow.com/questions/1993390/static-linking-vs-dynamic-linking. Another nice point not discussed here, is that shared libraries allow function interposition, a specific and common example of which is the "LD_PRELOAD trick". AFAIK the environmental variable MALLOC_CHECK_ also depends on interposition. This allows you to quickly and easily rerun your application with a debug or instrumentation oriented version of a function without recompiling. If you were to statically link (in the MALLOC_CHECK_ example, glibc), you can't do any of these tricks and must recompile/relink each time.

    • @TheEVEInspiration
      @TheEVEInspiration 7 років тому +1

      First, thanks for the response and I know I can be provocative!
      I would argue that the case you bring up for dynamic linking can be done without dynamic linking libraries just the same. For example at the expense of an indirection step and a different way of compiling and linking.
      I remember from the 90s the Microsoft C/C++ compiler had the option compile per function and link only what is needed. When a compiler does that, every function is essentially on its own or part of a graph of code that gets included only when possibly needed. Linking is very fast and could in theory even be done incrementally / just in time. Having a new binary to run with a different implementation of a function in place can be very rapid and with some effort and tooling even with other code changes.
      Anyway, DLL hell became a meme for a reason and there is still heavy over-use of them IMO.
      Back when I still did C++, I hated it when a library had no source code with it. It impeded debugging and learning what truly happened and made having everything compiled the same way unlikely. And because hidden issue like you demonstrated, I actually ran into these before in the 90s and had to link libraries twice (back then I did not know why).
      The only "valid" reason I seen libs were shipped in compiled form was the fear of customers and others seeing the code. Today with open source and so many security fears in general (not related), as a customer I would prefer having the sources even more then ever before.
      And hiding implementations is just a technical hotfix for a non-technical problem and that is always a bad thing for someone, usually the customer.
      Don't take any of this as a critique on you, your presentation was good.
      I am just voicing a point of view.

    • @TheEVEInspiration
      @TheEVEInspiration 7 років тому

      And it might be my confirmation bias, but I can see a lot of what I wrote also noted on the page you linked.

    • @quickNir
      @quickNir 7 років тому

      Well, that would be an extra indirection. For a function that isn't getting inlined the shared library approach has zero overhead. Not to mention that the shared library approach is non-intrusive: you can use interposition on functions defined in libraries you don't control. That's why it's easy to swap in tcmalloc or jemalloc. Anyway I'm not really sure what the big downsides of shared libraries are. If you stick purely to one or the other, then you don't get these issues, so it seems weird to blame it on shared linking. In fact, a global defined in a shared library is always safe, even if people have other static libraries. The inverse is not true (and how I demonstrate the issue).
      Of the advantages of static linking listed on the SO page, 2/3 (running in limited environments, and startup time) are 100% irrelevant in what I do (HFT). As for making distribution easier, I'm not sure if I really agree. Copying over a directory isn't substantially harder than copying over a single file. Meanwhile, you have all of the benefits discussed above, as well as potentially using less memory and getting better performance when running different, but code-sharing binaries on the same machine.

    • @TheEVEInspiration
      @TheEVEInspiration 7 років тому

      "Well, that would be an extra indirection. For a function that isn't getting inlined the shared library approach has zero overhead"
      Indirection that is perfectly predictable is practically free. And where non-inlined code is located in memory matters, just as with data layout. Code gets cached, and thus pre-fetch predictability and information density have a performance impacts. One of the jobs of an optimizing compiler+linker system is to figure out either by profiling or static call graph analysis where to put code relative to calling code. Code that is hardly ever called must be moved away from hotspots to keep the hotspots dense and increase code cache effectiveness.
      This matters when run-time ultimate performance is your primary goal. The less of a library is used the less dense the code is from the perspective of your CPUs cache(s). Using very little of the functionality in a library is common. Static linking alone can remove the dead code and have a benefit (if the library isn't a tangled mess).
      Your use-case might be different in that your typical hotspots reside fully in your main program or library. Then this effect does not affect you as much as each can get optimized on its own. This is a design element creators of libraries need to keep in mind. Too often I see tiny and frequently needed functions put in APIs that are not part of the main program or even have to be invoked remotely.