5. C to Assembly

Поділитися
Вставка
  • Опубліковано 15 тра 2024
  • MIT 6.172 Performance Engineering of Software Systems, Fall 2018
    Instructor: Tao B. Schardl
    View the complete course: ocw.mit.edu/6-172F18
    UA-cam Playlist: • MIT 6.172 Performance ...
    This lecture focuses on how C code is implemented in x86-64 assembly. Dr. Schardl reasons through the mapping from C code to assembly in two steps: C to LLVM IR and then LLVM IR to Assembly.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

КОМЕНТАРІ • 144

  • @leixun
    @leixun 3 роки тому +185

    *My takeaways:*
    1. How does C code become assembly 5:20
    2. LLVM IR primer 10:25
    3. C to LLVM IR 19:45
    4. LLVM IR to assembly 48:18
    5. Case study 1:07:42

    • @Sam-AZ
      @Sam-AZ 2 роки тому +1

      Thank you.

    • @leixun
      @leixun 2 роки тому +1

      @@Sam-AZ You are welcome!

    • @frejustossou9910
      @frejustossou9910 2 роки тому +1

      @@leixun Hello which is your Channel ?

    • @frejustossou9910
      @frejustossou9910 2 роки тому +1

      @@leixun Done. Your Channel is very interesting.

    • @leixun
      @leixun 2 роки тому +2

      @@frejustossou9910 Thanks!

  • @klausehrhardt4481
    @klausehrhardt4481 2 роки тому +4

    Nice to have a look on the intermediary LLVM output files and the way compilers do their work.

  • @shirleyachara3809
    @shirleyachara3809 2 роки тому +7

    Best lecture on this topic! Thanks 🙏

  • @josephgibson2548
    @josephgibson2548 4 роки тому +20

    Thank you for these videos!

  • @SuperlativeCG
    @SuperlativeCG 2 роки тому +50

    The only assembly language I can read are IKEA instructions.

  • @jeffpowell860
    @jeffpowell860 3 роки тому +19

    primer, priming, primed....primmed?

  • @filipecotrimmelo7714
    @filipecotrimmelo7714 2 роки тому +1

    This class was really amazing.

  • @jimjrivan
    @jimjrivan 2 місяці тому

    Parabéns pela aula, professor!

  • @byarkan
    @byarkan 2 роки тому +32

    Amish man is my new favorite instructor right now

  • @bariswheel
    @bariswheel 2 роки тому +4

    Great stuff thank you

  • @norbi4148
    @norbi4148 2 роки тому +4

    MIT logóre nézve megijedtem és hírtelen a BME MIT jutott eszembe. A mai napig PTSD-m van tőle :D

  • @piotrlenarczyk5803
    @piotrlenarczyk5803 2 роки тому +1

    Thank you for video.

  • @user-ju1qd9ek2m
    @user-ju1qd9ek2m Рік тому

    very clear talk, thanks

  • @custodiogomesbarcellos4972
    @custodiogomesbarcellos4972 2 роки тому

    Great content.

  • @ill_t5
    @ill_t5 Рік тому

    حقـاً شيـيء رائـع 🤍🤍.

  • @davereid-daly2205
    @davereid-daly2205 2 роки тому +6

    Brilliant simple straight forward explanation. Why does everyone else make this seem complicated????

    • @pschneider1968
      @pschneider1968 2 роки тому

      Because it is complicated?!

    • @davereid-daly2205
      @davereid-daly2205 2 роки тому +4

      @@pschneider1968 I don't find it complicated. Generally, in my experience, people who don't fully understand a system struggle to teach others how it works and they tend to complicate things. I see this all the time in my area of expertise. People love to teach others, but there are not many good teachers because too few of them have an in depth grasp of the material.

    • @pschneider1968
      @pschneider1968 2 роки тому

      @@davereid-daly2205 Don't get me wrong - the lecture is great! But for me, this is really complicated stuff... E.g. if you look at the x86-64 instruction pipeline, branch prediction, etc etc. Someone once said (I don't recall who it was), that "the true sign of advanced technology is that it's indistinguishable from magic" 😁

    • @davereid-daly2205
      @davereid-daly2205 2 роки тому +2

      @@pschneider1968 I'm not sure what you are trying to say, to be honest. But I don't agree with your quoted statement. Magic is largely a product of misdirection, where as computers are a product of engineering and great design, two very different things.

    • @pschneider1968
      @pschneider1968 2 роки тому

      @@davereid-daly2205 Yeah, I see that you are not getting my point. Let's leave it at that.

  • @pythontron8710
    @pythontron8710 2 роки тому +19

    I would appreciate a Holy C to assembly course.

    • @Antagon666
      @Antagon666 2 роки тому +12

      Functions are called Prayers, and are executed by God himself.

  • @raghav151196
    @raghav151196 3 роки тому +7

    In the final example of fib, under LBB0_1,
    we calculate (n-1) as -------> leaq -1(%rbx), %rdi
    but calculate (n2) in 2 steps.
    Why did we do so?

    • @rickr530
      @rickr530 2 роки тому +1

      Yes it does not make sense. It should be optimized to a single instruction: leaq -2(%rbx), %rdi.

    • @digama0
      @digama0 2 роки тому +6

      It is because the optimization was set to -O1 in this example, instead of -O3 which can make line-by-line comparison harder. You can see on godbolt (link block seems active, see /z/nzfvh5Gfc on the compiler explorer) that clang will in fact optimize the second ADD to LEA when you crank up the optimization level. My guess is that the reason it didn't select LEA to begin with was because the second calculation was the last use of %rbx, so it figured it would reuse %rbx for the result of computing n-2, and then filling the linkage arguments is a separate stage, where %rdi gets filled. (The -O3 code is even more confusing because it actually inlines one of the recursive calls into a while loop, so it would not have made a good demonstration for the example.)

  • @deanlhouston
    @deanlhouston 2 роки тому +35

    One of the students asks why is C called 'C'... and the instructor didn't know.
    C language is called 'C' because C comes after 'B'. C was derived from and an improvement over an earlier programming language called 'B'. Nothing mysterious or philosophical was involved in naming it.

    • @josh5457
      @josh5457 2 роки тому +8

      B itself was derived from BCPL (which came from CPL). One could imagine that C's name is a double entendre, as C comes after B alphabetically but is also the next letter in BCPL (so should C's successor be called D or P?). Honestly I doubt Ritchie and Thompson had this in mind lol but it's fun to think about

    • @jamesmillerjo
      @jamesmillerjo 2 роки тому

      And no theory has confirmed

    • @slbewsp
      @slbewsp 2 роки тому +10

      I believe the question was about phi, not about C.

    • @OEFarredondo
      @OEFarredondo Рік тому

      I'd have called it G

  • @TB-jl9fr
    @TB-jl9fr 2 роки тому +1

    Much of india dudes in lecture. Seems to be the same anywhere in that sort of lectures :D
    After watching 5min of that video i remembered why i changed the subject from embedded systems to electrical engineering xD

  • @richardhelper167
    @richardhelper167 2 роки тому

    I wonder why in llvm ir representation of mm_base function on 25:22 parameters A and B marked as readonly. They'd better be preceded by a const keyword.

    • @digama0
      @digama0 2 роки тому

      Adding the readonly keyword only means that writing to the pointer is UB, so as long as the C compiler knows that the pointer isn't written to and doesn't pass the pointer to something else that does, it is free to insert the attribute. LLVM can do this as well during optimization, but the C compiler knows the C spec better and infer constness in places where LLVM might not be able to.

  • @banalestorchid5814
    @banalestorchid5814 2 роки тому

    @9:34 "Primer" is not pronounced "prim-mer" if it were then it would be written "primmer". It is "prime-er" as in "priming" you for something.

  • @CaseyAnthonyVEVO
    @CaseyAnthonyVEVO 2 роки тому

    @16:44 it might just be the network security guy in me but when I saw ICMP I thought of something other than Boolean logic LOL. How confusing.

  • @StephenCameron
    @StephenCameron 2 роки тому +2

    At 1:18:26, why did it use leaq to calculate n-1, but movq and addq to compute n-2? Why not again use leaq -2(%rbx), %rdi to compute n-2?

    • @StephenCameron
      @StephenCameron 2 роки тому +1

      To answer my own question, the answer *may* be: because leaq is done by address decode hw and add/mov are not, and so they can be executed simultaneously by the CPU, but it may not do two leaq's simultaneously. But this is just a guess, I don't actually know this.

    • @isovideo7497
      @isovideo7497 2 роки тому +2

      The n-1 calculation needs to preserve the original n as n is used later. The n-2 term doesn't need to preserve n, so the addq is faster, and this is what you see prior to optimization. The optimizer could indeed combine the addq and movq as that is a simple multi-instruction optimization.

  • @vivekkaushik9508
    @vivekkaushik9508 Рік тому +1

    Watch this before going to bed. You'll wake up smarter.

  • @fabiobairros3582
    @fabiobairros3582 2 роки тому +4

    Congrats for the class !! How can I implement the C operation (x % y) in Assembly ?

    • @MrGeorge1896
      @MrGeorge1896 2 роки тому +12

      Use the DIV/IDIV divide instruction. The remainder of the division will be stored in register RDX. (=EDX in 32 bit mode)

    • @H33t3Speaks
      @H33t3Speaks 2 роки тому

      @@MrGeorge1896 Pretty sure there’s a ‘mod’ instruction between to registers or an ALU.

    • @karolmaczek
      @karolmaczek 2 роки тому +2

      @@H33t3Speaks no

    • @isovideo7497
      @isovideo7497 2 роки тому +1

      A div instruction should also give the remainder, but if y is a constant power of two, it may use faster and-mask operations (e.g. x % 4 == x & 0x3).

    • @fabiobairros3582
      @fabiobairros3582 2 роки тому

      @@isovideo7497 Thanks !! Do you know how the Python function pow(a, b. m) works ? what algorithm it is used ?

  • @luserdroog
    @luserdroog 2 роки тому

    The shape of the greek letter phi looks like the graph with lots of loops.

    • @StephenCameron
      @StephenCameron 2 роки тому +2

      Seems like psi would have been a better fit.

  • @93hothead
    @93hothead 2 роки тому

    im just looking at these and am completely clueless as to what is going on.... is there anyway to get used to RISC-V architecture??

    • @sergiog5543
      @sergiog5543 2 роки тому

      no

    • @pschneider1968
      @pschneider1968 2 роки тому +1

      @@sergiog5543 There is a reason Andrew S. Tanenbaum, in the initial newsgroup discussion "Linux is obsolete", said that Intel x86 was a "weird" architecture 😉😁

  • @marcioaso
    @marcioaso 4 роки тому +20

    I thought it would be C to Assembly, not C to LLVM.

  • @sethtrowbridge9122
    @sethtrowbridge9122 2 роки тому +10

    you. in the background. with the squeaky chair: I don't know who you are. I don't know what you want. If you are looking for ransom I can tell you I don't have money, but what I do have are a very particular set of skills. Skills I have acquired over a very long career. Skills that make me a nightmare for people like you.

  • @kenichimori8533
    @kenichimori8533 2 роки тому

    Define

  • @solome6478
    @solome6478 2 роки тому

    1:10 ahh college life of pulling multiple all-nighters to cram...

  • @bagtea
    @bagtea 3 роки тому +6

    who else is strugglin on Assembly to machine :(

    • @mvisperas
      @mvisperas 3 роки тому

      Try hand assembling. You will learn how the assembly is translated to machine codes. Used to do hand calculation on relative jumps, this can teach you how positive and negative numbers work.

    • @bagtea
      @bagtea 3 роки тому +5

      @@mvisperas lol i somehow managed to learn and do well in that exam but now i forgot everything

  • @pekertimulia125
    @pekertimulia125 2 роки тому +1

    Smalltalk and C++

  • @magno5157
    @magno5157 2 роки тому +7

    59:34 the use of "top" and "bottom" here (referring to the fact that stack grows "downward") is very awful.
    It's much, much more common that people say "the top of the stack" and "the base of a stack frame". In which case, %rbp points to the bottom of the current stack frame and %rsp points to the top of the stack, instead of reversing the sense of direction by describing %rbp as the "top" and %rsp as the "bottom".

    • @haydenp14
      @haydenp14 2 роки тому +1

      good insight

    • @evanconley9825
      @evanconley9825 2 роки тому +1

      The professor's reference was accurate and is definitely the proper way to define both the "direction" that the stack "grows" as well as "where in the stack frame" each of %rsp and %rbp points to. It is exactly as shown, regardless of changing the semantics. Using the phrases "top of stack" and "bottom of stack frame" doesn't change the definitions, and that explanation doesn't result in the description you provided, because %rbp does not point to the bottom of the current stack frame, it points to the top; %rsp points to the bottom of the current stack frame.
      This is not a matter of interpretation or perspective. %rbp position 0 is near the top of the current frame, holding previous return address, and %rbp position >0 is at the top of the current frame, holding the current return address. %rbp

    • @magno5157
      @magno5157 2 роки тому

      @@evanconley9825 I never said it was wrong. His way of referencing was just against common usage.

    • @akshatghoshal6098
      @akshatghoshal6098 2 роки тому

      @@magno5157 i think this is how they taught me in university as well just like how this professor is teaching

    • @akshatghoshal6098
      @akshatghoshal6098 2 роки тому

      @@magno5157 actually nevermind i dont know exactly what ur talking about since I am bad lol

  • @AndreyVarlamov
    @AndreyVarlamov 2 роки тому

    learning assembly in 2020 = reinventing wheel...

  • @day7141
    @day7141 2 роки тому

    What’s this, only plebes use C?

    • @longlostwraith5106
      @longlostwraith5106 2 роки тому

      C runs everything, BOI.

    • @day7141
      @day7141 2 роки тому

      @@longlostwraith5106 You’re delusional. Crypts run everything? By that you mean Vice Lords, which are a gang of Rabbi’s.
      There’s a reason the BL’s always attack those old Jews in New York all the time. It’s because they’re the leadership of the Vice Lords. The Folk are the White people branches. The Crypts are the PoC.
      You’re delusional. These groups are all trash. They ruin lives for dollars and street corners. Gangs are stupid people trying to compete in a modern economy.

  • @darkwoodmovies
    @darkwoodmovies 5 місяців тому

    Love this open courseware, truly thank you... but as an aside, I don't think we need to donate to a school with a $23.5 billion endowment.

    • @mitocw
      @mitocw  5 місяців тому +1

      The additional funds we are asking for is not survival but to thrive! MIT gives $1-2 million every year to MIT OpenCourseWare and that's not going away. We've been publishing for 20+ years now, e.g. MIT has given tens of millions of dollars away for free (not to mention the generous material contributions of all the instructors and students at MIT... which is purely voluntary). We will always be publishing courses... but we could always publish more with more money. You can help us publish more courses and help us share more knowledge. ocw.mit.edu/donate

  • @whkee
    @whkee 2 роки тому +7

    Assembly to Machine ☝️🤣

  • @tamoghnamukerjee9283
    @tamoghnamukerjee9283 2 роки тому +3

    I am sorry, I thought I get a 'C' just to sign up and 'assemble' an imaginary study table.
    You mean to say I was wrong.

  • @jonathanjollimore4794
    @jonathanjollimore4794 2 роки тому

    Will yea look at all the numberphile someone times yea need a good old head from logic problems and puzzles ;)

  • @justcurious1940
    @justcurious1940 7 місяців тому

    I was planning to dig more under 'C' but after seeing this video I lost interest, It's so messy and more complex than 'C' or even 'C++'.

  • @arturo.gonzalex
    @arturo.gonzalex 2 роки тому +2

    In european universities they teach us exactly the same. But we can study for free, and in US the price is 100k. Why?

    • @love_pets1363
      @love_pets1363 2 роки тому +1

      It's the american dream.

    • @timothydee1507
      @timothydee1507 2 роки тому +2

      MIT is ranked #3 in the world and like 8 of the top 10 universities are also American

    • @swarnavasamanta2628
      @swarnavasamanta2628 2 роки тому

      The value of a degree from MIT is far more than any university from EU

    • @arturo.gonzalex
      @arturo.gonzalex 2 роки тому

      @@swarnavasamanta2628 what exactly makes it more valuable? enormous student debt?

    • @swarnavasamanta2628
      @swarnavasamanta2628 2 роки тому

      @@arturo.gonzalex what makes it valueable is the people who are around you. Everyone who gets in these colleges are smart and works hard so you're in a good environment. And also the research opportunities in these colleges are manifolds. Not to mention there are numerous scholarships for these prestigious colleges. And you get picked up like a hot cake when you graduate from any of these colleges. European Universities are good but these are the best.

  • @ScoopexUs
    @ScoopexUs 2 роки тому +2

    This isn't really Computer Science, since it analyzes one type of compiler, not even one type of CPU let alone all types (for which Assembler works very similarly). To take the phi loop example, in another compiler there would be nothing to correspond to a phi "instruction", and the variable i would correspond to a single register (by any alias).
    If instead Assembler was taught generally, anyone could write their own compiler for a current or future CPU (by synthesis, bottom-up software engineering). They would still be as unable as you (or any professor) to parse the output of another compiler. You are left with identify patterns, part of which you understand. But what actual code is run, and how CPUs execute code, is left a mystery to CS students.
    This is why there are no hands in the air. Doing it this way leaves the subject too alien to what they've been taught for years, which is a top-down software engineering approach (yes, even for C which has all the power of Assembler without the benefits and speed.)

    • @alex_pincha
      @alex_pincha 2 роки тому +6

      6:57 "this is not a compiler class........"

    • @generessler6282
      @generessler6282 2 роки тому +4

      He didn't mention that phi is a standard part of static single assignment, which has been part of the compiler literature for 15 years or more. Any compiler that uses SSA as an intermediate rep (which these days is virtually all) will need a phi representation. In fact the reason llvm and clang exist is to use best-of-breed compiler techniques - like SSA - in a fresh impl rather than trying to glue them into gcc. So this lecture is about compiler engineering as it affects the students in the course, which is about implementing high performing software. Whether that's computer science is a religious issue.

  • @kenichimori8533
    @kenichimori8533 2 роки тому

    Cyrillic alphabet assembly Osakana.
    お魚

  • @josemanuelquispemamani9672
    @josemanuelquispemamani9672 2 роки тому +1

    We need spanish subtitutles

  • @filipecotrimmelo7714
    @filipecotrimmelo7714 2 роки тому +1

    Assembly is easiest than LLVM lol

  • @setandforgetinvesting6708
    @setandforgetinvesting6708 2 роки тому

    My brain can't understand this

  • @jamesmillerjo
    @jamesmillerjo 2 роки тому

    So many MIT jokes

  • @rty1955
    @rty1955 2 роки тому +1

    OMG x86? Really? What a highly limited, brain dead processor.
    The x86 is a toy trying to compete in a grown up world.
    Im am old timer who has written about a million lines of assembler code for over 14 different CPUs. X86 has got to be the worst live coded on, the best? IBM mainframes. I've written for: Data General, General Automation 460, CDC 1700, PDP 11 series, Quotron, IBM Series/1, Intel 8080, Zilog Z80, PIC processors, TI, IBM mainframes from 1401 thru s/390. Cray, Silicon graphics and a bunch of other types of machines. I even wrote microcode for IBM 360/30 and PDP 11/44. So I am pretty well versed in many CPU architectures.
    I taught COBOL programmers how to read core dumps as well.
    I used to read 3,000 page core dumps often when major subsystems failed.
    I taught mainframe operating systems concepts at NYU in NY.
    To me, computers are tools to get a job done. Some computers do things very well others do not. Same thing for computer languages. There is no one language that is efficient for all cases. This is why I learned 12 different languages. As a programmer you should ALWAYS be learning. I was self taught IBM assembler and when I did, a big light went on. Suddenly everything made sense. I was a sponge to learn more. Things I could do in assembly could not even be dreamed of by a high level language programmer. I could do more with very limited memory by adopting assembly concepts. I made the mainframe do things that IBM said could not be done. Some code I wrote in the 80s is still running today

  • @MrTiagovla
    @MrTiagovla 3 роки тому

    Just jump to the last 10min.

  • @notgate2624
    @notgate2624 2 роки тому +1

    Way too high-level. I feel like anyone who has heard of LLVM wouldn't get a lot out of this. No details were given on how it WORKS. He just talks about what it does.
    How to draw an owl:
    1) Draw a circle
    2) Draw the rest of the owl

  • @makerofstartup7902
    @makerofstartup7902 2 роки тому

    This video is for complete youngsters, and seeing at 8:57 fib call you pretty safe to close this video.
    Tip: for modern software you using graphics processing, input proc and many more, but not fib algo or calls.

  • @astaghfirullahalzimastaghf3648
    @astaghfirullahalzimastaghf3648 3 роки тому +1

    I thought every lecturers
    In this university is good
    Unfortunately some are pure capitalist or mercenary
    Who didn't know to give proper lecture

    • @ASCENDANTGAMERSAGE
      @ASCENDANTGAMERSAGE 2 роки тому +10

      What?

    • @josephphillips865
      @josephphillips865 2 роки тому

      @@ASCENDANTGAMERSAGE This university has very good lecturers however many universities seem to be all about the money rather than providing a quality education that best serves the needs of students. Ex: A school that charges a bunch of money while having low standards for hiring instructors. Students might get a degree but likely will not have credits that will transfer to a properly accredited university.

  • @illonggoako1372
    @illonggoako1372 4 роки тому +2

    To Mark Zuckerberg this just history... museum information..

    • @tratbagd4500
      @tratbagd4500 4 роки тому +7

      What are you talking about ?

    • @Itachi.Uchiha.Offical
      @Itachi.Uchiha.Offical 3 роки тому +7

      Yes, what are you talking about? :D

    • @piggubiggu5324
      @piggubiggu5324 3 роки тому +1

      He's saying that Mark Zuckerberg thinks all of this is garbage. He's probably right.

    • @starc0w
      @starc0w 2 роки тому +9

      @@piggubiggu5324 No, he's definitely not right about that.
      Compilers don't fall from the sky. And someone has to understand the basics. That is essential.
      This knowledge is very valuable and important.

    • @Raison_d-etre
      @Raison_d-etre 2 роки тому +2

      He also said he was too busy to read books. He wouldn't care about politics but for its impact on his company. Why would you think Zuckerberg is a good judge of anything other than what would improve his company's bottom line?

  • @illonggoako1372
    @illonggoako1372 4 роки тому +1

    Obsolete

    • @doggo660
      @doggo660 4 роки тому +6

      lmao what?

    • @mvisperas
      @mvisperas 3 роки тому +10

      Assembly is not obsolete. The only language a CPU knows is the machine codes. Somebody has to write those compilers.

    • @vladusa
      @vladusa 2 роки тому +7

      whoever this commenter is has no idea what the absolute shit he's doing

    • @davidomar742
      @davidomar742 2 роки тому +7

      go back to writing you cute little JavaScript kid

    • @vladusa
      @vladusa 2 роки тому

      @@davidomar742 u talking about me?

  • @ridwanm5789
    @ridwanm5789 2 роки тому +1

    gcc -S untitled.c > untitled.s (am I correct?)

  • @thomasclapton2010
    @thomasclapton2010 Рік тому

    Amish man is my new favorite instructor right now