I recently got hired by a company for a mixed programming position. I had NO experience in embedded systems programming whatsoever and was struggling to catch up. This course single-handedly brought me up to speed with quality and clarity unrivaled by any other UA-cam training course (and I have watched quite a few). If you have just found this course, keep going. You have come to the right place. The fact that Miro Samek has posted this course for free is the height of scholastic generosity. I have learned more from this course than any computer-science college class.
I have seriously never in my life seen a better tutorial. This gives me just the exact information I need, no extra jokes, no useless explanations. Excellent work!
Agreed. So tired of the loud, sarcastic, quippy, quirky, meme-filled teaching style that is unfortunately taking over the world right now. Feels good being spoken to like a grown-up!
I have watched and read hundreds of programming tutorials in my time and have always (being the inquisitve person I am), have always been left wondering about what's actually going on inside the system amongst other things. This tutorial and your teaching is the best I have ever seen and it puts my university I went to, to complete shame. Good thing I knew my uni was bad though that's why I left after the first year. Thank you so much for these lessons, you are a GOD in my eyes.
I really liked how you explained the intricate detail of memory and pipe-lining,rather than just coding purely in C. It really cleared some things up for me, thanks. :)
This is a beautiful demonstration of how to teach both to the beginner and the intermediate at the same time. I've known about nesting control structures for quite some time, but coming from a scientific (not computer science) background, I find the side discussions regarding memory etc. illuminating. Thank you!
Sir, thank you for all the effort you are putting, these tutorial videos are great! This much useful tuorials are really rare to find. Please keep up for the people out there who are craving to learn..
Great material, excellent teaching style. We went over the low level concepts in my Computer Organization class but seeing it all action, and with the concepts being brought up as needed is what really makes it all click.
Just thought I'd put this here in case anyone's interested. I was curious about the chosen method of determining whether a number is odd using bitwise operations as opposed to the more typical modulus arithmetic method. After some digging, I found that the bitwise approach may be marginally more efficient than using modulus, however bitwise is not guaranteed to work for negative numbers. For this reason, it is generally advised to use modulus unless you're working with a particularly time sensitive application, for example an interrupt, and you know that the number you're testing will never be negative.
That's an excellent point. Generally, the first concern during coding is to express the intent as clearly as only possible and the moduls operator would do this more clearly than the (obscure) bit operation. (Specifically, for the uninitiated, the test for "oddness" with the moduls operator would be ((counter % 2) == 0). Presumably, most modern compilers would optimize the modulus (division) the same way as the bit operator. I should have mentioned this in the video, but at the same time I wanted to provide an example for the bit operator because it is particularly important for embedded programmers. I hope this makes sense. --MMS
Hi, at @6:41 you mentioned that the generated machine code is faster than the C code because there is only one conditional branch at the bottom of the loop. I don't understand how there isn't one conditional branch in the C code as there is two outgoing arrows and two ingoing arrows in both situations. The only difference to me that is that the conditional branch is at the bottom vs. the top. Can you clarify what you meant by this?
I'm having issues following along with the first 2 lessons in Keil micro vision. Any tips or recommended videos to watch so I can know how to interpret your earlier lessons with a different IDE? Thanks.
Maybe you can skip forward and watch a lesson that uses KEIL rVision? The first lesson to do so is lesson #21 "Foreground/Background Architecture" ua-cam.com/video/AoLLKbvEY8Q/v-deo.htmlsi=MzaGjGY9y6j3DlSp
At 6'40", you compared the efficiency of the generated code with that of the code we intent to write, and then you said that the generated code is faster because it has only one loop at the bottom. However, the flow chat on the left has only one loop also which stays at the beginning. I am quite confused why the right flow chat is faster than the left one.
@108670028652039476584 The code generated by the compiler is faster because it avoids a branch to jump over the body of the loop when the condition is false. As I explain just a minute later in this lesson, a branch (jump) stalls the instruction pipeline in the CPU, and therefore it is quite costly (the cost is at least 3 clock cycles, which corresponds to the number of stages in the pipeline). --MMS
At 3:42, how do you step one assembly code at a time? When I perform Step Into in debugger mode, it is stepping to the next statement in the C code, not the assembly code.
You need to click on the disassembly window and make it active. Then single-stepping will advance one machine instruction at a time. To go back to stepping one C line at a time, click on the source code window. --MMS
Thank you for the course. I have one doubt with running the debugger second time. First run in debugger jumps with the green highlight on each assembly instruction i.e. both the B.N and BLT.N but the during the second the instruction directly jumps between ADDS and CMP, doesn’t highlight the “0x8e: 0xdbfc BLT.N 0x8a” Note that I am using the “Step Into F11” and I know that the BLT.N is executed (checked by putting a breakpoint on it) but what’s the reason for not highlighting after the first run?
You are using two modes of debugging: one *source* line at a time and one machine *instruction* at a time. The source-line mode is used when the active window is the source code window. The machine-instruction mode is used when the disassembly window is active. --MMS
3:55 I wanted to see if I could predict what the value would be for the compare instruction on my machine. My code is almost similar to yours except that I have while (counter < 5). I used page A6-64 (page 202 of the document) of the ARMv7-M architecture reference manual you linked in the description. The description for the compare instruction in the reference manual states that bits 15:6 are 0b 010000 1010, and bits 5:3 are for Rm and 2:0 are for Rn. Now, in my code, the value of 0 is moved to R0. So, the bits 5:3 is 000. Since we are comparing this value to #5, whose binary representation is 101, we should get that the compare instruction has a value of: 0b 010000 1010 000 101, which is 0x4285. However, my disassembler is saying that the instruction value is actually 0x2805. Would you please explain how/why I should be 0x2805 instead of 0x4825?
Great video, very clear and well presented. One question I have is when you wrote the code for determining if the counter was odd, you put '!= 0' why did you chose to put this as opposed to '== 1'? I get that both ways mean the same thing and would work, but is this a more efficient way or something? it just seems like reverse logic to me.
Why did I choose '!=0' vs. '==1'? Because comparison against zero is almost always the fastest. CPUs have instructions that set status bits when argument is (or isn't ) zero. This means that most of the time the CPU does not really need to do any comparison. In contrast, comparing to anything else typically requires generating this constant in a register (like 1), and then comparing with it, which is as expensive as subtraction. --MMS
The "T1" name is explained in the "ARM Architecture Reference Manual", as shown in the video. The BLT instruction has multiple such "encodings" (slightly different binary representations). All these "encodings" are disassembled to the BLT instruction, but they have different restrictions as to the context of use. Some encodings cannot be used in the context of the "IT block" (IF-THEN block). Please read the "ARM Architecture Reference Manual" shown in the video for more information. --MMS
Thanks for the great tutorials, I have one question: in the disassembly window and when I'm using While loop, I am not able to see the additional instruction (like BN or BLT.N ) like the second part of this tutorial. I am only able to see the regular instructions (ADDS and MOVS) like in first part of this video. Do you have any suggestions or ideas why that happening? Thanks
In order to have branch instructions (such as B.N or BLT.N), you need to have branches, such as if-statements and while-loops in your C code, which are gradually added to the code as the lesson progresses. Perhaps you don't have them in your code yet? (BTW, the .N suffix after branch instructions denotes "near" branch. There are also "far" branches that can jump much further from the current location.)
+Nawar Youssef I have the exact same issue. I thought it could be due to a newer version of IAR, but that doesn't make sense really. We should be able to see the same instructions in the disassembly regardless of version
+Johnny Hall thx Johnny, I figure it out! It was due to the old version of the code when we copy and paste the folders. In someway it was compiled as the code for the previous lesson. All what you need to do is creating new project for each lesson and then you can copy the previous code instead of coping the whole folder.
Your tutorials are great. I ordered the DE0-Nano-SoC a few days ago and I saw your tutorials today. I like the DNS for the Ethernet speed and I like IAR because, I like to see what is happening on the register level and I like your tutorials. The DNS is a hybrid with the Cortex A9 and an FPGA. It is made by Altera. I looked at the IAR site and it offers support for the Cortex A9 under a different name and the site also lists Altera as a support group or something. Is there a fit with DE0-Nano-SoC and IAR that I am missing or is it something that possibly could happen later?
How does the program decide what general purpose register to use when storing data? For e.g. in this video, the value of the counter variable is stored in register R1 (see 2:07). However, in my code, the value is stored in R0. Is it possible to force the program to store a value to a certain register or is it automatically determined? How do you work with more than one register at a time and is it possible to fill up all the general purpose registers with data? I.e. you force the registers R0 to R12 to hold values at the same time.
The compiler is free to choose the CPU registers as it pleases (within limits). And no, you can't force any specific way of allocating registers to variables, at least not in the *standard* C. The restrictions on choosing the registers are known as the calling convention. This is the agreement as to what registers can be clobbered and which must be preserved across function calls (lessons 8 and 9 talk about functions and the ARM Procedure Calling Standard).
I've placed a link to the "ARMv7-M Architecture Reference Manual" at the end of the video description, but this is not an official link from ARM. To get to the official documentation, ARM Ltd. requires you to sign up to their website. Please go to arm.com, type "Architecture Reference Manual" into the search box, and sign up to download the PDF.
Hello, I'm a newbie to embedded systems programming. First I would like to say, "Thanks for this series!" In order for me to understand everything to the fullest, I watch your videos multiple times in an attempt to make sure I grasp every concept before moving on. I hope you can help me with my question: In this video at position 9:00 minutes, we see: 0x80: 0xdbfb BLT.N ??main_1 According to your previous explanation, the offset for the BRANCH is 0xfb (-5), which means this instruction would branch to 0x80 + 0xfb = 0x7b:, and not 0x7a. Can you clarify why the offset is not 0xfa?
Daniel P HI Daniel. I've picked up on this too which I think is an error in the video. Calculating the new PC location is not correct from what I can see. If you add more statements ++counter; you will increase the offset and create a new value and the PC - Offset formula doesn't go to the right location.
Daniel P As I explain in several occasions later in the course, any load to the PC on Cortex-M must be an odd number. This is because the least-significant-bit in the PC is not for addressing, but rather it is the THUMB-flag, meaning that it indicates the THUMB instruction set. But Cortex-M supports only THUMB (Thumb-2 to be precise), so this bit has to be always. In fact, as an experiment you can try to load something even to the PC. For example, you might change the value in the LR register right before returning from a function (see the upcoming lesson about functions). As you will see, this will cause the HardFault exception in the CPU. --MMS
Quantum Leaps, LLC hi this is interesting. I thought the address shown in PC counter was actually 2 instructions behind the real address due to pipelining.
+Quantum Leaps, LLC Hello! It is perfect tutorial. I am very pleased to watch it. As for this case, I don't understand why 0x7e: 0xdbfc works with 0 in the least-significant-bit and 0x80: 0xdbfb shouldn't work. Both instructions load PC, don't they? Could you explain?
+Dmitrii Kozlov The branch instructions do not load the PC. They increment or decrement the PC (they are *relative* to the current PC). So, you might think that the B instruction is like ADD pc,pc,#xx, which is of course different than LDR pc,... Regarding the offsets, your example 0x80:0xdbfb would not work because the offset 0xfb is odd, whereas the THUMB-2 instructions are always at even addresses.
@9:01, the jump encoded at address 0x80 (= 128 in decimal) is fb, which equals minus 5 bytes, right? Then this would seem to indicate a jump to address 0x7b (= 123 in decimal). But in the Disassembly view it appears that the jump should be to address 0x7a. Can you please explain the apparent discrepancy?
+Forbes Winthrop By now it is probably one of the most frequently asked questions. I explain later in the course that any value loaded to the PC (Program Counter) register on the ARM Cortex-M CPU must be odd. This is because the least-significant-bit in the PC is NOT used for addressing (instructions are always aligned on the even address boundary, anyway). So this LSB in PC is instead used to carry information about the state of the CPU, with 0==ARM-state and 1==THUMB-state. But Cortex-M supports only the THUMB-state, so the LSB bit must be always 1. BTW, you can try to load an even address to the PC from the debugger. You should try it and see what happens!
Hello, I am confused about a few things.. how do we know "d" = t1, B = condition ?? and How does FC = -4? I rewatched the counting segment, and still do not understand how 0xFC = -4?
+Mathew Wilson In 32-bit arithmetic the bit pattern for -4 is 0xFFFFFFFC (just try to 0x00000000 - 0x00000004). In 8-bit arithmetic, 0x00 - 0x04 == 0xFC).
Please watch the video @5:00. Here I show the encoding of various B (branch) instructions. The encoding for the BLT.N instruction at the address 0x7E uses the 8-bit immediate offset. All those offsets are interpreted as *signed* numbers. So, an 8-bit signed number 0xFC is -4. If it was interpreted as unsigned number, it would be 252. So, in the end it all depends on how the bit pattern is interpreted.
+Sai Ganesh At 9:30 I give just an example of the "if" statement. There is no actual code (only comments) in the if-branch and else-branch. But if you would put some actual code in the if-branch, then this code would execute when the counter is odd. Similarly, if you had some actual code in the else-branch, then this code would execute when the counter is NOT odd (i.e., when it is even).
Hi I'm a little confused on the computation of the program counter when branching. So given this: 0x000004DC E7F7 B 0x000004CE When computing the program counter, we add the offset value of the encoding to the current value of the PC. In this case we have PC: 0x4DC and an offset of F7, which in two's complement is -9. So adding them together we get 4D3, which isn't the target address of the branch. What am I missing here?
Your B instruction starts with 11100, so it uses the T2 encoding (see 5:05), so the immediate offset is 11-bit 0x7F7. But you're right that after sign-extending from 11 bits, it is -9. So, now 0x4DC - 9 == 0x4D3 -> 0x4D2 (after discarding the LS-Bit). But I think that this must be further corrected for the pipelining effect 0x4D2 - 4 == 0x4CE. --MMS
Hi Samek Thank you for the serious of tutorial. In this tutorial I have one clarification For 18 and 21 the binary representation seems to be different Is it correct ?
Of course. How would that all work otherwise? The binary representation of 18 is: 10010. Binary representation of 21 is: 10101. I hope you see that's different. --MMS
You don't say what you see instead of the "B.N" instruction. But some disassemblers might simply show the "B" (Branch) instruction. The suffix ".N" means "near" branch within limited distance from the current PC (Program Counter). Such "near" branch instructions can be encoded as a 16-bit instruction, which is the case here (see the instruction binary encoding 0xe000). There are also "far" branch instructions encoded in 32-bit instructions. --MMS
I really appreciate for giving your time :) After I can only see (at 3:35 highlighted) instruction (0x76: 0x2000 MOVS RO #0) which is the first lint after main. And after that I only see a series of ADDS. I dont see B.N, ADDS,CMP, BLT.N and so on. I hope I am explaining well
You most likely see the disassembly code from the very first version of the code, when the counter variable is incremented several times (++counter;). But at 3:45 the code is already modified to contain the while() loop. --MMS
I am really really sorry. I did not realise I had two main.c files open. And the main.c file that was on top had the while loop, but that was not being executed, instead other main.c file which had ++counter code was being executed thats why I see old incrorrect disassembly instructions. I am sorry for wasting your valuable time. Thanks for being patient :)
i wrote the same code with the version ARM 8.50.4, and in the disassembly view the if_statement is not considered, i dont understand, someone can help me please?
I had the same issue. Copying the files in the beginning of the lesson seems to be the problem. You have to create from scratch the files following the steps in lesson 1
I'm not sure why the compiler generates a different machine code for you (I'm presuming that the C code is identical), but I would check the optimization level. At a different level of optimization, the compiler might change the ordering of code, and put the else-part first. Then the condition of the branch instruction would be exactly the logic negative, to jump over the else part. I hope you can see how it works.
the optimization level is low , & the code is the same is you wrote but when i downloaded the lesson2 project & run it the assembler code is the same as in the video
Calculating the offset isn't quite right here. Modifying the code to include three ++counter statements increases the offset value but the simple formula of Current PC - Offset doesn't give the right location. Help anyone?
Its foolish, but I need your help! I am planing to work through a tutorial about programming in C the Stellaris Launchpad using the IAR Workbench IDE. When I enter in debug mode I have not yet found how to place views in parallel. Once in a while trying it works, but I have not find out what I did to have it do what I want. The views appear ordered in tabs of a window. I have no problems with the first row of views in which 4 views appear in parallel, dide by side. My problem is with the additional views. It always shows a window over the full screen with with 3 tabs, "Locals", "Memory" and "Debug Log". I want to have them show up not under tabs, but a a window of its own. I know that somehow it is done with the light blue bar on the left of this window, but I am unable to split the full width of the screen with 3 individuals "screens" as I have them on the upper row of window in the layout.
Helmut: The documentation for the IAR EWARM Integrated Development Environment is located in \arm\doc\EWARM_IDEGuide.ENU.pdf. Also, the IAR C-SPY debugger is described in \arm\doc\EWARM_DebuggingGuide.ENU.pdf
Before recurring to publish my request for help here I have studied the documents you link to! Apparently my question is too silly to be mentioned there!
I agree that the IAR IDE dock-window implementation is not intuitive, and non-standard. Unfortunately, I don't have the space here to explain what a 260+ page IAR manual has missed. The only advice I can give is that you can grab a window by the tab and drag it to the position you like. Please watch the window outline before you "drop" the window. Also, often to achieve the desired position, you need to drag a given window multiple times.
Hello sir I am going to ask a question I understood 0xFC of ofset is -4 but i cant understand 0xFB of offset=-6.In second while loop Adress is 0x80 and we have 0xDBFB so 0x80-5=0x7B but it goes to 0x7A because 0xFB of offset=-5 it must goes to 0x7B . Please help me sir.
I explain this several times throughout this video course, but let me quickly repeat. The offset 0xFB is obviously -5, which is an *odd* number. But THUMB2 instructions cannot be aligned at odd addresses. Therefore the LSB (Least-Significant Bit) loaded to the PC (Program Counter)is *not* used for addressing (this address must be zero, so that all instructions are aligned at even addresses). Instead, the LSB loaded to the PC is used to indicate the CPU mode (1=TUMB, 0=ARM). In Cortex-M CPUs this bit *must* be always "1", because the CPU does not support the ARM instruction set at all. Therefore, you will always see *odd* addresses loaded to the PC, while the actual instruction reached will be at an *even* address. --MMS
This course focuses on the fundamental concepts in embedded programming and also shows you how these concepts are ultimately realized in an embedded processor. These things are not going to change any time soon, because fundamental concepts never go out of style. Moreover, when you see and understand how the fundamental concepts are implemented at the low level, you will gain deeper understanding and you'll use the concepts with greater confidence. Please also remember that the course is still going on and is getting progressively more advanced. Stay tuned! --MMS
This is embedded programming for ARM architecture, what don't you understand ? have you a question ? just go to next videos if you want to dig deeper about interupt etc.
save the code in another directory: WROOOOONG, CHUMP. Put it into version control, that's backup on steroids. And start to think in terms of branches (diversions from a chosen point), probably intended to merge at sone points in time. Suddenly, you don't have to juggle a multitudes of possibly changing dirs, all the diffs are by definition anchored to a fixed point in the time/change continuum, and contemporary tools (editors/IDEs) even give you a lot of info (origin, changes) inline. OR: by investing a little bit into the process you reap multitudes of benefits.
@mrdkyzmrdany8742 Since the publication of this video, the complete projects for all lessons have been put under version control and are available on GitHub at: github.com/QuantumLeaps/modern-embedded-programming-course --MMS
I recently got hired by a company for a mixed programming position. I had NO experience in embedded systems programming whatsoever and was struggling to catch up. This course single-handedly brought me up to speed with quality and clarity unrivaled by any other UA-cam training course (and I have watched quite a few). If you have just found this course, keep going. You have come to the right place. The fact that Miro Samek has posted this course for free is the height of scholastic generosity. I have learned more from this course than any computer-science college class.
can you give your linkdin profile to connect with you
I have seriously never in my life seen a better tutorial. This gives me just the exact information I need, no extra jokes, no useless explanations. Excellent work!
Agreed. So tired of the loud, sarcastic, quippy, quirky, meme-filled teaching style that is unfortunately taking over the world right now. Feels good being spoken to like a grown-up!
@@abominabletruthman Couldn't agree more.
I love this style that the lecture's topic seems easy but the lecture itself reviews some essential concepts in architecture and assembly.
Dude, thank you!!! All the Coursera courses pale in comparison with this. Nothing better than a practical approach!
I have watched and read hundreds of programming tutorials in my time and have always (being the inquisitve person I am), have always been left wondering about what's actually going on inside the system amongst other things. This tutorial and your teaching is the best I have ever seen and it puts my university I went to, to complete shame. Good thing I knew my uni was bad though that's why I left after the first year. Thank you so much for these lessons, you are a GOD in my eyes.
I really liked how you explained the intricate detail of memory and pipe-lining,rather than just coding purely in C. It really cleared some things up for me, thanks. :)
Everything in this course and channel is BRILLANT. I look at this a few times its very valuable.
This is a beautiful demonstration of how to teach both to the beginner and the intermediate at the same time. I've known about nesting control structures for quite some time, but coming from a scientific (not computer science) background, I find the side discussions regarding memory etc. illuminating. Thank you!
Sir, thank you for all the effort you are putting, these tutorial videos are great! This much useful tuorials are really rare to find. Please keep up for the people out there who are craving to learn..
Your tutorials are the master piece for embedded engineers...!!...Great work....
Great material, excellent teaching style. We went over the low level concepts in my Computer Organization class but seeing it all action, and with the concepts being brought up as needed is what really makes it all click.
I've been searching so long to find such a deep tutorial. I really thank you !
This is one of the best tutorials I have ever seen.
Just thought I'd put this here in case anyone's interested. I was curious about the chosen method of determining whether a number is odd using bitwise operations as opposed to the more typical modulus arithmetic method. After some digging, I found that the bitwise approach may be marginally more efficient than using modulus, however bitwise is not guaranteed to work for negative numbers. For this reason, it is generally advised to use modulus unless you're working with a particularly time sensitive application, for example an interrupt, and you know that the number you're testing will never be negative.
That's an excellent point. Generally, the first concern during coding is to express the intent as clearly as only possible and the moduls operator would do this more clearly than the (obscure) bit operation. (Specifically, for the uninitiated, the test for "oddness" with the moduls operator would be ((counter % 2) == 0). Presumably, most modern compilers would optimize the modulus (division) the same way as the bit operator. I should have mentioned this in the video, but at the same time I wanted to provide an example for the bit operator because it is particularly important for embedded programmers. I hope this makes sense. --MMS
great teaching, very concise,clear and in depth
Best tutorial i have ever had on Embedded systems..kudos!!
You are the best! Such good teaching.
Great Course... Really appreciate your work. I look forward more tutorials like this from you.
Hi, at @6:41 you mentioned that the generated machine code is faster than the C code because there is only one conditional branch at the bottom of the loop. I don't understand how there isn't one conditional branch in the C code as there is two outgoing arrows and two ingoing arrows in both situations. The only difference to me that is that the conditional branch is at the bottom vs. the top. Can you clarify what you meant by this?
Anyone got it?
Thank you very much, please keep continue this fantastic training
My brain is burning, it's so pleasant. Big up thanks for the work bro !
Thank You Very Much for the tutorial and sharing your precious knowledge.
I'm having issues following along with the first 2 lessons in Keil micro vision. Any tips or recommended videos to watch so I can know how to interpret your earlier lessons with a different IDE? Thanks.
Maybe you can skip forward and watch a lesson that uses KEIL rVision? The first lesson to do so is lesson #21 "Foreground/Background Architecture" ua-cam.com/video/AoLLKbvEY8Q/v-deo.htmlsi=MzaGjGY9y6j3DlSp
At 6'40", you compared the efficiency of the generated code with that of the code we intent to write, and then you said that the generated code is faster because it has only one loop at the bottom.
However, the flow chat on the left has only one loop also which stays at the beginning. I am quite confused why the right flow chat is faster than the left one.
@108670028652039476584
The code generated by the compiler is faster because it avoids a branch to jump over the body of the loop when the condition is false. As I explain just a minute later in this lesson, a branch (jump) stalls the instruction pipeline in the CPU, and therefore it is quite costly (the cost is at least 3 clock cycles, which corresponds to the number of stages in the pipeline). --MMS
At 3:42, how do you step one assembly code at a time? When I perform Step Into in debugger mode, it is stepping to the next statement in the C code, not the assembly code.
You need to click on the disassembly window and make it active. Then single-stepping will advance one machine instruction at a time. To go back to stepping one C line at a time, click on the source code window. --MMS
@@StateMachineCOM Thank you.
Amazing tutorial. I liked the part when you used the & operator instead of a a%2!=0. I guess the former variant is quicker in the processor.
Thank you for the course. I have one doubt with running the debugger second time. First run in debugger jumps with the green highlight on each assembly instruction i.e. both the B.N and BLT.N but the during the second the instruction directly jumps between ADDS and CMP, doesn’t highlight the “0x8e: 0xdbfc BLT.N 0x8a”
Note that I am using the “Step Into F11” and I know that the BLT.N is executed (checked by putting a breakpoint on it) but what’s the reason for not highlighting after the first run?
You are using two modes of debugging: one *source* line at a time and one machine *instruction* at a time. The source-line mode is used when the active window is the source code window. The machine-instruction mode is used when the disassembly window is active. --MMS
@@StateMachineCOM Thank you for such quick response. Yes, I was switching the active window and that was the problem. Really great quality content.
@@StateMachineCOM Thanks for this response. I was stuck with this for half an hour.
Just amazing!
first of its kind ! keep the tutorials coming.. thank you
Great lesson. I would be gratefull if you could provide the link to arm architecture reference manual you showed in this video.
Thank you for the incredible material!
Love the videos
Amazing lessons!
Thank you very much sir....Incredible materials....thanks a lot for your gratitude....🇮🇳
3:55
I wanted to see if I could predict what the value would be for the compare instruction on my machine. My code is almost similar to yours except that I have while (counter < 5).
I used page A6-64 (page 202 of the document) of the ARMv7-M architecture reference manual you linked in the description. The description for the compare instruction in the reference manual states that bits 15:6 are 0b 010000 1010, and bits 5:3 are for Rm and 2:0 are for Rn.
Now, in my code, the value of 0 is moved to R0. So, the bits 5:3 is 000. Since we are comparing this value to #5, whose binary representation is 101, we should get that the compare instruction has a value of: 0b 010000 1010 000 101, which is 0x4285.
However, my disassembler is saying that the instruction value is actually 0x2805.
Would you please explain how/why I should be 0x2805 instead of 0x4825?
Thanks a lot! The tutorials are amazing!
Great video, very clear and well presented.
One question I have is when you wrote the code for determining if the counter was odd, you put '!= 0' why did you chose to put this as opposed to '== 1'?
I get that both ways mean the same thing and would work, but is this a more efficient way or something? it just seems like reverse logic to me.
Why did I choose '!=0' vs. '==1'? Because comparison against zero is almost always the fastest. CPUs have instructions that set status bits when argument is (or isn't ) zero. This means that most of the time the CPU does not really need to do any comparison. In contrast, comparing to anything else typically requires generating this constant in a register (like 1), and then comparing with it, which is as expensive as subtraction. --MMS
@@StateMachineCOM Fantastic response. Thank you, I knew there must have been a reason.
At 5:00 you said our instruction uses the encoding 'd' so its T1 I couldn't quite understand why, Could you please elaborate?
The "T1" name is explained in the "ARM Architecture Reference Manual", as shown in the video. The BLT instruction has multiple such "encodings" (slightly different binary representations). All these "encodings" are disassembled to the BLT instruction, but they have different restrictions as to the context of use. Some encodings cannot be used in the context of the "IT block" (IF-THEN block). Please read the "ARM Architecture Reference Manual" shown in the video for more information. --MMS
Thanks for the great tutorials, I have one question: in the disassembly window and when I'm using While loop, I am not able to see the additional instruction (like BN or BLT.N ) like the second part of this tutorial. I am only able to see the regular instructions (ADDS and MOVS) like in first part of this video. Do you have any suggestions or ideas why that happening? Thanks
In order to have branch instructions (such as B.N or BLT.N), you need to have branches, such as if-statements and while-loops in your C code, which are gradually added to the code as the lesson progresses. Perhaps you don't have them in your code yet? (BTW, the .N suffix after branch instructions denotes "near" branch. There are also "far" branches that can jump much further from the current location.)
Thanks for the information, but I am using While loop, so I think it is
because different version of the software maybe!. Thanks anyway.
+Nawar Youssef I have the exact same issue. I thought it could be due to a newer version of IAR, but that doesn't make sense really. We should be able to see the same instructions in the disassembly regardless of version
+Johnny Hall thx Johnny, I figure it out! It was due to the old version of the code when we copy and paste the folders. In someway it was compiled as the code for the previous lesson. All what you need to do is creating new project for each lesson and then you can copy the previous code instead of coping the whole folder.
Nawar Youssef ah! Thank you! Very clever
Your tutorials are great. I ordered the DE0-Nano-SoC a few days ago and I saw your tutorials today. I like the DNS for the Ethernet speed and I like IAR because, I like to see what is happening on the register level and I like your tutorials. The DNS is a hybrid with the Cortex A9 and an FPGA. It is made by Altera. I looked at the IAR site and it offers support for the Cortex A9 under a different name and the site also lists Altera as a support group or something. Is there a fit with DE0-Nano-SoC and IAR that I am missing or is it something that possibly could happen later?
I found my answer. There is a c-compiler in my kit but it was delayed in loading on my PC.
I was incorrect. My c-compiler looks to be line code text. Will IAR run code for my Deo-Nano-Soc?
How does the program decide what general purpose register to use when storing data? For e.g. in this video, the value of the counter variable is stored in register R1 (see 2:07). However, in my code, the value is stored in R0. Is it possible to force the program to store a value to a certain register or is it automatically determined? How do you work with more than one register at a time and is it possible to fill up all the general purpose registers with data? I.e. you force the registers R0 to R12 to hold values at the same time.
The compiler is free to choose the CPU registers as it pleases (within limits). And no, you can't force any specific way of allocating registers to variables, at least not in the *standard* C. The restrictions on choosing the registers are known as the calling convention. This is the agreement as to what registers can be clobbered and which must be preserved across function calls (lessons 8 and 9 talk about functions and the ARM Procedure Calling Standard).
I've placed a link to the "ARMv7-M Architecture Reference Manual" at the end of the video description, but this is not an official link from ARM.
To get to the official documentation, ARM Ltd. requires you to sign up to their website. Please go to arm.com, type "Architecture Reference Manual" into the search box, and sign up to download the PDF.
Hello, I'm a newbie to embedded systems programming. First I would like to say, "Thanks for this series!" In order for me to understand everything to the fullest, I watch your videos multiple times in an attempt to make sure I grasp every concept before moving on.
I hope you can help me with my question:
In this video at position 9:00 minutes, we see:
0x80: 0xdbfb BLT.N ??main_1
According to your previous explanation, the offset for the BRANCH is 0xfb (-5), which means this instruction would branch to 0x80 + 0xfb = 0x7b:, and not 0x7a.
Can you clarify why the offset is not 0xfa?
Daniel P HI Daniel. I've picked up on this too which I think is an error in the video. Calculating the new PC location is not correct from what I can see. If you add more statements ++counter; you will increase the offset and create a new value and the PC - Offset formula doesn't go to the right location.
Daniel P
As I explain in several occasions later in the course, any load to the PC on Cortex-M must be an odd number. This is because the least-significant-bit in the PC is not for addressing, but rather it is the THUMB-flag, meaning that it indicates the THUMB instruction set. But Cortex-M supports only THUMB (Thumb-2 to be precise), so this bit has to be always.
In fact, as an experiment you can try to load something even to the PC. For example, you might change the value in the LR register right before returning from a function (see the upcoming lesson about functions). As you will see, this will cause the HardFault exception in the CPU.
--MMS
Quantum Leaps, LLC hi this is interesting. I thought the address shown in PC counter was actually 2 instructions behind the real address due to pipelining.
+Quantum Leaps, LLC Hello! It is perfect tutorial. I am very pleased to watch it. As for this case, I don't understand why 0x7e: 0xdbfc works with 0 in the least-significant-bit and 0x80: 0xdbfb shouldn't work. Both instructions load PC, don't they? Could you explain?
+Dmitrii Kozlov The branch instructions do not load the PC. They increment or decrement the PC (they are *relative* to the current PC). So, you might think that the B instruction is like ADD pc,pc,#xx, which is of course different than LDR pc,... Regarding the offsets, your example 0x80:0xdbfb would not work because the offset 0xfb is odd, whereas the THUMB-2 instructions are always at even addresses.
@9:01, the jump encoded at address 0x80 (= 128 in decimal) is fb, which equals minus 5 bytes, right? Then this would seem to indicate a jump to address 0x7b (= 123 in decimal). But in the Disassembly view it appears that the jump should be to address 0x7a. Can you please explain the apparent discrepancy?
+Forbes Winthrop By now it is probably one of the most frequently asked questions. I explain later in the course that any value loaded to the PC (Program Counter) register on the ARM Cortex-M CPU must be odd. This is because the least-significant-bit in the PC is NOT used for addressing (instructions are always aligned on the even address boundary, anyway). So this LSB in PC is instead used to carry information about the state of the CPU, with 0==ARM-state and 1==THUMB-state. But Cortex-M supports only the THUMB-state, so the LSB bit must be always 1. BTW, you can try to load an even address to the PC from the debugger. You should try it and see what happens!
Quantum Leaps, LLC Thanks. Great course.
Hello, I am confused about a few things.. how do we know "d" = t1, B = condition ?? and How does FC = -4? I rewatched the counting segment, and still do not understand how 0xFC = -4?
+Mathew Wilson In 32-bit arithmetic the bit pattern for -4 is 0xFFFFFFFC (just try to 0x00000000 - 0x00000004). In 8-bit arithmetic, 0x00 - 0x04 == 0xFC).
how do you know that the remaining address is set to "FFFFFF" in 0xFFFFFFFC when only FC is given?
Please watch the video @5:00. Here I show the encoding of various B (branch) instructions. The encoding for the BLT.N instruction at the address 0x7E uses the 8-bit immediate offset. All those offsets are interpreted as *signed* numbers. So, an 8-bit signed number 0xFC is -4. If it was interpreted as unsigned number, it would be 252. So, in the end it all depends on how the bit pattern is interpreted.
@@StateMachineCOM thanks that was a good explanation, but only can you throw light on these as well , How
"d" = t1, B = condition ??
Thanks in advance
please clarify that what happens if we assume the counter as odd at 9:30 . it is not clear for me in this video thanks in advance
+Sai Ganesh At 9:30 I give just an example of the "if" statement. There is no actual code (only comments) in the if-branch and else-branch. But if you would put some actual code in the if-branch, then this code would execute when the counter is odd. Similarly, if you had some actual code in the else-branch, then this code would execute when the counter is NOT odd (i.e., when it is even).
Hi I'm a little confused on the computation of the program counter when branching.
So given this:
0x000004DC E7F7 B 0x000004CE
When computing the program counter, we add the offset value of the encoding to the current value of the PC. In this case we have PC: 0x4DC and an offset of F7, which in two's complement is -9. So adding them together we get 4D3, which isn't the target address of the branch. What am I missing here?
Your B instruction starts with 11100, so it uses the T2 encoding (see 5:05), so the immediate offset is 11-bit 0x7F7. But you're right that after sign-extending from 11 bits, it is -9. So, now 0x4DC - 9 == 0x4D3 -> 0x4D2 (after discarding the LS-Bit). But I think that this must be further corrected for the pipelining effect 0x4D2 - 4 == 0x4CE. --MMS
Hi Samek
Thank you for the serious of tutorial.
In this tutorial I have one clarification
For 18 and 21 the binary representation seems to be different
Is it correct ?
Of course. How would that all work otherwise? The binary representation of 18 is: 10010. Binary representation of 21 is: 10101. I hope you see that's different. --MMS
@@StateMachineCOM
Got it is Dec to Binary👍
I had thought Hex to Binary
Thanks for your response.
Really awesome videos 👌
Thanks for the great videos. At 3:42 I dont see instruction "B.N" instruction on my computer can you please tell me the reason why ?
You don't say what you see instead of the "B.N" instruction. But some disassemblers might simply show the "B" (Branch) instruction. The suffix ".N" means "near" branch within limited distance from the current PC (Program Counter). Such "near" branch instructions can be encoded as a 16-bit instruction, which is the case here (see the instruction binary encoding 0xe000). There are also "far" branch instructions encoded in 32-bit instructions. --MMS
I really appreciate for giving your time :) After I can only see (at 3:35 highlighted) instruction (0x76: 0x2000 MOVS RO #0) which is the first lint after main. And after that I only see a series of ADDS. I dont see B.N, ADDS,CMP, BLT.N and so on. I hope I am explaining well
You most likely see the disassembly code from the very first version of the code, when the counter variable is incremented several times (++counter;). But at 3:45 the code is already modified to contain the while() loop. --MMS
I am really really sorry. I did not realise I had two main.c files open. And the main.c file that was on top had the while loop, but that was not being executed, instead other main.c file which had ++counter code was being executed thats why I see old incrorrect disassembly instructions. I am sorry for wasting your valuable time. Thanks for being patient :)
i wrote the same code with the version ARM 8.50.4, and in the disassembly view the if_statement is not considered, i dont understand, someone can help me please?
Hello, how did you move the green line in disassembly up and down ?
I had the same issue.
Copying the files in the beginning of the lesson seems to be the problem.
You have to create from scratch the files following the steps in lesson 1
hello sir
when i make the project & run the debugger the branch instruction after CMP is BGE.N not BLT
what is the reason for this ?
I'm not sure why the compiler generates a different machine code for you (I'm presuming that the C code is identical), but I would check the optimization level. At a different level of optimization, the compiler might change the ordering of code, and put the else-part first. Then the condition of the branch instruction would be exactly the logic negative, to jump over the else part. I hope you can see how it works.
the optimization level is low , & the code is the same is you wrote
but when i downloaded the lesson2 project & run it the assembler code is the same as in the video
@@mohamedabuelkasem4940 You probably had declared unsigned int
Calculating the offset isn't quite right here. Modifying the code to include three ++counter statements increases the offset value but the simple formula of Current PC - Offset doesn't give the right location. Help anyone?
Good lesson
Its foolish, but I need your help! I am planing to work through a tutorial about programming in C the Stellaris Launchpad using the IAR Workbench IDE. When I enter in debug mode I have not yet found how to place views in parallel. Once in a while trying it works, but I have not find out what I did to have it do what I want. The views appear ordered in tabs of a window. I have no problems with the first row of views in which 4 views appear in parallel, dide by side.
My problem is with the additional views. It always shows a window over the full screen with with 3 tabs, "Locals", "Memory" and "Debug Log". I want to have them show up not under tabs, but a a window of its own. I know that somehow it is done with the light blue bar on the left of this window, but I am unable to split the full width of the screen with 3 individuals "screens" as I have them on the upper row of window in the layout.
Helmut: The documentation for the IAR EWARM Integrated Development Environment is located in \arm\doc\EWARM_IDEGuide.ENU.pdf. Also, the IAR C-SPY debugger is described in \arm\doc\EWARM_DebuggingGuide.ENU.pdf
Before recurring to publish my request for help here I have studied the documents you link to! Apparently my question is too silly to be mentioned there!
I agree that the IAR IDE dock-window implementation is not intuitive, and non-standard. Unfortunately, I don't have the space here to explain what a 260+ page IAR manual has missed. The only advice I can give is that you can grab a window by the tab and drag it to the position you like. Please watch the window outline before you "drop" the window. Also, often to achieve the desired position, you need to drag a given window multiple times.
Thank you.
Hello sir
I am going to ask a question I understood 0xFC of ofset is -4 but i cant understand 0xFB of offset=-6.In second while loop Adress is 0x80 and we have 0xDBFB so 0x80-5=0x7B but it goes to 0x7A
because 0xFB of offset=-5 it must goes to 0x7B
. Please help me sir.
I explain this several times throughout this video course, but let me quickly repeat. The offset 0xFB is obviously -5, which is an *odd* number. But THUMB2 instructions cannot be aligned at odd addresses. Therefore the LSB (Least-Significant Bit) loaded to the PC (Program Counter)is *not* used for addressing (this address must be zero, so that all instructions are aligned at even addresses). Instead, the LSB loaded to the PC is used to indicate the CPU mode (1=TUMB, 0=ARM). In Cortex-M CPUs this bit *must* be always "1", because the CPU does not support the ARM instruction set at all. Therefore, you will always see *odd* addresses loaded to the PC, while the actual instruction reached will be at an *even* address.
--MMS
Quantum Leaps, LLC. Thanks for help sir
Is this course still relevant?
This course focuses on the fundamental concepts in embedded programming and also shows you how these concepts are ultimately realized in an embedded processor. These things are not going to change any time soon, because fundamental concepts never go out of style. Moreover, when you see and understand how the fundamental concepts are implemented at the low level, you will gain deeper understanding and you'll use the concepts with greater confidence. Please also remember that the course is still going on and is getting progressively more advanced. Stay tuned! --MMS
Fockin A
embeded programing ??
LOL this pure c programing look my how programing interupt , nvic , usb and more not fu... for loop function
This is embedded programming for ARM architecture, what don't you understand ? have you a question ? just go to next videos if you want to dig deeper about interupt etc.
save the code in another directory: WROOOOONG, CHUMP.
Put it into version control, that's backup on steroids. And start to think in terms of branches (diversions from a chosen point), probably intended to merge at sone points in time.
Suddenly, you don't have to juggle a multitudes of possibly changing dirs, all the diffs are by definition anchored to a fixed point in the time/change continuum, and contemporary tools (editors/IDEs) even give you a lot of info (origin, changes) inline.
OR: by investing a little bit into the process you reap multitudes of benefits.
@mrdkyzmrdany8742 Since the publication of this video, the complete projects for all lessons have been put under version control and are available on GitHub at: github.com/QuantumLeaps/modern-embedded-programming-course --MMS