By this definition, I've been a software engineer since I was 21. I was writing an inventory management system and one of the users' name was Bob. Bob was computer literate, but if there was a way to enter the wrong value, press the wrong key at the wrong time, Bob was an expert at it. I learned VERY quickly that anything I wrote needed to be "Bob-proof". Countless times I asked myself, "Now what could Bob do here that would mess this up?" Sadly, Bob passed away a couple of years ago, but he had an outsized positive influence on my software development career.
On the other hand, based on my experience, more problems are caused by overengineering than almost any other problem. Writing the simplest code that checks the requirements and is also testable is the mark of a skilled SWE.
Persistently asking the question, "What can go wrong?" is not only valuable for engineering, it is also a key factor in what the US Navy calls Operational Risk Management (that acronym ORM again, but with a different meaning). Following it allows you to run a munitions handling operation with a minimum of whoopsies. In my own software development practice, I boil it down to the phrase, "In God we trust, but all user input and function returns get checked."
When I was in college, my dad told me about the "Therac-25". I learned very early on about the importance of defensive programming. This is one of the reason why I personally do not like C++, I found my self constantly thinking about failure modes of the programming environment in addition to the application failure modes while at the same time solving the application problem. I know you can write bad code in any language, but C++ just made it too easy to do. I'm not "blaming" C++, because I am a firm believer in the expression "poor craftsmen blame their tools", but as a "good craftsmen" I just found myself thinking about the tool too much.
Yup, good discussion on software engineering. And in 30+ years of working as a developer, software engineer, and enterprise architect one fact that is proven again and again is that regardless of titles there are people that are natural born software engineers and people that will never get it. I've spent a career writing financial software so while lives aren't on the line people do still get a bit testy when you send a few million dollars to the wrong spot. So having a development organization that understands the difference between a system that "works" when everything is happy and a system that is reliable and behaves in a defined fashion under all likely conditions, and that is well structured to allow maintenance and future development (as well as many other characteristics) is vital.
I was once labeled as a pessimistic that thought too much about what could go wrong and took too long to deliver. The person that critized me would regularly spend the weekend reapiring his own clever code in production with an angry cleint behind his back.
I'm working at a company now where I was brought in to help with a couple big projects they weren't having much luck doing with their then current staff. One dev straight up told me he never wrote tests. They just take extra time and serve no purpose. I wonder why those big projects were having so many troubles at this place? Not all devs there have that attitude thankfully, but the rot is still pretty deep on some teams.
@@queenstownswords couldn't agree more. This is what I try to explain to the development manager, but he's more interested in showing his bosses how many tickets have been completed by the team and passed on to the QA. It wastes the time of both the devs and the QA, as most of the issues could easily be caught at the developer's end (typos, not following the mockups properly, etc)
The annoying thing is, all that weekend work and jumping in to solve client problems means that person will be seen as the superstar, despite all those problems being self inflicted. Strangely, if the client has regular problems they're more likely to keep you on as they'll feel you're necessary. If the client experiences zero problems they interpret it as you're adding no value!
Bob Martin advocates starting TDD test cases with the error cases and maybe some of the edge cases first. I've tried this, and I rather like it, because they're pretty easy to do. It builds momentum for me, and these are cases that need to be covered anyhow. Adding them at the end of the process is more of a drag. I also like to follow the philosophy of: "Hard in training. Easy in battle." I want to subject my code to as many adverse scenarios as possible in a testing environment to confirm that it responds as desired. I never want a this-should-never-happen scenario to occur for the first time in production. Returning to the NASA theme, astronauts simulate many different scenarios, including catastrophic ones. The training allows them to experience really bad scenarios in a safe environment before potentially encountering them for the first time in space. Considering failure scenarios in our software is the same.
Agree totally. Happy path tests are of course necessary but to me the interesting tests are where you're trying your hardest to abuse the code under test. I once rewrote a troublesome file processing application and part of the test process for the new system was feeding it fuzzed input. Input that was _almost_ correct. I had a system that could generate input files where each record had a slight error, and would generate all possible category of errors. It would generate data sets of tens of thousands of fuzzed input files which I'd throw at the system. And it found some issues too, in code that was already well unit and integration tested and had gone through QA testing. The night that system went live I had no worries and slept like a baby. (Yeah, tempting fate I know).
Often it is necessary to work through the happy flow first because it is not totally clear what you need to make. In those cases, working through edge cases and failure modes is a waste of time. True engineers often do this as well. First make a Proof of Concept, then properly engineer the product. I find that SW projects far too often skip the PoC phase, leading to disaster. PoC comes before MVP (Minimum Viable Product).
@@TheEvertwAgree. I don't do typical TDD because I don't want to invest a lot in tests for code I may decide isn't the right approach and throw away. So my first iteration of "tests" are exercising the code in the debugger, playing around with it, changing values to take different execution paths, getting a feel for it. When I have done that and if I still like the approach then I can start writing the real tests, both happy path and as the code is written all the tests for the more interesting non-happy paths.
@@TheEvertw Point well taken. I think there's a happy medium or possibly it's two different contexts or it's a hybrid. You suggested two different contexts when you mentioned Proof of Concept, maybe via a prototype, and then Properly Engineered. But all too often, I've seen the PoC prototype become the final version, since we're often forced to move onto the next thing. I don't think that Bob Martin's advice necessarily applies to a PoC prototype. That's building something that mostly satisfies sunny day scenarios just to get a concept of what's really desired or required by the customer. Bob's advice is probably more useful in the Properly Engineered phase. Maybe we throw away the prototype and start anew now that we have a sense of what we want to do. Maybe we refactor the prototype into something more battle worthy. Bob's advice to start with the error/edge cases is to shape the solution space without making too many assumptions about what the solution space is when you first start. Bob has a mantra about this. Here's a copy from one of his blogs: " 'As the tests get more specific, the code gets more generic.' As a test suite grows, it becomes ever more specific. i.e. it becomes an ever more detailed specification of behavior. Good software developers meet this increase in specification by increasing the generality of their code. To say this differently: Programmers make specific cases work by writing code that makes the general case work. As a rule, the production code is getting more and more general if you can think of tests that you have not written; but that the production code will pass anyway. If the changes you make to the production code, pursuant to a test, make that test pass, but would not make other unwritten tests pass, then you are likely making the production code too specific." I've had issues with posting links in UA-cam comments, so I won't post any here. If you want to see the entire blog, do an internet search for "The Cycles of TDD December 2014". He demonstrated this in a group presentation where he implemented a Stack using this technique. He even commented multiple times that his implementations were not what most of us would do when implementing a Stack. But this technique nudges him toward a good Stack implementation as he was shaping the solution one test at a time. The video ends a bit abruptly, but he's mostly done with his presentation by that point. The presentation is on UA-cam and it can be found via a search for "Robert Cecil Martin (uncle Bob) demonstrates test driven development by implementing a stack in Java".
@@TheEvertw My suggestion would be to write at least the "happy flow" tests for the PoC and then try to break the functionality with new tests as soon as you go into the MVP stage.
A place I worked at when (much) younger, referred to all new programmers as Murphy-Coders - anything that can go wrong with their code, eventually will - the engineering mindset not only takes training, but some level of hard lessons that are hopefully sandboxed to non-critical projects.
My first week on the job, I ran a 10 year old file, test.bat, inside our build folder, expecting it to do something build related. It deleted all folders in my Git directory, and I had to reclone everything. I learned pretty quick to think more carefully and not take things at face value haha
I couldn't agree more to the topic of "do the success path of the software in the last moment" and is spot-on on what I say everyday to the point to be called a "broken record". The moment you write the success path of a method, class or system before any error handling or failure scenario (including tests for them) you self-sabotage yourself and your mind in thinking your work is done or at least the majority of the work is done. Is much more difficult to trace-back our steps and find all the things than possibly can go wrong. Beyond that, for the same reasons, I believe that It is necessary, in the same moment that the last successful piece of code is wrote, all code necessary to achieve that goal must be wrote and done too. independently of size or effort necessary. For a lack of an official term I call this way of code and discipline of "Work in depth" instead of "Work in breadth".
There’s a huge difference between prototyping and production. Software Dev and software engineering really are the same thing. When you’re working on an actual production-grade system, it requires project management, functional and non-functional requirements, architecture, risk management, test plans, compliance, and an appropriate design/architecture for maintainability and updates. However, the reason nothing in practice comes anywhere close is because most business software is really just about attaining market share, not engineering proper fault-tolerant systems. Most companies are just building prototypes and marketing the hell out of them, even if their stuff doesn’t work. However, with success, you can build in the safeguards and production-grade level maintenance capabilities once there’s funding to justify that additional overhead.
This one is a standout, Dave!! It's so easy to code for the happy path and fail to think of all the many unhappy ones that can come from the overwhelming different combinations that all the inputs can take. I'm still halfway through the video so it might even be mentioned, but the Therac-25 comes to mind. The program took such a specific combination of precisely timed steps to go wrong, but the stakes were immensely high if it went wrong and innocent people paid for that lack of forethought with their lives.
When I was about 10 years old my dad brought an apple 2e home and I started playing around with it. I started going through the manual looking up commands and checking to see what they did. Eventually I came across the command called "format" and the documentation said it formatted the disk. Not knowing what the word format meant and not having a dictionary or the internet handy I decided to try it. Needless to say that day I learned to hate people who use a word to define that same word. And I run across this in most documentation to this day.
Fun fact: That hypothical story is somewhat happend to us... We had a formating tool that should only format disks that would have been used with our external devices. Not checking which drive was formated one customer managed to setup the system with a drive letter C as possible disks we shall format. So... that PC was history....
Just because you're not working on a piece of software that might cost people lives it doesn't mean that your code won't be used in such applications. My teams deal with tracking shipments, rerouting and providing live updates in order to enable high-efficiency JIT manufacturing. One would think that in the case of error it would only affect our clients' productivity and only cost some money... But our system is also used for tracking donor organs for transplanting where the cost can be much higher. I have to constantly remind our engineers that their overlooks in code can have more than just a monetary cost...
I remember a nice article by Jack Ganssle on the need for watchdogs. He describes a system that was crashing all the time yet continued to perform as desired because of a good watchdog. In fact, the customers never knew the system was buggy.
I'd like to point out that in the example @ 10:58 the error condition is basically swept under the rug and an possibly incorrect default value is returned that could cause a bug elsewhere in a calculation. These bugs can be extremely hard to detect and troubleshoot.
This is a great point; for most web software I'd argue a crash would be *better* here. On the Apollo, probably not. I guess knowing which is the right thing to do given the context you're in is also an aspect of engineering.
This is where another engineering principle kicks in; Risk analysis. Once you identify a possible failure or error, you need to figure out how serious it is. Could this kill someone? Could this lead to billing errors, leaked confidential data, or other legal headaches? Or does it just mean the display styling might be off a little? Is this a situation we can recover from, or must we avoid it at all costs? Once you can answer those questions, deciding how to handle the situation gets easier. Sometimes a hard crash IS the failsafe mode. If a one tonne robotic arm on an assembly line can't determine its position, it should probably stop moving.
Weird, I've been struggling with the feeling that thinking what could go wrong was something that slowed down my development process, but this point of view changed my mind
Great topic. Software engineers think beyond good users, and consider malicious users, or environments that software will run in too. This includes engineering around data and system behaviours, not only the software. The classic example of a SQL data injection, or default to install a driver when a USB device is inserted. Software engineering needs to occur at different levels from core functions to systems level engineering. This why it requires a team approach.
I think I have been (by this definition) an engineer from a very early age (maybe 5?). My brain has always fundamentally worked this way, looking at anything and instantly branching out and finding every potential problem with any idea before I even try to generate a solution. My solutions are thus worked backwards from paths that have the fewest hard problems and then typically those problems are mitigated as much as possible with at least 1-3 redundancies. I am not even talking about programming here, this is how everything in my life works LOL. Although I program in this exact same fashion. For instance I recently decided to setup a small-ish 36 gallon aquarium in my place and I spent 2-3 months stressing over all the potential failure points. Ended up turning my entire living room into a shallow bath tub sealing along the baseboard to mitigate water damage, pouring a perfectly level epoxy top onto the stand, covering it with a non slip mat incase of an earthquake shaking it off the stand, reinforcing it with more silicone, water proofing all the joints in the stand against leaks because it is fiberboard, mapping out potential water spill paths and making plans on putting some 4x4 under the couch to stop the initial flood of water going down the hallway with momentum if the tank exploded. My friends all think I am insane.
Great distinction, i find myself focus on what could go wrong, the performance and the life-time of the software. While my colleagues tend to focus on solving as many features as possible :)
I effectively got canceled on a Government programme for arguing that we shouldn't rely on an external actor removing their link to our API to protect the API from being used from a certain date/time. I argued we should correctly record the time the endpoint could be used for the object (we're talking applications for money here), another argued that simply let it 500 error as a great idea. Some people don't care about doing things right. Production launch rolled around, and the link was still present on the external system 2 months after close. Thankfully I finally got listened to, and the feature made it into the production release, and smooth sailing. No one ever goes back and thanks you for those, only for the failures, and those that make the failures never get held to account.
One of my CS teachers emphasised that no matter how clearly you label the input field as a Date and document that you expect a "mm/dd/yyyy", someone will try to enter "peanut butter" or "garbage". Always check your inputs and always ask "what's the worst that could (reasonably) happen?"
Excellent video! You must go beyond the happy path and not stop once it’s working. Refactoring is a key step too that often gets missed once a feature is”working”
What annoys me most in terms of what can go wrong, is how a developer or support techie following our work might have to make live database changes as a result of failures of our weak logic such as data inconsistencies. Such live changes are quite stressful. Are our design weaknesses causing stress to colleagues, present and future? Will they have to endure health hazards?
Good point. I’m experiencing this now. It’s hard to tell when colleagues are capable and management accepts anything they say because it fits the existing pattern. Hard not to think it’s me.
I think engineering is more about doing the most with as little as possible. Best performance, reliability, modularity, etc. depending on what's important for that use case. For the least amount of cost, power, failures, modifications, etc.
"How does it work?" was answered in 1970 when I asked about the family car. Dad locked up the tool box. My question was interpreted as, "How can I break it?"
I always hew to the old saying: "If architects and engineers built buildings the way software designers wrote software, the first woodpecker that comes along would destroy civilization." A few years ago ... it came to light that airlines were not shutting down the avionics when the aircraft was at the gate between flights. This could go on for many days. or weeks - with ground power keeping everything up and running. A particular nav system on board 787's would crash when a counter reached overflow. So the order went out that that aircraft model had to have someone be sure that the avionics system was shut off (re-started) at least every 51 days.
Interesting reflection, considering that I make a distinction between coders (people that code) and programmers (people that code with TODO lists and a git repo). Actually: engineering is a conscious construction of something (not necessarily programs) using a specification that they test their product against. It has been known far longer than programming, and the tests are both negative -- it shouldn't break, and positive -- it should do a specific task.
I characterize programmers as people that are able to break apart problems into abstract concepts, design architectures to model those concepts and designing tooling that is able to manipulate the model in a reliable and efficient manner. Coder as someone who can take a spec and write it but is somewhat limited in their ability to see the bigger picture or design systems, usually through inexperience but sometimes through complacency via lack of need.
It's interesting how many similar distinctions exist. I usually call them: 1: Programmers just want everything explained in detail and then type it down with minimal thinking 2: Software developers actually think about the original problem and what solution would fit the best
Good engineering allows a user to use what was engineered without needing to understand how it works. An engineer is someone who makes those things. What is made is made as difficult as possible to be used incorrectly. Also an engineer makes something with n number of design criteria, for example: performance, speed of delivery, security, extensibility, optimized for memory use, ease of use, documented, ease of on boarding. And these criteria have different relative levels of importance. So the engineer will combine the elements of their domain to produce something that satisfies all those criteria, simultaneously and in the right balance.
Perfect video! I do see a lot of projects going in a full happy path mindset and then trying to figure out what's going on, with tons of bugs happening. I'm studying software testing and this video is perfect. I do bought 2 of your books, continuous delivery and modern system design. I will read them for sure!! Thanks!
Another great video, thanks! If you want to be a real engineer of any kind, it's worth spending some time on engineering ethics. There is more to engineering than "design and build" if you want to do things the right way and avoid unnecessary risks. Cheers!
I would say this applies in any jobs with a lot of dependencies and risk factors. Movie set shooting production need to have this what could go wrong. Gameplay designers (not the code but the game design itself) need to have it. Any form of physical object that people rely on their life with or object that can cause harm need to have it. Maybe all above could be called engineering mindset. Still in many cases it prefer real world usage scenario risk analysis as its broader and fits more. Also all solutions or findings would not fall under something they really needs engineering to solve it
I was wondering about this for some time. Considering my programming projects are usually less destructive and less important or they are automated, I have not had to worry about people using the software wrong or something going wrong. I now know I am a programmer and have simply not had the chance to be a software engineer. I probably could be one, but the need has not come up for me to think about what could go wrong.
I would argue that the most important engineering principal is discipline in the sense that we limit ourselves in some way to achieve some sort of simplicity. For example, electrical engineers might work with the "lumped matter discipline" to ensure that Kirchhoff's laws hold, or further restrict themselves to digital circuit elements to more easily create more complex digital devices. I liken this to using high-level programming languages, or limiting use of goto, or functional programming. These are all abstractions away from the hardware that simplify program analysis and construction.
@@Olu-AkinsurojuMaxwell Sure. So, an electrical engineering student will first learn about Maxwell's Equations. These are very complicated electromagnetic formulas that apply to everything. If you built circuits out of random materials, you would have to analyse them in this way. Instead, we produce electrical components with consistent properties that allow us to model them as, for example, fixed resistances and capacitances. This technique removes some of the complexity of Maxwell's Equations and reduce them to the much simpler Kirchhoff's Laws. Similarly, when using semiconductors, you can make sure to run them in certain operating conditions so that they are always in saturation, which allows you to use simpler equations in your circuit model. After that, maybe you could make an arbitrary rule that all wires in your circuit should be at either 5V or 0V, with certain conditions about how they switch from one to the other, then lump bunches of components together as logic gates, almost completely abstracting away the electrical nature to allow purely digital analysis. The whole field of electrical engineering is about restricting yourself to allow for higher level abstractions and progressively easier analysis. Computer engineering is similar. You can use assembly language to write programs and directly interface with the hardware. However, you can also restrict yourself to just C and only using libraries to interface with hardware. After that, you might decide to adhere to coding standards, e.g. not using *goto*, or always using modules for encapsulation. Then, you might switch to a language that does automatic memory management, or enforces message passing as the only way elements can communicate. The next step might arguably be languages that remove variables and unflagged side effects. All of these abstractions are intended to move you away from needing to think about the hardware and being able to more easily analyse your work. Unfortunately, unlike hardware abstractions, software abstractions do sometimes involve compromises with regard to build times and sometimes also runtime speed and memory footprint. A true engineer will still use the abstractions but see these challenges as opportunities for improvement. Always remember, the word _engineer_ is more closely linked to the word _ingenuity_ than the word _engine_.
Do computer science degree programs actually teach software engineering? I hear claims from the software community that "computer science is a branch of mathematics" If that's true, this isn't engineering but rather they're teaching mathematicians. In general, while engineers use the results of mathematicians as tools, they don't "do mathematics" and rarely if ever, do engineers have to come up with new mathematical theories. Very occasionally, software engineers might be required to invent new algorithms, but, from my experience, software engineering is almost all about finding prewritten libraries and knowing how to apply them to engineer software solutions. Software architecture is, I believe, the software engineers' biggest challenge. Software should be maintainable and as bug-free as possible.
Most CS programmes don't teach engineering well, but I think that your analogies are wrong. Most advancement in software is not these days an academic process of "Computer Science" it is practical process of engineering. If you are working places where you only reuse other people's code, then that is not what I mean by "software engineering". Software Engineering are the guide rails that help us to achieve a better outcome if we follow them. I'd argue that this is the same for all forms of "engineering". You may certainly be relying on knowledge that comes from other places, but engineering is a practical discipline of solving real world problems, and in my experience we are always solving new problems, because otherwise - what is the point? because we can copy software for free, so it is always new.
sadly, one place i find this thinking the least is in IT management. we can "get by" with too few people, too little documentation, and all kinds of "cost" saving measures if we ignore all the un-happy paths...
This right here is the real reason why software teams need Continuous delivery. Folks - assume hidden risks, work in small batches, use feature toggles, contain blast radii, write tests before you write your code, write a few tests after writing your code and finally, plan for observability and automatic feedback. Trust me - life will be good. And learn from your occasional small safe failures and make your system resilient one unit test at a time. Release code gently and often. Aim to build enough confidence so that you can release on a Friday evening - it can be done.
The difference is that in some Canadian provinces "engineer" is a regulated term. If you have a Computer Science degree it's not enough to call yourself an engineer. Literally, they will sue you if you do.
*quickly changing position title from developer to engineer* 😅 Not just because I prefer engineer to developer but because that's the way I have always gone about it. In the past that has earned me the title of being "problem" driven, but I have learned to think about solutions before I present the problem to anyone so that now I am considered "solutions" driven. 😊
I do tend to focus on risk assessing everything crucial and mitigating the risks, as part of a sense of responsibility instilled into me in public service. It might not qualify me to be called a software engineer. But it is seen as a weakness of mine by those with a more ‘positive’, ‘optimistic’ approach. It can be used against me. Today people having slogans to use against my ilk such as “go fast and break things” and calling my ilk “risk averse”. You can lose a job over it these days. Career even. That is what can go wrong. How do we mitigate it?
When writing a method, I usually try to first validate the input in all ways I can imagine and only after that I write the actual logic. Of course that's also something working very well when using TDD. There I first try to write all the tests that have names like "divisor can not be null", before I write the "happy tests". But how often did I hear the phrase "That will never happen!" when I asked about some specific condition that hasn't been checked... Yes, normally the users don't write more than 200 characters in the "username" field, but sometimes someone makes a copy & paste error or even a hacker tries an exploit, so it's better to take bad input into consideration. Edit: What I forgot to mention: I usually still throw an exception when my validation catches bad input, but I try to make sure to provide as much context as possible within that exception, so that the logs will help to identify the original source of the issue. Even worse than someproblem that has not been catched is an error that has been catched but then only returns "something went wrong" without providing any help what went wrong and how to fix it.
It's called Degenerate Testing/Coding. Particularly when I do TDD (Test Driven Development), you ask yourself upfront, "Given X, how should this piece of software/function behave?"
I was born thinking about what could go wrong. I won't lie to you all, I was a very destructive child. Dishes, Legos, VCRs, Piggy Banks, CDs, Computers, Tables, Phones, etc. I would always destroy them and then see if I could build it back up. Of course, when I got older, I found out what screws and screwdrivers were lol. Then my high school had automotive classes where we diagnosed and repaired cars. (usually the teachers) There were also donated cars. Sometimes some of us destroyed the components. However, we learned that there were ways to recover from those failures if they weren't too bad. Stripping a bolt, recoverable. Putting fluids in the wrong location, recoverable. Breaking a portion of the suspension, recoverable. Having the car fall off the lift...not recoverable. I remember when someone stripped the threads of a bolt for the engine. I was tasked to use a tap and die to repair it. I found the teeth that perfectly fit into the threads. It was in metric. I fixed it. When we went to screw it, it did not work. Come to find out the threads were imperial. A difference of 0.5 millimeters or less lol. I didn't think to check in both measurements then compare the two. When I learn something, I learn it well. Then I try to find all sorts of ways I can destroy it or others can destroy it. First, I build a small foundation of knowledge to test against when something goes wrong. Whatever that small foundation does, it also happens when scaled up. I've done this in Math, Science, Software, Art, Automotive, etc. Writing code that works NEVER is enough for me. I want to know how people can break it. I like searching for bugs and seeing where things go wrong. I think about what a human, animal, natural disaster, etc. would do to destroy it. I actually want to get enough skills to build something (not tied to business just for fun) and say, "This is unhackable" and have someone go in there and destroy it. Then look around picking up the pieces seeing what they did. There is nothing better than watching someone or something take what you've created and destroying it. I should say though, one should think about what could go wrong to a certain extent. Otherwise, you'll be like me in my 20s thinking about how a bad element can run into the school and looking for all sorts of escape plans on campus. In fact in 2020, I watched a show called "Monk" and laughed because that was almost my life in my 20s. It still is I just toned it down a bit lol. Except when I went to that laundromat and saw a bent pillar and the walls cracked on the area where some of the weight should have been distributed at. I called that in so fast. Not sure what happened because I never went back there!
LOL - one way to get this point-of-view is to be required to maintain your software for a couple of decades. My third for-sale software project was used daily from 1985 to 2010. It hadn't crashed in over a decade when it was finally retired. I couldn't blame anybody except me for any issues, and it made me quite a "defensive programmer."
Thanks Dave. I really understand every bit of this video thanks to your explanations at low level with great examples. I sometimes find your CD videos a bit high level due to the fact l that you're aiming them at experienced professionals. That's not to say I'm not learning from them, just that for an beginner/intermediate like me they take more avenues of investigation after watching to decode some of the terms.
"What could possibly go wrong?" is a question usually asked by people who expect barely anything can go wrong. This question is pointless, as the answer to it is always the same: Everything. There's absolutely nothing that for sure cannot go wrong. E.g. no matter how safe you write a program that writes file data, a disk write operation can always fail as the disk might fail itself, the power supply of the computer can always fail in the middle of a write operation, the system might have to kill the process performing the operation at any given point, there might be a bug in the filesystem driver, and so on. If data corruption is not an option under any circumstances, you must never alter data directly, you must always write altered data to a new file, verify the data was correctly written and can correctly be read back and only delete the old file afterwards. No other write operation is guaranteed to be safe. But in practice you must make compromises. Nobody will accept that your program is 10 times slower or needs 100 times more disk space just to be safe against any possible write failure, when 999 out of 1000 people will never experience such an issue in their entire lifetime, as then your code has no real world benefit for those 999 people, just for the 1 unlucky guy. And that unlucky guy could also roll back an hourly backup, which is far easier to create once an hour in the background. The question is only, how expensive would that data loss be, as up to one hour of data would be lost? And if it really is that expensive, wouldn't it be cheaper to just have two identical systems running at all times and if one system ever fails, you can switch to the backup system and recover the first one from there without any data loss at all? At least those two systems will run at full speed and only use as much storage space is really required. Think of cars. Is it really worth to design a car that will never malfunction when needed, even though it's 10 times as expensive as a normal car? Wouldn't it make much more sense to just buy two or three cars instead, so you always have a backup car if one fails or just rent a car or call a cab when needed?
Real engineers often have a Proof of Concept phase where they do not care about failure modes or edge cases. A Proof of Concept is used to remove the technological risk / uncertainty from the project. The lessons learned in making the PoC are used to plan and develop the final product. The PoC itself is thrown away. It is a waste of time & effort to go through failure modes etc while still in the PoC phase. SW engineers would do well to adopt this as well. Far too often SW projects fail because there is still a lot of uncertainty & risk, both technical and in user interaction. If that is the case, FIRST built a PoC with a minimal team. If you scale up a team while there is still a lot of risk & uncertainty, the project is most likely to run over budget or fail outright.
Funny that I focus only on what could go wrong, totally destroying the possibility of glorious outcome in my head to the point "Why should I write any code if someone's got a way to break it anyway?"
We are living in a time where it's not allowed to think "what can go wrong" most people who do this will be attacked and public shamed, so most of us don't let our self have a mindset thinking of what can go wrong. We are either a negative person or a conspiracy theorist if have the "what can go wrong" mindset.
IDK man. As an engineer and someone who has been tinkering with and repairing stuff all my life, I think this contingency mindset is something to aspire to, but is something that's rarely practiced. Dont forget that a lot of engineering is hamstrung by constraints around budget and time. And sometimes engineers can get so fixated on technical pursuits that they completely disregard the end user. I've seen it too many times for it to be an anomaly.
A good quality video on the mindset of problem solving, although the title is clickbaity and objectionable. Since “software developers” and “software engineers” are the same thing.
Exactly, good software should still be engineered. But we must ask ourselves why software has stuck on the level of engineering for so long. "The Future of Programming" by Bret Victor should give a think. As long as we're stuck on the engineering level, Alan Kay's computer revolution has not happened.
According to the Threat Model Manifesto, the threat modeling process should answer the following four questions: What are we working on? What can go wrong? What are we going to do about it? Did we do a good enough job?
As I have always said, there are differences, but all roles are essential. When constructing a building you need architects, civil engineers, construction workers...all are essential parts of the process, but their roles are all different in essence! The worst, and negligent, kind of person is when they categorize themselves as "jacks of all trades", distrust these people!
Margaret took her daughter to work and she pressed the wrong button and crashed the simulation. It is described here, but I read it in a few descriptions wehackthemoon.com/people/margaret-hamilton-her-daughters-simulation
So, what title or role do we use when we want to emphasize the engineering-like nature of the discipline, but we do not have the right to use the protected title, as here in Canada for example? (I do agree about design for failure, though!)
Sad that some industry leaders who build their current stuff from their legacy are exactly the opposite of this. When" happy path" only is actually the outcome of a fight between dev and management...because management is fine with less than the happy path as shipping is more important than quality.
If you focus on glorious good paths instead of asking what could go wrong, you basically invoke Murphy's Laws for engineering. It's never a question of "if", only "when" and "how bad". I really don't mind being called "pessimistic" when I deliver value.
Perhaps this is the reason, why certain people get promoted while decent engineers are not. They are too skeptical and don’t show their „driven to win“ attitude. 🤷🏼♂️
Where you say "positive" I'd say "delusional". The very first thing I learned on this job is that defensive code/design is robust and therefore both resilient and reliable. Error handling is job one. I have too many stories of stupid, easily foreseen disasters to recount.
On the other side of the coin, assessment needs to be done from time to time to balance overengineering and mvp, otherwise, failure to meet the time to market could kill all well intentions.
I think that assumes that doing a good job is more costly, I don't believe that it is over the duration of a software project. Because good code is simpler, easier to work on and easier to change, teams that spend the time to keep their code "good" are much more efficient. "There is no trade off between Speed & Quality" - State of DevOps Reports.
I think that (good) software *engineering* is being aware of the (entire) *context* in which your software runs. And then act on that awareness. This context can include possible failures that can occur, but also can mean user-behavior, system-behavior, heck sometimes even physics that can get in the way, etc.
How do we make all the checks within a function if the language isn't prepared for that? In Java you can have unchecked exceptions, which assuming the function that returns it is random code, it can throw any of those, we "should" put everything on a try-catch then or one for each function, but that's bad practice. This either way doesn't do well for lambdas, and Rust has the Result type which is wonderful, but how do we tell Mr. NoMoneyForThat and Ms. NoTimeForThat that we should use better languages? Now imagine telling the poor guys using interpreted, loosely typed languages that there's such a thing as a good practice
Both software developer and software engineer mean the same thing and people use these terms interchangeably. Almost every developer/engineer tries to do their best and they want to have code that is maintainable in the future, robust, doesn't have errors and is resilient to changes. Of course not everyone has the knowledge/experience to do it and not everyone is trained enough to imagine ALL the possible errors that might occur. Even the best software engineers/developers can't foresee all the failures that will happen with their code and it is impossible to foresee how users will end up interacting with the product. Suggesting there's a divide between software developers and software engineers is just trying to have a go at less experienced people. Instead of using the term 'developer' as a derisive term, let's focus on only promoting approaches that have proven track record of providing benefit to long-term code base health. Trying to change definition of the term 'developer' into effectively a cuss word is not helping anyone, it is only alienating less experienced people in the field and causes junior developers to have self-doubts and impostor syndrome, all the while trying to elevate one-self (in this case Dave) to this status of a highly qualified person that always gets it right, when that's not the case at all.
I'd argue, and do in my book on the topic, that what you say may be true, but that it is only true because in software we have devalued what the term "engineering" means. Most software development is not done as engineering in the classic sense of the term. I define engineer like this. "Engineering is the application of an empirical, scientific approach to finding efficient solutions to practical problems." that is not how most software developers approach their work!
@@ContinuousDelivery yes, you're free to choose a definition of the word as you see fit. There will be others who use slightly different definitions for the word engineer. For example there are certainly people out there who don't consider software engineering to be real engineering and would argue vehemently that it's not engineering, using a definition of engineering that suits their view to be derogatory towards software engineers. They'd laugh at the notion that you're doing any engineering. It's the same argument as software development vs software engineering. It's just a word play to put down everyone who does not do things the way you do.
Software developers (and engineers) need to realise the second order consequences of their poor actions or inactions. Creating software is as much a moral exercise as it is academic or practical. Sure, we're not all making pacemakers, but there are so many examples of poor software costing jobs, bankrupting companies, resulting in wrongful criminal convictions for users, prison, and sadly suicide. We have moral obligations to keep high levels of quality, and push back, even refuse, when requested to do things that violate users or are unlawful (putting in the effort to document and communicate why). The problem is that many people simply do what they got told to on the JIRA ticket with little accountability. They end up allowing themselves to be treated as an exchangeable part on the factory line and they do it to themselves.
There are software developers and there are engineers. The term 'software engineer' is a misnomer. If I see anyone call themselves a software engineer on a resume, I will throw it away.
By this definition, I've been a software engineer since I was 21. I was writing an inventory management system and one of the users' name was Bob. Bob was computer literate, but if there was a way to enter the wrong value, press the wrong key at the wrong time, Bob was an expert at it. I learned VERY quickly that anything I wrote needed to be "Bob-proof". Countless times I asked myself, "Now what could Bob do here that would mess this up?" Sadly, Bob passed away a couple of years ago, but he had an outsized positive influence on my software development career.
That was a confortable comment. Nice approach to look things like this.
Solution, make sure events follow your design and freeze everything else
Yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
If you ever wrote a book about your career, you could give it the title "Bits and Bob". :)
Sorry to hear about Bob.
On the other hand, based on my experience, more problems are caused by overengineering than almost any other problem. Writing the simplest code that checks the requirements and is also testable is the mark of a skilled SWE.
Persistently asking the question, "What can go wrong?" is not only valuable for engineering, it is also a key factor in what the US Navy calls Operational Risk Management (that acronym ORM again, but with a different meaning). Following it allows you to run a munitions handling operation with a minimum of whoopsies.
In my own software development practice, I boil it down to the phrase, "In God we trust, but all user input and function returns get checked."
When I was in college, my dad told me about the "Therac-25". I learned very early on about the importance of defensive programming. This is one of the reason why I personally do not like C++, I found my self constantly thinking about failure modes of the programming environment in addition to the application failure modes while at the same time solving the application problem. I know you can write bad code in any language, but C++ just made it too easy to do. I'm not "blaming" C++, because I am a firm believer in the expression "poor craftsmen blame their tools", but as a "good craftsmen" I just found myself thinking about the tool too much.
Modern C++ has more or less dealt with these concerns.
Yup, good discussion on software engineering. And in 30+ years of working as a developer, software engineer, and enterprise architect one fact that is proven again and again is that regardless of titles there are people that are natural born software engineers and people that will never get it. I've spent a career writing financial software so while lives aren't on the line people do still get a bit testy when you send a few million dollars to the wrong spot. So having a development organization that understands the difference between a system that "works" when everything is happy and a system that is reliable and behaves in a defined fashion under all likely conditions, and that is well structured to allow maintenance and future development (as well as many other characteristics) is vital.
Yep. But even errors in a financial software can be fatal. See his video on the British Post Office scandal with the buggy Fujitsu software :/
I was once labeled as a pessimistic that thought too much about what could go wrong and took too long to deliver.
The person that critized me would regularly spend the weekend reapiring his own clever code in production with an angry cleint behind his back.
Pay it once up front, or pay it three times over down the road.
I'm working at a company now where I was brought in to help with a couple big projects they weren't having much luck doing with their then current staff. One dev straight up told me he never wrote tests. They just take extra time and serve no purpose. I wonder why those big projects were having so many troubles at this place? Not all devs there have that attitude thankfully, but the rot is still pretty deep on some teams.
@@queenstownswords couldn't agree more. This is what I try to explain to the development manager, but he's more interested in showing his bosses how many tickets have been completed by the team and passed on to the QA.
It wastes the time of both the devs and the QA, as most of the issues could easily be caught at the developer's end (typos, not following the mockups properly, etc)
The annoying thing is, all that weekend work and jumping in to solve client problems means that person will be seen as the superstar, despite all those problems being self inflicted. Strangely, if the client has regular problems they're more likely to keep you on as they'll feel you're necessary. If the client experiences zero problems they interpret it as you're adding no value!
@@paul_andrews You hit that one right on the nose :)
Bob Martin advocates starting TDD test cases with the error cases and maybe some of the edge cases first. I've tried this, and I rather like it, because they're pretty easy to do. It builds momentum for me, and these are cases that need to be covered anyhow. Adding them at the end of the process is more of a drag.
I also like to follow the philosophy of: "Hard in training. Easy in battle." I want to subject my code to as many adverse scenarios as possible in a testing environment to confirm that it responds as desired. I never want a this-should-never-happen scenario to occur for the first time in production.
Returning to the NASA theme, astronauts simulate many different scenarios, including catastrophic ones. The training allows them to experience really bad scenarios in a safe environment before potentially encountering them for the first time in space. Considering failure scenarios in our software is the same.
Agree totally. Happy path tests are of course necessary but to me the interesting tests are where you're trying your hardest to abuse the code under test. I once rewrote a troublesome file processing application and part of the test process for the new system was feeding it fuzzed input. Input that was _almost_ correct. I had a system that could generate input files where each record had a slight error, and would generate all possible category of errors. It would generate data sets of tens of thousands of fuzzed input files which I'd throw at the system. And it found some issues too, in code that was already well unit and integration tested and had gone through QA testing. The night that system went live I had no worries and slept like a baby. (Yeah, tempting fate I know).
Often it is necessary to work through the happy flow first because it is not totally clear what you need to make. In those cases, working through edge cases and failure modes is a waste of time.
True engineers often do this as well. First make a Proof of Concept, then properly engineer the product.
I find that SW projects far too often skip the PoC phase, leading to disaster. PoC comes before MVP (Minimum Viable Product).
@@TheEvertwAgree. I don't do typical TDD because I don't want to invest a lot in tests for code I may decide isn't the right approach and throw away. So my first iteration of "tests" are exercising the code in the debugger, playing around with it, changing values to take different execution paths, getting a feel for it. When I have done that and if I still like the approach then I can start writing the real tests, both happy path and as the code is written all the tests for the more interesting non-happy paths.
@@TheEvertw Point well taken. I think there's a happy medium or possibly it's two different contexts or it's a hybrid. You suggested two different contexts when you mentioned Proof of Concept, maybe via a prototype, and then Properly Engineered. But all too often, I've seen the PoC prototype become the final version, since we're often forced to move onto the next thing.
I don't think that Bob Martin's advice necessarily applies to a PoC prototype. That's building something that mostly satisfies sunny day scenarios just to get a concept of what's really desired or required by the customer.
Bob's advice is probably more useful in the Properly Engineered phase. Maybe we throw away the prototype and start anew now that we have a sense of what we want to do. Maybe we refactor the prototype into something more battle worthy.
Bob's advice to start with the error/edge cases is to shape the solution space without making too many assumptions about what the solution space is when you first start. Bob has a mantra about this. Here's a copy from one of his blogs:
" 'As the tests get more specific, the code gets more generic.'
As a test suite grows, it becomes ever more specific. i.e. it becomes an ever more detailed specification of behavior. Good software developers meet this increase in specification by increasing the generality of their code. To say this differently: Programmers make specific cases work by writing code that makes the general case work.
As a rule, the production code is getting more and more general if you can think of tests that you have not written; but that the production code will pass anyway. If the changes you make to the production code, pursuant to a test, make that test pass, but would not make other unwritten tests pass, then you are likely making the production code too specific."
I've had issues with posting links in UA-cam comments, so I won't post any here. If you want to see the entire blog, do an internet search for "The Cycles of TDD December 2014".
He demonstrated this in a group presentation where he implemented a Stack using this technique. He even commented multiple times that his implementations were not what most of us would do when implementing a Stack. But this technique nudges him toward a good Stack implementation as he was shaping the solution one test at a time.
The video ends a bit abruptly, but he's mostly done with his presentation by that point. The presentation is on UA-cam and it can be found via a search for "Robert Cecil Martin (uncle Bob) demonstrates test driven development by implementing a stack in Java".
@@TheEvertw My suggestion would be to write at least the "happy flow" tests for the PoC and then try to break the functionality with new tests as soon as you go into the MVP stage.
A place I worked at when (much) younger, referred to all new programmers as Murphy-Coders - anything that can go wrong with their code, eventually will - the engineering mindset not only takes training, but some level of hard lessons that are hopefully sandboxed to non-critical projects.
My first week on the job, I ran a 10 year old file, test.bat, inside our build folder, expecting it to do something build related. It deleted all folders in my Git directory, and I had to reclone everything. I learned pretty quick to think more carefully and not take things at face value haha
I couldn't agree more to the topic of "do the success path of the software in the last moment" and is spot-on on what I say everyday to the point to be called a "broken record".
The moment you write the success path of a method, class or system before any error handling or failure scenario (including tests for them) you self-sabotage yourself and your mind in thinking your work is done or at least the majority of the work is done.
Is much more difficult to trace-back our steps and find all the things than possibly can go wrong. Beyond that, for the same reasons, I believe that It is necessary, in the same moment that the last successful piece of code is wrote, all code necessary to achieve that goal must be wrote and done too. independently of size or effort necessary.
For a lack of an official term I call this way of code and discipline of "Work in depth" instead of "Work in breadth".
There’s a huge difference between prototyping and production.
Software Dev and software engineering really are the same thing.
When you’re working on an actual production-grade system, it requires project management, functional and non-functional requirements, architecture, risk management, test plans, compliance, and an appropriate design/architecture for maintainability and updates.
However, the reason nothing in practice comes anywhere close is because most business software is really just about attaining market share, not engineering proper fault-tolerant systems. Most companies are just building prototypes and marketing the hell out of them, even if their stuff doesn’t work. However, with success, you can build in the safeguards and production-grade level maintenance capabilities once there’s funding to justify that additional overhead.
This one is a standout, Dave!! It's so easy to code for the happy path and fail to think of all the many unhappy ones that can come from the overwhelming different combinations that all the inputs can take.
I'm still halfway through the video so it might even be mentioned, but the Therac-25 comes to mind. The program took such a specific combination of precisely timed steps to go wrong, but the stakes were immensely high if it went wrong and innocent people paid for that lack of forethought with their lives.
Thanks, glad you liked it.
When I was about 10 years old my dad brought an apple 2e home and I started playing around with it. I started going through the manual looking up commands and checking to see what they did. Eventually I came across the command called "format" and the documentation said it formatted the disk. Not knowing what the word format meant and not having a dictionary or the internet handy I decided to try it. Needless to say that day I learned to hate people who use a word to define that same word. And I run across this in most documentation to this day.
Fun fact: That hypothical story is somewhat happend to us... We had a formating tool that should only format disks that would have been used with our external devices. Not checking which drive was formated one customer managed to setup the system with a drive letter C as possible disks we shall format. So... that PC was history....
Just because you're not working on a piece of software that might cost people lives it doesn't mean that your code won't be used in such applications. My teams deal with tracking shipments, rerouting and providing live updates in order to enable high-efficiency JIT manufacturing. One would think that in the case of error it would only affect our clients' productivity and only cost some money... But our system is also used for tracking donor organs for transplanting where the cost can be much higher. I have to constantly remind our engineers that their overlooks in code can have more than just a monetary cost...
I remember a nice article by Jack Ganssle on the need for watchdogs. He describes a system that was crashing all the time yet continued to perform as desired because of a good watchdog. In fact, the customers never knew the system was buggy.
Maggie Hamilton is one of my heroes. Thanks for putting this together.
I'd like to point out that in the example @ 10:58 the error condition is basically swept under the rug and an possibly incorrect default value is returned that could cause a bug elsewhere in a calculation. These bugs can be extremely hard to detect and troubleshoot.
This is a great point; for most web software I'd argue a crash would be *better* here. On the Apollo, probably not. I guess knowing which is the right thing to do given the context you're in is also an aspect of engineering.
Returning zero when you accidentally divide by zero seems like a very bad idea.
This is where another engineering principle kicks in; Risk analysis.
Once you identify a possible failure or error, you need to figure out how serious it is. Could this kill someone? Could this lead to billing errors, leaked confidential data, or other legal headaches? Or does it just mean the display styling might be off a little? Is this a situation we can recover from, or must we avoid it at all costs?
Once you can answer those questions, deciding how to handle the situation gets easier. Sometimes a hard crash IS the failsafe mode. If a one tonne robotic arm on an assembly line can't determine its position, it should probably stop moving.
Weird, I've been struggling with the feeling that thinking what could go wrong was something that slowed down my development process, but this point of view changed my mind
Great topic. Software engineers think beyond good users, and consider malicious users, or environments that software will run in too. This includes engineering around data and system behaviours, not only the software. The classic example of a SQL data injection, or default to install a driver when a USB device is inserted.
Software engineering needs to occur at different levels from core functions to systems level engineering. This why it requires a team approach.
I think I have been (by this definition) an engineer from a very early age (maybe 5?). My brain has always fundamentally worked this way, looking at anything and instantly branching out and finding every potential problem with any idea before I even try to generate a solution. My solutions are thus worked backwards from paths that have the fewest hard problems and then typically those problems are mitigated as much as possible with at least 1-3 redundancies.
I am not even talking about programming here, this is how everything in my life works LOL. Although I program in this exact same fashion.
For instance I recently decided to setup a small-ish 36 gallon aquarium in my place and I spent 2-3 months stressing over all the potential failure points. Ended up turning my entire living room into a shallow bath tub sealing along the baseboard to mitigate water damage, pouring a perfectly level epoxy top onto the stand, covering it with a non slip mat incase of an earthquake shaking it off the stand, reinforcing it with more silicone, water proofing all the joints in the stand against leaks because it is fiberboard, mapping out potential water spill paths and making plans on putting some 4x4 under the couch to stop the initial flood of water going down the hallway with momentum if the tank exploded. My friends all think I am insane.
Great distinction, i find myself focus on what could go wrong, the performance and the life-time of the software. While my colleagues tend to focus on solving as many features as possible :)
I effectively got canceled on a Government programme for arguing that we shouldn't rely on an external actor removing their link to our API to protect the API from being used from a certain date/time. I argued we should correctly record the time the endpoint could be used for the object (we're talking applications for money here), another argued that simply let it 500 error as a great idea. Some people don't care about doing things right.
Production launch rolled around, and the link was still present on the external system 2 months after close. Thankfully I finally got listened to, and the feature made it into the production release, and smooth sailing. No one ever goes back and thanks you for those, only for the failures, and those that make the failures never get held to account.
One of my CS teachers emphasised that no matter how clearly you label the input field as a Date and document that you expect a "mm/dd/yyyy", someone will try to enter "peanut butter" or "garbage".
Always check your inputs and always ask "what's the worst that could (reasonably) happen?"
Margaret Hamilton was so impressive. Thank you for giving credit where it's due and mentioning the mother of software engineering. Great video!
Absolutely!
Excellent video! You must go beyond the happy path and not stop once it’s working. Refactoring is a key step too that often gets missed once a feature is”working”
What annoys me most in terms of what can go wrong, is how a developer or support techie following our work might have to make live database changes as a result of failures of our weak logic such as data inconsistencies. Such live changes are quite stressful. Are our design weaknesses causing stress to colleagues, present and future? Will they have to endure health hazards?
Good point. I’m experiencing this now. It’s hard to tell when colleagues are capable and management accepts anything they say because it fits the existing pattern. Hard not to think it’s me.
I think engineering is more about doing the most with as little as possible.
Best performance, reliability, modularity, etc. depending on what's important for that use case. For the least amount of cost, power, failures, modifications, etc.
"How does it work?" was answered in 1970 when I asked about the family car. Dad locked up the tool box. My question was interpreted as, "How can I break it?"
I always hew to the old saying: "If architects and engineers built buildings the way software designers wrote software, the first woodpecker that comes along would destroy civilization."
A few years ago ...
it came to light that airlines were not shutting down the avionics when the aircraft was at the gate between flights. This could go on for many days. or weeks - with ground power keeping everything up and running.
A particular nav system on board 787's would crash when a counter reached overflow.
So the order went out that that aircraft model had to have someone be sure that the avionics system was shut off (re-started) at least every 51 days.
Interesting reflection, considering that I make a distinction between coders (people that code) and programmers (people that code with TODO lists and a git repo). Actually: engineering is a conscious construction of something (not necessarily programs) using a specification that they test their product against. It has been known far longer than programming, and the tests are both negative -- it shouldn't break, and positive -- it should do a specific task.
I characterize programmers as people that are able to break apart problems into abstract concepts, design architectures to model those concepts and designing tooling that is able to manipulate the model in a reliable and efficient manner. Coder as someone who can take a spec and write it but is somewhat limited in their ability to see the bigger picture or design systems, usually through inexperience but sometimes through complacency via lack of need.
@@RiversJ I buy that alternate definition as *_very_* relevant thought, but I would call that "program designers."
It's interesting how many similar distinctions exist.
I usually call them:
1: Programmers
just want everything explained in detail and then type it down with minimal thinking
2: Software developers
actually think about the original problem and what solution would fit the best
I was born as an engineer by that definition, my first thought after building something was how to break it and if successful, how to prevent it.
Good engineering allows a user to use what was engineered without needing to understand how it works. An engineer is someone who makes those things. What is made is made as difficult as possible to be used incorrectly. Also an engineer makes something with n number of design criteria, for example: performance, speed of delivery, security, extensibility, optimized for memory use, ease of use, documented, ease of on boarding. And these criteria have different relative levels of importance. So the engineer will combine the elements of their domain to produce something that satisfies all those criteria, simultaneously and in the right balance.
Perfect video! I do see a lot of projects going in a full happy path mindset and then trying to figure out what's going on, with tons of bugs happening. I'm studying software testing and this video is perfect. I do bought 2 of your books, continuous delivery and modern system design. I will read them for sure!! Thanks!
Thanks for the feedback. Glad it was helpful!
Another great video, thanks! If you want to be a real engineer of any kind, it's worth spending some time on engineering ethics. There is more to engineering than "design and build" if you want to do things the right way and avoid unnecessary risks. Cheers!
I would say this applies in any jobs with a lot of dependencies and risk factors. Movie set shooting production need to have this what could go wrong. Gameplay designers (not the code but the game design itself) need to have it. Any form of physical object that people rely on their life with or object that can cause harm need to have it.
Maybe all above could be called engineering mindset. Still in many cases it prefer real world usage scenario risk analysis as its broader and fits more. Also all solutions or findings would not fall under something they really needs engineering to solve it
I was wondering about this for some time. Considering my programming projects are usually less destructive and less important or they are automated, I have not had to worry about people using the software wrong or something going wrong. I now know I am a programmer and have simply not had the chance to be a software engineer. I probably could be one, but the need has not come up for me to think about what could go wrong.
I would argue that the most important engineering principal is discipline in the sense that we limit ourselves in some way to achieve some sort of simplicity.
For example, electrical engineers might work with the "lumped matter discipline" to ensure that Kirchhoff's laws hold, or further restrict themselves to digital circuit elements to more easily create more complex digital devices.
I liken this to using high-level programming languages, or limiting use of goto, or functional programming. These are all abstractions away from the hardware that simplify program analysis and construction.
If you don't mind please can you explain in a simpler term just entered the discipline a few months ago
@@Olu-AkinsurojuMaxwell Sure.
So, an electrical engineering student will first learn about Maxwell's Equations. These are very complicated electromagnetic formulas that apply to everything. If you built circuits out of random materials, you would have to analyse them in this way. Instead, we produce electrical components with consistent properties that allow us to model them as, for example, fixed resistances and capacitances. This technique removes some of the complexity of Maxwell's Equations and reduce them to the much simpler Kirchhoff's Laws. Similarly, when using semiconductors, you can make sure to run them in certain operating conditions so that they are always in saturation, which allows you to use simpler equations in your circuit model. After that, maybe you could make an arbitrary rule that all wires in your circuit should be at either 5V or 0V, with certain conditions about how they switch from one to the other, then lump bunches of components together as logic gates, almost completely abstracting away the electrical nature to allow purely digital analysis. The whole field of electrical engineering is about restricting yourself to allow for higher level abstractions and progressively easier analysis.
Computer engineering is similar. You can use assembly language to write programs and directly interface with the hardware. However, you can also restrict yourself to just C and only using libraries to interface with hardware. After that, you might decide to adhere to coding standards, e.g. not using *goto*, or always using modules for encapsulation. Then, you might switch to a language that does automatic memory management, or enforces message passing as the only way elements can communicate. The next step might arguably be languages that remove variables and unflagged side effects. All of these abstractions are intended to move you away from needing to think about the hardware and being able to more easily analyse your work. Unfortunately, unlike hardware abstractions, software abstractions do sometimes involve compromises with regard to build times and sometimes also runtime speed and memory footprint. A true engineer will still use the abstractions but see these challenges as opportunities for improvement.
Always remember, the word _engineer_ is more closely linked to the word _ingenuity_ than the word _engine_.
@@pdr. Thank you
Do computer science degree programs actually teach software engineering? I hear claims from the software community that "computer science is a branch of mathematics" If that's true, this isn't engineering but rather they're teaching mathematicians. In general, while engineers use the results of mathematicians as tools, they don't "do mathematics" and rarely if ever, do engineers have to come up with new mathematical theories. Very occasionally, software engineers might be required to invent new algorithms, but, from my experience, software engineering is almost all about finding prewritten libraries and knowing how to apply them to engineer software solutions. Software architecture is, I believe, the software engineers' biggest challenge. Software should be maintainable and as bug-free as possible.
Most CS programmes don't teach engineering well, but I think that your analogies are wrong. Most advancement in software is not these days an academic process of "Computer Science" it is practical process of engineering. If you are working places where you only reuse other people's code, then that is not what I mean by "software engineering". Software Engineering are the guide rails that help us to achieve a better outcome if we follow them. I'd argue that this is the same for all forms of "engineering". You may certainly be relying on knowledge that comes from other places, but engineering is a practical discipline of solving real world problems, and in my experience we are always solving new problems, because otherwise - what is the point? because we can copy software for free, so it is always new.
sadly, one place i find this thinking the least is in IT management. we can "get by" with too few people, too little documentation, and all kinds of "cost" saving measures if we ignore all the un-happy paths...
Hi Dave, interesting video again.
Really would love to see your insights about the Post Office glitch.
Actually I talked about the Post office scandal in my video about "Ethics in SW Dev" ua-cam.com/video/XcysNttn0WI/v-deo.htmlsi=jApmNy894NongcaK
This right here is the real reason why software teams need Continuous delivery. Folks - assume hidden risks, work in small batches, use feature toggles, contain blast radii, write tests before you write your code, write a few tests after writing your code and finally, plan for observability and automatic feedback. Trust me - life will be good. And learn from your occasional small safe failures and make your system resilient one unit test at a time. Release code gently and often. Aim to build enough confidence so that you can release on a Friday evening - it can be done.
An apt subject given the current exposure of the Post Office / Fujitsu scandal in the news...
to me those words mean the same thing
The difference is that in some Canadian provinces "engineer" is a regulated term. If you have a Computer Science degree it's not enough to call yourself an engineer. Literally, they will sue you if you do.
*quickly changing position title from developer to engineer* 😅
Not just because I prefer engineer to developer but because that's the way I have always gone about it.
In the past that has earned me the title of being "problem" driven, but I have learned to think about solutions before I present the problem to anyone so that now I am considered "solutions" driven. 😊
I do tend to focus on risk assessing everything crucial and mitigating the risks, as part of a sense of responsibility instilled into me in public service. It might not qualify me to be called a software engineer. But it is seen as a weakness of mine by those with a more ‘positive’, ‘optimistic’ approach. It can be used against me. Today people having slogans to use against my ilk such as “go fast and break things” and calling my ilk “risk averse”. You can lose a job over it these days. Career even. That is what can go wrong. How do we mitigate it?
When writing a method, I usually try to first validate the input in all ways I can imagine and only after that I write the actual logic.
Of course that's also something working very well when using TDD.
There I first try to write all the tests that have names like "divisor can not be null", before I write the "happy tests".
But how often did I hear the phrase "That will never happen!" when I asked about some specific condition that hasn't been checked...
Yes, normally the users don't write more than 200 characters in the "username" field, but sometimes someone makes a copy & paste error or even a hacker tries an exploit, so it's better to take bad input into consideration.
Edit:
What I forgot to mention:
I usually still throw an exception when my validation catches bad input, but I try to make sure to provide as much context as possible within that exception, so that the logs will help to identify the original source of the issue.
Even worse than someproblem that has not been catched is an error that has been catched but then only returns "something went wrong" without providing any help what went wrong and how to fix it.
Another amazing video worth sharing with your entire team! Thanks.
Absolutely Enjoyed this. Thanks for sharing.
It's called Degenerate Testing/Coding. Particularly when I do TDD (Test Driven Development), you ask yourself upfront, "Given X, how should this piece of software/function behave?"
Very important topic and a clear explanation.
Thank you!
I was born thinking about what could go wrong. I won't lie to you all, I was a very destructive child. Dishes, Legos, VCRs, Piggy Banks, CDs, Computers, Tables, Phones, etc. I would always destroy them and then see if I could build it back up. Of course, when I got older, I found out what screws and screwdrivers were lol. Then my high school had automotive classes where we diagnosed and repaired cars. (usually the teachers) There were also donated cars. Sometimes some of us destroyed the components. However, we learned that there were ways to recover from those failures if they weren't too bad. Stripping a bolt, recoverable. Putting fluids in the wrong location, recoverable. Breaking a portion of the suspension, recoverable. Having the car fall off the lift...not recoverable. I remember when someone stripped the threads of a bolt for the engine. I was tasked to use a tap and die to repair it. I found the teeth that perfectly fit into the threads. It was in metric. I fixed it. When we went to screw it, it did not work. Come to find out the threads were imperial. A difference of 0.5 millimeters or less lol. I didn't think to check in both measurements then compare the two.
When I learn something, I learn it well. Then I try to find all sorts of ways I can destroy it or others can destroy it. First, I build a small foundation of knowledge to test against when something goes wrong. Whatever that small foundation does, it also happens when scaled up. I've done this in Math, Science, Software, Art, Automotive, etc. Writing code that works NEVER is enough for me. I want to know how people can break it. I like searching for bugs and seeing where things go wrong. I think about what a human, animal, natural disaster, etc. would do to destroy it.
I actually want to get enough skills to build something (not tied to business just for fun) and say, "This is unhackable" and have someone go in there and destroy it. Then look around picking up the pieces seeing what they did. There is nothing better than watching someone or something take what you've created and destroying it.
I should say though, one should think about what could go wrong to a certain extent. Otherwise, you'll be like me in my 20s thinking about how a bad element can run into the school and looking for all sorts of escape plans on campus. In fact in 2020, I watched a show called "Monk" and laughed because that was almost my life in my 20s. It still is I just toned it down a bit lol. Except when I went to that laundromat and saw a bent pillar and the walls cracked on the area where some of the weight should have been distributed at. I called that in so fast. Not sure what happened because I never went back there!
LOL - one way to get this point-of-view is to be required to maintain your software for a couple of decades. My third for-sale software project was used daily from 1985 to 2010. It hadn't crashed in over a decade when it was finally retired. I couldn't blame anybody except me for any issues, and it made me quite a "defensive programmer."
So the answer is risk mitigation and more indepth software planning? I needed this answer 2 years ago.
Great discussion. Absolutely fundamental.
Thanks Dave. I really understand every bit of this video thanks to your explanations at low level with great examples. I sometimes find your CD videos a bit high level due to the fact l that you're aiming them at experienced professionals. That's not to say I'm not learning from them, just that for an beginner/intermediate like me they take more avenues of investigation after watching to decode some of the terms.
"What could possibly go wrong?" is a question usually asked by people who expect barely anything can go wrong. This question is pointless, as the answer to it is always the same: Everything. There's absolutely nothing that for sure cannot go wrong.
E.g. no matter how safe you write a program that writes file data, a disk write operation can always fail as the disk might fail itself, the power supply of the computer can always fail in the middle of a write operation, the system might have to kill the process performing the operation at any given point, there might be a bug in the filesystem driver, and so on. If data corruption is not an option under any circumstances, you must never alter data directly, you must always write altered data to a new file, verify the data was correctly written and can correctly be read back and only delete the old file afterwards. No other write operation is guaranteed to be safe.
But in practice you must make compromises. Nobody will accept that your program is 10 times slower or needs 100 times more disk space just to be safe against any possible write failure, when 999 out of 1000 people will never experience such an issue in their entire lifetime, as then your code has no real world benefit for those 999 people, just for the 1 unlucky guy. And that unlucky guy could also roll back an hourly backup, which is far easier to create once an hour in the background. The question is only, how expensive would that data loss be, as up to one hour of data would be lost? And if it really is that expensive, wouldn't it be cheaper to just have two identical systems running at all times and if one system ever fails, you can switch to the backup system and recover the first one from there without any data loss at all? At least those two systems will run at full speed and only use as much storage space is really required. Think of cars. Is it really worth to design a car that will never malfunction when needed, even though it's 10 times as expensive as a normal car? Wouldn't it make much more sense to just buy two or three cars instead, so you always have a backup car if one fails or just rent a car or call a cab when needed?
Real engineers often have a Proof of Concept phase where they do not care about failure modes or edge cases. A Proof of Concept is used to remove the technological risk / uncertainty from the project. The lessons learned in making the PoC are used to plan and develop the final product. The PoC itself is thrown away. It is a waste of time & effort to go through failure modes etc while still in the PoC phase.
SW engineers would do well to adopt this as well. Far too often SW projects fail because there is still a lot of uncertainty & risk, both technical and in user interaction. If that is the case, FIRST built a PoC with a minimal team. If you scale up a team while there is still a lot of risk & uncertainty, the project is most likely to run over budget or fail outright.
Funny that I focus only on what could go wrong, totally destroying the possibility of glorious outcome in my head to the point "Why should I write any code if someone's got a way to break it anyway?"
We are living in a time where it's not allowed to think "what can go wrong" most people who do this will be attacked and public shamed, so most of us don't let our self have a mindset thinking of what can go wrong. We are either a negative person or a conspiracy theorist if have the "what can go wrong" mindset.
IDK man. As an engineer and someone who has been tinkering with and repairing stuff all my life, I think this contingency mindset is something to aspire to, but is something that's rarely practiced. Dont forget that a lot of engineering is hamstrung by constraints around budget and time. And sometimes engineers can get so fixated on technical pursuits that they completely disregard the end user. I've seen it too many times for it to be an anomaly.
A good quality video on the mindset of problem solving, although the title is clickbaity and objectionable. Since “software developers” and “software engineers” are the same thing.
An interesting concept also found in your book, "Modern Software Engineering".
As long as companies are looking for developers to solve a certain problem in a minimal amount of time, it does not matter what's "engineering".
Exactly, good software should still be engineered.
But we must ask ourselves why software has stuck on the level of engineering for so long. "The Future of Programming" by Bret Victor should give a think. As long as we're stuck on the engineering level, Alan Kay's computer revolution has not happened.
According to the Threat Model Manifesto, the threat modeling process should answer the following four questions:
What are we working on?
What can go wrong?
What are we going to do about it?
Did we do a good enough job?
As I have always said, there are differences, but all roles are essential. When constructing a building you need architects, civil engineers, construction workers...all are essential parts of the process, but their roles are all different in essence! The worst, and negligent, kind of person is when they categorize themselves as "jacks of all trades", distrust these people!
I love the way you explain.
I'm a computer scientist. I've met many terrible "software engineers" according to your criteria.
9:20 Could you please provide a reference to the claim that the Apollo 11 crew pushed a button that they should not have pushed? Great video, btw!
Margaret took her daughter to work and she pressed the wrong button and crashed the simulation. It is described here, but I read it in a few descriptions wehackthemoon.com/people/margaret-hamilton-her-daughters-simulation
@@ContinuousDelivery thank you!
Like this definition, simple and clear.
So, what title or role do we use when we want to emphasize the engineering-like nature of the discipline, but we do not have the right to use the protected title, as here in Canada for example? (I do agree about design for failure, though!)
I don't think the video is about titles. It just says you need the engineering mindset to write good sw, in any role.
Sad that some industry leaders who build their current stuff from their legacy are exactly the opposite of this. When" happy path" only is actually the outcome of a fight between dev and management...because management is fine with less than the happy path as shipping is more important than quality.
Make great - not date. That's what I hear from our most senior product leadership.
If you focus on glorious good paths instead of asking what could go wrong, you basically invoke Murphy's Laws for engineering. It's never a question of "if", only "when" and "how bad". I really don't mind being called "pessimistic" when I deliver value.
Perhaps this is the reason, why certain people get promoted while decent engineers are not. They are too skeptical and don’t show their „driven to win“ attitude. 🤷🏼♂️
Thank you ❤❤
Where you say "positive" I'd say "delusional". The very first thing I learned on this job is that defensive code/design is robust and therefore both resilient and reliable. Error handling is job one. I have too many stories of stupid, easily foreseen disasters to recount.
"Done right" or "Done right now"? Your boss will choose the former and be promoted before the later needs to be paid back.
As a pessimist, I'm naturally good at this! And it's not a joke! 😁
So developers do not check inputs etc.?
BS distinction
I would add a little bit to your statement. 'Engineering is about pragmatism and trade offs'
Floppy disk ? I have to google what that is.
Write a program to print "Hello World". What could possibly go wrong?
I didn’t learn anything. Failure mode work has nothing to do with developer vs engineer. Just one of the processes that need to be applied.
I'm glad to know my chronic anxiety makes me an engineer with no prior education
On the other side of the coin, assessment needs to be done from time to time to balance overengineering and mvp, otherwise, failure to meet the time to market could kill all well intentions.
I think that assumes that doing a good job is more costly, I don't believe that it is over the duration of a software project. Because good code is simpler, easier to work on and easier to change, teams that spend the time to keep their code "good" are much more efficient. "There is no trade off between Speed & Quality" - State of DevOps Reports.
Making it work VS making it survive inevitable failure.
He knew the guy who'd swipe lunches in the employee kitchen fridge would swipe software A and run it😈🤣
I write code. That's what I do.
There's some math to it. There's some art to it.
This leaves one open question: was Margaret Hamilton's dress gold-and-white or black-and-blue?
Looks like Engineer A had good potential to become a Black Hat Hacker, or an Anarchist.
Engineer B is also called an ..
I think that (good) software *engineering* is being aware of the (entire) *context* in which your software runs. And then act on that awareness.
This context can include possible failures that can occur, but also can mean user-behavior, system-behavior, heck sometimes even physics that can get in the way, etc.
How do we make all the checks within a function if the language isn't prepared for that?
In Java you can have unchecked exceptions, which assuming the function that returns it is random code, it can throw any of those, we "should" put everything on a try-catch then or one for each function, but that's bad practice.
This either way doesn't do well for lambdas, and Rust has the Result type which is wonderful, but how do we tell Mr. NoMoneyForThat and Ms. NoTimeForThat that we should use better languages? Now imagine telling the poor guys using interpreted, loosely typed languages that there's such a thing as a good practice
Apollo did have bugs found, and they were life threatening ;-)
if you don't find bugs in a program somebody is cheating
Mind your constraints when building solutions.
Both software developer and software engineer mean the same thing and people use these terms interchangeably. Almost every developer/engineer tries to do their best and they want to have code that is maintainable in the future, robust, doesn't have errors and is resilient to changes. Of course not everyone has the knowledge/experience to do it and not everyone is trained enough to imagine ALL the possible errors that might occur. Even the best software engineers/developers can't foresee all the failures that will happen with their code and it is impossible to foresee how users will end up interacting with the product. Suggesting there's a divide between software developers and software engineers is just trying to have a go at less experienced people. Instead of using the term 'developer' as a derisive term, let's focus on only promoting approaches that have proven track record of providing benefit to long-term code base health. Trying to change definition of the term 'developer' into effectively a cuss word is not helping anyone, it is only alienating less experienced people in the field and causes junior developers to have self-doubts and impostor syndrome, all the while trying to elevate one-self (in this case Dave) to this status of a highly qualified person that always gets it right, when that's not the case at all.
I'd argue, and do in my book on the topic, that what you say may be true, but that it is only true because in software we have devalued what the term "engineering" means. Most software development is not done as engineering in the classic sense of the term. I define engineer like this. "Engineering is the application of an empirical, scientific approach to finding efficient solutions to practical problems." that is not how most software developers approach their work!
@@ContinuousDelivery yes, you're free to choose a definition of the word as you see fit. There will be others who use slightly different definitions for the word engineer. For example there are certainly people out there who don't consider software engineering to be real engineering and would argue vehemently that it's not engineering, using a definition of engineering that suits their view to be derogatory towards software engineers.
They'd laugh at the notion that you're doing any engineering.
It's the same argument as software development vs software engineering. It's just a word play to put down everyone who does not do things the way you do.
Software developers (and engineers) need to realise the second order consequences of their poor actions or inactions. Creating software is as much a moral exercise as it is academic or practical.
Sure, we're not all making pacemakers, but there are so many examples of poor software costing jobs, bankrupting companies, resulting in wrongful criminal convictions for users, prison, and sadly suicide.
We have moral obligations to keep high levels of quality, and push back, even refuse, when requested to do things that violate users or are unlawful (putting in the effort to document and communicate why).
The problem is that many people simply do what they got told to on the JIRA ticket with little accountability. They end up allowing themselves to be treated as an exchangeable part on the factory line and they do it to themselves.
There are software developers and there are engineers. The term 'software engineer' is a misnomer. If I see anyone call themselves a software engineer on a resume, I will throw it away.