Migrations - The Hardest Actual Problem in Computer Science • Matt Ranney • YOW! 2022

Поділитися
Вставка
  • Опубліковано 13 чер 2024
  • This presentation was recorded at YOW! 2022. #GOTOcon #YOW
    yowcon.com
    Matt Ranney - Principal Engineer at #DoorDash ‪@DoorDash‬
    RESOURCES
    / mranney
    / mranney
    ABSTRACT
    #Migrations sound boring and hard. If you do them wrong, migrations can cause outages, data corruption, and slow down your whole engineering team. But how can you do them right?
    In this talk, I'll share some learnings from working on different migrations over the years and offer some ideas for how to avoid as much pain as possible. [...]
    Download slides and read the full abstract here:
    yowcon.com/sydney-2022/sessio...
    RECOMMENDED BOOKS
    Simon Brown • Software Architecture for Developers Vol. 2 • leanpub.com/visualising-softw...
    David Farley • Modern Software Engineering • amzn.to/3GI468M
    Dave Farley & Jez Humble • Continuous Delivery • amzn.to/3ocIHwd
    Woods, Erder & Pureur • Continuous Architecture in Practice • amzn.to/2QWAmkl
    George Fairbanks • Just Enough Software Architecture • amzn.to/3uZzVo0
    / gotocon
    / goto-
    / gotoconferences
    #MattRanney #SoftwareEngineering #Programming #SoftwareDevelopment #YOWcon
    Looking for a unique learning experience?
    Attend the next GOTO conference near you! Get your ticket at gotopia.tech
    Sign up for updates and specials at gotopia.tech/newsletter
    SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
    ua-cam.com/users/GotoConf...
  • Наука та технологія

КОМЕНТАРІ • 59

  • @Keisuki
    @Keisuki Рік тому +85

    A big reason, I think, that devs don't like doing migrations, is that migrations are massive career killers.
    As mentioned near the end, successful migrations simply aren't rewarded, they don't produce any cool metrics, they don't generate business value, and they don't get you promoted.
    But then, on the other side, failed migrations are an incredible risk to the dev. This obviously depends on the culture of the company you're working for, but if developers have ever seen one of their peers be called out or blamed, there's no way those devs will willingly do a migration. A failed migration can cause immense amounts of damage to a business, and if the bosses are the vindictive kind, they will come after you and make things awful for you.

    • @Keisuki
      @Keisuki Рік тому +1

      Great talk, was very interesting. 8.5/10

    • @ridcully
      @ridcully Рік тому +3

      These migrations outlast yearly planning cycles. I’m convinced there is immense business value in doing a necessary migration successfully, but the benefits outlive the standard reward cycles of those shouldering the risk.

    • @steveftoth
      @steveftoth Рік тому +4

      @@ridcully exactly. Scale is a feature and scaling these services is definitely always harder than we want to say it is. That should be the way we sell these migrations. Scale is a feature, maintenance reduction is a feature and both have huge bottom line implications but most engineers are unable to build those metrics for management to understand.

    • @jasonlotito
      @jasonlotito Рік тому +6

      "they don't produce any cool metrics, they don't generate business value, and they don't get you promoted"
      If they don't produce anything of value, then they aren't worth doing. If you are doing a migration, it's because it's the best solution for whatever you are trying to solve. Whether that's stability, performance, capacity to grow, being able to hire, lower costs, security, etc. Migrations shouldn't be rewarded. The results of migrations are what you need to focus on.
      I've been a part of a large number of migrations, and that's how you need to approach them. A migration isn't worth anything. Saving $100,000 a month because of a migration while also improving stability by an extra few 9's is something worthy of recognition.
      At the same time, you need engineering leadership that calls that out and rewards people who do that.

    • @slesers
      @slesers Рік тому +3

      As for internal dev, migration maybe is career killer. For consultant, it's career builder.

  • @mbunkus
    @mbunkus Рік тому +31

    The talk was really great, but as with every advice one needs to keep the situation in mind the advice was given for. Matt's talking about systems at an enormous scale that's accessed by millions across the globe 24/7 and must therefore be up 24/7 & cope with such a huge influx of requests. For such systems the requirements & useful techniques are vastly different that, say, business applications for HR departments of small to medium companies (say up to 5.000 employees) with maybe 20 concurrent users max. A huge number of us developers are developing tools for such markets, and the solutions we can use there are definitely vastly different. You can easily use a single PostgreSQL instance & ORMs there; they provide immense values and their drawbacks simply won't matter in such environments.
    The drawback of "don't use ORMs", "create a lot of micro services", "use multi-master databases" etc. is an immense complexity. That complexity is needed to scale massively, but massive amounts of us developers don't need such massive scale and would do themselves a disservice by introducing massive complexity.
    What I'm trying to say: when designing your system spend a good amount of time thinking about the targeted environment & adjust the tools & techniques you use accordingly. Don't blindly follow cool YT videos.
    I'm really not throwing trying to throw shade on the video. I thoroughly enjoyed it. I'm just starting to see both "LOL no ORMs you stupid!?" and "LOL ORMs are stupid" comments, and both are just as wrong as they're right. It depends.

    • @AmanSingh-ou6tq
      @AmanSingh-ou6tq Рік тому +3

      "There is no silver bullet in Software Engineering. There are only tradeoffs."

    • @vyas-n
      @vyas-n Рік тому

      While historically I would agree with you, in the current landscape I would find it hard to believe that the ORM is the best architectural solution. With today's tech stacks for developing a low-complexity or low-uptime system, an ORM might be too much complexity for the application.
      Questions that come up are:
      - Do you even need a SQL database? Could you just use NoSQL or even Object Storage?
      - Does your SQL schema need to be strictly tied to your application?

    • @thePontiacBandit
      @thePontiacBandit 10 місяців тому

      Totally agree. Context matters.
      I will say though, a crucial part of using ORMs is learning how to work around them. In a fast paced environment with lots of developers, a lack of knowledge and being overly trustubg of an OEM can create technical debt that is hard to triage and deal with.

  • @Stephendenham
    @Stephendenham Рік тому +26

    Why is it so hard? In part because we don't have a way to recognize and reward the problems our architecture avoided

    • @ChrisAthanas
      @ChrisAthanas Рік тому +2

      This is so true

    • @bbravoo
      @bbravoo 9 місяців тому

      But also because you need to keep the plane flying and upgrading the engines while you build the next aircraft

  • @wrjacqmein
    @wrjacqmein Рік тому +13

    "Good Abstractions - A good abstraction or interface is one that allows either side to change something without requiring coordination or changes to the other side" - Matt Ranney

    • @agcwall
      @agcwall Рік тому

      I still don't understand this point of the talk... I can't think of a single abstraction where you can do this. If you change an interface on one side, you're necessarily forced to change the user, with the exception of purely additive changes like adding a new method or parameter. Then he asks if GraphQL is a good abstraction... so confused, graphql isn't an abstraction, it's a way of querying for data...

    • @jimmyhirr5773
      @jimmyhirr5773 Рік тому

      ​@@agcwall ASCII is a good example. This is an interface that was originally created for teletypes and line printers. It's survived multiple generations of advancements in computers and now it's everywhere. If someone creates a new type of display or keyboard, there's no need to change ASCII to support it.

    • @jimmyhirr5773
      @jimmyhirr5773 Рік тому

      ​@@agcwall I wouldn't be so dismissive about an interface supporting adding a new method or parameter without breaking. There's lots of statically typed languages and data formats (like XML) where adding a new method or parameter will cause breaking changes.

    • @agcwall
      @agcwall Рік тому

      @@jimmyhirr5773 And if you want to add new features and support chinese, for example, you break this interface. I still don't see how this serves as an example of "change something without requiring coordination or changes to the other side". I still can't think of a single example.

    • @agcwall
      @agcwall Рік тому

      I feel like maybe we're using differing terminology for the idea of a "protocol"... if you want to supply info not supported by the protocol, you force a change on both ends, period.

  • @utoob7361
    @utoob7361 Рік тому +3

    The problem with engineering, and computer science specifically, is that putting out fires makes you a hero, preventing fires gets you laid off. So bad programming is actually a good career move.

  • @MaximilianSchafzahl
    @MaximilianSchafzahl Рік тому +7

    thanks, good practice oriented talk

  • @Tubingonline1
    @Tubingonline1 Рік тому +1

    This is an awesome talk , cannot believe "Migrations" could be made so interesting! 😊

  • @tyeadel
    @tyeadel Рік тому +2

    Oh yes. Once migrated a system from a mainframe to a group of sql servers. There were a large number of remote clients all with their own systems and we had an in house database system written (largely undocumented) over a few years by developers who had left. Oh what fun!!!

  • @garrettdoorenbos4048
    @garrettdoorenbos4048 Рік тому +1

    Great presentation, very informative.

  • @ericlegoubin4462
    @ericlegoubin4462 Рік тому +3

    It's true that the migration doesn't bring anything in terms of functional value.
    However, as Matt Ranney points out, velocity had dropped and something had to be done. If the time to market has actually decreased after the migration, the migration can be considered as successful. The point is to determine the criteria of success before the migration and to measure them after the migration.
    I have the impression that the solution code should have circular dependencies to force both upward and downward compatibility, as mentioned in the presentation. The scenario of refactoring the solution to break these dependencies had to be discarded due to the workload.
    Also, I note that the unit test coverage was not enough and that the proof of isofunctionality came from the comparison of the database state.
    Regarding the notion of good abstraction, this is similar to the notion of pivot format or service contract. Good abstractions are the key to divide the problem. Migrating a large monolith remains fundamentally risky and therefore difficult! Migrating a set of small monoliths is more achievable.
    Among the tracks that we have today, it comes to my mind
    * Micro-service architectures allow migrating block by block and smoothing the migration load over several iterations.
    * Progressive deployments such as A/B testing allow to deploy a new version in real conditions with a limited business risk.
    * Hexagonal architecture improves the segregation between the business code that we generally want to keep and the infrastructure code that is often the object of the migration.

  • @theodorbarth8436
    @theodorbarth8436 Рік тому +13

    I totally agree: don't use an ORM.

  • @jimmyhirr5773
    @jimmyhirr5773 Рік тому

    The definition of a good interface is both concise and accurate. It's implicit in the design goals of having high cohesion and a high locality of connascence. One thing it misses is that if you do have to make a change across the interface, it's better for it to be small and simple. This is covered by coupling and by the strength and degree of connascence.

  • @KnThSelf2ThSelfBTrue
    @KnThSelf2ThSelfBTrue Рік тому +2

    idk, I feel like a service mesh with canaries can address a lot of these issues. And also, like... an ORM sharing a database with RPC services, and one that has built-in eventing/message-brokering system that other services can subscribe to sounds like a lot of complexity shoved into a single django process that has nothing to do with the complexity of migrations.

  • @woolfel
    @woolfel Рік тому

    As a consultant, I've worked on several migration projects and it's super painful. It pays well in the consulting world and constitutes a significant percent of the work at fortune 500 shops. For me, maintainability, documentation and ease of ramp up are way more important than lines of code. Too many people obsess over lines of code or trying to show how smart they are.
    When Bank Of America acquired Bank Boston, the original estimate for integration was 3-5 years. It ended up being much longer and closer to 8 years.

    • @jimmyhirr5773
      @jimmyhirr5773 Рік тому

      According to the CHAOS report, the average software project is delivered in double the estimated time. So that outcome is very common for all software projects.

  • @OrcusMaximus
    @OrcusMaximus Рік тому

    Zero downtime. Does PostgreSQL support seamless failover to a slave? If not, is your database server never taken down for patching?

  • @millertime6
    @millertime6 11 місяців тому

    Me: used an ORM and auto incrementing keys just yesterday 😩 🤦🏽‍♂️

  • @matthewpublikum3114
    @matthewpublikum3114 Рік тому

    31:30 more info on that replication problem

  • @uome2k7
    @uome2k7 Рік тому +1

    You don't cover what you think a good abstraction would be. But I don't know what would work here. To make a meaningful change, the contract would have to be changed in some way or your APIs have to have deviated (potentially farther) from your database models.
    Migrations are hard but its fun to untangle the spaghetti. It really helps you want to write/design good code to leave it easier for the next team. Its hardest to convince business to pay for this because they aren't getting any new value from it. And its hard to justify the potential cost savings/velocity increases ahead of time.

    • @OrcusMaximus
      @OrcusMaximus Рік тому

      Yep. Finding bugs in old code proving that it has *never* run, and so can be deleted. Great feeling.

  • @richardduncan3403
    @richardduncan3403 Рік тому +2

    Nice preso. This guy would have made a great character on Silicon Valley as whitty migration guy.

  • @adambickford8720
    @adambickford8720 Рік тому +4

    Migrations have zero 'wow' factor; it either worked 'like it was supposed to' or you look like a clown.
    Its also usually incredibly slow with long feedback cycles so you are never in 'the zone'. You make some tweaks, kick something off and wait, the whole time contemplating how much this sucks. It doesn't have that 'one more try' factor that makes development fun.

  • @nekony3563
    @nekony3563 Рік тому +22

    Migrations are not that hard. Having no downtime is hard. If having no downtime "doesn't add direct business value", then migrate with downtime.

  • @clementdato6328
    @clementdato6328 Рік тому

    very true, I rise above the average attractiveness

  • @br3nto
    @br3nto Рік тому +1

    47:04 automate it. How?

    • @ChrisAthanas
      @ChrisAthanas Рік тому

      That’s what he is saying the challenge is

    • @br3nto
      @br3nto Рік тому +1

      @@ChrisAthanas yeah I get that. It would be nice to hear from him what he has already been able to automate, what the easy parts are, what the hard parts are, what seems to be the most fruitful to automate with priority, has he developed any techniques yet.

    • @ChrisAthanas
      @ChrisAthanas Рік тому

      @@br3nto due to the widespread "roll your own custom" migration techniques with a normally custom API and database
      Due to this, automation is difficult and it would be better to have some best practices about how to design systems that are more easy to migrate from the start
      As the field matures this will become the norm and ai tools may help, and this will continue to be a pain point as each business solution is different and future expansion directions are usually unknown

    • @br3nto
      @br3nto Рік тому +1

      @@ChrisAthanas yeah definitely. Showing comparisons of different techniques is useful, like when he demonstrated the pros and cons of two different techniques of shared db vs double write. More of that would be awesome. The hard part about migrations is we have to pick a migration path, so we often have nothing to compare against.

    • @br3nto
      @br3nto Рік тому +1

      @@ChrisAthanas I guess it will become more common place too. We’re always building new code, so in the future there will be more code to migrate. We will need to improve the writing of code and migration of code just to keep ourselves from being inundated with work.

  • @declanmcardle
    @declanmcardle Рік тому

    @3:00 a year? Try 10 or 20!

  • @br3nto
    @br3nto Рік тому +8

    Lol. The advice, Don’t use an ORM, is like saying, Don’t use a web server framework or Don’t use a client-side rendering framework, or don’t use a server-side rendering framework. That advice is the kind you get from dogma; it may not even be generally good advice let alone good for your specific context.

    • @nielsbom5558
      @nielsbom5558 Рік тому +1

      Are you talking from experience like the speaker is? Have you done large scale migrations of systems that always need to keep working? And if so: what did you think of the talk?

    • @Keisuki
      @Keisuki Рік тому

      As with all things, the answer is unfortunately "It depends".
      I've seen situations where ORMs have been really useful, and situations where ORMs have screwed everything up to the point of needing a complete rewrite.

    • @br3nto
      @br3nto Рік тому +4

      @@nielsbom5558 my experience is relative. I’ve done plenty of migrating old systems to new systems with the requirement of no down time. I’m not sure how it compares to the speakers experience. Large is also relative. Some were large to me. I think the speaker didn’t actually talk about what he wanted to talk about. Which is a shame because he probably had a lot of usefully things to say. Right at the very end he says there is a lack of tooling to help with migrations and no automation. That the process should be automated. Basically implying that it should be as easy and methodical as refactoring code. But he didn’t go into any of that in detail. There’s not really a lot of useful information I can take away and apply. It was good seeing the comparison between two approaches of migrating a service and the pros and cons. The advice about ORMs doesn’t make sense, because we don’t choose the systems that need to be migrated; they come as they come and we have to deal with that history somehow. That advice only makes sense in a brand new project; but why would we not use good tools and frameworks? We don’t know what the future holds, so coding for the future is a fools errand. Saying that though, there are styles of coding that are easier to migrate than others… eg highly nested code or deep call stacks is much harder to migrate. If you can’t isolate and test specific functionality, you can’t easily do a migration. That’s true regardless of the framework and libraries or infrastructure choices that are made.

    • @janekschleicher9661
      @janekschleicher9661 10 місяців тому +1

      @@nielsbom5558 Well, the question here is, what do you have to do before you got a system so big and successfull that you could need such a hard and painful migration. If you start with a very clean, future safe, well abstracted initial implementation: you might take longer, you probably can't kickstart it as a one man (or small team) project - just as one critical part to do good abstractions is to have intense and diverse discussions, you can't probably start with juniors and it might be expensive in the beginning, especially if you try not to take frameworks that give you something for "free".
      The problem here is, if you do all of this, you might have a great architecture that would be very safe if you get very big, but you don't come so far, because everyone who is just making fast prototyping and postponing migration problems to the point till you are very successful and already printing money, so you can even invest 25% of all your budget just in the migration, will have bypassed you.
      That's why close to all successful companies started with a simple, but hard to maintain (at some point) monolith. If they didn't, they wouldn't be successful in the beginning and their wonderful architecture never benefits. Like DoorDash also did.
      Having to do a migration is a luxery problem of the winners, not a bad design to start with.
      Django/ORM/Admin Panel/Middleware/Postgres allows you to bring up professional service running for many users (maybe not millions or billions) in just some days as a one project.
      Starting with an abstraction over Kafka, distributed Microservices, distributed databases, service layers and so on, would be future proof, but kickstarting it might take months of multiple seniors working on it with a solid budget already going on.
      And you never know, whether the product you are developing is really the product needed, so what ever you think about it, you'll need some rapid prototyping. Also, you don't really know what will be from now in 10 or 20 years the pain points of migrations. While it's a good idea to apply best practices when ever you can, it might be a not so great idea to deny current best practices just for the sake of a possible, unknown migration in 10 years. Software already would be much easier to refactor and migrate if current best practices would really be applied usually. I would bet that most migrations suffer mostly from a lack of documentation, testing and automation (all of them are current best practices). And not so much from using one technology in a consistent way as much as possible. Again, there's a point in that a successful product is somehow guaranteed to be overwhelmed by its own success, so all of these points already are complicated to fullfill.

  • @laughingvampire7555
    @laughingvampire7555 11 місяців тому

    ORM's considered harmful

  • @dengan699
    @dengan699 Рік тому +10

    LOL "dont use an ORM"
    Sorry what? You literally just spent 51min talking about how you couldn't migrate database. ORM's main job is to abstract what database you use.
    The joy of ORM is that you can run your local sqlite with the same schema as your production Postgres, without needing Postgres locally / Aurora / Mysql/ whatever.
    Enjoy running your bespoke Mysql instance which only works under Debian Jessy with a dozen of bespoke patches to make it work so you can run your tests!