Looking for books & other references mentioned in this video? Check out the video description for all the links! Want early access to videos & exclusive perks? Join our channel membership today: ua-cam.com/channels/s_tLP3AiwYKwdUHpltJPuA.htmljoin Question for you: What’s your biggest takeaway from this video? Let us know in the comments! ⬇
Some timecodes 6:40 Isolate the data 9:20 Release train 12:50 Horror, pain and suffering 15:00 Strangler fig 22:00 Branch by abstraction 26:00 Parallel run 28:30 Extracting data 39:00 Partitions
This is a great talk! Sam Newman's book is a MUST if you encounter in the position of a big monolith asking for an architectural chance. Too bad I can't like this talk more than once.
In the database example, It is clear that moving the join to application layer is a not scalable solution, as also stated in other comments. I guess that a probable solution to huge problems like this where join over millions of records is required, is to use a Middleware layer cache/ data analytics / reporting tool. Nevertheless, when size gets bigger and data increases, everything gets more complicated
34:43 "What instead we're going to want to do is move the join relationship up into the application tier". Yeah, as mentioned, you are absolutely definitely not going to want to do that. Joining through the application in this way won't just be slower. It will be several orders of magnitude slower for any non-trivial cardinality. Here's a fun exercise: Put a modest 10,000 rows on the left, then put a still pretty modest 1,000,000 rows on the right, then design a system which joins through the application in this way for any significant and realistic amount of data, and then start looking for a new job.
If you're going to have a function of your application that regularly yields a million+ rows then you probably don't want to pull that away from its DB and put that into a microservice. Like he said it's all about trade-offs.
The purpose of his solution was exemplify a simple way to decompose the database. BUT in the case you mentioned, a solution would be replicate the data through events to the other microservice so the join could be done in the database layer, the microservice which holds selling data could have a copy of the catalog data. Sure, It could have consistency problems but It's all about context and the trade-offs that comes with it
19:00 I'm a bit surprised Sam does not mention the possibility of releasing functionality without actually deploying it, usually via announcement email that has not been ran by development team.
Hello Sam, I have a scenario can you please provide your advice. The situation am into is for Modernizing a C++ based IBM Tuxedo framework based services (they are quite a spagetti with all services hosted and scale together.) Probably can say scaling monolith developed in legacy tech stack (c++). Company has decided to modernize it to Microservices odel in Azure cloud using Java and modern tech stack. The code base is almost 30 years old with lot of business logic and time tested implementation. What would be your advice in modernizing such Distrubuted Monolith? Should we pick domain by domain and reimplement from scratch? or verticaly slice the [rocessing maintaining it as still C++ into cloud with Java based Microservices (with JNI interface into busiess logic c++ vertical sliced code?) ? And have calls back to the IBM DC for any other domain dependencies? Also there is challenge of cross Environment communications (Azure cloid and IBM DC).
But that thing with the catalogue at the end... Pretty hard to get that with a monolith. I've encountered this many times in my career. You could have a lot of availability and try to polyfill with HA services. See how well that works out for you with a 1s time to diagnose
I'd stick with a well-decomposed monolith with decoupled modules, until the need for scale exceeded the trouble that comes with trading simple ACID transactions with event-driven thingies and sagas ...
What if my legacy system is delieverd by a third party and we don have any access to the code, the only thing we can access is the database. We cannot have integration from the legacy to the new microservice platform. Our suggestion is to create integration services between the legacy database and the new microseervices, is that a good practice you think?
Unfortunately, this presentation takes as a hypothesis you already have bounded context in your monolith, which are imo, the first and hardest goal to reach.
A question regarding the Catalog and Finance example: How do you make a pagination for that query? For example, you ask how many records sold last week, but there are more than 10k results, so you want to bring these results split to pages.
We already have a way to create, build and deploy modules in Java. OSGI allows us to build our application as independent modules which can then be deployed on a running instance just like a plugin.
Actually, the argument of "loosing" integrity enforcement on DB split is kind of straw-man. The enforcement is encoded more easily, however it is coded after all!
In large government projects audit trace requirements are very strict. If you can't write an audit trail record in your audit trail store, you must rollback all pending data changes to avoid situation when data in some database has changed without any audit traces. So, if you have split your DB for your microservices, you have to either ensure separate audit trail records in every database (and then somehow be able to still aggregate them for analysis in single tool with single UI) or use distributed transactions. Anyway, it can get tricky to maintain such a system.
It's really night and day. In the database, it's described declaratively; and once described, none of your application code needs to consider dangling references because (barring a database bug) they can't happen. In the application code, it's described operationally in the way the code behaves, and all code that deals with these database entities needs to consider dangling references, because they'll occur sooner or later.
@@LewisCowles Only someone who does not actually understand the distinction between "theory" and "theorem" would assert that the distinction them is trivial, or in other words, by commenting you have demonstrated that you have no business commenting.
Thank you. Good talk until getting into the database topic. The so-called "monolith database" are usually those are created by "developers." Unfortunately, developers are getting into the business of database creation (tools make it much easy and tempting), especially with the advent of NoSql. This kind of talk about databases refactoring (about the data in general) is a systematic problem within the organizations: Especially, in developer-driven culture. It seems to me developers never able to understand that data/information is the enterprise's shared assets. Programming languages, tools, applications, technology, methodology come and go. But, information always stays. Therefore, creating and managing data (databases.) in a sustainable way requires experienced data specialists (data architects, database engineers). Your microservices will be as good as the level of maturity of the data management in the organization.
I agree, but let me tell you one thing, there are developers out there that put a lot of effort into creating good data structures. Of course a data specialist will be more prepared, but don't underestimate the fact that a developer will be the person that make that data fruible by services, and somehow this can be a pro. Best thing would be a good collaboration between those two figures, but it cannot always happen. Personally in many situations I preferred to delegate logic to the dB, instead of doing it later in the code, and I spend much time to create tables in order to avoid later modifications. I see the data as the effective core of applications, at least the ones on which I'm working on in the last years. The application itself is what access, present and manipulates this data in the most effective way possible, tought it remains a "satellite". Plus, when time passes the db grows acquiring a certain value. The direct one, but also an indirect form that can be determined through data analysis. Maybe there are other developers out there doing better than me, but honestly I wouldn't say that dbs are only for database engineers. Of course we are talking about the average application, not gigantic projects. These are only personal thoughts, but the main point here is far from db construction, I'd really like to split a monolith app into two parts, three maximum, because the underlying data allows me to do that. But when it comes to loosing foreign keys, and joins... Well then I prefer to don't split that part of the application and keep data integrity. What do you think about this? I would appreciate any suggestions.... Thanks
Looking for books & other references mentioned in this video?
Check out the video description for all the links!
Want early access to videos & exclusive perks?
Join our channel membership today: ua-cam.com/channels/s_tLP3AiwYKwdUHpltJPuA.htmljoin
Question for you: What’s your biggest takeaway from this video? Let us know in the comments! ⬇
Finally someone who understands that not every company is Netflix.
Or is just flexing to get a job at Netflix
Some timecodes
6:40 Isolate the data
9:20 Release train
12:50 Horror, pain and suffering
15:00 Strangler fig
22:00 Branch by abstraction
26:00 Parallel run
28:30 Extracting data
39:00 Partitions
Thanks
Great coverage. Clear Audio. Good visible slides. Good view of the speaker. Well done GOTO. Well done.
I disagree about the audio. The could have applied some echo removal filter.
Yes agree - the number of talks I've seen where the speaker is referring to a slide and the camera is on the speaker! This talk is great for clarity.
I am falling in love with goto talks. Keep going! 🥰
This is a great talk! Sam Newman's book is a MUST if you encounter in the position of a big monolith asking for an architectural chance. Too bad I can't like this talk more than once.
In the database example, It is clear that moving the join to application layer is a not scalable solution, as also stated in other comments. I guess that a probable solution to huge problems like this where join over millions of records is required, is to use a Middleware layer cache/ data analytics / reporting tool.
Nevertheless, when size gets bigger and data increases, everything gets more complicated
Interesting take, gives me stuff to think about 🤔💬
Wonderful talk. Highly recommend Sam's books, too!
Great speaker, great content!
34:43 "What instead we're going to want to do is move the join relationship up into the application tier".
Yeah, as mentioned, you are absolutely definitely not going to want to do that. Joining through the application in this way won't just be slower. It will be several orders of magnitude slower for any non-trivial cardinality. Here's a fun exercise: Put a modest 10,000 rows on the left, then put a still pretty modest 1,000,000 rows on the right, then design a system which joins through the application in this way for any significant and realistic amount of data, and then start looking for a new job.
If you're going to have a function of your application that regularly yields a million+ rows then you probably don't want to pull that away from its DB and put that into a microservice. Like he said it's all about trade-offs.
@@patwentz3830 My example doesn't require that the function yield a million rows, merely that the search space is a million rows.
The purpose of his solution was exemplify a simple way to decompose the database. BUT in the case you mentioned, a solution would be replicate the data through events to the other microservice so the join could be done in the database layer, the microservice which holds selling data could have a copy of the catalog data. Sure, It could have consistency problems but It's all about context and the trade-offs that comes with it
@@pafernandesful Yep, event carried state transfer if you choose A over C in CAP (this is what we do at my company)
As always, impressive talk by SAM.
Great talk! Nice overview of microservices patterns.
Thanks
we were in the master class together. Let's connect.
19:00 I'm a bit surprised Sam does not mention the possibility of releasing functionality without actually deploying it, usually via announcement email that has not been ran by development team.
He mentions that later
Hello Sam, I have a scenario can you please provide your advice. The situation am into is for Modernizing a C++ based IBM Tuxedo framework based services (they are quite a spagetti with all services hosted and scale together.) Probably can say scaling monolith developed in legacy tech stack (c++). Company has decided to modernize it to Microservices odel in Azure cloud using Java and modern tech stack. The code base is almost 30 years old with lot of business logic and time tested implementation. What would be your advice in modernizing such Distrubuted Monolith? Should we pick domain by domain and reimplement from scratch? or verticaly slice the [rocessing maintaining it as still C++ into cloud with Java based Microservices (with JNI interface into busiess logic c++ vertical sliced code?) ? And have calls back to the IBM DC for any other domain dependencies? Also there is challenge of cross Environment communications (Azure cloid and IBM DC).
Amazing talk!
Monolith is not Legacy, Legacy is not Monolith. Those are two distinct terms with distinct meanings, not to be used intraoperatively.
But that thing with the catalogue at the end... Pretty hard to get that with a monolith. I've encountered this many times in my career. You could have a lot of availability and try to polyfill with HA services. See how well that works out for you with a 1s time to diagnose
Fantastic
I'd stick with a well-decomposed monolith with decoupled modules, until the need for scale exceeded the trouble that comes with trading simple ACID transactions with event-driven thingies and sagas ...
What if my legacy system is delieverd by a third party and we don have any access to the code, the only thing we can access is the database.
We cannot have integration from the legacy to the new microservice platform. Our suggestion is to create integration services between the legacy database and the new microseervices, is that a good practice you think?
Unfortunately, this presentation takes as a hypothesis you already have bounded context in your monolith, which are imo, the first and hardest goal to reach.
I can't point exactly what but there is something Jim Morrisonesque in this talk.
A question regarding the Catalog and Finance example:
How do you make a pagination for that query?
For example, you ask how many records sold last week, but there are more than 10k results, so you want to bring these results split to pages.
Maybe for current page send their ids as list via service call and fetch their details as list from catalog service?
CAP theorem rather?
Excellent Boss 👍👍💪 💪
We already have a way to create, build and deploy modules in Java. OSGI allows us to build our application as independent modules which can then be deployed on a running instance just like a plugin.
37:40 simply never delete anything
35:50 FKs don’t improve perf of joins
Actually, the argument of "loosing" integrity enforcement on DB split is kind of straw-man. The enforcement is encoded more easily, however it is coded after all!
In large government projects audit trace requirements are very strict. If you can't write an audit trail record in your audit trail store, you must rollback all pending data changes to avoid situation when data in some database has changed without any audit traces.
So, if you have split your DB for your microservices, you have to either ensure separate audit trail records in every database (and then somehow be able to still aggregate them for analysis in single tool with single UI) or use distributed transactions. Anyway, it can get tricky to maintain such a system.
It's really night and day. In the database, it's described declaratively; and once described, none of your application code needs to consider dangling references because (barring a database bug) they can't happen. In the application code, it's described operationally in the way the code behaves, and all code that deals with these database entities needs to consider dangling references, because they'll occur sooner or later.
Nice trouser
My head hurts from hearing "CAP Theory" instead of "The CAP Theorem"
you should become a game streamer. Then you'll have full control over all the trivial details that don't impact the world ;-)
@@LewisCowles Only someone who does not actually understand the distinction between "theory" and "theorem" would assert that the distinction them is trivial, or in other words, by commenting you have demonstrated that you have no business commenting.
Don't worry about it.. networks suck and consistency is impossible.. look after your users
Many tutors on the youtube are not harmed during the talk......
Thank you. Good talk until getting into the database topic.
The so-called "monolith database" are usually those are created by "developers." Unfortunately, developers are getting into the business of database creation (tools make it much easy and tempting), especially with the advent of NoSql.
This kind of talk about databases refactoring (about the data in general) is a systematic problem within the organizations: Especially, in developer-driven culture.
It seems to me developers never able to understand that data/information is the enterprise's shared assets. Programming languages, tools, applications, technology, methodology come and go.
But, information always stays. Therefore, creating and managing data (databases.) in a sustainable way requires experienced data specialists (data architects, database engineers).
Your microservices will be as good as the level of maturity of the data management in the organization.
I agree, but let me tell you one thing, there are developers out there that put a lot of effort into creating good data structures. Of course a data specialist will be more prepared, but don't underestimate the fact that a developer will be the person that make that data fruible by services, and somehow this can be a pro. Best thing would be a good collaboration between those two figures, but it cannot always happen.
Personally in many situations I preferred to delegate logic to the dB, instead of doing it later in the code, and I spend much time to create tables in order to avoid later modifications.
I see the data as the effective core of applications, at least the ones on which I'm working on in the last years.
The application itself is what access, present and manipulates this data in the most effective way possible, tought it remains a "satellite". Plus, when time passes the db grows acquiring a certain value. The direct one, but also an indirect form that can be determined through data analysis.
Maybe there are other developers out there doing better than me, but honestly I wouldn't say that dbs are only for database engineers. Of course we are talking about the average application, not gigantic projects.
These are only personal thoughts, but the main point here is far from db construction, I'd really like to split a monolith app into two parts, three maximum, because the underlying data allows me to do that. But when it comes to loosing foreign keys, and joins... Well then I prefer to don't split that part of the application and keep data integrity.
What do you think about this? I would appreciate any suggestions.... Thanks
Microservices solve one giant problem by creating all n! possible permutations of it. ;-)
Github scientist, that's cool
For mysql diffing and change management, www.skeema.io/ looks good.
Koalas want to kill you!
errrline
Annoying Echo!!!