you videos are so validating, I have naturally evolved into this pattern after facing some of the challenges you mentioned in my other projects. handling state propagation this way really decouples your services, although the you may seem like your writing more code, imo I think verbose is better and it makes reading things allot simpler because you only have relevant events and commands handled in the domain
Great video, as usual! In terms of solving the underlying problem, more often than not, if the integrity of the system depends on both the data being written and the message being sent successfully, I have found the best solution is to wrap them both in a single transaction. If the message send fails, the DB write is rolled back and the operation errors out. This is the most straightforward solution to the problem and works well for the majority of use cases. Persisting messages and sending them later (as in the outbox pattern) leads to us having to build a mini-message broker ourselves which is often much more trouble than it’s worth. And the listen to yourself pattern is almost never a great option for all the reasons you outlined so well in this video.
@@RtoipKa for sure, if wrapping both calls in a transaction is not an option, it makes sense that we’d need an alternative. I’ve just rarely encountered a situation where it’s not possible, but maybe that comes down to the particular stacks I have used.
@@DevMastery probably yes, because I have never been in situation where this is possible. For example any cloud message broker. And distributed transactions have problems of their own, even if they are possible.
@@RtoipKa with a database platform that supports transactions, we can typically do: START TRANSACTION DB WRITE MESSAGE SEND COMMIT TRANSACTION If MESSAGE SEND fails we can ROLLBACK and send an error code to the client. Of course, because this is non-atomic there is a very slim, but non-zero, chance that the ROLLBACK or the COMMIT fails so we need to handle those scenarios if we want to be bullet proof. But they are rare enough that good logging, alerting, and manual intervention can often be the answer.
Read Your Writes is a pretty good waypoint on the road to event sourcing (and arguably event sourcing is just a special case of this: a stream of events is a queue of messages). I think it's biggest value is in guiding you away from having consumers which depend for correctness on querying state that they aren't themselves responsible for updating.
The Listen to yourself pattern is also a way to scale the Monolith. You publish the event in the broker, and you listen to it again. But we might argue that this is a Monolith and the logical boundary is in the monolith itselft so we just go around again. I think I saw this in your video about avoiding the usage of MediatR for in memory event dispatcher. Thanks for the great content!
The problem with the Command approach is that it's direct, unicast and imperative, vs. the Event approach (messaging) which is indirect, multicast and declarative. As a consequence, to issue a command you have to be aware of all of the receivers, and these receivers have to be online and running, otherwise you have to persist your state and keep retrying (which is complex and impractical). In practice, it can (and must) only be applied to near-realtime systems where issuing a late command is the same or worse as not issuing it at all. It can be fixed, of course. With events. Which should have been used in the first place.
Commands don't have to be synchronous. It is direct in terms that you know the endpoint that consumed the message but they do not need to be available or able to process it (queues). Both events with a topic and commands with a queue are removing temporal coupling. Commands are more coupled than events because you know the consumer, yes.
Using this pattern alongside a create (POST) operation, we must reply with 202 status code. But using this pattern alongside an update (PUT) operation, returning 202 looks wrong. I don't think this pattern can be used alongside an update operation, I'm right?
Does the command not just push this same issue to the command handler/consumer? Command processing would still need to make at minimum a database update and then publish a separate event saying it happened?
Really good summary of some variations on outbox, but see durable store as a pattern for longer term failure in database, not a 2PC problem. You have not mentioned how input event message consistency is maintained ... i.e. there are not 2 but 3 datastore transactions (at least, discounting side-effects/APIs) - input event, database, output event. There is also the poison transaction/message situation which also needs a dead-letter-queue pattern. I suspect more similar patterns for other failure modes - is there a summary of all the various failure modes and a group of patterns needed to be implemented together to ensure fully resilient processing?
And what happens if you fail to rollback? I'm assuming you mean within a transaction. Meaning, publish then commit trx. Same problem as commit trx and publish. Their not a single atomic operation.
Thanks for sharing the Listen to Yourself pattern, Derek. But I don't understand how it can solve the consistency issue when using with Event Sourcing pattern. Do you mean the publisher (I guess command handler in Event Sourcing pattern) can subscribe to events that got stored in EventStore DB and then try to publish another integration event to message broker? What if that publication fails?
If you're storing events as the source of truth for state, then the event you store you can subscribe to. This is common for creating projections (read models) with event sourcing. Within a boundary that is. As for externally as an outside/external/integration event, it doesn't solve the problem.
There is a flaw with the listen to yourself approach. Assume you are processing orders, and if you write the order first to a message broker and then listen to it to write that to the database, how are you going to handle duplicates?
Great video! I was also thinking of alternatives to the Outbox Pattern. What do you think of binlog consumers as the alternate solution? My thought: I was wondering do we really need to manage outbox tables? Why can't we simplify the design by implementing the binlog consumers of the actual table? Example: When we are performing CRUD operations on lets say ORDER table, we can setup the consumers of the binlog of this table which guarantees whenever there is UPSERT on the ORDER table we also publish an event to the subscribers. Let me know your thoughts.
The issue with CDC, assuming you're creating an outbox table, is they don't often define the business intent that occurred. I mention that here: ua-cam.com/video/YusVrd9rHJU/v-deo.html
Thanks for getting back. As I mentioned why should we even bother to create outbox table? We can setup the CDC on the actual table instead which will have the business details. I don’t truly get the purpose of outbox table if we can simply do the log tailing on the actual (business) table
It would be a violation of a fundamental principle of microservice architecture, which is: the microservices shouldn't share a database. In the outbox pattern, the code responsible for reading records from the outbox table and publishing events to the message bus, either - is implemented in the producing service (for example runs in a different thread) - or is a separate process/service (however in this case, it's still considered to be part of the 'logical microservice', even though the microservice now consists of two physical services) The consuming side is not aware of the outbox pattern whatsoever; it just listens for the events.
Not to much into this for now, but only consuming the log without having the outbox table will not give you the information if the event already successfully sent?
Having done wide scale event based architecture I can say this is the most frustrating problem we work with. Outbox has been used widely but it is klunky. Theoretically the truth is in the event stream and so thats where the 'real' data should work.
Ya, assuming you're using an event stream as the point of truth (event sourcing). However many tend to use topics in say Kafka as an event stream. In the sense as a means to be a point of truth and communicate, which is a mess of coupling.
I've seen the mcdonalds approach before but I fail to see how it really solves the consistency issue. What if dynamodb is also down, it's still a dual write to two different databases so we are back to square one... Not to mention that further commands can come in and if the queue is back the events will be mixed up in order. Terrible idea
It doesn't solve the consistency issue. It's a "good enough" likely in their situation. The trade-off with the outbox pattern is you're database is going to have an increase in load. If that's not feasible, using a fallback can be a better situation. It also doesn't need to be an all or nothing, both can work together. Use an outbox where you absolutely need the consistency, and fallbacks for when you're fine of the risk of the fallback being down. It can be, but what's the likely hood of it? What's the risk? Is it worth it?
@@CodeOpinion An alternative approach can be following. It is possible to "short circuit" the Outbox pattern DB load by saving the "new" event in the outbox, and still within the transaction do your Message Broker call. If that succeeds you update the event to "processed", (we are still in the same transaction, so no other process can access our newly created event record yet) and therefor your Output polling won't even see the event. Because we assume that in most cases the message broker call will succeed, you can lower the intensity of the outbox processor and reduce the (heavy?) load on your database.
@@HansDeMulder but if your transaction commit failed, you have published to the message broker an event that in fact did not happen so you return to the listen to yourself problem with this solution, subscribers can receive event related to nothing and will never be.
Thank you Derek for the great explanation on this topic! I'm implementing an outbox pattern at the moment and I want to ask if it's a good idea to store events in a db not only for processing them but also for auditing and traceability purposes in the future?
You could, depending on what type of DB you're using that might be feasible (or not). For tracing you might be better served using something like Open Telemery so you can see the entire flow
you videos are so validating, I have naturally evolved into this pattern after facing some of the challenges you mentioned in my other projects. handling state propagation this way really decouples your services, although the you may seem like your writing more code, imo I think verbose is better and it makes reading things allot simpler because you only have relevant events and commands handled in the domain
Great video, as usual!
In terms of solving the underlying problem, more often than not, if the integrity of the system depends on both the data being written and the message being sent successfully, I have found the best solution is to wrap them both in a single transaction. If the message send fails, the DB write is rolled back and the operation errors out. This is the most straightforward solution to the problem and works well for the majority of use cases. Persisting messages and sending them later (as in the outbox pattern) leads to us having to build a mini-message broker ourselves which is often much more trouble than it’s worth. And the listen to yourself pattern is almost never a great option for all the reasons you outlined so well in this video.
Most of the time You cannot have transaction on database and queue. That's why this patterns exist.
@@RtoipKa for sure, if wrapping both calls in a transaction is not an option, it makes sense that we’d need an alternative. I’ve just rarely encountered a situation where it’s not possible, but maybe that comes down to the particular stacks I have used.
@@DevMastery probably yes, because I have never been in situation where this is possible. For example any cloud message broker. And distributed transactions have problems of their own, even if they are possible.
@@RtoipKa with a database platform that supports transactions, we can typically do:
START TRANSACTION
DB WRITE
MESSAGE SEND
COMMIT TRANSACTION
If MESSAGE SEND fails we can ROLLBACK and send an error code to the client.
Of course, because this is non-atomic there is a very slim, but non-zero, chance that the ROLLBACK or the COMMIT fails so we need to handle those scenarios if we want to be bullet proof. But they are rare enough that good logging, alerting, and manual intervention can often be the answer.
Read Your Writes is a pretty good waypoint on the road to event sourcing (and arguably event sourcing is just a special case of this: a stream of events is a queue of messages). I think it's biggest value is in guiding you away from having consumers which depend for correctness on querying state that they aren't themselves responsible for updating.
thank you! we use exactly this process with GCP buckets and buckets-notifications + scheduler.
6:17 So what happens when it fails to deliver?
The Listen to yourself pattern is also a way to scale the Monolith. You publish the event in the broker, and you listen to it again.
But we might argue that this is a Monolith and the logical boundary is in the monolith itselft so we just go around again.
I think I saw this in your video about avoiding the usage of MediatR for in memory event dispatcher.
Thanks for the great content!
I can't find who defined the "Listen to Yourself Pattern" for the first time. Did you have any sources?
What's the first book in your background, if I may ask? The one at the bottom is DDD one from Eric Evans, I guess. 😀
I learned a lot from this video. Thank you so much!
You're very welcome!
The problem with the Command approach is that it's direct, unicast and imperative, vs. the Event approach (messaging) which is indirect, multicast and declarative. As a consequence, to issue a command you have to be aware of all of the receivers, and these receivers have to be online and running, otherwise you have to persist your state and keep retrying (which is complex and impractical). In practice, it can (and must) only be applied to near-realtime systems where issuing a late command is the same or worse as not issuing it at all. It can be fixed, of course. With events. Which should have been used in the first place.
Commands don't have to be synchronous. It is direct in terms that you know the endpoint that consumed the message but they do not need to be available or able to process it (queues). Both events with a topic and commands with a queue are removing temporal coupling. Commands are more coupled than events because you know the consumer, yes.
Using this pattern alongside a create (POST) operation, we must reply with 202 status code.
But using this pattern alongside an update (PUT) operation, returning 202 looks wrong.
I don't think this pattern can be used alongside an update operation, I'm right?
Does the command not just push this same issue to the command handler/consumer? Command processing would still need to make at minimum a database update and then publish a separate event saying it happened?
Really good summary of some variations on outbox, but see durable store as a pattern for longer term failure in database, not a 2PC problem. You have not mentioned how input event message consistency is maintained ... i.e. there are not 2 but 3 datastore transactions (at least, discounting side-effects/APIs) - input event, database, output event. There is also the poison transaction/message situation which also needs a dead-letter-queue pattern.
I suspect more similar patterns for other failure modes - is there a summary of all the various failure modes and a group of patterns needed to be implemented together to ensure fully resilient processing?
Another pattern is to rollback the changes to Datastore and re-instantiate the consistency if you fail to forward the message to the message broker.
And what happens if you fail to rollback? I'm assuming you mean within a transaction. Meaning, publish then commit trx. Same problem as commit trx and publish. Their not a single atomic operation.
Thanks for sharing the Listen to Yourself pattern, Derek. But I don't understand how it can solve the consistency issue when using with Event Sourcing pattern. Do you mean the publisher (I guess command handler in Event Sourcing pattern) can subscribe to events that got stored in EventStore DB and then try to publish another integration event to message broker? What if that publication fails?
If you're storing events as the source of truth for state, then the event you store you can subscribe to. This is common for creating projections (read models) with event sourcing. Within a boundary that is. As for externally as an outside/external/integration event, it doesn't solve the problem.
@@CodeOpinion Would you recommend using traditional outbox pattern for publishing integration events that will be consumed by other microservices?
There is a flaw with the listen to yourself approach. Assume you are processing orders, and if you write the order first to a message broker and then listen to it to write that to the database, how are you going to handle duplicates?
Each order has a unique id.
Great video like always! Thanks!
Great video! I was also thinking of alternatives to the Outbox Pattern.
What do you think of binlog consumers as the alternate solution?
My thought:
I was wondering do we really need to manage outbox tables? Why can't we simplify the design by implementing the binlog consumers of the actual table?
Example: When we are performing CRUD operations on lets say ORDER table, we can setup the consumers of the binlog of this table which guarantees whenever there is UPSERT on the ORDER table we also publish an event to the subscribers.
Let me know your thoughts.
The issue with CDC, assuming you're creating an outbox table, is they don't often define the business intent that occurred. I mention that here: ua-cam.com/video/YusVrd9rHJU/v-deo.html
Thanks for getting back. As I mentioned why should we even bother to create outbox table? We can setup the CDC on the actual table instead which will have the business details.
I don’t truly get the purpose of outbox table if we can simply do the log tailing on the actual (business) table
It would be a violation of a fundamental principle of microservice architecture, which is: the microservices shouldn't share a database.
In the outbox pattern, the code responsible for reading records from the outbox table and publishing events to the message bus, either
- is implemented in the producing service (for example runs in a different thread)
- or is a separate process/service (however in this case, it's still considered to be part of the 'logical microservice', even though the microservice now consists of two physical services)
The consuming side is not aware of the outbox pattern whatsoever; it just listens for the events.
No, the outbox table & the original table ought to be in same database. So there is no Sharing of db here
Not to much into this for now, but only consuming the log without having the outbox table will not give you the information if the event already successfully sent?
off topic question: what books is that on top of DDD??
Having done wide scale event based architecture I can say this is the most frustrating problem we work with. Outbox has been used widely but it is klunky. Theoretically the truth is in the event stream and so thats where the 'real' data should work.
Ya, assuming you're using an event stream as the point of truth (event sourcing). However many tend to use topics in say Kafka as an event stream. In the sense as a means to be a point of truth and communicate, which is a mess of coupling.
I've seen the mcdonalds approach before but I fail to see how it really solves the consistency issue. What if dynamodb is also down, it's still a dual write to two different databases so we are back to square one... Not to mention that further commands can come in and if the queue is back the events will be mixed up in order. Terrible idea
It doesn't solve the consistency issue. It's a "good enough" likely in their situation. The trade-off with the outbox pattern is you're database is going to have an increase in load. If that's not feasible, using a fallback can be a better situation. It also doesn't need to be an all or nothing, both can work together. Use an outbox where you absolutely need the consistency, and fallbacks for when you're fine of the risk of the fallback being down. It can be, but what's the likely hood of it? What's the risk? Is it worth it?
@@CodeOpinion Yeah, agreed. I suppose CDC would've been a better alternative to show then :)
@@CodeOpinion An alternative approach can be following. It is possible to "short circuit" the Outbox pattern DB load by saving the "new" event in the outbox, and still within the transaction do your Message Broker call. If that succeeds you update the event to "processed", (we are still in the same transaction, so no other process can access our newly created event record yet) and therefor your Output polling won't even see the event. Because we assume that in most cases the message broker call will succeed, you can lower the intensity of the outbox processor and reduce the (heavy?) load on your database.
@@HansDeMulder but if your transaction commit failed, you have published to the message broker an event that in fact did not happen so you return to the listen to yourself problem with this solution, subscribers can receive event related to nothing and will never be.
Thank you Derek for the great explanation on this topic!
I'm implementing an outbox pattern at the moment and I want to ask if it's a good idea to store events in a db not only for processing them but also for auditing and traceability purposes in the future?
You could, depending on what type of DB you're using that might be feasible (or not). For tracing you might be better served using something like Open Telemery so you can see the entire flow
super valuable, thank you 🔥
Hey this is great stuff