Handling Failures in Message Driven Architecture

Поділитися
Вставка
  • Опубліковано 24 лип 2024
  • Many great libraries help to add resilience and fault tolerance by handling failures in a message driven architecture. However, it's not just as simple as adding retries, timeouts, circuit breakers, etc., globally to all network calls. Many implications are specific to the context of the request being processed. And in many cases, it's not solely a technical concern but rather it's a business concern.
    🔔 Subscribe: / @codeopinion
    💥 Join this channel to get access to source code & demos!
    / @codeopinion
    🔥 Don't have the JOIN button? Support me on Patreon!
    / codeopinion
    📝 Blog: codeopinion.com
    👋 Twitter: / codeopinion
    ✨ LinkedIn: / dcomartin
    0:00 Intro
    1:09 Immediate Retry
    2:05 Exponential Backoff
    4:12 Deadletter
    4:55 Circuit Breaker
    6:47 Cascading Failures
    #softwarearchitecture #messagedrivenarchitecture #eventdrivenarchitecture
  • Наука та технологія

КОМЕНТАРІ • 39

  • @varshard0
    @varshard0 3 роки тому +7

    Funny enough, I presented message driven architecture in my company's knowledge sharing session 2 days ago. Had you upload the video sooner, I could've just show your video and just eat popcorn.

  • @andersjuul8310
    @andersjuul8310 3 роки тому

    Nice! This kicks off the day beautifully here in Denmark :)
    Especially: Emphacising business involvement in how we should handle errors and the hold-up caused by retries within handling of a specific message.
    Please, please keep it up! I love it! :)

  • @voduyquang4778
    @voduyquang4778 10 місяців тому

    Oh really I want to say thank you. You make the videos along with documents that very clear and more professional. The topics are very interesting nowadays. The big thanks for you pls keep it up!!!

  • @cbaxtermusic
    @cbaxtermusic 2 роки тому

    you have the best, most consumable videos on very complex topics. I love this channel. I'm shocked you are not at 1mil followers yet.

    • @CodeOpinion
      @CodeOpinion  2 роки тому

      I appreciate that! Thanks 👍

  • @manan5
    @manan5 3 роки тому

    Glad i found your channel. Really advance concepts in such a nice and crisp video.

    • @CodeOpinion
      @CodeOpinion  3 роки тому

      Hopefully they are helpful.

    • @manan5
      @manan5 3 роки тому

      @@CodeOpinion indeed they are. such level of content is not easy to find. thanks again :)

  • @glaydersen
    @glaydersen 3 роки тому +1

    Really helpfull, as always! Thanks for the video!

  • @sathyajithps013
    @sathyajithps013 3 роки тому

    I was literally sitting down to create two queues for processing purchases in my on going project and this video dropped. What a timing! I always thought of just retrying or burying the message to the back of the queue or just creating a new queue for failed messages. Excellent video once again. 👌

    • @CodeOpinion
      @CodeOpinion  3 роки тому +2

      It's really just understanding the implications of whatever you choose to do with failures. There are consequences, sometimes not immediately understood. Hopefully that's the message that came across.

    • @sathyajithps013
      @sathyajithps013 3 роки тому

      @@CodeOpinion Yep definitely. Take a look at the requirements/business rules and implement the solution. Keep an eye on the changing requirements or if the queue->consumer process is working as expected and refactor if necessary, right?

  • @yamirtomas
    @yamirtomas 3 роки тому

    jesus man, you just keep making super interesting videos! keep it going! and thanks!

  • @edenr1988
    @edenr1988 3 роки тому

    Very clear, I like the way you explain, keep it up 👍

  • @3sviat
    @3sviat 3 роки тому +1

    Thank you for your videos!
    I like the idea of loosely coupled monolith but I think in real world too few projects starts with this approach. Could you share your thought about strategies how to move out from coupled monolith when it is live on production? Have you had such cases in your practice?

  • @vinylwarmth
    @vinylwarmth 3 роки тому

    Appreciate this video, thanks. Doing a lot more messaging in my latest role so finding your videos extremely useful 👍
    I'm notice you use Kafka in many of your videos. Have you got a blog post/video on why or how you'd choose between Kafka/Service bus/Event grid? I'd be interested to learn about that

    • @CodeOpinion
      @CodeOpinion  3 роки тому +1

      I have not but it's a good topic. I'll try to cover that upcoming.

  • @bruno.arruda
    @bruno.arruda 3 роки тому

    Nice video!
    Do you have some vision or any draft of how potentially Distributed Circuit Breaker could be?
    Another service between the multiple instances and external services? Or any framework/library already doing such a thing?
    Thanks!

    • @CodeOpinion
      @CodeOpinion  3 роки тому +1

      github.com/Polly-Contrib/Polly.Contrib.AzureFunctions.CircuitBreaker
      This is the only one I was made aware of. I don't know of any specific libraries or have used any. I've always had to implement it myself for both circuit breakers and failover. Usually using shared cache or by toggling a feature flag.

  • @allinvanguard
    @allinvanguard 3 роки тому +1

    Great video! I have a question - What is the usual process of continuing with deadlettered messages? Are these messages scheduled by the queue after X amount of time? Would the consumer actively try to re-query these messages once it knows that is is now capable of processing them?

    • @CodeOpinion
      @CodeOpinion  3 роки тому +2

      It really is up to you on how advanced you make it. For me it's it starts as as the reason analyze what the failure is and why it occurred to determine if you're handling that particular fault correctly. If you're reporting/metrics on them, you can be alarm when it's occurring so you can take action. For example, if some external service you're using is down and you have failed messages, maybe you turn off the functionality that's producing those messages in the first place. Another example is maybe there's maintenance on some service that takes it down for an hour every weekend. If you understand that now, you may go t o the deadletter and then build something that retries them after X hours from failure. Again, it's no global solution it's usually per message/consumer and what it's actually doing. Hope this answers your queestion.

  • @carlosmauriciorebolledosie6286

    Hi, great content on this channel. I would like to know if there is a demo in code about how to build a dead letter queue mechanism, if I become a member could I find this demo?

    • @CodeOpinion
      @CodeOpinion  Рік тому

      Use a messaging library that provides you with an API or a way to configure retries and failures (DLQ). If you're in the .NET Space, check out NServiceBus, Mass Transit, Brighter or Jasper.

  • @tony-ma
    @tony-ma 2 роки тому

    I can spin up multiple projections services (same projection but multiple instances in the kubenetes for mutiple processing) for one persistant subscription from the eventstore?

    • @CodeOpinion
      @CodeOpinion  2 роки тому +1

      I believe you're referring to the competing consumers pattern, that I talk about in this video: ua-cam.com/video/xv6Ljbq6me8/v-deo.html
      If you need to process events from a stream in order, then no you won't be able to use the competing consumers pattern but rather you would want to use a catch up subscription.

  • @neha6000
    @neha6000 Рік тому

    Hi where I can see practicle implementation..?

  • @thanhvo2092
    @thanhvo2092 3 роки тому

    Like before watching, thank you!

  • @varshard0
    @varshard0 3 роки тому

    A trick for an exponential back off that doesn't hang up your worker node that I did was using multiple queues as a sort of psuedo dead letter queues.
    1. Create a few queues and set the working queue as their dead-letter queue.
    2. Set different message TTL based on each back-off period on each queue. So, when a message is expired, they will be requeue in the workin queue.
    3. When a fail message on a consumer node, the node will increment nuimber of retries in the message header, then throw the message to the psuedo dead letter queue. ie: retries = 1, send it to the queue 1.
    4. After the message is published to a psuedo dead letter queue, the consumer will ACK message back to the working queue toT remove the message.
    You need multiple queues with different TTL instead of setting TTL on each message, because TTL on each message will be taken into account only when the message is at the head of a queue. So, a message with short TTL have to wait for a prior message with a longer TTL to be expired first. A head of the line issue.

    • @CodeOpinion
      @CodeOpinion  3 роки тому

      So you're hinting what I mentioned at the end which is the the value of messaging libraries!

  • @arikshapiro4056
    @arikshapiro4056 2 роки тому

    What would you do if you have a client that sent the initial command that started the flow asynchronously, but needs to know if a failure happened? would you have the client consuming from the deadletter queue to check if there's a matching message? or would you send a MessageFailed msg to the message broker yourself and have the client consuming those?

    • @CodeOpinion
      @CodeOpinion  2 роки тому +1

      I think it really depends on the command and the context of if the user/someone needs to be notified of the failure. You sure could have a retry limit and publish a type of failure event to notify the producer (or any consumer). Also if the producer has some way of verifying/querying if a message has been processed then that is also an option. If it hasn't been processed and has exceeded a certain SLA or expected timeframe then it could show something appropriate to the User.

    • @arikshapiro4056
      @arikshapiro4056 2 роки тому

      @@CodeOpinion Thank you for your answer! Love your videos