Zabbix global event correlation explained

Поділитися
Вставка
  • Опубліковано 26 гру 2024

КОМЕНТАРІ • 8

  • @EinTypOhneHandle
    @EinTypOhneHandle 4 місяці тому

    Thanks for the video - quick question regarding aggressiveness of that rule: Let's say I configure this the way you have, so that if my router goes down, my node won't show any problems. Does this affect all problems? So let's say my router raises a low priority problem because the temperature is too high. Let's say after this, unrelated, one of the connected nodes goes down and raises an unavailable problem. Would this problem get suppressed until my router resolves the temperature problem? If so, this would be a huge drawback. Would there be a way around this? Maybe only be able to apply this for problems of priority high+?

  • @d.howardcolesjr4862
    @d.howardcolesjr4862 Рік тому

    Hey, this is templateable. hehe. good solution. the location based tag is very good.

  • @clecimarfernandes4992
    @clecimarfernandes4992 8 місяців тому

    Hi Aigars, Thank you for this video. I´m trying to do this and add one more tag to the event correlation, a tag for an event, for example, a pair of tags where tag is problem1. So, only when problem1 occurs will the event correlation run. Is that possible? I´m running some tests.

  • @dombayan
    @dombayan Рік тому

    Hi Aigars, thanks very much, very informative video. To me the only drawback will be the constantly raising/closing triggers of non-router hosts. I tried it in my network, but those endless raising/closing alerts while the router is down spoil the trigger count statistics... Do you think there is a way to suppress the dependent triggers? (without using the host dependency ;) )

    • @aigarskadikis
      @aigarskadikis  Рік тому

      It's actually recommended to use a cronjob which will erase all syntoms from database once in a while. The SQL commands are in video description. Usually in production when we use this cronjob we add a "clock" contraint to only delete records which are older than 14 days.
      Let us know if that is improving the view of statistics.

  • @stevenleander6101
    @stevenleander6101 Рік тому

    Hi, great video, i tried the setup as pr. your description. And i have an alert action that will send SMS after 3minutes of downtime, allowing the correlation to do the magic, before sending the SMS. All fine, i only receive 1 SMS with problem. But when all the units come online again, then i receives resolved SMS's, for all nodes.? Is there a way to avoid this behavior ?
    Btw. Thanks for a great Summit 2023, I enjoyed it very much.

  • @d.howardcolesjr4862
    @d.howardcolesjr4862 Рік тому

    The template can have a Macro called "{$PARENT_LOCATION}" with a default value of "site1", then have tags such as "Location" with value of: {$PARENT_LOCATION}. Which macro can be overridden at the closest template linked to the node, or the node itself. The Router's template and macros can be the same. So "Location" would have value of {$PARENT_NAME} in the template tag, with the macro having a default value. Then you could override that macro at the router (or parent) itself. I've tested this and it's working like a charm.
    So, as I said this solution can work with templates, with minimal manual intervention.

    • @d.howardcolesjr4862
      @d.howardcolesjr4862 Рік тому

      So "Location" would have value of {$PARENT_NAME} in the template tag, with the macro having a default value. should be: So "Location" would have value of {$PARENT_LOCATION} in the template tag, with the macro having a default value. DOH.