这是indexloc提供的服务,不要输入任何密码
Skip to content
This repository was archived by the owner on Feb 13, 2025. It is now read-only.

Conversation

@Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Feb 4, 2015

also gave the old lookup function a new name to clearly distinguish the two.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Feb 4, 2015

note: i couldn't get the thing to work where when you load the rule in the Rule tab, it also provides the lookup table. i was debugging the parse tree it uses in conf/conf.go to catch the lookupSeries statement but it didn't appear

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Feb 5, 2015

on @mjibson 's request, here's my alerting rule:

macro dm {
    # "g" means global, "s" means seggregated per tag
    template = dm
    $g_hist = graphiteBand($q_global, $duration, $period, "", 1)
    $g_now = graphite($q_global, $duration, "", "")
    $g_hist_med = median($g_hist)
    $g_now_med = median($g_now)

    $s_hist = graphiteBand($q_segg, $duration, $period ,$s_tag, 1)
    $s_now = graphite($q_segg, $duration, "", $s_tag)
    $s_hist_med = median($s_hist)
    $s_hist_dev= dev($s_hist)
    $s_now_med = median($s_now)
    $s_now_dev = dev($s_now)

    # we add 0.01 so that we can be sure never to divide by zero.
    $s_med_diff = ($s_now_med - $s_hist_med)/($s_hist_dev+0.01)
    $s_med_bad = $s_med_diff < $s_min_med_diff

    $g_strength = $g_now_med/$g_hist_med
    $s_med_issues = sum(t($s_med_bad,""))
    warn = $s_med_issues > 0
    crit = $s_med_issues > $s_max_num_issues_crit || $g_strength < $g_min_strength
    #warnNotification = web
    #critNotification = web
}

lookup logsettings {
    entry log=some-important|logs-here {
        importance = 50
    }
    entry log=* {
    importance = 10
    }
}
alert bi {
    $duration = "60m"
    # TODO 7d once we have the data
    $period = "3d"
    $q_global = "transformNull(sum(stats._sum_vimeo.warehouselogger.*),0)"
    $q_segg = "aliasByNode(transformNull(stats._sum_vimeo.warehouselogger.*,0),3)"
    $s_tag = "log"
    $s_min_med_diff = -4
    $s_max_num_issues_crit = 10
    $g_min_strength = 0.7
    macro = dm
    $importance = lookupSeries($s_now, "logsettings", "importance")
    $tpl_s_tag = log
    $tpl_s_tag_plural = logs
    $tpl_unit = Warehouselog messages
    $tpl_g_ge = sum_vimeo warehouselogger sum by category from -1week||
    $tpl_s_ge = sum_vimeo warehouselogger from -1week||
    $tpl_s_ge_pre = sum_vimeo warehouselogger category=
    $tpl_s_ge_post = from -1week||
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should stay the same, at least for now.

@madelynnblue
Copy link
Contributor

Could you also post the code where you tried to get it working in the rule page? I'd like to see that attempt.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Feb 5, 2015

is this what you mean?

diff --git a/cmd/bosun/conf/conf.go b/cmd/bosun/conf/conf.go
index e7073d3..2db4bf7 100644
--- a/cmd/bosun/conf/conf.go
+++ b/cmd/bosun/conf/conf.go
@@ -1067,10 +1067,12 @@ func (c *Conf) AlertTemplateStrings() (*AlertTemplateStrings, error) {
                }
                lookups := make(map[string]bool)
                walk := func(n eparse.Node) {
+            fmt.Println(n)
                        eparse.Walk(n, func(n eparse.Node) {
                                switch n := n.(type) {
                                case *eparse.FuncNode:
-                                       if n.Name != "lookup" || len(n.Args) == 0 {
+                    fmt.Println("n", n.Name, n.Args)
+                                       if (n.Name != "lookupDB" && n.Name != "lookupSeries") || len(n.Args) == 0 {
                                                return
                                        }
                                        switch n := n.Args[0].(type) {

@madelynnblue
Copy link
Contributor

Your logic is wrong, should be:

if (n.Name != "lookupDB" || n.Name != "lookupSeries") && len(n.Args) == 0 {

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Feb 6, 2015

ok but printing all the nodes as we iterate over them didn't even show any lookupSeries ?

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Feb 6, 2015

i can confirm that if i change it to if (n.Name != "lookupDB" || n.Name != "lookupSeries") || len(n.Args) == 0 { it also doesn't load the lookuptable into the textarea when i bring up a rule that uses lookupSeries, into the Rule page.

@madelynnblue
Copy link
Contributor

Err, my comment was wrong. I think your logic was right. Hold on, checking.

@Dieterbe
Copy link
Contributor Author

i think my lookupSeries doesn't show up in the parse tree because it's not used in any of the computations that make up the warn or crit outcome
my lookupSeries only serves to set a variable that is used in the template.
and the logic to include lookup tables only seems to inspect the parse tree of the warn/crit expressions.

i tried to work around it by adding an $importance < 0 || clause to warn and crit (and i know this expr won't change the outcome because $importance is always >0).
so with a config like:

macro dm {
    # "g" means global, "s" means seggregated per tag
    template = dm
    $g_hist = graphiteBand("$q_global", "$duration", "$period", "", 1)
    $g_now = graphite("$q_global", "$duration", "", "")
    $g_hist_med = median($g_hist)
    $g_now_med = median($g_now)

    $g_reqs = graphite("$q_global", "1d", "", "")
    $g_stdevs = graphite("stdev($q_global,10)", "1d", "", "")
    $g_stdev_last  = last($g_stdevs)

    $g_hist_dev = dev($g_hist)
    $g_now_dev = dev($g_now)
    # too coarse
    #$g_dev_units = ($g_now_dev * $g_hist_med ) / ( ($g_hist_dev * $g_now_med) + 0.01)
    $g_dev_units = ($g_stdev_last * $g_hist_med ) / ( ($g_hist_dev * $g_now_med) + 0.01)

    $s_hist = graphiteBand("$q_segg", "$duration", "$period" ,"$s_tag", $s_period)
    $s_now = graphite("$q_segg", "$duration", "", "$s_tag")
    $s_hist_med = median($s_hist)
    $s_hist_dev= dev($s_hist)
    $s_now_med = median($s_now)
    $s_now_dev = dev($s_now)

    # we add 0.01 so that we can be sure never to divide by zero.
    $s_med_diff = ($s_now_med - $s_hist_med)/($s_hist_dev+0.01)
    $s_med_bad = $s_med_diff < $s_min_med_diff

    $g_strength = $g_now_med/$g_hist_med
    $s_med_issues = sum(t($s_med_bad,""))
}

macro dm-finish {
    warn = $importance < 0 || $s_med_issues > 0 || ($g_max_erratic_warn != -1 && $g_dev_units > $g_max_erratic_warn)
    crit = $importance < 0 || $s_med_issues > $s_max_num_issues_crit || $g_strength < $g_min_strength || ($g_max_erratic_crit != -1 && $g_dev_units > $g_max_erratic_crit)
}

lookup logsettings {
    entry log=some-important|logs-here {
        importance = 50
    }
    entry log=* {
        importance = 10
    }
}
alert bi {
    $duration = 3h
    $period = 7d
    $q_global = transformNull(sum(stats._sum_vimeo.warehouselogger.*),0)
    $q_segg = aliasByNode(transformNull(stats._sum_vimeo.warehouselogger.*,0),3)
    $s_tag = log
    $s_period = 1
    $s_min_med_diff = -4
    $s_max_num_issues_crit = 10
    $g_min_strength = 0.7
    # if logs suddenly spike down, that's not very helpful to know what's up. so don't do this
    # we do this for request traffic which is a better indicator.
    # bugs in code would cause logs' median to drop, not spikes.
    $g_max_erratic_warn = -1
    $g_max_erratic_crit = -1
    macro = dm
    $importance = lookupSeries($s_now, "logsettings", "importance")
    macro = dm-finish
    $tpl_s_tag = log
    $tpl_s_tag_plural = logs
    $tpl_unit = Warehouselog messages
    $tpl_g_ge = sum_vimeo warehouselogger sum by category from -1week||
    $tpl_s_ge = sum_vimeo warehouselogger from -1week||
    $tpl_s_ge_pre = sum_vimeo warehouselogger category=
    $tpl_s_ge_post = from -1week||
    warnNotification = logalerts
    critNotification = logalerts
}

note i had to split up the macro in 2 macro's because $importance is used in the warn/crit, but to set $importance, via lookupSeries, we need access to a series which is created in the beginning of the macro.

this results in:

panic: interface conversion: parse.Node is *parse.FuncNode, not *parse.StringNode [recovered]
    panic: interface conversion: parse.Node is *parse.FuncNode, not *parse.StringNode [recovered]
    panic: interface conversion: parse.Node is *parse.FuncNode, not *parse.StringNode

goroutine 1 [running]:
bosun.org/cmd/bosun/conf.errRecover(0xc2081a7c88)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:166 +0xbc
bosun.org/cmd/bosun/expr/parse.(*Tree).recover(0xc2081f8c40, 0xc2081a7518)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:177 +0x87
bosun.org/cmd/bosun/conf.func·018(0xc2081e65c0, 0x3, 0x4, 0x0, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:1204 +0x79
bosun.org/cmd/bosun/expr/parse.(*FuncNode).Tags(0xc2081c4690, 0x0, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/node.go:132 +0x79
bosun.org/cmd/bosun/expr/parse.(*BinaryNode).Check(0xc2081aec00, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/node.go:270 +0x3b9
bosun.org/cmd/bosun/expr/parse.(*BinaryNode).Check(0xc2081ac5a0, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/node.go:264 +0x2e7
bosun.org/cmd/bosun/expr/parse.(*BinaryNode).Check(0xc20804fc80, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/node.go:264 +0x2e7
bosun.org/cmd/bosun/expr/parse.(*Tree).parse(0xc2081f8c40)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:230 +0xb5
bosun.org/cmd/bosun/expr/parse.(*Tree).Parse(0xc2081f8c40, 0xc208052c00, 0x3b5, 0xc2081fa3d0, 0x2, 0x2, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:220 +0xf0
bosun.org/cmd/bosun/expr/parse.Parse(0xc208052c00, 0x3b5, 0xc2081fa3d0, 0x2, 0x2, 0xc2081f8c40, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/parse/parse.go:99 +0xbd
bosun.org/cmd/bosun/expr.New(0xc208052c00, 0x3b5, 0xc2081fa3d0, 0x2, 0x2, 0x9f, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/expr/expr.go:54 +0x11e
bosun.org/cmd/bosun/conf.(*Conf).NewExpr(0xc20808c1c0, 0xc208052c00, 0x3b5, 0x4, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:1126 +0xcf
bosun.org/cmd/bosun/conf.(*Conf).loadAlert(0xc20808c1c0, 0xc20804c440)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:764 +0x168e
bosun.org/cmd/bosun/conf.(*Conf).loadSection(0xc20808c1c0, 0xc20804c440)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:455 +0xd8
bosun.org/cmd/bosun/conf.New(0x7fffd808c0e2, 0xa, 0xc20816d500, 0x308d, 0xc20808c1c0, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:344 +0xa25
bosun.org/cmd/bosun/conf.ParseFile(0x7fffd808c0e2, 0xa, 0xc2080cfdc0, 0x0, 0x0)
    /home/dieter/go/src/bosun.org/cmd/bosun/conf/conf.go:310 +0xdf
main.main()
    /home/dieter/go/src/bosun.org/cmd/bosun/main.go:54 +0x1e6

goroutine 8 [syscall]:
os/signal.loop()
    /usr/lib/go/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init·1
    /usr/lib/go/src/os/signal/signal_unix.go:27 +0x35

I did some debugging, so we can see with what output the function breaks. like so:

diff --git a/cmd/bosun/conf/conf.go b/cmd/bosun/conf/conf.go
index e7073d3..e351612 100644
--- a/cmd/bosun/conf/conf.go
+++ b/cmd/bosun/conf/conf.go
@@ -1201,6 +1203,7 @@ func (c *Conf) Funcs() map[string]eparse.Func {
                return results, nil
        }
        tagLookup := func(args []eparse.Node) (eparse.Tags, error) {
+        fmt.Println("taglookup for", args)
                name := args[0].(*eparse.StringNode).Text
                lookup := c.Lookups[name]
                if lookup == nil {

the result is:

taglookup for [graphite("aliasByNode(transformNull(stats._sum_vimeo.warehouselogger.*,0),3)", "3h", "", "log") "logsettings" "importance"]

i'm hoping this is enough for @mjibson to understand what's going on and help me fixing this :)

@Dieterbe
Copy link
Contributor Author

I just realized, the solution we came up with yesterday (including all lookup tables from config into generated config upon rule execution, with the tables specified via web ui overriding them) works for executing the rules without missing lookup table errors, but it still doesn't address the main problem: the lookup tables that are "implicitly" used by the template are not shown in the web ui, which is an omission because i typically want to at least see, and often modify them, along with my alerting rules and template. I think that the proper solution is scanning the template for lookup table names (or perhaps i should say variables that come from lookup tables) so that we can include them.
to your point of dynamically generated names via printf, we could just tell the user "if you do this, you should declare in text in the template which lookup table names this will evaluate to", perhaps in a comment or something. anything we can easily parse.

@Dieterbe
Copy link
Contributor Author

I found the cause of the panic, it was because i was using the same taglookup for both the current lookup function and the new lookupSeries. but interestingly that faulty code path was not being invoked if i only had 1 macro, and everything worked well (including the lookupSeries itself) with the faulty code. (the rule execution, template rendering etc all worked well!)
But everytime (and only when) i use 2 macro's instead of 1, it invokes the buggy code and panics.
So while that is weird and beyond my comprehension, I'm happy after just fixing the code.

the other problem still applies (loading up the rule in the web ui doesn't bring up the needed lookup tables) and the workaround i thought of (including the var from the lookuptable in the warn/crit expression) doesn't actually work though (perhaps because i'm going through a macro). but let's just merge this (the lookupSeries stuff works) and i'll open a new ticket for the problems with lookuptables.

kylebrandt added a commit that referenced this pull request Feb 26, 2015
add lookup function that leverages tags in given series
@kylebrandt kylebrandt merged commit 16faf02 into bosun-monitor:master Feb 26, 2015
Dieterbe added a commit that referenced this pull request May 30, 2016
and clarify lookup()

lookupsSeries was implemented in #690
but not documented yet.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants