Rule templates reference

COMMERCIAL FEATURE: Access business service monitoring (BSM), including rule templates, in the packaged Sensu Go distribution. For more information, read Get started with commercial features.

NOTE: Business service monitoring (BSM) is in public preview and is subject to change.

Rule templates are the resources that Sensu applies to service components for business service monitoring. A rule template applies to selections of events defined by a service component’s query. This selection of events is the rule’s input.

The rule template evaluates the selection of events using an ECMAScript 5 (JavaScript) expression specified in the rule template’s eval object and emits a single event based on this evaluation. For example, a rule template’s expression might define the thresholds at which Sensu will consider a service component online, degraded, or offline:

  • Online until fewer than 70% of the service component’s events have a check status of OK.
  • Degraded while 50-69% of the service component’s events have a check status of OK.
  • Offline when fewer than 50% of the service component’s events have a check status of OK.

The rule template expression can also create arbitrary events.

Rule template example

This example rule template creates an event when the percentage of events with the given status exceed the given threshold:

---
type: RuleTemplate
api_version: bsm/v1
metadata:
  name: status-threshold
  namespace: default
spec:
  description: Creates an event when the percentage of events with the given status exceed the given threshold
  arguments:
    required:
      - threshold
    properties:
      status:
        type: string
        default: non-zero
        enum:
          - non-zero
          - warning
          - critical
          - unknown
      threshold:
        type: number
        description: Numeric value that triggers an event when surpassed
  eval: |
    var statusMap = {
      "non-zero": 1,
      "warning": 1,
      "critical": 2,
    };
    function main(args) {
      var total = sensu.events.count();
      var num = sensu.events.count(args.status);
      if (num / total <= args.threshold) {
        return;
      }
      return event.status = statusMap[args.status],
      });
    }
{
  "type": "RuleTemplate",
  "api_version": "bsm/v1",
  "metadata": {
    "name": "status-threshold",
    "namespace": "default"
  },
  "spec": {
    "description": "Creates an event when the percentage of events with the given status exceed the given threshold",
    "arguments": {
      "required": [
        "threshold"
      ],
      "properties": {
        "status": {
          "type": "string",
          "default": "non-zero",
          "enum": [
            "non-zero",
            "warning",
            "critical",
            "unknown"
          ]
        },
        "threshold": {
          "type": "number",
          "description": "Numeric value that triggers an event when surpassed"
        }
      }
    },
    "eval": "var statusMap = {\n  \"non-zero\": 1,\n  \"warning\": 1,\n  \"critical\": 2,\n};\nfunction main(args) {\n  var total = sensu.events.count();\n  var num = sensu.events.count(args.status);\n  if (num / total <= args.threshold) {\n    return;\n  }\n  return event.status = statusMap[args.status],\n  });\n}"
  }
}

Apply rule templates to service components

Rule templates are general, parameterized resources that can apply to one or more service components. To apply a rule template to a specific service component:

  • List the rule template name in the service component’s rules.template field.
  • Specify the arguments the rule template requires in the service component’s rules.template.arguments object.

Several service components can use the same rule template with different argument values. For example, a rule template might evaluate one argument, threshold_ok, against the number of events with OK status, as represented by the following logic:

if numberEventsOK < threshold_ok {
  emit warning event
}

You can specify a variety of thresholds as arguments in service component definitions that reference this rule template. One service component might set a threshold_ok value at 10; another service component might set the value at 50. Both service components can make use of the same rule template at the threshold that makes sense for that component.

Service components can reference more than one rule template. Sensu evaluates each rule separately, and each rule produces its own event as output.

Built-in rule template: Aggregate

Sensu’s business service monitoring (BSM) includes a built-in rule template, aggregate, that allows you to treat the results of multiple disparate check executions executed across multiple disparate systems as a single result (event). This built-in rule template is ready to use with your service components.

Reference the rule template name in the rules.template field and configure the arguments in the rules.template.arguments object in your service component resource definitions.

Use the aggregate rule template for services that can be considered healthy as long as a minimum threshold is satisfied. For example, you might set the minimum threshold at 10 web servers with an OK status or 70% of processes running with an OK status.

The aggregate rule template is useful in dynamic environments and environments with some tolerance for failure.

To review the aggregate resource definition, retrieve it with a GET request to the BSM API:

curl -X GET \
http://127.0.0.1:8080/api/enterprise/bsm/v1/namespaces/default/rule-templates/aggregate \
-H "Authorization: Key $SENSU_API_KEY"

The response will include the complete aggregate rule template resource definition:

{
  "type": "RuleTemplate",
  "api_version": "bsm/v1",
  "metadata": {
    "name": "aggregate",
    "namespace": "default"
  },
  "spec": {
    "arguments": {
      "properties": {
        "critical_count": {
          "description": "create an event with a critical status if there the number of critical events is equal to or greater than this count",
          "type": "number"
        },
        "critical_threshold": {
          "description": "create an event with a critical status if the percentage of non-zero events is equal to or greater than this threshold",
          "type": "number"
        },
        "metric_handlers": {
          "default": {},
          "description": "metric handlers to use for produced metrics",
          "items": {
            "type": "string"
          },
          "type": "array"
        },
        "produce_metrics": {
          "default": {},
          "description": "produce metrics from aggregate data and include them in the produced event",
          "type": "boolean"
        },
        "set_metric_annotations": {
          "default": {},
          "description": "annotate the produced event with metric annotations",
          "type": "boolean"
        },
        "warning_count": {
          "description": "create an event with a warning status if there the number of critical events is equal to or greater than this count",
          "type": "number"
        },
        "warning_threshold": {
          "description": "create an event with a warning status if the percentage of non-zero events is equal to or greater than this threshold",
          "type": "number"
        }
      },
      "required": null
    },
    "description": "Monitor a distributed service - aggregate one or more events into a single event. This BSM rule template allows you to treat the results of multiple disparate check executions – executed across multiple disparate systems – as a single event. This template is extremely useful in dynamic environments and/or environments that have a reasonable tolerance for failure. Use this template when a service can be considered healthy as long as a minimum threshold is satisfied (for example, at least 5 healthy web servers? at least 70% of N processes healthy?).",
    "eval": "\nif (events && events.length == 0) {\n    event.check.output = \"WARNING: No events selected for aggregate\n\";\n    event.check.status = 1;\n    return event;\n}\n\nevent.annotations[\"io.sensu.bsm.selected_event_count\"] = events.length;\n\npercentOK = sensu.PercentageBySeverity(\"ok\");\n\nif (!!args[\"produce_metrics\"]) {\n    var handlers = [];\n\n    if (!!args[\"metric_handlers\"]) {\n        handlers = args[\"metric_handlers\"].slice();\n    }\n\n    var ts = Math.floor(new Date().getTime() / 1000);\n\n    event.timestamp = ts;\n\n    var tags = [\n        {\n            name: \"service\",\n            value: event.entity.name\n        },\n        {\n            name: \"entity\",\n            value: event.entity.name\n        },\n        {\n            name: \"check\",\n            value: event.check.name\n        }\n    ];\n\n    event.metrics = sensu.NewMetrics({\n        handlers: handlers,\n        points: [\n            {\n                name: \"percent_non_zero\",\n                timestamp: ts,\n                value: sensu.PercentageBySeverity(\"non-zero\"),\n                tags: tags\n            },\n            {\n                name: \"percent_ok\",\n                timestamp: ts,\n                value: percentOK,\n                tags: tags\n            },\n            {\n                name: \"percent_warning\",\n                timestamp: ts,\n                value: sensu.PercentageBySeverity(\"warning\"),\n                tags: tags\n            },\n            {\n                name: \"percent_critical\",\n                timestamp: ts,\n                value: sensu.PercentageBySeverity(\"critical\"),\n                tags: tags\n            },\n            {\n                name: \"percent_unknown\",\n                timestamp: ts,\n                value: sensu.PercentageBySeverity(\"unknown\"),\n                tags: tags\n            },\n            {\n                name: \"count_non_zero\",\n                timestamp: ts,\n                value: sensu.CountBySeverity(\"non-zero\"),\n                tags: tags\n            },\n            {\n                name: \"count_ok\",\n                timestamp: ts,\n                value: sensu.CountBySeverity(\"ok\"),\n                tags: tags\n            },\n            {\n                name: \"count_warning\",\n                timestamp: ts,\n                value: sensu.CountBySeverity(\"warning\"),\n                tags: tags\n            },\n            {\n                name: \"count_critical\",\n                timestamp: ts,\n                value: sensu.CountBySeverity(\"critical\"),\n                tags: tags\n            },\n            {\n                name: \"count_unknown\",\n                timestamp: ts,\n                value: sensu.CountBySeverity(\"unknown\"),\n                tags: tags\n            }\n        ]\n    });\n\n    if (!!args[\"set_metric_annotations\"]) {\n        var i = 0;\n\n        while(i < event.metrics.points.length) {\n            event.annotations[\"io.sensu.bsm.selected_event_\" + event.metrics.points[i].name] = event.metrics.points[i].value.toString();\n            i++;\n        }\n    }\n}\n\nif (!!args[\"critical_threshold\"] && percentOK <= args[\"critical_threshold\"]) {\n    event.check.output = \"CRITICAL: Less than \" + args[\"critical_threshold\"].toString() + \"% of selected events are OK (\" + percentOK.toString() + \"%)\n\";\n    event.check.status = 2;\n    return event;\n}\n\nif (!!args[\"warning_threshold\"] && percentOK <= args[\"warning_threshold\"]) {\n    event.check.output = \"WARNING: Less than \" + args[\"warning_threshold\"].toString() + \"% of selected events are OK (\" + percentOK.toString() + \"%)\n\";\n    event.check.status = 1;\n    return event;\n}\n\nif (!!args[\"critical_count\"]) {\n    crit = sensu.CountBySeverity(\"critical\");\n\n    if (crit >= args[\"critical_count\"]) {\n        event.check.output = \"CRITICAL: \" + args[\"critical_count\"].toString() + \" or more selected events are in a critical state (\" + crit.toString() + \")\n\";\n        event.check.status = 2;\n        return event;\n    }\n}\n\nif (!!args[\"warning_count\"]) {\n    warn = sensu.CountBySeverity(\"warning\");\n\n    if (warn >= args[\"warning_count\"]) {\n        event.check.output = \"WARNING: \" + args[\"warning_count\"].toString() + \" or more selected events are in a warning state (\" + warn.toString() + \")\n\";\n        event.check.status = 1;\n        return event;\n    }\n}\n\nevent.check.output = \"Everything looks good (\" + percentOK.toString() + \"% OK)\";\nevent.check.status = 0;\n\nreturn event;\n"
  }
}

The configuration for a service component that references the aggregate rule template might look like this example:

---
type: ServiceComponent
api_version: bsm/v1
metadata:
  name: web-app
  namespace: default
  created_by: admin
spec:
  cron: ''
  handlers:
  - slack
  interval: 30
  query:
  - type: fieldSelector
    value: event.check.labels.service == applications
  rules:
  - arguments:
      critical_threshold: 50
      metric_handler: influxdb
      produce_metrics: true
      set_metric_annotations: true
      warning_threshold: 70
    name: crit50-warn70
    template: aggregate
  services:
  - applications
{
  "type": "ServiceComponent",
  "api_version": "bsm/v1",
  "metadata": {
    "name": "web-app",
    "namespace": "default",
    "created_by": "admin"
  },
  "spec": {
    "cron": "",
    "handlers": [
      "slack"
    ],
    "interval": 30,
    "query": [
      {
        "type": "fieldSelector",
        "value": "event.check.labels.service == applications"
      }
    ],
    "rules": [
      {
        "arguments": {
          "critical_threshold": 50,
          "metric_handler": "influxdb",
          "produce_metrics": true,
          "set_metric_annotations": true,
          "warning_threshold": 70
        },
        "name": "crit50-warn70",
        "template": "aggregate"
      }
    ],
    "services": [
      "applications"
    ]
  }
}

Rule template specification

Top-level attributes

type
description Top-level attribute that specifies the resource type. For rule template configuration, the type should always be RuleTemplate.
required Required for rule template configuration in wrapped-json or yaml format.
type String
example
type: RuleTemplate
{
  "type": "RuleTemplate"
}
api_version
description Top-level attribute that specifies the Sensu API group and version. For rule template configuration in this version of Sensu, the api_version should always be bsm/v1.
required Required for rule template configuration in wrapped-json or yaml format.
type String
example
api_version: bsm/v1
{
  "api_version": "bsm/v1"
}
metadata
description Top-level scope that contains the rule template’s name and namespace as well as the created_by field.
required true
type Map of key-value pairs
example
metadata:
  name: status-threshold
  namespace: default
  created_by: admin
{
  "metadata": {
    "name": "status-threshold",
    "namespace": "default",
    "created_by": "admin"
  }
}
spec
description Top-level map that includes the rule template configuration spec attributes.
required Required for rule template configuration in wrapped-json or yaml format.
type Map of key-value pairs
example
spec:
  description: Creates an event when the percentage of events with the given status exceed the given threshold
  arguments:
    required:
      - threshold
    properties:
      status:
        type: string
        default: non-zero
        enum:
          - non-zero
          - warning
          - critical
          - unknown
      threshold:
        type: number
        description: Numeric value that triggers an event when surpassed
  eval: |
    var statusMap = {
      "non-zero": 1,
      "warning": 1,
      "critical": 2,
    };
    function main(args) {
      var total = sensu.events.count();
      var num = sensu.events.count(args.status);
      if (num / total <= args.threshold) {
        return;
      }
      return sensu.new_event({
        status: statusMap[args.status],
        /* ... */,
      });
    }
{
  "spec": {
    "description": "Creates an event when the percentage of events with the given status exceed the given threshold",
    "arguments": {
      "required": [
        "threshold"
      ],
      "properties": {
        "status": {
          "type": "string",
          "default": "non-zero",
          "enum": [
            "non-zero",
            "warning",
            "critical",
            "unknown"
          ]
        },
        "threshold": {
          "type": "number",
          "description": "Numeric value that triggers an event when surpassed"
        }
      }
    },
    "eval": "var statusMap = {\n  \"non-zero\": 1,\n  \"warning\": 1,\n  \"critical\": 2,\n};\nfunction main(args) {\n  var total = sensu.events.count();\n  var num = sensu.events.count(args.status);\n  if (num / total <= args.threshold) {\n    return;\n  }\n  return sensu.new_event({\n    status: statusMap[args.status],\n    /* ... */,\n  });\n}"
  }
}

Metadata attributes

name
description Name for the rule template that is used internally by Sensu.
required true
type String
example
name: status-threshold
{
  "name": "status-threshold"
}
namespace
description Sensu RBAC namespace that the rule template belongs to.
required true
type String
example
namespace: default
{
  "namespace": "default"
}
created_by
description Username of the Sensu user who created the rule template or last updated the rule template. Sensu automatically populates the created_by field when the rule template is created or updated.
required false
type String
example
created_by: admin
{
  "created_by": "admin"
}

Spec attributes

description
description Plain text description of the rule template’s behavior.
required true
type String
example
description: Creates an event when the percentage of events with the given status exceed the given threshold
{
  "description": "Creates an event when the percentage of events with the given status exceed the given threshold"
}
arguments
description The rule template’s arguments using JSON Schema properties.
required true
type Map of key-value pairs
example
arguments:
  required:
    - threshold
  properties:
    status:
      type: string
      default: non-zero
      enum:
        - non-zero
        - warning
        - critical
        - unknown
    threshold:
      type: number
      description: Numeric value that triggers an event when surpassed
{
  "arguments": {
    "required": [
      "threshold"
    ],
    "properties": {
      "status": {
        "type": "string",
        "default": "non-zero",
        "enum": [
          "non-zero",
          "warning",
          "critical",
          "unknown"
        ]
      },
      "threshold": {
        "type": "number",
        "description": "Numeric value that triggers an event when surpassed"
      }
    }
  }
}
eval
description ECMAScript 5 (JavaScript) expression for the rule template to evaluate.
required true
type String
example
eval: |
    var statusMap = {
      "non-zero": 1,
      "warning": 1,
      "critical": 2,
    };
    function main(args) {
      var total = sensu.events.count();
      var num = sensu.events.count(args.status);
      if (num / total <= args.threshold) {
        return;
      }
      return sensu.new_event({
        status: statusMap[args.status],
        /* ... */,
      });
    }
{
  "eval": "var statusMap = {\n  \"non-zero\": 1,\n  \"warning\": 1,\n  \"critical\": 2,\n};\nfunction main(args) {\n  var total = sensu.events.count();\n  var num = sensu.events.count(args.status);\n  if (num / total <= args.threshold) {\n    return;\n  }\n  return sensu.new_event({\n    status: statusMap[args.status],\n    /* ... */,\n  });\n}"
}

Arguments attributes

required
description List of attributes the rule template argument requires. The listed attributes must be configured in the properties object.
required false
type Array
example
required:
  - threshold
{
  "required": [
    "threshold"
  ]
}

properties
description List of properties that define the argument’s behavior. In JSON Schema.
required true
type Array
example
properties:
  status:
    type: string
    default: non-zero
    enum:
      - non-zero
      - warning
      - critical
      - unknown
    threshold:
      type: number
      description: Numeric value that triggers an event when surpassed
{
  "properties": {
    "status": {
      "type": "string",
      "default": "non-zero",
      "enum": [
        "non-zero",
        "warning",
        "critical",
        "unknown"
      ]
    },
    "threshold": {
      "type": "number",
      "description": "Numeric value that triggers an event when surpassed"
    }
  }
}