Checks
- Check commands
- Check result specification
- Check scheduling
- Proxy checks
- Check token substitution
- Check hooks
- Check specification
- Examples
Checks work with Sensu agents to produce monitoring events automatically. You can use checks to monitor server resources, services, and application health as well as collect and analyze metrics. Read the guide to monitoring server resources to get started. You can discover, download, and share Sensu check assets using Bonsai, the Sensu asset index.
Check commands
Each Sensu check definition specifies a command and the schedule at which it should be executed. Check commands are executable commands that are executed by a Sensu agent.
A command may include command line arguments for controlling the behavior of the command executable. Many common checks are available as assets from Bonsai and support command line arguments so different check definitions can use the same executable.
Sensu advises against requiring root privileges to execute check commands or scripts. The Sensu user is not permitted to kill timed out processes invoked by the root user, which could result in zombie processes.
How and where are check commands executed?
All check commands are executed by Sensu agents as the sensu
user. Commands
must be executable files that are discoverable on the Sensu agent system (for example:
installed in a system $PATH
directory).
Check result specification
Although Sensu agents attempt to execute any command defined for a check, successful processing of check results requires adherence to a simple specification.
- Result data is output to STDOUT or STDERR
- For service checks, this output is typically a human-readable message.
- For metric checks, this output contains the measurements gathered by the check.
- Exit status code indicates state
0
indicates “OK”1
indicates “WARNING”2
indicates “CRITICAL”- Exit status codes other than
0
,1
, or2
indicate an “UNKNOWN” or custom status
PRO TIP: Those familiar with the Nagios monitoring system may recognize this specification, as it is the same one used by Nagios plugins. As a result, Nagios plugins can be used with Sensu without any modification.
At every execution of a check command – regardless of success or failure – the Sensu agent publishes the check’s result for eventual handling by the event processor (the Sensu backend).
Check scheduling
Checks are scheduled by the Sensu backend, which publishes check execution requests to entities via a publish-subscribe model.
Subscriptions
Checks have a defined set of subscriptions, transport
topics to which the Sensu backend publishes check requests. Sensu entities become
subscribers to these topics (called subscriptions) via their individual
subscriptions
attribute. In practice, subscriptions typically correspond to
a specific role or responsibility (for example: a webserver or database).
Subscriptions are powerful primitives in the monitoring context because they allow you to effectively monitor for specific behaviors or characteristics corresponding to the function being provided by a particular system. For example, disk capacity thresholds might be more important (or at least different) on a database server as opposed to a webserver; conversely, CPU or memory usage thresholds might be more important on a caching system than on a file server. Subscriptions also allow you to configure check requests for an entire group or subgroup of systems rather than requiring a traditional one-to-one mapping.
To configure subscriptions for a check, use the subscriptions
attribute to specify an array of one or more subscription names.
Sensu schedules checks once per interval for each agent with a matching subscription.
For example, if we have three agents configured with the system
subscription, a check configured with the system
subscription results in three monitoring events per interval: one check execution per agent per interval.
In order for Sensu to execute a check, the check definition must include a subscription that matches the subscription of at least one Sensu agent.
Round-robin checks
By default, Sensu schedules checks once per interval for each agent with a matching subscription: one check execution per agent per interval.
Sensu also supports deduplicated check execution when configured with the round_robin
check attribute.
For checks with round_robin
set to true
, Sensu executes the check once per interval, cycling through the available agents alphabetically according to agent name.
For example, for three agents configured with the system
subscription (agents A, B, and C), a check configured with the system
subscription and round_robin
set to true
results in one monitoring event per interval, with the agent creating the event following the pattern A -> B -> C -> A -> B -> C for the first six intervals.
In the diagram above, the standard check is executed by agents A, B, and C every 60 seconds, while the round-robin check cycles through the available agents, resulting in each agent executing the check every 180 seconds.
To use check ttl
and round_robin
together, your check configuration must also specify a proxy_entity_name
. If you do not specify a proxy_entity_name
when using check ttl
and round_robin
together, your check will stop executing.
PRO TIP: Use round robin to distribute check execution workload across multiple agents when using proxy checks.
Scheduling
You can schedule checks using the interval
, cron
, and publish
attributes.
Sensu requires that checks include either an interval
attribute (interval scheduling) or a cron
attribute (cron scheduling).
Interval scheduling
You can schedule a check to be executed at regular intervals using the interval
and publish
check attributes.
For example, to schedule a check to execute every 60 seconds, set the interval
attribute to 60
and the publish
attribute to true
.
NOTE: When creating an interval check, Sensu calculates an initial offset to splay the check’s first scheduled request. This helps to balance the load of both the backend and the agent, and may result in a delay before initial check execution.
Example interval check
type: CheckConfig
api_version: core/v2
metadata:
name: interval_check
namespace: default
spec:
command: check-cpu.sh -w 75 -c 90
handlers:
- slack
interval: 60
publish: true
subscriptions:
- system
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"name": "interval_check",
"namespace": "default"
},
"spec": {
"command": "check-cpu.sh -w 75 -c 90",
"subscriptions": ["system"],
"handlers": ["slack"],
"interval": 60,
"publish": true
}
}
Cron scheduling
You can also schedule checks using cron syntax.
For example, to schedule a check to execute once a minute at the start of the minute, set the cron
attribute to * * * * *
and the publish
attribute to true
.
Example cron check
type: CheckConfig
api_version: core/v2
metadata:
name: cron_check
namespace: default
spec:
command: check-cpu.sh -w 75 -c 90
cron: '* * * * *'
handlers:
- slack
publish: true
subscriptions:
- system
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"name": "cron_check",
"namespace": "default"
},
"spec": {
"command": "check-cpu.sh -w 75 -c 90",
"subscriptions": ["system"],
"handlers": ["slack"],
"cron": "* * * * *",
"publish": true
}
}
Ad-hoc scheduling
In addition to automatic execution, you can create checks to be scheduled manually using the checks API.
To create a check with ad-hoc scheduling, set the publish
attribute to false
in addition to an interval
or cron
schedule.
Example ad-hoc check
type: CheckConfig
api_version: core/v2
metadata:
name: ad_hoc_check
namespace: default
spec:
command: check-cpu.sh -w 75 -c 90
handlers:
- slack
interval: 60
publish: false
subscriptions:
- system
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"name": "ad_hoc_check",
"namespace": "default"
},
"spec": {
"command": "check-cpu.sh -w 75 -c 90",
"subscriptions": ["system"],
"handlers": ["slack"],
"interval": 60,
"publish": false
}
}
Proxy checks
Sensu supports running proxy checks where the results are considered to be for an
entity that isn’t actually the one executing the check, regardless of whether
that entity is a Sensu agent entity or a proxy entity.
Proxy entities allow Sensu to monitor external resources
on systems or devices where a Sensu agent cannot be installed, like a
network switch or a website.
You can create a proxy check using the proxy_entity_name
attribute or the proxy_requests
attributes.
Using a proxy check to monitor a proxy entity
When executing checks that include a proxy_entity_name
, Sensu agents report the resulting events under the specified proxy entity instead of the agent entity.
If the proxy entity doesn’t exist, Sensu creates the proxy entity when the event is received by the backend.
To avoid duplicate events, we recommend using the round_robin
attribute with proxy checks.
Example proxy check using a proxy_entity_name
The following proxy check runs every 60 seconds, cycling through the agents with the proxy
subscription alphabetically according to the agent name, for the proxy entity sensu-site
.
type: CheckConfig
api_version: core/v2
metadata:
name: proxy_check
namespace: default
spec:
command: http_check.sh https://sensu.io
handlers:
- slack
interval: 60
proxy_entity_name: sensu-site
publish: true
round_robin: true
subscriptions:
- proxy
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"name": "proxy_check",
"namespace": "default"
},
"spec": {
"command": "http_check.sh https://sensu.io",
"subscriptions": ["proxy"],
"handlers": ["slack"],
"interval": 60,
"publish": true,
"round_robin": true,
"proxy_entity_name": "sensu-site"
}
}
Using a proxy check to monitor multiple proxy entities
The proxy_requests
check attributes allow Sensu to run a check for each entity that matches the definitions specified in the entity_attributes
, resulting in monitoring events that represents each matching proxy entity.
The entity attributes must match exactly as stated; no variables or directives have any special meaning, but you can still use Sensu query expressions to perform more complicated filtering on the available value, such as finding entities with particular subscriptions.
The proxy_requests
attributes are a great way to monitor multiple entities using a single check definition when combined with token substitution.
Since checks including proxy_requests
attributes need to be executed for each matching entity, we recommend using the round_robin
attribute to distribute the check execution workload evenly across your Sensu agents.
Example proxy check using proxy_requests
The following proxy check runs every 60 seconds, cycling through the agents with the proxy
subscription alphabetically according to the agent name, for all existing proxy entities with the custom label proxy_type
set to website
.
This check uses token substitution to import the value of the custom entity label url
to complete the check command.
See the entity reference for information about using custom labels.
type: CheckConfig
api_version: core/v2
metadata:
name: proxy_check_proxy_requests
namespace: default
spec:
command: http_check.sh {{ .labels.url }}
handlers:
- slack
interval: 60
proxy_requests:
entity_attributes:
- entity.labels.proxy_type == 'website'
publish: true
round_robin: true
subscriptions:
- proxy
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"name": "proxy_check_proxy_requests",
"namespace": "default"
},
"spec": {
"command": "http_check.sh {{ .labels.url }}",
"subscriptions": ["proxy"],
"handlers": ["slack"],
"interval": 60,
"publish": true,
"proxy_requests": {
"entity_attributes": [
"entity.labels.proxy_type == 'website'"
]
},
"round_robin": true
}
}
Fine-tuning proxy check scheduling with splay
Sensu supports distributing proxy check executions across an interval using the splay
and splay_coverage
attributes.
For example, if we assume that the proxy_check_proxy_requests
check in the example above matches three proxy entities, we’d expect to see a burst of three events every 60 seconds.
If we add the splay
attribute (set to true
) and the splay_coverage
attribute (set to 90
) to the proxy_requests
scope, Sensu distributes the three check executions over 90% of the 60-second interval, resulting in three events splayed evenly across a 54-second period.
Check token substitution
Sensu check definitions may include attributes that you may wish to override on an entity-by-entity basis. For example, check commands – which may include command line arguments for controlling the behavior of the check command – may benefit from entity-specific thresholds, etc. Sensu check tokens are check definition placeholders that will be replaced by the Sensu agent with the corresponding entity definition attributes values (including custom attributes).
Learn how to use check tokens with the Sensu tokens reference documentation.
NOTE: Check tokens are processed before check execution, therefore token substitutions will not apply to check data delivered via the local agent socket input.
Check hooks
Check hooks are commands run by the Sensu agent in response to the result of check command execution. The Sensu agent will execute the appropriate configured hook command, depending on the check execution status (ex: 0, 1, 2).
Learn how to use check hooks with the Sensu hooks reference documentation.
Check specification
Top-level attributes
type | |
---|---|
description | Top-level attribute specifying the sensuctl create resource type. Checks should always be of type CheckConfig . |
required | Required for check definitions in wrapped-json or yaml format for use with sensuctl create . |
type | String |
example |
|
api_version | |
---|---|
description | Top-level attribute specifying the Sensu API group and version. For checks in Sensu backend version 5.4, this attribute should always be core/v2 . |
required | Required for check definitions in wrapped-json or yaml format for use with sensuctl create . |
type | String |
example |
|
metadata | |
---|---|
description | Top-level collection of metadata about the check, including the name and namespace as well as custom labels and annotations . The metadata map is always at the top level of the check definition. This means that in wrapped-json and yaml formats, the metadata scope occurs outside the spec scope. See the metadata attributes reference for details. |
required | Required for check definitions in wrapped-json or yaml format for use with sensuctl create . |
type | Map of key-value pairs |
example |
|
spec | |
---|---|
description | Top-level map that includes the check spec attributes. |
required | Required for check definitions in wrapped-json or yaml format for use with sensuctl create . |
type | Map of key-value pairs |
example |
|
Spec attributes
command | |
---|---|
description | The check command to be executed. |
required | true |
type | String |
example |
|
subscriptions | |
---|---|
description | An array of Sensu entity subscriptions that check requests will be sent to. The array cannot be empty and its items must each be a string. |
required | true |
type | Array |
example |
|
handlers | |
---|---|
description | An array of Sensu event handlers (names) to use for events created by the check. Each array item must be a string. |
required | false |
type | Array |
example |
|
interval | |
---|---|
description | How often the check is executed, in seconds |
required | true (unless cron is configured) |
type | Integer |
example |
|
cron | |
---|---|
description | When the check should be executed, using cron syntax or these predefined schedules. |
required | true (unless interval is configured) |
type | String |
example |
|
publish | |
---|---|
description | If check requests are published for the check. |
required | false |
default | false |
type | Boolean |
example |
|
timeout | |
---|---|
description | The check execution duration timeout in seconds (hard stop). |
required | false |
type | Integer |
example |
|
ttl | |
---|---|
description | The time to live (TTL) in seconds until check results are considered stale. If an agent stops publishing results for the check, and the TTL expires, an event will be created for the agent’s entity. The check ttl must be greater than the check interval and should allow enough time for the check execution and result processing to complete. For example, for a check that has an interval of 60 (seconds) and a timeout of 30 (seconds), the appropriate ttl is at least 90 (seconds).To use check ttl and round_robin together, your check configuration must also specify a proxy_entity_name . If you do not specify a proxy_entity_name when using check ttl and round_robin together, your check will stop executing. NOTE: Adding TTLs to checks adds overhead, so use the ttl attribute sparingly. |
required | false |
type | Integer |
example |
|
stdin | |
---|---|
description | If the Sensu agent writes JSON serialized Sensu entity and check data to the command process’ STDIN. The command must expect the JSON data via STDIN, read it, and close STDIN. This attribute cannot be used with existing Sensu check plugins, nor Nagios plugins etc, as Sensu agent will wait indefinitely for the check process to read and close STDIN. |
required | false |
type | Boolean |
default | false |
example |
|
low_flap_threshold | |
---|---|
description | The flap detection low threshold (% state change) for the check. Sensu uses the same flap detection algorithm as Nagios. |
required | false |
type | Integer |
example |
|
high_flap_threshold | |
---|---|
description | The flap detection high threshold (% state change) for the check. Sensu uses the same flap detection algorithm as Nagios. |
required | true (if low_flap_threshold is configured) |
type | Integer |
example |
|
runtime_assets | |
---|---|
description | An array of Sensu assets (names), required at runtime for the execution of the command |
required | false |
type | Array |
example |
|
check_hooks | |
---|---|
description | An array of check response types with respective arrays of Sensu hook names. Sensu hooks are commands run by the Sensu agent in response to the result of the check command execution. Hooks are executed, in order of precedence, based on their severity type: 1 to 255 , ok , warning , critical , unknown , and finally non-zero . |
required | false |
type | Array |
example |
|
proxy_entity_name | |
---|---|
description | The entity name, used to create a proxy entity for an external resource (i.e., a network switch). |
required | false |
type | String |
validated | \A[\w\.\-]+\z |
example |
|
proxy_requests | |
---|---|
description | Sensu proxy request attributes allow you to assign the check to run for multiple entities according to their entity_attributes . In the example below, the check executes for all entities with entity class proxy and the custom proxy type label website . Proxy requests are a great way to reuse check definitions for a group of entities. For more information, see the proxy requests specification and the guide to monitoring external resources. |
required | false |
type | Hash |
example |
|
silenced | |
---|---|
description | The silences that apply to this check. |
type | Array |
example |
|
env_vars | |
---|---|
description | An array of environment variables to use with command execution. NOTE: To add env_vars to a check, use sensuctl create . |
required | false |
type | Array |
example |
|
output_metric_format | |
---|---|
description | The metric format generated by the check command. Sensu supports the following metric formats: nagios_perfdata (Nagios Performance Data) graphite_plaintext (Graphite Plaintext Protocol) influxdb_line (InfluxDB Line Protocol) opentsdb_line (OpenTSDB Data Specification)When a check includes an output_metric_format , Sensu will extract the metrics from the check output and add them to the event data in Sensu metric format. For more information about extracting metrics using Sensu, see the guide. |
required | false |
type | String |
example |
|
output_metric_handlers | |
---|---|
description | An array of Sensu handlers to use for events created by the check. Each array item must be a string. output_metric_handlers should be used in place of the handlers attribute if output_metric_format is configured. Metric handlers must be able to process Sensu metric format. For an example, see the Sensu InfluxDB handler. |
required | false |
type | Array |
example |
|
round_robin | |
---|---|
description | When set to true , Sensu executes the check once per interval, cycling through each subscribing agent in turn. See round-robin checks for more information.Use the round_robin attribute with proxy checks to avoid duplicate events and distribute proxy check executions evenly across multiple agents. See proxy checks for more information.To use check ttl and round_robin together, your check configuration must also specify a proxy_entity_name . If you do not specify a proxy_entity_name when using check ttl and round_robin together, your check will stop executing. |
required | false |
type | Boolean |
example |
|
subdue | |
---|---|
description | Check subdues are not yet implemented in Sensu Go. Although the subdue attribute appears in check definitions by default, it is a placeholder and should not be modified. |
example |
|
Metadata attributes
name | |
---|---|
description | A unique string used to identify the check. Check names cannot contain special characters or spaces (validated with Go regex \A[\w\.\-]+\z ). Each check must have a unique name within its namespace. |
required | true |
type | String |
example |
|
namespace | |
---|---|
description | The Sensu RBAC namespace that this check belongs to. |
required | false |
type | String |
default | default |
example |
|
labels | |
---|---|
description | Custom attributes to include with event data, which can be accessed using event filters. In contrast to annotations, you can use labels to create meaningful collections that can be selected with API filtering and sensuctl filtering. Overusing labels can impact Sensu’s internal performance, so we recommend moving complex, non-identifying metadata to annotations. |
required | false |
type | Map of key-value pairs. Keys can contain only letters, numbers, and underscores, but must start with a letter. Values can be any valid UTF-8 string. |
default | null |
example |
|
annotations | |
---|---|
description | Non-identifying metadata to include with event data, which can be accessed using event filters. You can use annotations to add data that’s meaningful to people or external tools interacting with Sensu. In contrast to labels, annotations cannot be used in API filtering or sensuctl filtering and do not impact Sensu’s internal performance. |
required | false |
type | Map of key-value pairs. Keys and values can be any valid UTF-8 string. |
default | null |
example |
|
Proxy requests attributes
entity_attributes | |
---|---|
description | Sensu entity attributes to match entities in the registry, using Sensu query expressions |
required | false |
type | Array |
example |
|
splay | |
---|---|
description | If proxy check requests should be splayed, published evenly over a window of time, determined by the check interval and a configurable splay coverage percentage. For example, if a check has an interval of 60 seconds and a configured splay coverage of 90 %, its proxy check requests would be splayed evenly over a time window of 60 seconds * 90 %, 54 seconds, leaving 6 s for the last proxy check execution before the the next round of proxy check requests for the same check. |
required | false |
type | Boolean |
default | false |
example |
|
splay_coverage | |
---|---|
description | The percentage of the check interval over which Sensu can execute the check for all applicable entities, as defined in the entity attributes. Sensu uses the splay coverage attribute to determine the amount of time check requests can be published over (before the next check interval). |
required | required if splay attribute is set to true |
type | Integer |
example |
|
Check output truncation attributes
max_output_size | |
---|---|
description | Maximum size, in bytes, of stored check outputs. When this attribute is set to a non-zero value, the Sensu backend truncates check outputs larger than this value before storing to etcd. max_output_size does not affect data sent to Sensu filters, mutators, and handlers. |
required | false |
type | Integer |
example |
|
discard_output | |
---|---|
description | Discard check output after extracting metrics. No check output will be sent to the Sensu backend. |
required | false |
type | Boolean |
example |
|
Examples
Minimum recommended check attributes
NOTE: The attribute interval
is not required if a valid cron
schedule is defined.
type: CheckConfig
api_version: core/v2
metadata:
name: check_minimum
namespace: default
spec:
command: collect.sh
handlers:
- slack
interval: 10
publish: true
subscriptions:
- system
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"namespace": "default",
"name": "check_minimum"
},
"spec": {
"command": "collect.sh",
"subscriptions": [
"system"
],
"handlers": [
"slack"
],
"interval": 10,
"publish": true
}
}
Metric check
type: CheckConfig
api_version: core/v2
metadata:
annotations:
slack-channel: '#monitoring'
labels:
region: us-west-1
name: collect-metrics
namespace: default
spec:
check_hooks: null
command: collect.sh
discard_output: true
env_vars: null
handlers: []
high_flap_threshold: 0
interval: 10
low_flap_threshold: 0
output_metric_format: graphite_plaintext
output_metric_handlers:
- influx-db
proxy_entity_name: ""
publish: true
round_robin: false
runtime_assets: null
stdin: false
subscriptions:
- system
timeout: 0
ttl: 0
{
"type": "CheckConfig",
"api_version": "core/v2",
"metadata": {
"name": "collect-metrics",
"namespace": "default",
"labels": {
"region": "us-west-1"
},
"annotations": {
"slack-channel" : "#monitoring"
}
},
"spec": {
"command": "collect.sh",
"handlers": [],
"high_flap_threshold": 0,
"interval": 10,
"low_flap_threshold": 0,
"publish": true,
"runtime_assets": null,
"subscriptions": [
"system"
],
"proxy_entity_name": "",
"check_hooks": null,
"stdin": false,
"ttl": 0,
"timeout": 0,
"round_robin": false,
"output_metric_format": "graphite_plaintext",
"output_metric_handlers": [
"influx-db"
],
"env_vars": null,
"discard_output": true
}
}