Collect service metrics with Sensu checks

PRO TIP: You can use the HTTP Service Monitoring (Local) integration in the Sensu Catalog to collect service metrics instead of following this guide. Follow the Catalog prompts to configure the Sensu resources you need and start processing your observability data with a few clicks.

Sensu checks are commands (or scripts) that the Sensu agent executes that output data and produce an exit code to indicate a state. If you are unfamiliar with checks, read the checks reference for details and examples. You can also learn how to configure monitoring checks in Monitor server resources.

This guide demonstrates how to use a check to extract service metrics for an NGINX webserver, with output in Nagios Performance Data format.

Requirements

To follow this guide, install the Sensu backend, make sure at least one Sensu agent is running, and install and configure sensuctl.

Before you begin, add the debug handler to your Sensu instance. The check in this guide will use it to write metric events to a file for inspection.

Configure a Sensu entity

Every Sensu agent has a defined set of subscriptions that determine which checks the agent will execute. For an agent to execute a specific check, you must specify the same subscription in the agent configuration and the check definition. To run the NGINX webserver check, you’ll need a Sensu entity with the subscription webserver.

To add the webserver subscription to the entity the Sensu agent is observing, first find your agent entity name:

sensuctl entity list

The ID is the name of your entity.

Replace <ENTITY_NAME> with the name of your agent entity in the following sensuctl command. Then, run:

sensuctl entity update <ENTITY_NAME>
  • For Entity Class, press enter.
  • For Subscriptions, type webserver and press enter.

Confirm both Sensu services are running:

systemctl status sensu-backend && systemctl status sensu-agent

The response should indicate active (running) for both the Sensu backend and agent.

Register the dynamic runtime asset

To power the check to collect service metrics, you will use a check in the sensu/http-checks dynamic runtime asset. Use sensuctl to register the sensu/http-checks dynamic runtime asset:

sensuctl asset add sensu/http-checks:0.5.0 -r http-checks

The response will indicate that the asset was added:

fetching bonsai asset: sensu/http-checks:0.5.0
added asset: sensu/http-checks:0.5.0

You have successfully added the Sensu asset resource, but the asset will not get downloaded until
it's invoked by another Sensu resource (ex. check). To add this runtime asset to the appropriate
resource, populate the "runtime_assets" field with ["http-checks"].

This example uses the -r (rename) flag to specify a shorter name for the dynamic runtime asset: http-checks.

Use sensuctl to confirm that both the http-checks dynamic runtime asset is ready to use:

sensuctl asset list

The sensuctl response should list http-checks:

     Name                                       URL                                    Hash    
────────────── ───────────────────────────────────────────────────────────────────── ──────────
  http-checks   //assets.bonsai.sensu.io/.../http-checks_0.5.0_windows_amd64.tar.gz   52ae075  
  http-checks   //assets.bonsai.sensu.io/.../http-checks_0.5.0_darwin_amd64.tar.gz    72d0f15  
  http-checks   //assets.bonsai.sensu.io/.../http-checks_0.5.0_linux_armv7.tar.gz     ef18587  
  http-checks   //assets.bonsai.sensu.io/.../http-checks_0.5.0_linux_arm64.tar.gz     3504ddf  
  http-checks   //assets.bonsai.sensu.io/.../http-checks_0.5.0_linux_386.tar.gz       60b8883  
  http-checks   //assets.bonsai.sensu.io/.../http-checks_0.5.0_linux_amd64.tar.gz     1db73a8  

Install and configure NGINX

The webserver check requires a running NGINX service, so you’ll need to install and configure NGINX.

NOTE: You may need to install and update the EPEL repository with sudo yum install epel-release and sudo yum update before you can install NGINX.

Install NGINX:

sudo yum install nginx

Enable and start the NGINX service:

systemctl enable nginx && systemctl start nginx

Verify that NGINX is serving webpages:

curl -sI http://localhost

The response should include HTTP/1.1 200 OK to indicate that NGINX processed your request as expected:

HTTP/1.1 200 OK
Server: nginx/1.20.1
Date: Tue, 02 Nov 2021 20:15:40 GMT
Content-Type: text/html
Content-Length: 4833
Last-Modified: Fri, 16 May 2014 15:12:48 GMT
Connection: keep-alive
ETag: "xxxxxxxx-xxxx"
Accept-Ranges: bytes

With your NGINX service running, you can configure the check to collect service metrics.

Create a check to collect metrics

Create the collect-metrics check with a command that uses the http-perf performance check from the http-checks dynamic runtime asset:

sensuctl check create collect-metrics \
--command 'http-perf --url http://localhost --warning 1s --critical 2s' \
--handlers debug \
--interval 15 \
--subscriptions webserver \
--runtime-assets http-checks \
--output-metric-format nagios_perfdata

This example check specifies a 15-second interval for collecting metrics, a subscription to ensure the check will run on any entity that includes the webserver subscription, the name of the dynamic runtime asset the check needs to work properly, and the nagios_perfdata output metric format.

You should receive a confirmation response: Created.

To view the check resource you just created with sensuctl, run:

sensuctl check info collect-metrics --format yaml
sensuctl check info collect-metrics --format wrapped-json

The sensuctl response will list the complete check resource definition:

---
type: CheckConfig
api_version: core/v2
metadata:
  created_by: admin
  name: collect-metrics
  namespace: default
spec:
  check_hooks: null
  command: http-perf --url http://localhost --warning 1s --critical 2s
  env_vars: null
  handlers:
  - debug
  high_flap_threshold: 0
  interval: 15
  low_flap_threshold: 0
  output_metric_format: nagios_perfdata
  output_metric_handlers: null
  pipelines: []
  proxy_entity_name: ""
  publish: true
  round_robin: false
  runtime_assets:
  - http-checks
  secrets: null
  stdin: false
  subdue: null
  subscriptions:
  - webserver
  timeout: 0
  ttl: 0
{
  "type": "CheckConfig",
  "api_version": "core/v2",
  "metadata": {
    "created_by": "admin",
    "name": "collect-metrics",
    "namespace": "default"
  },
  "spec": {
    "check_hooks": null,
    "command": "http-perf --url http://localhost --warning 1s --critical 2s",
    "env_vars": null,
    "handlers": [
      "debug"
    ],
    "high_flap_threshold": 0,
    "interval": 15,
    "low_flap_threshold": 0,
    "output_metric_format": "nagios_perfdata",
    "output_metric_handlers": null,
    "pipelines": [
    ],
    "proxy_entity_name": "",
    "publish": true,
    "round_robin": false,
    "runtime_assets": [
      "http-checks"
    ],
    "secrets": null,
    "stdin": false,
    "subdue": null,
    "subscriptions": [
      "webserver"
    ],
    "timeout": 0,
    "ttl": 0
  }
}

Confirm that your check is collecting metrics

If the check is collecting metrics correctly according to its output_metric_format, the metrics will be extracted in Sensu metric format and passed to the observability pipeline for handling. The Sensu agent will log errors if it cannot parse the check output.

To confirm that the check extracted metrics, inspect the event passed to the debug handler in the debug-event.json file:

cat /var/log/sensu/debug-event.json

The event will include a top-level metrics section populated with metrics points arrays if the Sensu agent correctly ingested the metrics, similar to this example:

{
  "check": {
    "command": "http-perf --url http://localhost --warning 1s --critical 2s",
    "handlers": [
      "debug"
    ],
    "high_flap_threshold": 0,
    "interval": 15,
    "low_flap_threshold": 0,
    "publish": true,
    "runtime_assets": [
      "http-checks"
    ],
    "subscriptions": [
      "webserver"
    ],
    "proxy_entity_name": "",
    "check_hooks": null,
    "stdin": false,
    "subdue": null,
    "ttl": 0,
    "timeout": 0,
    "round_robin": false,
    "duration": 0.011235081,
    "executed": 1635886845,
    "history": [
      {
        "status": 0,
        "executed": 1635886785
      },
      {
        "status": 0,
        "executed": 1635886800
      },
      {
        "status": 0,
        "executed": 1635886815
      },
      {
        "status": 0,
        "executed": 1635886830
      },
      {
        "status": 0,
        "executed": 1635886845
      }
    ],
    "issued": 1635886845,
    "output": "http-perf OK: 0.001088s | dns_duration=0.000216, tls_handshake_duration=0.000000, connect_duration=0.000140, first_byte_duration=0.001071, total_request_duration=0.001088\n",
    "state": "passing",
    "status": 0,
    "total_state_change": 0,
    "last_ok": 1635886845,
    "occurrences": 5,
    "occurrences_watermark": 5,
    "output_metric_format": "nagios_perfdata",
    "output_metric_handlers": null,
    "env_vars": null,
    "metadata": {
      "name": "collect-metrics",
      "namespace": "default"
    },
    "secrets": null,
    "is_silenced": false,
    "scheduler": "memory",
    "processed_by": "sensu-centos",
    "pipelines": []
  },
  "metrics": {
    "handlers": null,
    "points": [
      {
        "name": "dns_duration",
        "value": 0.000216,
        "timestamp": 1635886845,
        "tags": null
      },
      {
        "name": "tls_handshake_duration",
        "value": 0,
        "timestamp": 1635886845,
        "tags": null
      },
      {
        "name": "connect_duration",
        "value": 0.00014,
        "timestamp": 1635886845,
        "tags": null
      },
      {
        "name": "first_byte_duration",
        "value": 0.001071,
        "timestamp": 1635886845,
        "tags": null
      },
      {
        "name": "total_request_duration",
        "value": 0.001088,
        "timestamp": 1635886845,
        "tags": null
      }
    ]
  },
  "metadata": {
    "namespace": "default"
  },
  "id": "d19ee7f9-8cc5-447b-9059-895e89e14667",
  "sequence": 146,
  "pipelines": null,
  "timestamp": 1635886845,
  "entity": {
    "entity_class": "agent",
    "system": {
      "hostname": "sensu-centos",
      "os": "linux",
      "platform": "centos",
      "platform_family": "rhel",
      "platform_version": "7.9.2009",
      "network": {
        "interfaces": [
          {
            "name": "lo",
            "addresses": [
              "127.0.0.1/8",
              "::1/128"
            ]
          },
          {
            "name": "eth0",
            "mac": "08:00:27:8b:c9:3f",
            "addresses": [
              "10.0.2.15/24",
              "fe80::20b8:8cea:fa4:2e57/64"
            ]
          },
          {
            "name": "eth1",
            "mac": "08:00:27:40:ab:31",
            "addresses": [
              "192.168.200.95/24",
              "fe80::a00:27ff:fe40:ab31/64"
            ]
          }
        ]
      },
      "arch": "amd64",
      "libc_type": "glibc",
      "vm_system": "vbox",
      "vm_role": "guest",
      "cloud_provider": "",
      "processes": null
    },
    "subscriptions": [
      "webserver",
      "entity:sensu-centos"
    ],
    "last_seen": 1635886845,
    "deregister": false,
    "deregistration": {},
    "user": "agent",
    "redact": [
      "password",
      "passwd",
      "pass",
      "api_key",
      "api_token",
      "access_key",
      "secret_key",
      "private_key",
      "secret"
    ],
    "metadata": {
      "name": "sensu-centos",
      "namespace": "default"
    },
    "sensu_agent_version": "6.5.4"
  }
}

What’s next

Now that you know how to extract metrics from check output, learn to use a metrics handler to populate service and time-series metrics in InfluxDB or send data to Sumo Logic.

Read Monitor server resources with checks to learn how to monitor an NGINX webserver rather than collect metrics. You can also learn to use Sensu to collect Prometheus metrics.

Learn more about the Sensu resources you created in this guide:

The events reference includes more information about the metrics section and metrics points array.

Visit Bonsai, the Sensu asset index, for more information about the sensu/http-checks dynamic runtime asset’s capabilities.