rabbitmq/rabbitmq-prometheus

https://github.com/rabbitmq/rabbitmq-prometheus

This was migrated to https://github.com/rabbitmq/rabbitmq-server

All issues have been transferred, archiving.

Prometheus Exporter of Core RabbitMQ Metrics

Getting Started

This is a Prometheus exporter of core RabbitMQ metrics, developed by the RabbitMQ core team. It is largely a "clean room" design that reuses some prior work from Prometheus exporters done by the community.

Project Maturity

This plugin is new as of RabbitMQ 3.8.0.

Documentation

See Monitoring RabbitMQ with Prometheus and Grafana.

Installation

This plugin is included into RabbitMQ 3.8.x releases. Like all plugins, it has to be enabled before it can be used:

To enable it with rabbitmq-plugins:

rabbitmq-plugins enable rabbitmq_prometheus

Usage

See the documentation guide.

Default port used by the plugin is 15692 and the endpoint path is at /metrics. To try it with curl:

curl -v -H "Accept:text/plain" "http://localhost:15692/metrics"

In most environments there would be no configuration necessary.

See the entire list of metrics exposed via the default port.

Configuration

This exporter supports the following options via a set of prometheus.* configuration keys:

prometheus.return_per_object_metrics returns individual (per object) metrics that are not aggregated (default is false).
prometheus.path defines a scrape endpoint (default is "/metrics").
prometheus.tcp.* controls HTTP listener settings that match those used by the RabbitMQ HTTP API
prometheus.ssl.* controls TLS (HTTPS) listener settings that match those used by the RabbitMQ HTTP API

Sample configuration snippet:

# these values are defaults prometheus.return_per_object_metrics = false prometheus.path = /metrics prometheus.tcp.port =  15692

When metrics are returned per object, nodes with 80k queues have been measured to take 58 seconds to return 1.9 million metrics in a 98MB response payload. In order to not put unnecessary pressure on your metrics system, metrics are aggregated by default.

When debugging, it may be useful to return metrics per object (unaggregated). This can be enabled on-the-fly, without restarting or configuring RabbitMQ, using the following command:

rabbitmqctl eval 'application:set_env(rabbitmq_prometheus, return_per_object_metrics, true).'

To go back to aggregated metrics on-the-fly, run the following command:

rabbitmqctl eval 'application:set_env(rabbitmq_prometheus, return_per_object_metrics, false).'

Contributing

See CONTRIBUTING.md.

Makefile

This project uses erlang.mk, running make help will return erlang.mk help.

To see all custom targets that have been documented, run make h.

For Bash shell autocompletion, run eval "$(make autocomplete)", then type make a<TAB> to see all Make targets starting with the letter a, e.g.:

$ make a<TAB ac               all.coverdata    app-build        apps             apps-eunit       asciidoc-guide   autocomplete all              app              app-c_src        apps-ct          asciidoc         asciidoc-manual

Copyright

rabbitmq_prometheus plugin

RabbitMQ with built-in Prometheus plugin
(no exporter needed)

Additionally, there is a solution to monitor RabbitMQ by using the built-in Prometheus plugin from RabbitMQ. Our recommendation is to use both options.

How to install plugin, choose official metrics, and set alerts

RabbitMQ version V3.8.0 and above supports the way to enable a built-in Prometheus metrics plugin that will expose all RabbitMQ metrics in Prometheus format to an endpoint that Prometheus can scrap by enabling the auto-discovery or by creating a service monitor. To enable the RabbitMQ plugin via Helm charts, set the metrics enabled to “true”.

helm install <release name> bitnami/rabbitmq --set metrics.enabled=true

More details about the plugin can be found here.

In the case of standard Prometheus installation, once the plugin is enabled in RabbitMQ, annotations need to be added to RabbitMQ (if you are using the RabbitMQ chart it will be added automatically). Here are the annotations:

annotations:
    prometheus.io/path: /metrics
    prometheus.io/scrape: "true"

These annotations should be added on the pod level. Now Prometheus will automatically start scraping the data if the pod discovery is enabled.
Prometheus configuration for pod discovery:

- job_name: "kubernetes-pods"

    kubernetes_sd_configs:
      - role: pod

In the case of the Prometheus Operator, once the plugin is enabled in RabbitMQ, the service monitor needs to be enables. For this, run the following command:

helm upgrade ---install <release name> bitnami/rabbitmq --set metrics.enabled=true --set metrics.serviceMonitor.enabled=true

Once the service monitor is created, the Prometheus operator will start scrapping the metrics automatically in the default configuration.

Some important metrics

Server is up
As the name suggests, this metric will expose the state of the RabbitMQ process and whether it is up or down.
➡ The key of the exporter metric is “rabbitmq_up”.
➡ The value of the metric is a boolean - 1 or 0 which symbolizes if RabbitMQ is up or down respectively.

Cluster down
Tis metric exposes the state of the RabbitMQ cluster.
➡ The key of the exporter metric is “rabbitmq_running”
➡ The value of the metric is a number that symbolizes the number of nodes in the RabbitMQ cluster.

Out of memory
The memory status of RabbitMQ is exposed through this metric.
➡ The key of the exporter metric is “rabbitmq_node_mem_used” and “rabbitmq_node_mem_limit”
➡ The value of the metric is a number that symbolizes the number of available memory

Too many connections
RabbitMQ acts as a broker between a publisher and a subscriber. Every client to the queue opens a connection with RabbitMQ. Each new one requires resources from the underlying machine and puts burden on the hardware as well as software. Therefore, the number of connections to RabbitMQ should be limited to avoid any discrepancy in the service.
➡ metric “ rabbitmq_connectionsTotal” gives the total active connections on RabbitMQ
➡ The number should be calculated based on the resources allocated to the RabbitMQ service

Cluster partitions down
This metric exposes the RabbitMQ partition status.
➡ The key of the exporter metric is “rabbitmq_partitions”
➡ The value of the metric is a number that symbolizes a number of the net

RabbitMQ Exporter Helm Chart

helm repo add Prometheus-community https://prometheus-community.github.io/helm-charts

helm repo update

helm install [RELEASE_NAME] prometheus-community/prometheus-rabbitmq-exporter

rabbitmq.url: Defines Rabbit MQ Listening URL.
rabbitmq.user: Rabbit MQ connection User.
rabbitmq.password: RabbitMQ password.

  capabilities: bert,no_sort
  include_queues: ".*"
  include_vhost: ".*"
  skip_queues: "^$"
  skip_verify: "false"
  skip_vhost: "^$"
  exporters: "exchange,node,overview,queue"
  output_format: "TTY"
  timeout: 30
  max_queues: 0

# or use the service monitor
prometheus:
  monitor:
    enabled: true
    additionalLabels:
      release: kps
    interval: 15s
    namespace: []
  rules:
    enabled: true
    additionalLabels:
      release: kps
      app: kube-prometheus-stack

rabbitmq:
  url: http://ncmq-rabbitmq-hana.nc.svc.cluster.local:15672
  user: guest
  password: guest
  # If existingPasswordSecret is set then password is ignored
  existingPasswordSecret: ~
  existingPasswordSecretKey: password
  capabilities: bert,no_sort
  include_queues: ".*"
  include_vhost: ".*"
  skip_queues: "^$"
  skip_verify: "false"
  skip_vhost: "^$"
  exporters: "exchange,node,overview,queue"
  output_format: "TTY"
  timeout: 30
  max_queues: 0

## Additional labels to set in the Deployment object. Together with standard labels from
## the chart
additionalLabels: {}

podLabels: {}


# Either use Annotation
annotations:
  prometheus.io/scrape: "true"
  prometheus.io/path: "/metrics"
  prometheus.io/port: "9419"


# or use the service monitor
prometheus:
  monitor:
    enabled: true
    additionalLabels:
      release: kps
    interval: 15s
    namespace: []
  rules:
    enabled: true
    additionalLabels:
      release: kps
      app: kube-prometheus-stack

annotations:
    prometheus.io/path: /metrics
    prometheus.io/scrape: "true"

rabbitmq/rabbitmq-prometheus: alerts

After digging into all the valuable metrics, this section explains in detail how we can get critical alerts.

PromQL is a query language for the Prometheus monitoring system. It is designed for building powerful yet simple queries for graphs, alerts, or derived time series (aka recording rules). PromQL is designed from scratch and has zero common grounds with other query languages used in time series databases, such as SQL in TimescaleDB, InfluxQL, or Flux. More details can be found here.

Prometheus comes with a built-in Alert Manager that is responsible for sending alerts (could be email, Slack, or any other supported channel) when any of the trigger conditions is met. Alerting rules allow users to define alerts based on Prometheus query expressions. They are defined based on the available metrics scraped by the exporter. Click here for a good source for community-defined alerts.

A general alert looks as follows:

– alert:(Alert Name)
expr: (Metric exported from exporter) >/</==/<=/=> (Value)
for: (wait for a certain duration between first encountering a new expression output vector element and counting an alert as firing for this element)
labels: (allows specifying a set of additional labels to be attached to the alert)
annotation: (specifies a set of informational labels that can be used to store longer additional information)

Some of the recommended RabbitMQ alerts are:

rabbitmq_prometheus plugin: alerts

Some critical alerts

Alert - Rabbit MQ Down

  - alert: RabbitmqDown
    expr: rabbitmq_up{service="{{ template "rabbitmq.fullname" . }}"} == 0
    for: 5m
    labels:
      severity: error
    annotations:
      summary: Rabbitmq down (instance {{ "{{ $labels.instance }}" }})
      description: RabbitMQ node down

Alert - Rabbit MQ Cluster Down

  - alert: ClusterDown
    expr: |
      sum(rabbitmq_running{service="{{ template "rabbitmq.fullname" . }}"})
      < {{ .Values.replicaCount }}
    for: 5m
    labels:
      severity: error
    annotations:
      summary: Cluster down (instance {{ "{{ $labels.instance }}" }})
      description: |
          Less than {{ .Values.replicaCount }} nodes running in RabbitMQ cluster
          VALUE = {{ "{{ $value }}" }}

Alert - RabbitMQ Partition

  - alert: ClusterPartition
    expr: rabbitmq_partitions{service="{{ template "rabbitmq.fullname" . }}"} > 0
    for: 5m
    labels:
      severity: error
    annotations:
      summary: Cluster partition (instance {{ "{{ $labels.instance }}" }})
      description: |
          Cluster partition
          VALUE = {{ "{{ $value }}" }}

Alert - RabbitMQ is out of memory

  - alert: OutOfMemory
    expr: |
      rabbitmq_node_mem_used{service="{{ template "rabbitmq.fullname" . }}"}
      / rabbitmq_node_mem_limit{service="{{ template "rabbitmq.fullname" . }}"}
      * 100 > 90
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Out of memory (instance {{ "{{ $labels.instance }}" }})
      description: |
          Memory available for RabbmitMQ is low (< 10%)\n  VALUE = {{ "{{ $value }}" }}
          LABELS: {{ "{{ $labels }}" }}

Alert - Too many connections

  - alert: TooManyConnections
    expr: rabbitmq_connectionsTotal{service="{{ template "rabbitmq.fullname" . }}"} > 1000
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Too many connections (instance {{ "{{ $labels.instance }}" }})
      description: |
          RabbitMQ instance has too many connections (> 1000)
          VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}

Alerts can be enabled, disabled, altered, or added using the helm chart here.

RabbitMQ Exporter Grafana

Graphs are easier to understand and more user-friendly than a row of numbers. For this purpose, users can plot their time series data in visualized format using Grafana.

Grafana is an open-source dashboarding tool used for visualizing metrics with the help of customizable and illustrative charts and graphs. It connects very well with Prometheus and makes monitoring easy and informative. Dashboards in Grafana are made up of panels, with each panel running a PromQL query to fetch metrics from Prometheus.
Grafana supports community-driven graphs for most of the widely used software, which can be directly imported to the Grafana Community.

NexClipper uses the Redis Database by the downager dashboard, which is widely accepted and has a lot of useful panels.

What is a Panel?

Panels are the most basic component of a dashboard and can display information in various ways, such as gauge, text, bar chart, graph, and so on. They provide information in a very interactive way. Users can view every panel separately and check the value of metrics within a specific time range.
The values on the panel are queried using PromQL, which is Prometheus Query Language. PromQL is a simple query language used to query metrics within Prometheus. It enables users to query data, aggregate and apply arithmetic functions to the metrics, and then further visualize them on panels.

Here an example panel:

Showing system up/down with other consumer-related information

RabbitMQ Prometheus Plugin Grafana

Dashboard

This is the dashboard that has been used.

RabbitMQ Exporter

About RabbitMQ

RabbitMQ with Prometheus Exporter

How do you set up an exporter for Prometheus?

Method 1 - Native

Method 2 - Service Discovery

Method 3 - Prometheus Operator

Metrics

RabbitMQ with built-in Prometheus plugin
(no exporter needed)

How to install plugin, choose official metrics, and set alerts

Some important metrics

Some critical alerts

Dashboard

rabbitmq/rabbitmq-prometheus

This was migrated to https://github.com/rabbitmq/rabbitmq-server

Prometheus Exporter of Core RabbitMQ Metrics

Getting Started

Project Maturity

Documentation

Installation

Usage

Configuration

Contributing

Makefile

Copyright

rabbitmq_prometheus plugin

RabbitMQ with built-in Prometheus plugin
(no exporter needed)

How to install plugin, choose official metrics, and set alerts

Some important metrics

RabbitMQ Exporter Helm Chart

rabbitmq/rabbitmq-prometheus: alerts

rabbitmq_prometheus plugin: alerts

Some critical alerts

RabbitMQ Exporter Grafana

RabbitMQ Prometheus Plugin Grafana

Dashboard

Leave a ReplyCancel reply

Database Exporters

Data Pipeline Exporters

RabbitMQ Exporter

About RabbitMQ

RabbitMQ with Prometheus Exporter

How do you set up an exporter for Prometheus?

Method 1 - Native

Method 2 - Service Discovery

Method 3 - Prometheus Operator

Metrics

RabbitMQ with built-in Prometheus plugin (no exporter needed)

How to install plugin, choose official metrics, and set alerts

Some important metrics

Some critical alerts

Dashboard

rabbitmq/rabbitmq-prometheus

This was migrated to https://github.com/rabbitmq/rabbitmq-server

Prometheus Exporter of Core RabbitMQ Metrics

Getting Started

Project Maturity

Documentation

Installation

Usage

Configuration

Contributing

Makefile

Copyright

rabbitmq_prometheus plugin

RabbitMQ with built-in Prometheus plugin (no exporter needed)

How to install plugin, choose official metrics, and set alerts

Some important metrics

RabbitMQ Exporter Helm Chart

rabbitmq/rabbitmq-prometheus: alerts

rabbitmq_prometheus plugin: alerts

Some critical alerts

RabbitMQ Exporter Grafana

RabbitMQ Prometheus Plugin Grafana

Dashboard

Leave a ReplyCancel reply

Database Exporters

Data Pipeline Exporters

RabbitMQ with built-in Prometheus plugin
(no exporter needed)

RabbitMQ with built-in Prometheus plugin
(no exporter needed)