Table of Contents
What is prometheus?
Prometheus is an open-source system monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and is maintained independently of any company. To emphasize this, and to clarify the project’s governance structure, Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted the project, after Kubernetes.
Prometheus collects and stores its metrics as time-series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.
Watch This video to learn more about Prometheus Architecture and its features.
Prometheus’s main features are:
- a multi-dimensional data model with time series data identified by metric name and key/value pairs
- PromQL, a flexible query language to leverage this dimensionality
- no reliance on distributed storage; single server nodes are autonomous
- time series collection happens via a pull model over HTTP
- pushing time series is supported via an intermediary gateway
- targets are discovered via service discovery or static configuration
- multiple modes of graphing and dashboarding support such as Grafana
Prometheus Node Exporter on Linux
Prometheus Recorded Rules
Queries that aggregate over thousands of time series can get slow when computed ad-hoc. To make this more efficient, Prometheus can prerecord expressions into new persisted time series via configured recording rules. Let’s say we are interested in recording the per-second rate of cpu time (
node_cpu_seconds_total) averaged over all cpus per instance (but preserving the
mode dimensions) as measured over a window of 5 minutes. We could write this as:
avg by (job, instance, mode) (rate(node_cpu_seconds_total[5m]))
Watch this tutorial for more details:
Prometheus Server Down Alerts Setup
Alerting with Prometheus is separated into two parts. Alerting rules in Prometheus servers send alerts to an Alertmanager. The Alertmanager then manages those alerts, including silencing, inhibition, aggregation, and sending out notifications via methods such as email, on-call notification systems, and chat platforms.
The main steps to setting up alerting and notifications are: