Fetching metrics

In this notebook, we will learn how to use the Prometheus API client for fetching and formatting the raw metrics obtained from a Prometheus host to drive better data science analysis on these metrics.

You can find more information about the functions of the API client here.

Installing the library:

To begin any exploratory analysis, we need to first install all the required packages.

For this notebook in particular, the prometheus api client library needs to be installed.

# !pip3 install prometheus-api-client
# !pip3 install matplotlib pandas

from prometheus_api_client import PrometheusConnect
from prometheus_api_client.metric_range_df import MetricRangeDataFrame
from prometheus_api_client.utils import parse_datetime
from datetime import timedelta
import pandas as pd

Connecting to Prometheus

The PrometheusConnect module of the library can be used to connect to a Prometheus host.

To know more about this module, you can refer to the documentation

We will connect to a public prometheus instance: http://demo.robustperception.io:9090

prom_url = "http://demo.robustperception.io:9090"
print("Prometheus uri: ", prom_url)

# Creating the prometheus connect object with the required parameters
pc = PrometheusConnect(url=prom_url, disable_ssl=True)

# Fetching a list of all metrics scraped by the Prometheus host.
pd.DataFrame(pc.all_metrics(), columns={"metrics"})
Prometheus uri:  http://demo.robustperception.io:9090
metrics
0 ALERTS
1 ALERTS_FOR_STATE
2 alertmanager_alerts
3 alertmanager_alerts_invalid_total
4 alertmanager_alerts_received_total
... ...
501 scrape_duration_seconds
502 scrape_samples_post_metric_relabeling
503 scrape_samples_scraped
504 scrape_series_added
505 up

506 rows × 1 columns

Fetching Metrics from Prometheus

Every metric in Prometheus is stored as time series data: streams of timestamped values belonging to the same metric and the same set of labeled dimensions. Each of these time series is uniquely identified by:

  • metric name - Specifies the general feature of a system that is measured. E.g. http_requests_total - the total number of HTTP requests received.

  • labels - Provides more details to identify a particular dimensional instantiation of the metric. E.g. http_requests_total{method="POST", handler="/api/tracks"}: all HTTP requests that used the method POST to the /api/tracks handler

Prometheus provides a functional query language called PromQL (Prometheus Query Language) that lets the user select and aggregate time series data in real time.

The get_current_metric_value() method in the library can be used to fetch metrics according to this PromQL format.

Parameters:

Examples of fetching metrics using different queries

# Here, we are fetching the current values of a particular metric name
pc.get_current_metric_value("alertmanager_alerts")
[{'metric': {'__name__': 'alertmanager_alerts',
   'instance': 'demo.robustperception.io:9093',
   'job': 'alertmanager',
   'state': 'active'},
  'value': [1606158844.546, '4']},
 {'metric': {'__name__': 'alertmanager_alerts',
   'instance': 'demo.robustperception.io:9093',
   'job': 'alertmanager',
   'state': 'suppressed'},
  'value': [1606158844.546, '0']}]
# Now, let's see if we can fetch a particular label configuration of this metric
pc.get_current_metric_value("alertmanager_alerts{state='active'}")
[{'metric': {'__name__': 'alertmanager_alerts',
   'instance': 'demo.robustperception.io:9093',
   'job': 'alertmanager',
   'state': 'active'},
  'value': [1606158848.546, '4']}]
# Sum of all the values of a metric
# You can also try methods such as rate, and count
# More functions here: https://prometheus.io/docs/prometheus/latest/querying/examples/
pc.get_current_metric_value("sum(scrape_duration_seconds)")
[{'metric': {}, 'value': [1606158849.597, '0.042080264']}]

Collecting Historical Data

Suppose we want to fetch historical data instead of just the current value, say the past few days of data, we can do so by using the get_metric_range_data() method. This method will fetch the data for the specifed metric label configuration within the time range specified. Along with specifying the start_time and end_time for the metric data collection, we need to specify the chunk size or the amount of data requested in one go. A large chunk size can result in non responsive code if the data is huge. If the end goal is to create a data frame, the chunk size will not affect the final frame size. Therefore it is a good practice to make sure the chunk size is balanced, i.e, not too high or not too low.

metric_data = pc.get_metric_range_data(
    "alertmanager_alerts{job='alertmanager'}",  # metric name and label config
    start_time=parse_datetime(
        "2d"
    ),  # datetime object for metric range start time
    end_time=parse_datetime(
        "now"
    ),  # datetime object for metric range end time
    chunk_size=timedelta(
        hours=12
    ),  # timedelta object for duration of metric data downloaded in one request
)
metric_data[0]["metric"]
{'__name__': 'alertmanager_alerts',
 'instance': 'demo.robustperception.io:9093',
 'job': 'alertmanager',
 'state': 'active'}

Creating pandas Data Frame

We can easily create a pandas data frame from the json response using the MetricRangeDataFrame class.

metric_df = MetricRangeDataFrame(metric_data)
metric_df
__name__ instance job state value
timestamp
1.605986e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
1.605986e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
1.605986e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
1.605986e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
1.605986e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
... ... ... ... ... ...
1.606159e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
1.606159e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
1.606159e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
1.606159e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
1.606159e+09 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0

34558 rows × 5 columns

Converting utc timestamp to python datetime object

metric_df.index = pd.to_datetime(metric_df.index, unit="s", utc=True)
metric_df
__name__ instance job state value
timestamp
2020-11-21 19:14:16.102999926+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
2020-11-21 19:14:26.095999956+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
2020-11-21 19:14:36.095999956+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
2020-11-21 19:14:46.098000050+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
2020-11-21 19:14:56.105000019+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager active 4
... ... ... ... ... ...
2020-11-23 19:13:26.095999956+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
2020-11-23 19:13:36.105000019+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
2020-11-23 19:13:46.114000082+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
2020-11-23 19:13:56.096999884+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0
2020-11-23 19:14:06.095999956+00:00 alertmanager_alerts demo.robustperception.io:9093 alertmanager suppressed 0

34558 rows × 5 columns

Conclusion

Great! In this notebook we saw how to use prometheus api client to create a pandas time series dataframe that can be used for analysis. In the next post, we will learn how to manipulate and visualize this dataframe.