Metrics¶
Prometheus provides four types of metrics. This document only briefly explains them, please refer to the official documentation if you need to learn more about it.
Note that of the four types the most commonly used are the Counter and Gauge, the Histogram and Summary are extremely useful but used must less as they are most complicated to fully understand and set up correctly. Many client libraries of Prometheus do not even provide support for Histograms and Summaries.
Basics¶
Every metrics must bear a unique name and help text. Titan will not let you override a metric: make sure the names (identifiers) are unique.
Note
Make sure metrics bear a unique name.
The help text is also mandatory as per Prometheus, it allows giving more context on the metric tracked.
Optionally, metrics can also take labels, which will be detailed later in this document.
All metrics are R6 classes, you can have any number of these metrics in any project.
Counter¶
A counter is the most basic metrics one can use. It consists of a simple counter that can only go up; the value of counters can never decrease.
Tip
The value of counters can only increase, use the Gauge if the value may decrease.
This can be used to measure the number of times an application is visited, or the number of times an endpoint is hit: these values only ever go up. Never use Counters for values that go down, titan will not let you decrease their value.
Instantiate a new counter from the Counter
R6 class, give it a name and some help text, then use the method inc
to increase it. You can also use the method set
to set it to a specific value, again make sure that value is greater than that which the counter already holds or it will throw a warning and not set the counter to that value.
1 2 3 4 5 6 7 8 9 10 |
|
1 2 3 |
|
Gauge¶
A gauge is very similar to the Counter at the exception that its value can decrease.
Then again this is set up in similar way as the Counter, the only difference is that it also has a dec
method to decrease the gauge.
1 2 3 4 5 6 7 8 9 10 |
|
1 2 3 |
|
So why would you use a Counter when a Gauge does the same and more? Because this is stored and processed differently by Prometheus. Prometheus is, at its core, a time series database and will take the metric type into account when reporting metrics.
Tip
Gauges and counters are fundamentally stored as different data types; do not simply switch one for the other, think thoroughly about what you measure.
Histogram¶
Histograms allow you to count observations and put them in configurable buckets.
Start by declaring a predicate; a function which will turn put the observations into buckets. A bucket is defined using the bucket
function which takes 1) the label of the bucket, and 2) the value of said bucket. Below we create a predicate that will put the observations into two buckets. It will be used to measure the time it take to process a request, if the request takes more than 3 seconds it goes into a bucket called “3” and if it takes over 3 seconds it will put the observation into another bucket called “9”.
1 2 3 4 5 6 7 8 9 10 11 |
|
The creation of the histogram itself differs little from other metrics: specify a unique name, help text, and pass the predicate function previously defined.
1 2 3 4 5 |
|
Here, to demonstrate, we create a function to simulate a request taking time by randomly making the function sleep for between 1 and 9 seconds. When the function exits we observe
the time difference between the beginning and end of the function. The observe
method will internally run the predicate
and place the results into buckets.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
1 2 3 4 5 6 |
|
Note that the histogram (as per Prometheus standards) also logs the count
, the number of observations and the sum
the sum of the observations. Above we can see that 4 requests were made that took a total of ~13 seconds; 2 of these took less than 3 seconds and the other 2 took more than that.
Summary¶
The Summary metric is very similar to the histogram, and works the same with Titan (predicate, etc.) except it does not count the observations in each buckets, instead it computes the sum of it. Also these buckets in Summary are called quantiles and must be between zero and one (0 < q < 1).
Labels¶
Labels allow adding granularity to metrics without duplicating them, they can be applied to any metric.
From the official documentation:
Warning
Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values. — Official documentation
Say for instance you have a small API with three endpoints and simply want to track the number of times they get pinged.
Tip
All labels specified must be used or titan will throw a warning and ignore the action.
Though you could create three separate Counters it might be more convenient to create a simple Counter with a label that can be set to the path that is used.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
1 2 3 4 5 |
|
If you use labels
you must specify all of the labels every time you change the value of the metric (inc
, dec
, set
, observe
). Otherwise titan throws a warning and ignores the action.
Error
Since all labels must be specified this will fail.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|