Adding metrics for prometheus

In this video we're going to add some Prometheus metrics to an existing Golang application. This is going to be the minimal and simplest application we can have but hopefully it'll give you an understanding of how to instrument your application with metrics. We're going to do this over HTTP - Prometheus does support other ways of emitting metrics but we're going to expose them on a HTTP endpoint in this video.

So firstly I just want to talk you through all the different types of metrics that you can have in Prometheus. These are basic ones:

The first one is a basic counter. A counter is a cumulative metric and it represents a single numeric value that only ever goes up. So one example for this would be transaction success. Every time a transaction is successful it goes up. This isn't something that should go down whereas you may have a different measure which is you might measure successful transactions over time and that one might go up and down a little bit.

Next we have a gauge. A gauge is a metric that represents a single numeric value that can go up and down arbitrarily. So the example I've given here is my blood sugar. So I'm a type 1 diabetic and this year I gave a talk at GoferCon in Singapore where I spoke about how I made a system to monitor my blood sugar and actually use the gauge for that. My blood sugar will be one value I'll eat some carbohydrates and it'll go up.

Next one is a histogram. This is a metric that samples observations usually things like request durations or response sizes and counts them in buckets. It also provides a sum of all the observed values. So the example I've given here is HTTP request duration in seconds. If you're trying to measure SLOs on the service or you're trying to figure out how well your service is performing generally in buckets, this is really good.

The next metric type is summary. It calculates configurable quantiles over a sliding window. To be honest, I rarely use this one. Usually the histogram is good enough, but we've got an example here of a HTTP response size and we've got sort of like objectives - what we're trying to achieve and how often that happens.

Making these is very simple. Figuring out which one to use is difficult. And oftentimes you could use multiple of them to achieve what you want to do. So don't be too concerned about getting this perfect.

One thing to pull out is labels - you can have multiple labels and that kind of adds different dimensions to your metrics. You might measure path, maybe query string (that would be a terrible idea because it would give you lots and lots of metrics but you could do it). More interesting is maybe you'd have a label for transactions and transaction success by Visa, Amex, MasterCard and they could be labels.

Once you define your counters, you then just need to register them using prometheus.register. This doesn't panic, it returns an error, but there's also a prometheus.must.register as well. I don't tend to support panicking in programs when I can avoid it.

After registering our counters, we can start using them in our code. I demonstrate this with some basic functions - one incrementing transaction success, another setting blood sugar to a random number, and endpoints for measuring duration and response size.

What makes this all useful is the Prometheus HTTP handler at /metrics which exposes all these metrics. When you hit this endpoint, you get not only your custom metrics but also lots of Go-specific metrics that Prometheus figured out for us - things like go routines running, Go version, etc.

I actually made a mistake in the demo by not registering all the metrics (only registered transaction success). I left this in the video because it's a good lesson in debugging - sometimes things fail silently. Even though we were incrementing the counters, they weren't showing up in metrics because they weren't registered.

After fixing this by registering all metrics (blood sugar gauge, HTTP duration, and HTTP response size), we can see all our custom metrics in the /metrics endpoint, properly labeled and tracked.

In the next video, we'll talk about how we can make use of this information and how this might work in a more production system.

Adding metrics for prometheus

Matt Boyle

Transcript