I wanted monitoring, and I had already deployed the kube-prometheus-stack (via the rancher-monitoring chart) and I wanted to leverage that for my non-k8s monitoring needs. It seemed silly to deploy an RPM or static docker based monitoring solution when I had that shiny k3s cluster sitting there, ready to go, with something already deployed on it. It should just be as easy as coaxing the exsisting prometheus instance to scrape other /metrics endpoints, right? …right?
Assumptions
You’re deployed one of the following charts to get prometheus/prometheus-operator/grafana running in your cluster
Note: this is just using the kube-prometheus-stack charts under the hood, so all of the same CRDs will apply here (sorta, see Rancher Specifics)
You’re running node_exporter somewhere else, say on your desktop (or a fleet of bare metal servers doing non-k8s things).
For the purposes of this post, you’ll see einsteinium.lan.zeroent.net:9100 as my “external” (meaning external to the cluster) instance of node_exporter
Suggestions on the internet
When you search for “prometheus operator external metrics”, you come across a couple of blogs:
In both of these posts, the authors suggest to use a combination of a custom ServiceMonitor resource pointing to a service resource, pointing to a endpoint resource that actually references your endpoint outside the cluster.
The problem
Even once you get past the improperly indented yaml examples, there are some problems with this solution:
It requires you to reference the external instance by IP Address (no hostname)
You have to define (and keep in sync) 3 resources just to monitor one node
The first point being the biggest; it makes the whole setup somewhat fragile. If you change your IP, you have to update some of the resources. If you’re doing any hostname-based proxying on the target, you’re hooped.
This solution involves you manually modifying the kind: Prometheus resource that was deployed with the helm chart (danger) and pointing to kind: Secret resource where you’re going to define, in traditional prometheus config format, extra jobs for Prometheus to go do.
The problem
You’re manually modifying the main Prometheus custom resource (which tells the prometheus-operator how to deploy/configure prometheus). The moment you run a helm upgrade on your installation, those changes could very well get blown away.
You have one secret that you have to update any time you want to add/remove an endpoint for monitoring.
k8s secrets are notoriously messy to update. You basically have to keep the source of the secret and re-generate it every time, ala:
So, there has to be a better way (at least, I think its a better way). While trying to get blackbox_exporter to work, I came across the kind: Probe CRD provided by the prometheus-operator chart. All of the examples of using this CRD reference the blackbox_exporter, but after some digging, you can use it to create prometheus jobs. At the end of the day, all of the methods described here on the page are, one way or another, just appending to the active prometheus jobs configuration section. So if you can get a
You’ll want to create a resource that looks something like this:
When we kubectl apply -f this config, we see the resulting prometheus config changes:
Notes
In the example above, I’m showing the use of relabelingConfigs and left in the commented-out labels directive. You can use either, but don’t use both. Also, be sure to read rancher-flavored caveats.
As far as I can tell, you need to define the hostname in both targets.staticConfig.static[] as well as prober.url. If you don’t define it in targets.staticConfig.static[], then the job doesnt get created. If you don’t define prober.url, you cant define prober.path (url is a required key per the spec), but you need to define prober.path to /metrics, otherwise it’s going to query /probe (see Disadvantages)
Advantages
This has several advantages over the current recommended way of using
No modifying the original deployment of prometheus-operator
One resource per monitoring target. This allows you to easily template out the resource definition and plumb it into your automation for host provisioning:
New host? apply one of these resources.
Deleting host? delete the corresponding resource
Disadvantages
To be clear, this is abusing the probe CRD. It seems like it was intended largely for the blackbox_exporter:
All docs examples for the prober CRD reference using it for blackbox_exporter
Even though we’re overriding the path, it still is pushing blackbox_exporter style arguments (see below). Thankfully node_exporter just ignores them and passes back the metrics as expected.
Wireshark capture of incoming request
This isnt necessarily a problem, but
Because we’re abusing the functionality of the probe, this solution could break with any update.
This is due to a refactor in the kind: Probe CRD occuring between the version of kube-prometheus that the rancher-monitoring chart is using (v0.48.0) and the mainline version from upstream (kube-prometheus, v0.53.1).
What this all means: If you’re using the upstream kube-prometheus chart (rather than the rancher-monitoring chart) and you want to use the dynamic relabeling provided by prometheus, you’ll have to specify it directly under spec as metricRelabelings like so:
Ideally, the prometheus-operator would provide a generic scrape config CRD, and it just so happens theres a open GitHub issue for this. As soon as that gets implemented, you’ll have a nice, clean, k8s-native way to point at external scrape points.