prometheus metric for container healthcheck status #2166

replicajune · 2019-02-06T17:34:20Z

Hi,

As far as I know, no metrics are available for healthcheck status of a container.

I see a metric about the "up" state of a container (container_last_seen) but nothing about what can be checked over State.Health.Status with docker

This statistic isn't really a metric because it return a string but i would guess that a bolean for each possible value would be useful (running, healthy, unhealthy for the ones I know )

The text was updated successfully, but these errors were encountered:

dashpole · 2019-02-06T17:38:14Z

Does an equivalent exist for all container runtimes cAdvisor supports (mesos, containerd, rkt, docker)?

We usually try and stay away from spec-based metrics, as they tend to be runtime-specific, and generate large numbers of metric streams for each container.

replicajune · 2019-02-06T18:38:58Z

I'm quite unaware of all specifications that could exist at this time. I'm under the impression (and could be wrong) that the OCI had or would propose something standard for this.

So, I've no idea unfortunately

The need I have is to have a metric that is about the work produced by a container rather than a state (container_tasks_state) of a processus or the fact that a container might be up or not.

The healthcheck instruction and related statistics with docker helps to really figure out if a container actually does what it should and I don't really see metrics about that for now

xavs · 2019-05-29T08:21:15Z

This would be one very useful addition.

BulatSaif · 2019-10-21T11:54:21Z

Does anyone find the workaround?

fontanacalifornia · 2019-11-21T21:37:50Z

I am also looking to accomplish this.

dashpole · 2019-11-23T01:11:43Z

The kubelet does have these kind of metrics: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/prober/prober_manager.go#L38.
Those metrics are registered at /metrics/probes on the kubelet's port.

But that doesn't help anyone not using kubernetes...

I'm not sure if cAdvisor should take on metrics collection on probes, as it isn't performing them. I believe we currently only fetch the container from docker at container creation time, so this would require us to poll the runtime for the information. I'm not sure we can provide accurate cumulative probe metrics based on sampling the state. It seems like we are bound to miss probe failures.

anil4u-04 · 2019-12-02T15:09:51Z

Hi Team,

Any advice/update/workaround here is much helpful for everyone. We needed this "health_check" very badly.

serhiiromaniuk · 2020-02-18T11:15:48Z

Hi everybody! sum(time() - container_last_seen) by (name) is a workaround for me, but sometimes it works really bad.

serhiiromaniuk · 2020-02-20T12:05:35Z

Also, for alerts sum(rate(container_last_seen{name=~".+"}[5m])) by (container_label_com_docker_compose_service) < 1, with 15s scrapes helps me to stop crying all day.

mbigras · 2020-04-03T06:53:11Z

It's hard to create alerts based on metrics that disappear and it also goes against prometheus best practices. I still don't understand why we can't just use absent and move on but you can read more about it here:

https://www.robustperception.io/existential-issues-with-metrics

Recently, a coworker discovered this exporter:

https://github.com/prometheus-net/docker_exporter

Which exposed a very valuable metric: docker_container_running_state, this metric won't disappear when the container stops!

Here's an example:

$ sudo docker run \
	--name docker_exporter \
	--detach \
	--restart always \
	--volume /var/run/docker.sock:/var/run/docker.sock \
	--publish 9417:9417 \
	prometheusnet/docker_exporter
$ sudo docker create --name foo -it ubuntu sleep 10
$ sudo docker start foo
$ curl -s localhost:9417/metrics | grep state
docker_container_running_state{name="foo"} 1
docker_container_running_state{name="docker_exporter"} 1
# wait ten seconds
$ curl -s localhost:9417/metrics | grep state
docker_container_running_state{name="foo"} 0
docker_container_running_state{name="docker_exporter"} 1

Cobertos · 2020-08-28T06:13:59Z

Healthchecking should be added to the above repo when prometheus-net/docker_exporter#11 is merged.

karugaru · 2021-03-09T08:18:19Z

To solve this issue, I created an application that exports the state of the container in Go language.

https://github.com/karugaru/docker_state_exporter
https://hub.docker.com/r/karugaru/docker_state_exporter

Try it if you like!

mariusleu · 2022-11-10T06:38:52Z

Can someone add this metric for monitoring HEALTCHECK of a container? (https://docs.docker.com/engine/reference/builder/#healthcheck)

Pandede · 2023-11-16T05:51:48Z

Is there any progress about this issue?

luanpaschoal · 2023-12-12T14:30:23Z

Up!

skast96 · 2023-12-14T14:11:51Z

Is there a way to do that in 2023 ?

mateuszdrab · 2024-05-28T00:21:53Z

I just had a play with the source code and I somewhat made a working poc which I commited to my fork

The metric is called container_health_state but I noticed it reports 0 when there is no health check so probably a better way to present this is needed.

container_health_state{container_label_org_opencontainers_image_created="",container_label_org_opencontainers_image_description="",container_label_org_opencontainers_image_licenses="",container_label_org_opencontainers_image_revision="",container_label_org_opencontainers_image_source="",container_label_org_opencontainers_image_title="",container_label_org_opencontainers_image_url="",container_label_org_opencontainers_image_version="",id="/user.slice/user-0.slice/[email protected]/init.scope",image="",name=""} 0 1716855349446

container_health_state{container_label_org_opencontainers_image_created="2024-05-08T20:07:29.227Z",container_label_org_opencontainers_image_description="Pi-hole in a docker container",container_label_org_opencontainers_image_licenses="NOASSERTION",container_label_org_opencontainers_image_revision="c2887aeffe4ac7d4d0730e739c4cd5a4ad40e958",container_label_org_opencontainers_image_source="https://github.com/pi-hole/docker-pi-hole",container_label_org_opencontainers_image_title="docker-pi-hole",container_label_org_opencontainers_image_url="https://github.com/pi-hole/docker-pi-hole",container_label_org_opencontainers_image_version="2024.05.0",id="/system.slice/docker-3dc08d059431db016cf7bf1065b11f600a8acd9b7b654ad59fd00596b891d9b1.scope",image="pihole/pihole:latest",name="Pi-hole-Redirect"} 1 1716855349134

Anyone wants to give it a try?

You'll have to compile cadvisor from source or I can provide a compiled binary

dashpole added the kind/enhancement label May 29, 2019

corhere mentioned this issue Dec 18, 2023

Prometheus Metric for Health Check Status moby/moby#46958

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prometheus metric for container healthcheck status #2166

prometheus metric for container healthcheck status #2166

replicajune commented Feb 6, 2019

dashpole commented Feb 6, 2019

replicajune commented Feb 6, 2019

xavs commented May 29, 2019

BulatSaif commented Oct 21, 2019

fontanacalifornia commented Nov 21, 2019

dashpole commented Nov 23, 2019

anil4u-04 commented Dec 2, 2019

serhiiromaniuk commented Feb 18, 2020 •

edited

Loading

serhiiromaniuk commented Feb 20, 2020

mbigras commented Apr 3, 2020

Cobertos commented Aug 28, 2020 •

edited

Loading

karugaru commented Mar 9, 2021 •

edited

Loading

mariusleu commented Nov 10, 2022

Pandede commented Nov 16, 2023

luanpaschoal commented Dec 12, 2023

skast96 commented Dec 14, 2023

mateuszdrab commented May 28, 2024 •

edited

Loading

prometheus metric for container healthcheck status #2166

prometheus metric for container healthcheck status #2166

Comments

replicajune commented Feb 6, 2019

dashpole commented Feb 6, 2019

replicajune commented Feb 6, 2019

xavs commented May 29, 2019

BulatSaif commented Oct 21, 2019

fontanacalifornia commented Nov 21, 2019

dashpole commented Nov 23, 2019

anil4u-04 commented Dec 2, 2019

serhiiromaniuk commented Feb 18, 2020 • edited Loading

serhiiromaniuk commented Feb 20, 2020

mbigras commented Apr 3, 2020

Cobertos commented Aug 28, 2020 • edited Loading

karugaru commented Mar 9, 2021 • edited Loading

mariusleu commented Nov 10, 2022

Pandede commented Nov 16, 2023

luanpaschoal commented Dec 12, 2023

skast96 commented Dec 14, 2023

mateuszdrab commented May 28, 2024 • edited Loading

serhiiromaniuk commented Feb 18, 2020 •

edited

Loading

Cobertos commented Aug 28, 2020 •

edited

Loading

karugaru commented Mar 9, 2021 •

edited

Loading

mateuszdrab commented May 28, 2024 •

edited

Loading