Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELK support: understanding metrics, displaying graph in kibana4, tuning #951

Open
gregbkr opened this issue Nov 4, 2015 · 1 comment
Open

Comments

@gregbkr
Copy link

gregbkr commented Nov 4, 2015

Hello everyone,

First, thank you cAdvisor team for making that product available for Docker infra, it's a great solution for monitoring and so easy to deploy! :-)

I am implementing Cadvisor(latest version) with ELK support (ElasticSearch 1.7|Logstash 1.5.3|Kibana 4) on docker.
It is working so far and I can get some graphs in Kibana4. Please excuse me if I am in the wrong place for these questions :-O. I would need help on several points:

  1. MACHINE_NAME ID
    field machine_name is represented as the ID of the cAdvisor container, could be great if we could have the name of the container... (same of what you did with the field Container_Name).
  2. LIMITS
    How to find limits - CPU, RAM, network - available in order to set the graph limit, or to display percentages? I can't find in the logs these values, but I can see it is represented on the cAdvisor live webpage.
  3. GRAPH VALUES LOGIC
    Understanding container_Name and graphs, can you confirm the following?
    "/" = root = /docker + /user + other_proccess_running_locally
    "/docker" = container1 + container2 + etc
    "/user" = user session
    For CPU it seems to work.
    cpu
    For RAM: /docker is nearly null while some containers got 4GB of RAM. So my guesses are false.
    mem
    For page fault and network: cAdvisor container surpasses the root "/". So root don't represent the sum of all containers :-(
    net
  4. CPU METRIC
    How to explain CPU usage metric?
    I can see cadvisor container stats.cpu.usage.total=30,000,000,000,000.
    "/" (root) seems to display: cpu.usage.total=180,000,000,000,000 (and still growing)
    I found that cpu.usage = The usage value is the delta of cumulative CPU usage from the beginning of the minute to the end of the minute.
    On my server: cat /proc/cpuinfo
    bogomips : 5200.18 X 2 procs = 10 400.36 instruction per second --> 624 021.6 instructions per minute maximum.
    So I don't know how I can have a so big number for a container...
    cpu
  5. PAGEFAULT & TUNING
    CAdvisor making lots of page fault and usually crash after few days? Is there some tuning you recommend?
  6. DATA IN ARRAY:
    I see the available field for eth0, but nothing for eth1. I am not sure what I am doing wrong.
    eth1
    Same for other metric: (IO, filesystem), I can see the field in an array, but I can't do any visu with them, and kibana say "array is not well supported"
    io_disk

Anyway, as I search help, I am glad to help in return too by testing any recommendations ;-)
You can build and access a test environment in Docker in few clicks on : https://github.com/gregbkr/docker-elk-cadvisor-dashboards

Thanks a lot for your support and very good week there!

Regards,
Greg.

@jchauncey
Copy link

cpu is an ever growing counter. You need to take the derivative (rate of change) when you graph that value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants