0% found this document useful (0 votes)
114 views

Monitoring Networks With Prometheus: Š Tefan Šafár CDN Engineer

Štefan Šafár is a CDN engineer at Showmax who uses Prometheus for cloud-native monitoring. Prometheus is an open-source time-series database that stores floating-point values at regular intervals and allows for powerful querying of metrics with labels. It integrates well with Showmax's stack and allows more capabilities than other monitoring systems. Štefan provides examples of PromQL queries used at Showmax and links to Grafana dashboards they use to visualize Prometheus data.

Uploaded by

edwardmaverick
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views

Monitoring Networks With Prometheus: Š Tefan Šafár CDN Engineer

Štefan Šafár is a CDN engineer at Showmax who uses Prometheus for cloud-native monitoring. Prometheus is an open-source time-series database that stores floating-point values at regular intervals and allows for powerful querying of metrics with labels. It integrates well with Showmax's stack and allows more capabilities than other monitoring systems. Štefan provides examples of PromQL queries used at Showmax and links to Grafana dashboards they use to visualize Prometheus data.

Uploaded by

edwardmaverick
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Monitoring networks with

Prometheus

Štefan Šafár
CDN Engineer
@som_zlo

@ShowmaxDevs https://tech.showmax.com
Who am I?
● I’m Štefan Šafár
● CDN Engineer @ Showmax
● We deliver tens of Gbit/s
● Prometheus user since 2015
● Used to do security, networks and
cloud infrastructure
● Usually based in Prague

@ShowmaxDevs https://tech.showmax.com
Contents
● What is Prometheus
● Why we use it
● Query examples & dashboards

@ShowmaxDevs https://tech.showmax.com
@ShowmaxDevs https://tech.showmax.com
What is Prometheus
● Time-series database
● Stores floating-point values every X seconds
● Raw data - no aggregation
● Powerful query language
● Can sum/average/add/multiply any data
● Labels allow you to slice the data
● Exporters for different services (i.e. SNMP)

@ShowmaxDevs https://tech.showmax.com
Why Prometheus
● Cloud-native monitoring
● Integrates very well with the rest of our stack
● Ops use it already - one system to rule them all
● It allows you to do more stuff more easily
● Everything else* sucks

* that I know of

@ShowmaxDevs https://tech.showmax.com
PromQL Examples
● arista_port_outOctets{description=~".*NAP.*"}
● rate(arista_port_outOctets{description=~".*NAP.*"}[3m])
● rate(arista_port_outOctets{description=~".*NAP.*"}[3m])*8
● sum(rate(arista_port_outOctets{description=~".*NAP.*"}[3m]
)*8)
● arista_port_outOctets{mtu!="1500"}
● (arista_tcam_used / arista_tcam_total)*100

@ShowmaxDevs https://tech.showmax.com
PromQL Examples
● sum(rate(arista_port_outOctets{description=~".*NAP.*"}[3m]
))*8 -
sum(rate(arista_port_outOctets{description=~".*NAP.*"}[3m]
offset 1d))*8
● arista_sfp_alarms
● arista_sfp_alarms AND ON (device, instance) arista_admin_up
== 0

@ShowmaxDevs https://tech.showmax.com
PromQL Examples
● quantile_over_time(0.99,rate(ifHCOutOctets{ifAlias="600_P2P
-CRESTA-OFFICE"}[3m])[1h:])*8
● quantile_over_time(0.95,rate(ifHCOutOctets{ifAlias=~".*OPTI
NET.*"}[3m])[1w:])*8
● quantile_over_time(0.95,sum by
(instance)(rate(ifHCOutOctets{ifAlias=~".*OPTINET.*"}[3m]))
[1w:])*8

@ShowmaxDevs https://tech.showmax.com
PromQL Examples
● (arista_tcam_used / arista_tcam_total)*100
● irate(arista_port_inOctets[5m]) /
irate(arista_port_inUcastPkts[5m]) < 2000
● arista_admin_up != arista_l2_up
● arista_sfp_stats{sensor="rxPower"}
● arista_sfp_stats{sensor="rxPower"} AND on(device, instance)
(arista_admin_up == 1)

@ShowmaxDevs https://tech.showmax.com
Grafana dashboards
● https://grafana.showmax.cc/d/vvJSOdkWk/sfp-inventory?or
gId=1
● https://grafana.showmax.cc/d/OZmQd16ik/bgp-status?orgId
=1
● https://grafana.showmax.cc/d/kduYH-DWz/sfp-receive-pow
er?orgId=1

@ShowmaxDevs https://tech.showmax.com
Summary
● SNMP sucks
● Prometheus is awesome
● Grafana is awesome
● You are awesome

@ShowmaxDevs https://tech.showmax.com
THANK YOU!
Get in touch!

Štefan Šafár
som_zlo

@ShowmaxDevs https://tech.showmax.com
Additional links
● Data source for most of the queries used in Examples:
https://github.com/Showmax/arista-eos-exporter
● Blogpost about Prometheus
https://tech.showmax.com/2019/10/prometheus-introducti
on/

@ShowmaxDevs https://tech.showmax.com

You might also like