Skip to content

Metrics monitor

On-device metrics monitor

The metrics monitor (available from qbee-agent version 2024.36) allows simple monitoring of certain system metrics by comparing them to a set threshold. If the system metric exceeds the threshold, a log message with severity WARN will be sent to the qbee backend. This log message can again result in a user notification if that has been configured.

Qbee allows for monitoring of many of the same metrics as we display graphs for. One of the key advantages with metrics monitoring is that it does not depend on metrics collection being switched on. This allows for monitoring metrics on the devices without constantly sending metrics data to the backend. This reduces the amount of data travelling over the network and might in turn result in a substantial saving for high cost network connectivity.

How does it work?

When a system metric is equial or has exceeded a configured threshold, the qbee-agent will create a log entry and store the state of the metric. System metrics that are in triggered state will not be considered on subsequent qbee-agent runs. Once the system metric falls below the configured threshold, the metric state will be cleared. Re-configuring the system metric threshold will also clear the state.

Available system metrics

cpu:user

Description:    The percentage of time cpu spends in user space
Treshold value: 0-100
Resource id:    none

cpu:system

Description:    The percentage of time cpu spends in system space
Treshold value: 0-100
Resource id:    none

cpu:iowait

Description:    The percentage of time cpu waits for I/O
Treshold value: 0-100
Resource id:    none

memory:memutil

Description:    The percentage of total memory currently in use
Treshold value: 0-100
Resource id:    none

memory:swaputil

Description:    The percentage of total swap currently in use
Treshold value: 0-100
Resource id:    none

filesystem:use

Description:    The percentage of filesystem use of a certain partition
Treshold value: 0-100
Resource id:    required and needs to be a filesystem mountpoint (eg. / or /data)

loadavg_weighted:1min

Description:    The system load average for the last minute
Treshold value: 0-
Resource id:    none

loadavg_weighted:5min

Description:    The system load average for the last 5 minutes
Treshold value: 0-
Resource id:   none

loadavg_weighted:15min

Description:    The system load average for the last 15 minutes
Treshold value: 0-
Resource id:    none

network:rx_bytes

Description:    Received bytes on a network interface between agent intervals
Treshold value: 0-
Resource id:    required and needs to be a configure network interface (eg. eth0)

network:rx_bytes

Description:    Transmitted bytes on a network interface between agent intervals
Treshold value: 0-
Resource id:    required and needs to be a configure network interface (eg. eth0)

temperature:temperature

Description:    Temperature in Celsius reported by temperature sensors 
Treshold value: 0-
Resource id:    required and currently can only have the value cpu_temp

Example: Setting a threshold for cpu:user

On the following screenshot we define 30% load for the cpu:user metrics.

Set cpu:user metric to 30%

Log messages example:

Set cpu:user metric to 30%