Metrics
Tracking metrics allows for deep understanding in how the system is operating and creates baseline understanding on what is normal. Consistent tracking of metrics can help in early anomaly detection.
The Integrity application, if configured to do so, generates a number of metrics. Each metric has a name and a type, e.g. Counter, Gauge, or Timer. The name and type determines a number of different time series exposed by a particular metric. These time series can be monitored and analyzed, using supported monitoring backends. Currently, Integrity supports JMX and Prometheus.
The type, naming conventions and number of time series generated from a particular metric may vary depending on the selected monitoring backend.
Monitoring systems
JMX
The JVM runtime can be monitored using standard JMX tools, like JConsole and Java Mission Control. These tools displays, among many other things, information about the metrics generated by Integrity.
To instrument the application to provide JMX metrics, activate the JMX
scenario. The following example shows how to start a server that activates JMX metrics and allows a JMX client to connect remotely from 192.168.0.10 to port 7091 on the server host:
JMX clients can always connect to the JVM from localhost, but if the JMX client is running on a remote host (e.g. if the server is running inside a Docker container and the client on the Docker host), the JMX_REMOTE
scenario must be activated.
Prometheus
To instrument the application to provide metrics to the Prometheus monitoring system, activate the PROMETHEUS scenario. The following example shows how to start a server that instruments Prometheus metrics:
The application activates a HTTP service exposing metrics that can be scraped by the Prometheus application on the following URL:
The table below lists the base names of all available metrics. Each metrics type is exposed by Prometheus as follows:
Counter: a value that can only be incremented. Exposes the single time series
<base_name>
_total.
Gauge: a value that can go up and down. Exposes the single time series
<base_name>
_max.
Timer: a value that represents a sampled value like durations and latencies. Exposes the time series
<base_name>_count
and<base_name>_sum
.
Example: the metric ffid_pipes_processed_duration
is a Timer, describing time spent processing successful requests. This metric generates the time series
ffid_pipes_processed_duration_seconds_sum
ffid_pipes_processed_duration_seconds_count
Using the following Prometheus queries, we can graph the most commonly used statistics about timers.
Average latency:
Throughput (requests per second):
Available Metrics
Integrity
Here is a list of fixed metrics exposed by the Integrity application.
Dynamic metrics
Apart from fixed metric names Integrity exposes a number of dynamic metric names. What tis means is that metrics will be exposed based on naming and configuration of the current system setup. The idea behind this is to allow for detailed monitoring if desired.
Dynamic metrics, in general, comes from modules that have outbound relations (http, ldap sql ,smtp etc), Pipes module (valves) & authenticators. Details in regards to metris is found in respective section av the documentation.
Vert.x
The Integrity application is largely built on Eclipse Vert.x. Vert.x generates metrics related to the Vert.x event bus, monitoring low-level activity on the application level.
Metrics generated by Vert.x can also help monitor and analyse incoming/outgoing traffic over various protocols, for example:
HTTP activity
TCP activity
Datagram transmission and reception
Metrics exposed by Eclipse Vert.x are documented here:
https://vertx.io/docs/vertx-micrometer-metrics/java/#_vert_x_core_tools_metrics