Prometheus is a tool/database that is used for monitoring. Prometheus adopts a pull based model in getting metrics data by querying each target defined in its configuration.
Prometheus scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs.
It stores all scraped samples locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts.
Grafana or other API consumers can be used to visualize the collected data.
Our light-4j Prometheus metrics handler collects the API runtime information and saves it to the Prometheus Metric data module. The Prometheus server will pull the metric from metrics_path which is configurable on the Prometheus config yml file (for example: /v1/prometheus) ; The pulling interval is second base on the config file (scrape_interval: 30s). A Grafana instance is hooked to Prometheus to output the metrics on dashboard from two different perspectives.
By default, Prometheus metric targets (nodes to monitor) are provided statically in the prometheus config file. Static targets are simple when we know in advance our infrastructure. But for the microservice architecture, the service API will have dynamic targets on the cloud. So our solution is use consul to provide targets for prometheus to monitor dynamically.
Consul is a service discovery tool by hashicorp, it allows services/virtual machines to be registered to it and then provide dns and http interfaces to query on the state of the registered services/virtual machines.
In Prometheus, we need to configure it using consul_sd_targets. Prometheus will then query the Consul http interface for the catalog of targets to monitor.
Configuration file (prometheus.yml) for the MiddlewareHandler (com.networknt.metrics.prometheus.PrometheusHandler)
// If metrics handler is enabled or not
// If the Prometheus hotspot is enabled or not.
// hotspot include thread, memory, classloader,...
And for each service API which you want to use Prometheus, there are several config changes that need to be made on handler.yml:
Prometheus is configured via a configuration file. The file is written in YAML format, defined by the Prometheus scheme.
Here is the sample config file. It can be found from the link:
# my global config
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'services'
- server: 'consul:8500'
# Skip verification until we have resolved why the certificate validation
# for the kubelet on API server nodes fail.
- source_labels: ['__meta_consul_service']
- source_labels: ['__meta_consul_tags']
- source_labels: ['__meta_consul_tags']
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
- job_name: 'prometheus'
- targets: ['localhost:9090']
Currently we are using https protocol:
for local environment, we ignore ssl check by setting in config file:
Prometheus Metric names and labels
Every time the series is uniquely identified by its metric name and a set of key-value pairs, also known as labels.
The metric name specifies the general feature of a system that is measured (e.g. http_requests_total - the total number of HTTP requests received). It may contain ASCII letters and digits, as well as underscores and colons. It must match the regex [a-zA-Z_:][a-zA-Z0-9_:]*.
Labels enable Prometheus’s dimensional data model: any given combination of labels for the same metric name identifies a particular dimensional instantiation of that metric (for example: all HTTP requests that used the method POST to the /api/tracks handler). The query language allows filtering and aggregation based on these dimensions. Changing any label value, including adding or removing a label, will create a new time series.
And the service “endpoint” and “clientId” are been added as Labels for Prometheus’s dimensional data model.
The sample result for the Prometheus metric will be like below: