14 Ekim 2022 Cuma

Prometheus prometheus.yml Dosyası

Giriş
Açıklaması şöyle
The default configuration file has four main configuration sections: globalalerting, rule_files, and scrape_configs.
1. global Alanı
Açıklaması şöyle
The global section contains global configurations for the entire Prometheus.

The field scrape_interval defines how long Prometheus will pull data once, we specify 15 seconds above.

The field evaluation_interval defines how long Prometheus will re-evaluate the rule once, temporarily we do not need to care about this configuration.
Örnek
Şöyle yaparız
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9090']
Örnek
Şöyle yaparız
# my global config
global:
  # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  scrape_interval: 15s 
  # Evaluate rules every 15 seconds. The default is every 1 minute.
  evaluation_interval: 15s 
  # scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 
#'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this 
  # config.
  - job_name: "prometheus"
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["localhost:9090"]
2. alerting Alanı
Açıklaması şöyle
The alerting section contains the configuration of the tool that we will send an alert to if there is a problem with our system, as mentioned above for Prometheus, we will use Alertmanager. Right now we don’t need to use an alert, so we close it with a #.
3. rule_files Alanı
Açıklaması şöyle
The rulefiles section will contain a configuration that defines the rule when Prometheus will need to fire an alert via the Alertmanager and the rules about recording, which we will learn about later. 
Açıklaması şöyle
Alerting rules are used to define conditions under which alerts are triggered. As an essential part of monitoring and reliability engineering, you can set up notifications via various channels such as email, Slack, or Squadcast to help detect and resolve issues before they become critical.

In this case, the rule_files field points to a directory containing alert rules, which define the conditions under which alerts are triggered. Triggered alerts get sent to the specified Alertmanager targets, which you can further configure to send notifications to various channels, such as email or the Squadcast platform.

Örnek
Şöyle yaparız
global:
  scrape_interval: 15s
  evaluation_interval: 1m

rule_files:
  - /etc/prometheus/rules/*.rules

scrape_configs:
  - job_name: 'darwin-service-1'
    scrape_interval: 5s
    static_configs:
      - targets: ['darwin-service-1:80']
    relabel_configs:
      - source_labels: [job]
        target_label: job
        replacement: 'darwin-new-service'
    resources:
      requests:
        memory: 2Gi
        cpu: 1
      limits:
        memory: 4Gi
        cpu: 2

    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          - alertmanager:9093
  - job_name: 'darwin-service-2'
    scrape_interval: 10s
    static_configs:
      - targets: ['darwin-service-2:80']
    resources:
      requests:
        memory: 1Gi
        cpu: 0.5
      limits:
        memory: 2Gi
        cpu: 1

    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          - alertmanager:9093
4. scrape_configs Alanı

Consul
Örnek
Şöyle yaparız. Burada consul_sd_configs ve my-service isimli servisler kullanılıyor
# my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).

scrape_configs:
# consul job
- job_name: 'consul'
  consul_sd_configs:
  - server: 'apm-showroom-consul-server:8500'
    services:
    - my-service
  metrics_path: '/actuator/prometheus'
  relabel_configs:
  - source_labels: [__meta_consul_service_id]
    regex: 'my-service-(.*)-(.*)-(.*)'
    replacement: 'my-service-$1'
    target_label: node
  - source_labels: [__meta_consul_service_id]
    target_label: instance
# prometheus job
- job_name: 'prometheus'
  static_configs:
  - targets: 
    - 'apm-showroom-prometheus:9090'
Açıklaması şöyle
Prometheus scrape all the instances related to the service my-service by resolving them dynamically.

No need to know the ip address nor the hostname nor the port.

You can see in the prometheus.yml configuration that we create a new label node thanks to the relabel_config this label will have the node name without the reference to the metrics type.



Hiç yorum yok:

Yorum Gönder