Prometheus部署+邮箱告警+企业微信告警+钉钉告警

Prometheus部署+邮箱告警+企业微信告警+钉钉告警

1 部署Prometheus server
1.1 下载二进制包
1
$ wget https://github.com/prometheus/prometheus/releases/download/v2.12.0/prometheus-2.12.0.linux-amd64.tar.gz
1.2 解压并move至/work/admin目录下
1
2
3
$ tar zcvf prometheus-2.7.1.linux-amd64.tar.gz

$ mv prometheus-2.7.1.linux-amd64 /work/admin/prometheus
1.3 配置并启动
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
$ cat prometheus.yml

global:

scrape_interval: 15s # 默认抓取间隔, 15秒向目标抓取一次数据。

evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# scrape_timeout is set to the global default (10s).

# Alertmanager configuration

alerting:

alertmanagers:

- static_configs:

- targets:

- "localhost:9093"

# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

rule_files:

- "/usr/local/prometheus/rules/mysql*.rules"

# - "first_rules.yml"

# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:

# Here it's Prometheus itself.

scrape_configs:

# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

- job_name: 'prometheus'

# metrics_path defaults to '/metrics'

# scheme defaults to 'http'.

static_configs:

- targets: ['localhost:9090']

- job_name: 'linux'

static_configs:

- targets: ['localhost:9101']

labels:

instance: node1



- job_name: 'mysql'

static_configs:

- targets: ['192.168.1.11:9104']

labels:

instance: db1

$ /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus
2 部署node_exporter
2.1 下载二进制包
1
$ wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
2.2 解压并move至/usr/local/prometheus目录下
1
2
3
$ tar zcvf node_exporter-0.17.0.linux-amd64.tar.gz

$ mv node_exporter-0.17.0.linux-amd64 /usr/local/prometheus/node_exporter
2.3 启动
1
$ /usr/local/prometheus/node_exporter/node_exporter --web.listen-address=:9100
3 部署mysqld_exporter
3.1 下载二进制包
1
$ wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.11.0/mysqld_exporter-0.11.0.linux-amd64.tar.gz
3.2 解压并move至/usr/local/prometheus目录下
1
2
3
$ tar zcvf mysqld_exporter-0.11.0.linux-amd64.tar.gz

$ mv mysqld_exporter-0.11.0.linux-amd64 /usr/local/prometheus/mysqld_exporter
3.3 为mysqld_exporter配置MySQL用户并授权,启动
1
2
3
4
5
6
7
8
9
$ cat .my.cnf

[client]

user=mysqld_exporter

password=000000

$ /usr/local/prometheus/mysqld_exporter/mysqld_exporter --config.my-cnf=/usr/local/prometheus/mysqld_exporter/.my.cnf
4 部署alertmanager
4.1 下载二进制包
1
$ wget https://github.com/prometheus/alertmanager/releases/download/v0.16.1/alertmanager-0.16.1.linux-amd64.tar.gz
4.2 解压并move至/usr/local/prometheus目录下
1
2
3
$ tar zcvf alertmanager-0.16.1.linux-amd64.tar.gz

$ mv alertmanager-0.16.1.linux-amd64 /usr/local/prometheus/alertmanager
4.3 修改配置文件并启动
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
$ cat alertmanager.yml

global:

resolve_timeout: 5m

smtp_smarthost: 'smtp.163.com:25' # 邮箱smtp服务器代理

smtp_from: 'XXXXXX@163.com' # 发送邮箱名称

smtp_auth_username: 'XXXXX@163.com' # 邮箱名称

smtp_auth_password: 'XXXXXXXX' # 邮箱密码或授权码

templates:

- 'template/*.tmpl'

route:

group_by: ['alertname']

group_wait: 10s

group_interval: 10s

repeat_interval: 24h

receiver: 'ops_dingding'

receivers:

- name: 'email'

email_configs:

- to: 'XXXXX@163.com' # 接收警报的email配置

html: '{{ template "test.html" . }}' # 设定邮箱的内容模板

headers: { Subject: "[WARN] 报警邮件"} # 接收邮件的标题

- name: 'wechat'

wechat_configs:

- corp_id: 'XXXXX'

to_party: '1'

agent_id: '1000002'

api_secret: 'XXXXX'

- name: 'ops_dingding'

webhook_configs:

- url: 'http://localhost:8060/dingtalk/ops_dingding/send'

inhibit_rules:

- source_match:

severity: 'critical'

target_match:

severity: 'warning'

equal: ['alertname', 'dev', 'instance']

$ /usr/local/prometheus/alertmanager/alertmanager --config.file=/usr/local/prometheus/alertmanager/alertmanager.yml
5 prometheus通过webhook推送告警至钉钉
5.1 添加钉钉机器人,获取webhook

参考 https://open-doc.dingtalk.com/docs/doc.htm?treeId=257&articleId=105735&docType=1

5.2 下载插件(二进制文件)
1
$ wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v0.3.0/prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz
5.3 解压并move至/usr/local/prometheus目录下
1
2
3
$ tar zxvf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz

$ mv prometheus-webhook-dingtalk-0.3.0.linux-amd64/prometheus-webhook-dingtalk /usr/local/prometheus/alertmanager
5.4 编辑启动脚本(请替换为自己的webhook URL 及 ding.profile)
1
2
3
4
5
$ cat dingding_start.sh

nohup /usr/local/prometheus/alertmanager/prometheus-webhook-dingtalk --ding.profile="ops_dingding=https://oapi.dingtalk.com/robot/send?access_token=XXXXXXX" 2>&1 1>/usr/local/prometheus/alertmanager/dingding.log &

$ sh dingding_start.sh
5.5 编辑alertmanager.yml,增加web_hook配置并重启alertmanager
1
2
3
4
5
- name: 'ops_dingding'

webhook_configs:

- url: 'http://localhost:8060/dingtalk/ops_dingding/send'
-------------本文结束感谢您的阅读-------------