Prometheus监控Zookeeper

这里只是对prometheus监控Zookeeper的配置过程做个记录,方便以后查阅。

  1. 使用jvm_exporter监控,jvm_exporter是一个可以配置抓取和暴露JMX目标的mBeans的收集器。
    github地址:https://github.com/prometheus/jmx_exporter

  2. 使用zookeeper_exporter监控

    github地址:https://github.com/jiankunking/zookeeper_exporter

1、使用 jvm_exporter 监控

1.1 下载jmx_exporter

新建一个目录,用于存放jmx_exporter和config配置

1
2
3
[admin@haifly-bj-kafka1 ~]$ mkdir jmx_exporter
[admin@haifly-bj-kafka1 ~]$ cd jmx_exporter
[admin@haifly-bj-kafka1 jmx_exporter]$ wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.3.1/jmx_prometheus_javaagent-0.3.1.jar

1.2 添加名为zookeeper.yml的配置文件

github地址:https://github.com/prometheus/jmx_exporter/blob/master/example_configs/zookeeper.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[admin@haifly-bj-kafka1 jmx_exporter]$ vim zookeeper.yml

rules:
# replicated Zookeeper
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+)><>(\\w+)"
name: "zookeeper_$2"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+)><>(\\w+)"
name: "zookeeper_$3"
labels:
replicaId: "$2"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(\\w+)"
name: "zookeeper_$4"
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+), name3=(\\w+)><>(\\w+)"
name: "zookeeper_$4_$5"
labels:
replicaId: "$2"
memberType: "$3"
# standalone Zookeeper
- pattern: "org.apache.ZooKeeperService<name0=StandaloneServer_port(\\d+)><>(\\w+)"
name: "zookeeper_$2"
- pattern: "org.apache.ZooKeeperService<name0=StandaloneServer_port(\\d+), name1=InMemoryDataTree><>(\\w+)"
name: "zookeeper_$2"

1.3 修改bin/zkServer.sh启动配置

1
2
3
4
5
6
7
8
9
10
11
12
13
if [ "x$SERVER_JVMFLAGS"  != "x" ]
then
JVMFLAGS="$SERVER_JVMFLAGS $JVMFLAGS"
fi

## 新增javaagent
JMX_DIR="/work/admin/jmx_exporter"
JVMFLAGS="$JVMFLAGS -javaagent:$JMX_DIR/jmx_prometheus_javaagent-0.3.1.jar=52181:$JMX_DIR/zookeeper.yml"

if [ "x$2" != "x" ]
then
ZOOCFG="$ZOOCFGDIR/$2"
fi

1.4 重启zookeeper服务

1
/work/admin/logstash_exporter/bin/logstash_exporter --web.listen-address :9198 --logstash.endpoint http://localhost:9600

1.5 prometheus配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
- job_name: 'kafka'
static_configs:
- targets: ['192.168.6.213:9100']
labels: {cluster: 'product',type: 'basic',env: 'kafka',job: 'kafka1',export: 'kafka'}
- targets: ['192.168.6.213:9308']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka1',export: 'kafka_exporter'}
- targets: ['192.168.6.213:52181']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka1',course: "zookeeper",export: 'jmx_exporter'}
- targets: ['192.168.3.94:9100']
labels: {cluster: 'product',type: 'basic',env: 'kafka',job: 'kafka2',export: 'kafka'}
- targets: ['192.168.3.94:9308']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka2',export: 'kafka_exporter'}
- targets: ['192.168.3.94:52181']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka2',course: "zookeeper",export: 'jmx_exporter'}
- targets: ['192.168.5.245:9100']
labels: {cluster: 'product',type: 'basic',env: 'kafka',job: 'kafka3',export: 'kafka'}
- targets: ['192.168.5.245:9308']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka3',export: 'kafka_exporter'}
- targets: ['192.168.5.245:52181']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka3',course: "zookeeper",export: 'jmx_exporter'}

重启prometheus

如果开启了api管理功能,可以如下方式热加载配置

1
curl -X POST http://127.0.0.1:9090/-/reload

1.6 grafana出图

10607
效果如下

上图面板经过一定的修改可能略有不同,仅供参考

2、使用 zookeeper_exporter 监控

2.1 下载 zookeeper_exporter

1
2
3
[admin@haifly-bj-kafka1 ~]$ mkdir zookeeper_exporter
[admin@haifly-bj-kafka1 ~]$ cd zookeeper_exporter
[admin@haifly-bj-kafka1 ~]$ wget https://github.com/carlpett/zookeeper_exporter/releases/download/v1.0.2/zookeeper_exporter

ZooKeeper 提供了四字命令(The Four Letter Words),用来获取 ZooKeeper 服务的当前状态及相关信息。

有哪些命令可以使用?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ZooKeeper四字命令	功能描述
conf 打印配置
cons 列出所有连接到这台服务器的客户端全部连接/会话详细信息。包括"接受/发送"的包数量、会话id、操作延迟、最后的操作执行等等信息。
crst 重置所有连接的连接和会话统计信息。
dump 列出那些比较重要的会话和临时节点。这个命令只能在leader节点上有用。
envi 打印出服务环境的详细信息。
reqs 列出未经处理的请求
ruok 即"Are you ok",测试服务是否处于正确状态。如果确实如此,那么服务返回"imok",否则不做任何相应。
stat 输出关于性能和连接的客户端的列表。
srst 重置服务器的统计。
srvr 列出连接服务器的详细信息
wchs 列出服务器watch的详细信息。
wchc 通过session列出服务器watch的详细信息,它的输出是一个与watch相关的会话的列表。
wchp 通过路径列出服务器watch的详细信息。它输出一个与session相关的路径。
mntr 输出可用于检测集群健康状态的变量列表

zookeeper_exporter 指标及含义

使用 echo mntr | nc zookeeper_ip 2181 查看

指标含义
zk_version版本
zk_avg_latency平均响应延迟
zk_max_latency最大响应延迟
zk_min_latency最小 响应延迟
zk_packets_received收包数
zk_packets_sent发包数
zk_num_alive_connections活跃连接数
zk_outstanding_requests堆积请求数
zk_server_state主从状态
zk_znode_countznode 数
zk_watch_countwatch 数
zk_ephemerals_count临时节点数
zk_approximate_data_size近似数据总和大小
zk_open_file_descriptor_count打开文件描述符数
zk_max_file_descriptor_count最大文件描述符数
leader主节点
zk_followersFollower数
zk_synced_followers已同步的Follower数
zk_pending_syncs阻塞中的sync操作

2.2 配置并启动 zookeeper_exporter

–help 查看有哪些可用命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[admin@haifly-bj-kafka1 zookeeper_exporter]$ ./zookeeper_exporter --help
Usage of ./zookeeper_exporter:
-bind-addr string
bind address for the metrics server (default ":9141")
-log-level string
log level (default "info")
-metrics-path string
path to metrics endpoint (default "/metrics")
-reset-on-scrape
should a reset command be sent to zookeeper on each scrape (default true)
-version
show version and exit
-zookeeper string
host:port for zookeeper socket (default "localhost:2181")

配置 zookeeper_exporter

1
2
3
4
5
[program:zookeeper_exporter]
command=/work/admin/zookeeper_exporter/zookeeper_exporter -bind-addr :9141 -zookeeper localhost:2181
directory=/work/admin/zookeeper_exporter
redirect_stderr=true
autorestart=true

启动 zookeeper_exporter

我用的supervisor来管理服务进程,使用supervisorctl工具进入管理控制台update就行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[admin@haifly-bj-kafka1 ~]$ supervisorctl 
suServer requires authentication
Username:supervisor
Password:

kafka RUNNING pid 26081, uptime 5:14:07
kafka_exporter RUNNING pid 13039, uptime 219 days, 19:50:05
ll-register RUNNING pid 12575, uptime 219 days, 20:06:43
node_exporter RUNNING pid 12606, uptime 219 days, 20:06:43
zookeeper RUNNING pid 26040, uptime 5:14:23
supervisor> update
zookeeper_exporter: added process group
supervisor> status
kafka RUNNING pid 26081, uptime 5:14:10
kafka_exporter RUNNING pid 13039, uptime 219 days, 19:50:08
ll-register RUNNING pid 12575, uptime 219 days, 20:06:46
node_exporter RUNNING pid 12606, uptime 219 days, 20:06:46
zookeeper RUNNING pid 26040, uptime 5:14:26
zookeeper_exporter RUNNING pid 26980, uptime 0:00:20
supervisor>

2.3 prometheus 配置

1
2
3
4
5
6
7
8
9
10
- job_name: 'kafka'
static_configs:
- targets: ['192.168.6.213:9100']
labels: {cluster: 'product',type: 'basic',env: 'kafka',job: 'kafka1',export: 'kafka'}
- targets: ['192.168.6.213:9308']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka1',export: 'kafka_exporter'}
- targets: ['192.168.6.213:52181']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka1',course: "zookeeper",export: 'jmx_exporter'}
- targets: ['192.168.6.213:9141']
labels: {cluster: 'product',type: 'middle',env: 'kafka',job: 'kafka1',course: "zookeeper",export: 'zookeeper_exporter'}

重启prometheus

1
curl -X POST http://127.0.0.1:9090/-/reload

2.4 grafana 出图

推荐使用9236

-------------本文结束感谢您的阅读-------------