https://grafana.com/grafana/dashboards/11962
peometheus里的这个dashboard直接拿sum(kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{job=\"$job\",instance=~\"$broker\"})by(instance)
统计的延迟,感觉不太对吧
kafka jmx metric里显示这个指标格式
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{quantile="0.50"} 1.0
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{quantile="0.75"} 1.0
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{quantile="0.95"} 4.0
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{quantile="0.98"} 14587.7
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{quantile="0.99"} 17068.0
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms{quantile="0.999",} 17068.0
看一篇文章里介绍quantile:假设0.9-quantile的值为120,意思就是所有的采样值中,小于120的采样值的数量占总体采样值的90%.
https://cloud.tencent.com/developer/news/319419
看来不能单纯作为延迟值来看…