Skip to content

Periodically reset ZooKeeper latency stats #1188

@jrhee17

Description

@jrhee17

We record zookeeper related stats as they are useful for troubleshooting issues.
Latency related stats are valuable because they can indicate whether zookeeper message processing is a bottleneck.

TimeGauge.builder("replica.zk.latency", this, TimeUnit.MILLISECONDS,
self -> serverStats(self).getAvgLatency())
.tag("type", "avg")
.register(meterRegistry);
TimeGauge.builder("replica.zk.latency", this, TimeUnit.MILLISECONDS,
self -> serverStats(self).getMaxLatency())
.tag("type", "max")
.register(meterRegistry);
TimeGauge.builder("replica.zk.latency", this, TimeUnit.MILLISECONDS,
self -> serverStats(self).getMinLatency())
.tag("type", "min")
.register(meterRegistry);

It seems like latency related values are never reset (e.g. max latency represents the max latency of all requests since server start)
It may be worth resetting the latency via serverStats(self).resetLatency() so that we can visualize the latency of recent requests.

Note that unlike Micrometer which applies a recency-bias, this is a hard reset - hence it may be difficult to add alerts without false-positives. (but we weren't really looking at this metric before anyways)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions