http/client: Count and export integrated queue length#3259
http/client: Count and export integrated queue length#3259xemul wants to merge 1 commit intoscylladb:masterfrom
Conversation
|
@nyh , please review |
nyh
left a comment
There was a problem hiding this comment.
Looks good to me, this is the first time I see this integrated-length.hh (I guess I've been sleeping under a rock), so I'm not 100% sold on its usefulness, but it does sound like it might be useful.
I think there's a small problem here. I'm clicking "approve", because I don't think it matters much, but please consider it anyway:
Looking at the integrated_length, it seems it has a third optional template parameter, Resolution which defaults to nanoseconds. I don't see any reason to keep nanosecond resolution, when the lowres_clock's resolution is milliseconds. So I think the "Resolution" can, and should, be milliseconds, not nanoseconds.
By the way, why doesn't the resolution defaults to taking the resolution of the clock itself instead of a separate parameter?
Well, it's two fold. On one hand, HTTP requests are not expected to come at nanoseconds frequency, that's true. On the other hand, unless the update frequency exceeds the resolution, it only affects the internal 64-bit integral value overflow rate. For observed value in the range of [0,1000], even using nanoseconds resolution the calculated 64-bit integral would overflow in ~6 months.
Because both, steady_clock and lowres_clock in seastar have the same |
f9b9b90 to
f12f99d
Compare
|
upd:
|
When I suggested to use milliseconds, I was for some reason convinced that lowres_clock actually has milliseconds resolution. Now I see you were right, it has the same resolution as the regular system clock, and increments the time every task_quota, which concievably maybe less than one millisecond... So maybe I was wrong, and it's better to use microseconds not milliseconds? Although milliseconds is probably good enough - and will allow integration of larger numbers over longer time periods. So I don't know. |
|
The integrated_value Resultion "interacts" with the frequency of checkpoints. If they happen faster than two subsequent resolution tick, they would get squashed. For http client we don't expect requests to get queued as fast as once-per-less-than-millisecond. |
So would you like to leave it millisecond or return it to microsecond? Up to you. When I suggested millisecond, I was under the mistaken impression that this is the resolution we already have. |
There's an implicit queue of requests in the client -- when the connection pool exhausts the client waits for a free socket to become available. Since there can be many parallel fibers trying to call .make_request(), this forms a queue of requests. This patch counts the number of "queued" requests into the integrated length value merged recently. Checkpoint happens once the request gets its socket and proceeds. Exporting should be performed by the caller. For that the client provides const reference to the integrated length value itself. It's preferred that called exports .integral() value as COUNTER metrics and rate()-s it on the dashboard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
f12f99d to
a188ce8
Compare
|
I think microseconds is safer yet harmless (pushed the updated patch) |
There's an implicit queue of requests in the client -- when the connection pool exhausts the client waits for a free socket to become available. Since there can be many parallel fibers trying to call .make_request(), this forms a queue of requests.
This patch counts the number of "queued" requests into the integrated length value merged recently. Checkpoint happens once the request gets its socket and proceeds.
Exporting should be performed by the caller. For that the client provides const reference to the integrated length value itself. It's preferred that called exports .integral() value as COUNTER metrics and rate()-s it on the dashboard.