PoC: Bound Instruments: Add WithAttributes to instruments.#7790
PoC: Bound Instruments: Add WithAttributes to instruments.#7790dashpole wants to merge 14 commits intoopen-telemetry:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #7790 +/- ##
=======================================
- Coverage 86.1% 81.3% -4.8%
=======================================
Files 302 304 +2
Lines 22046 23439 +1393
=======================================
+ Hits 18992 19073 +81
- Misses 2674 3981 +1307
- Partials 380 385 +5
🚀 New features to boost your workflow:
|
8d8ccaa to
877267a
Compare
a8e61ed to
92a47d3
Compare
92a47d3 to
3c5d935
Compare
|
What about adding |
| } | ||
| http.HandleFunc("/", func(_ http.ResponseWriter, r *http.Request) { | ||
| apiCounter.Add(r.Context(), 1) | ||
| apiCounter.WithAttributes(attribute.String("http.method", r.Method)).Add(r.Context(), 1) |
There was a problem hiding this comment.
It seems like this will also need an externally managed sync.Pool to avoid the []attribute.KeyValue allocation here.
There was a problem hiding this comment.
Yes,. This is currently the "naive" benchmark case from the description, where users just use the API in the simplest way. The performance without the pooling optimization (in either case) does improve with the bound instrument pattern, but does still have an allocation as you point out:
- NoFilter + Naive: 881ns -> 528 ns; 5 allocs -> 1 allocs
- Filtered + Naive: 1608ns -> 537 ns; 7 allocs -> 1 allocs
There was a problem hiding this comment.
Sorry, I'm still not following the logic here.
The criticism of the current API is that users who want to use the API in a performant way need to manage an external sync.Pool or cache.
For users to use the proposed API here they will need to use an external sync.Pool or cache.
Why doesn't the same criticism apply to the new API?
There was a problem hiding this comment.
I don't consider sync.Pools and a sync.Map to be the same FWIW.
I would summarize my criticisms of the current API as:
- Slow with two allocations even when you use sync.Pools.
- Really slow, and has lots of allocations if you don't use sync.Pools.
The bound instrument API is:
- Fast, and has no allocations when you use sync.Pools.
- Slower, with one allocation, if you don't use sync.pools.
That is also an option. It will improve the performance of the "Precomputed" case in-line with this PR. It will not improve the performance of the "Dynamic" or "Naive" case as the attributes are still provided using the option pattern. |
Is this a real use case? If attributes are pre-computed why are they being added in line? |
This is not a valid assertion: #7770 (comment) |
|
Sorry, by "in-line" I just meant "similar to the results of". I.e. "It will improve the performance of the "Precomputed" case similar to the results of this PR." It would have this result:
|
I see, thanks for the clarification 👍 |
I believe my point is correct that moving the option from the Lets move the caching discussion to #7768, if that is OK, and make sure we agree on what our recommendation is for performant code with dynamic attributes. |
|
It seems like the motivation proposed in this PRs description is based on performance. I'm happy to have a consolidated discussion about performance in #7768, but it means that this PR needs to be motivated solely on ergonimics and indiomatic code which motivates #7790 (comment) even more. |
|
I just want to agree on how we measure the performance of our recommended usage of the API + SDK in #7768. This PR can still be motivated by performance, if it shows performance improvements over the current benchmarks. Once that is merged, we can have more productive discussions about alternatives. |
... well it seems like that means my comments would still apply(?). If we want to motivate this approach based on performance it needs to show clear and considerable performance improvements over other API designs, including the existing one, when both designs are used as optimal as possible in real use-cases. |
| b.Run("Dynamic/WithAttributeSet", func(b *testing.B) { | ||
| counter := testCounter(b, mp.provider()) | ||
| b.ReportAllocs() | ||
| b.RunParallel(func(pb *testing.PB) { | ||
| for pb.Next() { | ||
| // Wrap in a function so we can use defer. | ||
| func() { | ||
| attrsSlice := attrPool.Get().(*[]attribute.KeyValue) | ||
| defer func() { | ||
| *attrsSlice = (*attrsSlice)[:0] // Reset. | ||
| attrPool.Put(attrsSlice) | ||
| }() | ||
| appendAttributes(attrsLen, attrsSlice) | ||
| addOpt := addOptPool.Get().(*[]metric.AddOption) | ||
| defer func() { | ||
| *addOpt = (*addOpt)[:0] | ||
| addOptPool.Put(addOpt) | ||
| }() | ||
| set := attribute.NewSet(*attrsSlice...) | ||
| *addOpt = append(*addOpt, metric.WithAttributeSet(set)) | ||
| counter.Add(ctx, 1, *addOpt...) | ||
| }() | ||
| } | ||
| }) | ||
| }) |
There was a problem hiding this comment.
It seems like we are talking in circles at this point 😑
| b.Run("Dynamic/WithAttributeSet", func(b *testing.B) { | |
| counter := testCounter(b, mp.provider()) | |
| b.ReportAllocs() | |
| b.RunParallel(func(pb *testing.PB) { | |
| for pb.Next() { | |
| // Wrap in a function so we can use defer. | |
| func() { | |
| attrsSlice := attrPool.Get().(*[]attribute.KeyValue) | |
| defer func() { | |
| *attrsSlice = (*attrsSlice)[:0] // Reset. | |
| attrPool.Put(attrsSlice) | |
| }() | |
| appendAttributes(attrsLen, attrsSlice) | |
| addOpt := addOptPool.Get().(*[]metric.AddOption) | |
| defer func() { | |
| *addOpt = (*addOpt)[:0] | |
| addOptPool.Put(addOpt) | |
| }() | |
| set := attribute.NewSet(*attrsSlice...) | |
| *addOpt = append(*addOpt, metric.WithAttributeSet(set)) | |
| counter.Add(ctx, 1, *addOpt...) | |
| }() | |
| } | |
| }) | |
| }) | |
| var optCache sync.Map | |
| b.Run("Dynamic/WithAttributeSet", func(b *testing.B) { | |
| counter := testCounter(b, mp.provider()) | |
| b.ReportAllocs() | |
| b.RunParallel(func(pb *testing.PB) { | |
| for pb.Next() { | |
| // Wrap in a function so we can use defer. | |
| func() { | |
| attrsSlice := attrPool.Get().(*[]attribute.KeyValue) | |
| defer func() { | |
| *attrsSlice = (*attrsSlice)[:0] // Reset. | |
| attrPool.Put(attrsSlice) | |
| }() | |
| appendAttributes(attrsLen, attrsSlice) | |
| addOpt := addOptPool.Get().(*[]metric.AddOption) | |
| defer func() { | |
| *addOpt = (*addOpt)[:0] | |
| addOptPool.Put(addOpt) | |
| }() | |
| d := attribute.NewDistinct(attrs...) | |
| set, loaded := setCache.Load(d) | |
| if !loaded { | |
| set, _ = setCache.LoadOrStore(d, attribute.NewSet(attrs...)) | |
| } | |
| *addOpt = append(*addOpt, metric.WithAttributeSet(set)) | |
| counter.Add(ctx, 1, *addOpt...) | |
| }() | |
| } | |
| }) | |
| }) |
There was a problem hiding this comment.
I'm going to block this so it is clear that as a Maintainer of this code I do not agree this is the direction we should take the API.
This is adding an additional and orthogonal approach to configuring attributes being recorded during measurements. It is contrary to our policy and all other users of our options.
The conversation on performance motivating this PR remains hyperbolic. I have not found it an effective argument. I do not see it motivating this change as the argument has been laid out.
In regards to prototyping an API change that would motivate bound attributes at the OpenTelemetry specification level, I see this proposal as a viable alternative. I can understand the value of bound instruments and this PR review in no way should be understood as opposition to finding a solution that would support this feature.
This does not address the issue described in #7743 at all. That proposal would improve the precomputed case in similar fashion to this PoC, but doesn't alleviate the core issue i'm interested in.
I'm sorry. I've put the cart before the horse a bit here. From my point of view, we just have differences of opinion about what the correct apples-to-apples benchmarks are. I thought I was on solid footing copying guidance from our contributing guide, but clearly you disagree. If you accept my benchmarks, then these results are actually really good, and (IMO) would justify a new API method. But yeah, if all of this was just to shave the precomputed case from 42 ns to 12 ns, it wouldn't be enough of a performance improvement to be worth it. Please take a look at #7743 (comment) when you have time. If we can't find agreement there, then we can close the issue and related PRs and focus on more productive things. |
Prototype for open-telemetry/opentelemetry-specification#4835
Notes for reviewers
metric/example_test.goTODO:
Benchmarks