Introduce built-in swift-distributed-tracing support#857
Introduce built-in swift-distributed-tracing support#857ktoso merged 20 commits intoswift-server:mainfrom
Conversation
czechboy0
left a comment
There was a problem hiding this comment.
Approach looks good assuming we really need the trait. I'd also be totally open to this being added without a trait, like logging - curious what others think. From my read a lot of the complexity in this PR is caused by that trait, so removing it would simplify.
Also regarding where the tracing happens, the v1 of this could just be a structured span inside execute, maybe?
I assume folks would be okay only getting spans when using the async/await API, and not the pre-concurrency one.
Sources/AsyncHTTPClient/NIOTransportServices/NWErrorHandler.swift
Outdated
Show resolved
Hide resolved
|
Actually, in case folks would like to see the super naive way to do tracing in AHC, if having the trait is not a requirement, here's what it could look like: #858 |
|
I'm ok with making the simplest possible span at first here, we can do that at first. I also am trying to land the I'm memory tracer so this pr would not have to have a copy paste one anymore |
8f2b2cb to
886db6f
Compare
26147af to
ca59a21
Compare
|
I pushed this along some more, cleaned up and added some tests that cover async execute, the ELF based get/post. I did have to dance around the availability problems a bit... Perhaps there's some cleaner ways to pull it off... This is nearing something that might be acceptable, please review folks. |
Sources/AsyncHTTPClient/NIOTransportServices/NWErrorHandler.swift
Outdated
Show resolved
Hide resolved
…raits" This reverts commit 886db6f.
|
Seems we may have overall agreement on the approach, so I'm going to add some more test cases to make sure we're handling errors and cancellations well -- and ping again once this is more complete :-) |
8a6fce9 to
91ea66e
Compare
Co-authored-by: George Barnett <gbarnett@apple.com>
Co-authored-by: George Barnett <gbarnett@apple.com>
|
FYI; Hmm, that failure is a warning/as-error in a file this PR didn't touch. |
|
Since everyone involved has generally agreed and we have a formal LGTM as well, I'm going to merge this. More than happy to work on follow-ups and we can improve what information we're emitting anytime. |
|
ooc, is this taken further up to include metrics and logs aswell, as otel provides. the discussion begins where a certain inflexibility comes between expecations and available time. additionally the part make inclusive cases more attractive as all measurement steps are centralized somewhat but kept as separate as possible in its centralization. like a protocol would. tl;dr why not otel? its tracing taken one step further. |
|
@wibed I recommend watching this recently presented talk about the separation of "API packages" and "backends" (like OTel): https://www.youtube.com/watch?v=HSxIFLsoODc |
## Motivation `AsyncHTTPClient` recently gained built-in support for Distributed Tracing (swift-server/async-http-client#857, swift-server/async-http-client#861, swift-server/async-http-client#862). It'd be a great showcase of Distributed Tracing to have an "end-to-end" example involving two server communicating via HTTP and this is now achievable without any manual HTTP client instrumentation. We already have multiple demos for both Vapor and Hummingbird so it feels natural for this demo to exist in Swift OTel as well. ## Modifications - Adds an example with two HTTP servers (Hummingbird & Vapor) where one calls the other via `AsyncHTTPClient` ## Result We now have an example that spans (no pun intended) two servers, connected via `AsyncHTTPClient`.
Almost 5 years after the initial tracing proposal PR for this HTTPClient, I'd like to propose it again.
Popular server packages should be embracing distributed tracing, and none is more crucial in this than the async-http-client.
This PR proposes an approach to include tracing with:
TracingSupporttrait is enabled by defaultThe details for where to start/end spans are TBD, we can do better than this.
Currently there is a single span that starts before we send the request, and ends as we get the response head. We can further improve this with spans for streaming the request body, spans for the time it takes to send the request etc. However first I'd like to get agreement on this approach for the instrumentation.
The spans
This creates very basic spans. Basically when you do
client.get()/post()etc, as well as the asynchronoustry await client.execute()we create a span that:There's no spans yet for the response body streaming or the waiting in the pool etc. All this can be added incrementally.
⏭️ The spans don't have fancy attributes yet;
⏭️ and the error cases are likely not fully covered. If someone more familiar with HTTPClient wants to take that on that'd be best, but I can follow-up with more handling once we're happy with this approach in general.
I think this should be done in follow-ups, so we can keep polishing up the tracing support as we use it in practice.
Availability woes
It's been somewhat painful to not cause the entire HTTPClient types to suddenly get @Availability which would be a breaking change; So I'm playing a bunch of tricks with storing
any Sendableas storage and casting that... We could probably use unsafe bitcasts, but taking this conservatively for now -- it should work first and foremost, and we can spend time shaving off overhead eventually.I don't like one bit where we're doing that, the
mutating func startRequestSpan<T>(tracer: T?) {}is kinda horrible, but if we make this haveTracerwe end up forcing the surrounding type to have availability...⏭️ We could think of some better ways to express this, but it's all internal so I think we can take this incrementally.
Configuration
It is possible to set a tracer "per client" by using the Configuration, which in combination with the in memory tracer (Incoming here apple/swift-distributed-tracing#180), will be able to be used for testing. One can also pass
nilto configuration to disable picking up the global tracer during configuration.We can use this for testing, and it's important for e.g. the otel library so when exporting while using the httpclient, we don't necessarily also cause spans of that to be exported -- if we don't care we can disable it this way. (alternatively, by making a parent span inside swift-otel that is NOT recording also works)
Context propagation
✅ This also includes header propagation, although I've ran out of night to make a test for it -- I'll follow up on it shortly.
Resolves slashmo/gsoc-swift-tracing#46
Resolves slashmo/gsoc-swift-tracing#95
Replaces #320
Replaces #289
cc @slashmo @simonjbeaumont @czechboy0 @FranzBusch @fabianfett