[alerts] add explicit versions to alert payloads#10555
Conversation
| comment TEXT NOT NULL, | ||
|
|
||
| -- The version of the alert class' schema that this alert's payload conforms | ||
| -- to. |
There was a problem hiding this comment.
Can you add a comment indicating that this is zero-indexed?
I know this is sorta implied by the type, but convention sometimes implies starting with 1, so it's worth being explicit
| const CLASS: AlertClass = AlertClass::Probe; | ||
| const VERSION: u32 = | ||
| <alert_types::Probe as alert_types::AlertPayload>::VERSION; | ||
| static DATA: LazyLock<serde_json::Value> = | ||
| LazyLock::new(|| serde_json::json!({})); |
There was a problem hiding this comment.
This might not matter much right now for probes, but to set a good precedent:
- Should
CLASSreferencealert_types::Probe::CLASS? Currently it points directly at thedb-modelenum variant. - Should
DATAbe referencingalert_types::Probe? These happen to both be empty JSON, but could theoretically get out-of-sync?
| use std::fmt; | ||
|
|
||
| /// Trait implemented by alerts. | ||
| pub trait AlertPayload: Serialize + JsonSchema + std::fmt::Debug { |
There was a problem hiding this comment.
Couple of thoughts:
- This trait requires
JsonSchema, but AFAICT, nothing is consuming that right now? I understand that we want to use this in the future though. - From what I can tell, nothing actually validates that "the version should be bumped when the schema changes". If someone alters a payload struct and doesn't bump the version, nothing fails (or notifies them at all).
I know you touched on this a bit in the commit message, but:
- How hard would it be to have a registry of all
AlertPayloadimpls, with the schema of eachJsonSchemasaved somewhere keyed by(class, version)? Then changing a struct without bumpingVERSIONcould cause a test failure, and make it clear what a programmer needs to change.
We don't technically need to block the PR behind this, but it seems like we should move VERSION to be guarded by something better than "the honor system" ASAP.
There was a problem hiding this comment.
All of what you're describing is the eventual goal. For now, my intent for this PR was just to have the base notion of having versions, so that we could add version 0 for some alerts. I wanted to get the bare minimum infrastructure for saying that alerts have versions at all before we added any alert classes. I definitely agree that we will need to be able to generate the schemas and emit them somewhere, and that we need tooling for actually validating that versions are correct. I want to be able to do that separately from writing a DE that emits v0 alerts, though.
mergeconflict
left a comment
There was a problem hiding this comment.
nice! i think @smklein already found all the fun stuff, so just one nit from me.
|
|
||
| impl AlertPayload for $Name { | ||
| const CLASS: AlertClass = AlertClass::$Class; | ||
| const VERSION: u32 = 0; |
There was a problem hiding this comment.
nit: i wonder if we could vary the version (maybe spit out arbitrary version numbers from proptest) so that if the production code happened to have an "oops, the version is always zero" bug, we'd catch it
There was a problem hiding this comment.
I think testing for this is part of the work on actually adding tooling and infrastructure for supporting multiple versions, which I'd like to defer for a later PR.
As per @smklein in #10555 (comment) this should be documented, and I figured we should have a constraint for it too.
This addresses the suggestion from #10555 (comment)
This branch adds an explicit notion of version numbers for alert payloads. This is implemented by changing the entry points for publishing alerts (either via `Nexus::alert_publish` or `SitrepBuilder::request_alert` for FM) to take a new `AlertPayload` trait implemented by types which describe alert payloads. These must provide an alert class and version constant, in addition to implementing `serde::Serialize` and `JsonSchema`. Eventually we will also use these to generate the schemas for each version of the alert so that they can be communicated to potential consumers. For now, though, we are at least enforcing that a given schema always provides the same alert class and version number (rather than being able to pass in any arbitrary class). Now, when an alert is created, the schema version is recorded in the alert record. We backfill any existing alert records as being version 0. This doesn't actually matter because, you know, there aren't any, but we may as well do the right thing there. When an alert is delivered to a webhook receiver, the version number is included in a JSON field and a new `x-oxide-alert-version` header.
Co-authored-by: Sean Klein <sean@oxide.computer>
As per @smklein in #10555 (comment) this should be documented, and I figured we should have a constraint for it too.
This addresses the suggestion from #10555 (comment)
153b061 to
2729e97
Compare
This branch adds an explicit notion of version numbers for alert payloads. This is implemented by changing the entry points for publishing alerts (either via
Nexus::alert_publishorSitrepBuilder::request_alertfor FM) to take a newAlertPayloadtrait implemented by types which describe alert payloads. These must provide an alert class and version constant, in addition to implementingserde::SerializeandJsonSchema. Eventually we will also use these to generate the schemas for each version of the alert so that they can be communicated to potential consumers. For now, though, we are at least enforcing that a given schema always provides the same alert class and version number (rather than being able to pass in any arbitrary class).Now, when an alert is created, the schema version is recorded in the alert record. We backfill any existing alert records as being version 0. This doesn't actually matter because, you know, there aren't any, but we may as well do the right thing there. When an alert is delivered to a webhook receiver, the version number is included in a JSON field and a new
x-oxide-alert-versionheader.