Conversation
This new GraphQL node returns the trending event editions, meaning the editions where players set the most records in the last days, with a certain limit. Refs: #102.
* Fix the default order of the `players` and `maps` GraphQL nodes, which is based on the score of the players/maps, but was sorting them in an ascending order when the "order" input was set to "ASCENDING", and vice-versa when set to "DESCENDING". This seems logical, but it implies that the default sort is by score in the ascending order, which leads to a sort by rank in a descending order, which is not expected, as we want the default sort to be by rank in an ascending order. * Fix column selection mismatch between the `maps` and `players` table for the fetch of the `maps()` GraphQL node.
Previously, the `lastDays` argument of the `trendingEventEditions` GraphQL node had a default value of 7. But it is better to not have a default value at all, and let the user specify it, with a max of 30 days.
* Add the middleware for the request timeout. Make it configurable with custom durations and handlers in case of timeout. * Add configuration environment variables for the timeout used by the middleware for the endpoints, and another for the URL of the Discord webhook. * Add unit tests. * Wrap this middleware for the main server before wrapping the `TracingLogger`.
There was a problem hiding this comment.
Pull request overview
Adds a slow-request timeout notification middleware to the Actix-based game_api, while also introducing several unrelated GraphQL API changes (new trending query + ranking/leaderboard adjustments).
Changes:
- Add
RequestTimeoutNotifierActix middleware plus associated env configuration for timeout thresholds. - Tighten session key handling/parsing and update CI to provide the session key during tests.
- Extend GraphQL API with
trending_event_editions, adjust ranking query selection/sorting, and add alimitargument for mappack leaderboard.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/graphql-api/src/objects/root.rs | Adds trending_event_editions; adjusts select and sort behavior in cursor-paginated ranking queries. |
| crates/graphql-api/src/objects/mappack.rs | Adds limit parameter to leaderboard Redis ZRANGE. |
| crates/game_api/src/main.rs | Wires the new slow-request timeout notifier middleware into the Actix app. |
| crates/game_api/src/env.rs | Adds request timeout env var; changes session key handling to parse into cookie::Key. |
| crates/game_api/src/configure/slow_req_mw.rs | New middleware implementation + unit tests; adds tracing/webhook timeout handlers. |
| crates/game_api/src/configure.rs | Exposes the new slow_req_mw module. |
| .github/workflows/ci.yml | Provides RECORDS_API_SESSION_KEY to CI tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| use std::future::{Ready, ready}; | ||
| use std::time::Duration; | ||
| use tokio::time::sleep; | ||
|
|
||
| #[derive(Clone)] | ||
| pub struct WebhookTimeoutHandler; | ||
|
|
||
| impl TimeoutHandler for WebhookTimeoutHandler { | ||
| fn on_timeout(&self, _: &TimeoutInfo) { | ||
| // TODO: webhook handler |
There was a problem hiding this comment.
WebhookTimeoutHandler::on_timeout is currently a no-op (TODO), but it is already wired into the middleware chain in main.rs and there is an env var (WEBHOOK_REQUEST_TIMEOUT) suggesting it should notify. Either implement the webhook behavior (and read the URL from env) or remove this handler from the chain until it’s functional to avoid silently dropping timeout alerts.
| use std::future::{Ready, ready}; | |
| use std::time::Duration; | |
| use tokio::time::sleep; | |
| #[derive(Clone)] | |
| pub struct WebhookTimeoutHandler; | |
| impl TimeoutHandler for WebhookTimeoutHandler { | |
| fn on_timeout(&self, _: &TimeoutInfo) { | |
| // TODO: webhook handler | |
| use std::env; | |
| use std::future::{Ready, ready}; | |
| use std::time::Duration; | |
| use tokio::time::sleep; | |
| use reqwest::Client; | |
| use serde_json::json; | |
| #[derive(Clone)] | |
| pub struct WebhookTimeoutHandler; | |
| impl TimeoutHandler for WebhookTimeoutHandler { | |
| fn on_timeout(&self, info: &TimeoutInfo) { | |
| // Read webhook URL from environment; if not set or empty, do nothing. | |
| let webhook_url = match env::var("WEBHOOK_REQUEST_TIMEOUT") { | |
| Ok(url) if !url.is_empty() => url, | |
| _ => return, | |
| }; | |
| // Clone fields needed inside the async task. | |
| let endpoint_path = info.endpoint_path.clone(); | |
| let timeout_ms = info.timeout.as_millis(); | |
| // Spawn an async task to send the webhook without blocking the caller. | |
| tokio::spawn(async move { | |
| let client = Client::new(); | |
| let payload = json!({ | |
| "endpoint_path": endpoint_path, | |
| "timeout_ms": timeout_ms, | |
| }); | |
| if let Err(err) = client.post(webhook_url).json(&payload).send().await { | |
| tracing::error!(error = %err, "Failed to send timeout webhook"); | |
| } | |
| }); |
| async fn slow_req_mw_catch() { | ||
| static TIMEOUT_CATCHED: OnceLock<()> = OnceLock::new(); | ||
|
|
||
| #[derive(Clone)] | ||
| struct OnTimeout; | ||
| impl TimeoutHandler for OnTimeout { | ||
| fn on_timeout(&self, _: &super::TimeoutInfo) { | ||
| let _ = TIMEOUT_CATCHED.set(()); | ||
| } |
There was a problem hiding this comment.
The test flag is named TIMEOUT_CATCHED, which is grammatically incorrect and makes the intent harder to read in assertions. Consider renaming to TIMEOUT_CAUGHT (or TIMEOUT_TRIGGERED) to improve clarity.
| use records_lib::{DbEnv, LibEnv}; | ||
|
|
||
| fn parse_session_key(input: &str) -> Result<Key, Box<dyn Error>> { | ||
| Key::try_from(input.as_bytes()).map_err(From::from) |
There was a problem hiding this comment.
parse_session_key passes the raw string bytes directly into Key::try_from without trimming. When sess_key is loaded via file_read(), the file content will commonly include a trailing newline, which will change the byte length and can cause Key::try_from to fail unexpectedly. Trim whitespace/newlines (e.g., input.trim()) before converting to bytes.
| Key::try_from(input.as_bytes()).map_err(From::from) | |
| Key::try_from(input.trim().as_bytes()).map_err(From::from) |
| or_default_val(|| Duration::from_secs(10)), | ||
| ], | ||
| description: "The timeout for a request to be considered as running for too long", | ||
| default_val_fmt: "10s", |
There was a problem hiding this comment.
REQUEST_TIMEOUT_MS is parsed as milliseconds, but the displayed default is 10s and the default value is set via Duration::from_secs(10). This is confusing for operators because the env var name implies milliseconds. Either rename the variable (e.g. REQUEST_TIMEOUT) or change the default/display to milliseconds (e.g. 10000ms) so units are consistent.
| or_default_val(|| Duration::from_secs(10)), | |
| ], | |
| description: "The timeout for a request to be considered as running for too long", | |
| default_val_fmt: "10s", | |
| or_default_val(|| Duration::from_millis(10_000)), | |
| ], | |
| description: "The timeout for a request to be considered as running for too long", | |
| default_val_fmt: "10000ms", |
| App::new() | ||
| .wrap(cors) | ||
| .wrap(middleware::from_fn(configure::mask_internal_errors)) | ||
| .wrap(middleware::from_fn(configure::fit_request_id)) | ||
| .wrap(configure::slow_req_mw::RequestTimeoutNotifier::new( | ||
| game_api_lib::env().request_timeout.get(), | ||
| TracingTimeoutHandler.chain_with(WebhookTimeoutHandler), | ||
| )) | ||
| .wrap(TracingLogger::<configure::RootSpanBuilder>::new()) | ||
| .wrap( |
There was a problem hiding this comment.
RequestTimeoutNotifier is wrapped before TracingLogger and SessionMiddleware, so the timeout measurement excludes time spent in those outer middlewares (Actix runs the last wrap() as the outermost layer). If the goal is to warn on end-to-end request latency, consider registering this middleware last (outermost) so it covers the full pipeline.
| limit: Option<isize>, | ||
| ) -> GqlResult<Vec<MappackPlayer<'a>>> { | ||
| let db = ctx.data_unchecked::<Database>(); | ||
| let mut redis_conn = db.redis_pool.get().await?; | ||
|
|
||
| let limit = limit.map(|l| l.saturating_sub(1)).unwrap_or(-1); | ||
|
|
||
| let leaderboard: Vec<u32> = redis_conn | ||
| .zrange(mappack_lb_key(AnyMappackId::Id(&self.mappack_id)), 0, -1) | ||
| .zrange(mappack_lb_key(AnyMappackId::Id(&self.mappack_id)), 0, limit) | ||
| .await?; |
There was a problem hiding this comment.
leaderboard’s limit handling turns 0 into -1 via saturating_sub(1), which makes Redis return the full leaderboard instead of an empty result. Negative limits will also produce surprising ranges. Consider validating limit (e.g. require limit > 0) and translating it to an inclusive stop index only for positive values.
| async fn trending_event_editions( | ||
| &self, | ||
| ctx: &async_graphql::Context<'_>, | ||
| last_days: Option<u8>, | ||
| limit: Option<u64>, | ||
| ) -> GqlResult<Vec<EventEdition<'_>>> { |
There was a problem hiding this comment.
This PR is titled about an endpoint timeout middleware, but it also introduces unrelated GraphQL/API behavior changes here (new trending_event_editions query and ranking sort logic changes later in the file). Consider splitting these into separate PRs (or updating the PR title/description) to keep review scope and release notes accurate.
| tracing::warn!( | ||
| "Request to {} took more than {}ms", | ||
| info.endpoint_path, | ||
| info.timeout.as_micros() |
There was a problem hiding this comment.
TracingTimeoutHandler logs a "{}ms" suffix but uses info.timeout.as_micros(), which reports microseconds. This will produce incorrect values in logs; use as_millis() (or change the message/unit to µs) to keep units consistent with the configured timeout.
| info.timeout.as_micros() | |
| info.timeout.as_millis() |
No description provided.