aksd: harden App Insights telemetry to anonymous app-usage counts#626
aksd: harden App Insights telemetry to anonymous app-usage counts#626gambtho wants to merge 13 commits intoAzure:headlamp-downstreamfrom
Conversation
Disable AJAX/fetch/exception/cookies/storage/auto-route auto-collection, drop the unconditional trackPageView() call (URL contains cluster and resource identifiers), and register a telemetry initializer that strips ai.user.* / ai.session.id / ai.location.ip tags and clears uri/refUri/url fields on every outgoing envelope.
Lock the contract: identifying tags (ai.user.*, ai.session.id, ai.location.ip) are removed; uri/refUri/url fields on baseData are blanked when present; benign tags and other baseData fields are preserved; envelopes without tags or baseData don't crash.
Gate the trackEvent call in the redux listener middleware on an explicit allowlist of HeadlampEventType values plus the synthetic 'exception' name. Future event-type additions that accidentally encode user data in the type string cannot leak — they must be opted into telemetry explicitly. action.payload.data is still never forwarded (already true), now with a comment making that contract explicit.
Two new cases: dispatching an event whose type is not on the allowlist does not call trackEvent; dispatching an allowlisted event with a populated data field calls trackEvent with only the type, never the data. The two existing analytics-tracking cases used the placeholder type 'test-event-type', which the allowlist correctly rejects. Updated them to use 'headlamp.list-view' so they continue to verify the appInsights-defined and appInsights-undefined paths.
Drop trackException, which shipped the full Error including message and
stack — error messages routinely contain identifiers (namespaces, resource
names, URLs, YAML fragments).
ErrorBoundary now calls trackEvent('exception', { errorName: error.name }),
shipping only the constructor name (TypeError, RangeError, KubeApiError,
etc.). Gives error-class distributions for triage without leaking content.
'exception' is allowlisted in headlampEventSlice.
Verify componentDidCatch ships trackEvent('exception', { errorName }) and
nothing else. A custom Error subclass produces its class name; a plain
Error produces 'Error'. Asserts that the message (which contained a
namespace and URL in the test fixture) is never forwarded in any property.
There was a problem hiding this comment.
Pull request overview
This PR hardens App Insights telemetry from the Headlamp frontend to ensure only anonymous app-usage counts are emitted, addressing an external review concern about overly liberal telemetry (URLs, error messages/stacks, user/session correlation).
Changes:
- Adds a telemetry event allowlist in the redux listener middleware and ensures only the event type is forwarded (never
action.payload.data). - Replaces exception telemetry from
trackException(error)with a scrubbedtrackEvent('exception', { errorName }). - Locks down App Insights SDK auto-collection and adds a telemetry initializer to strip identifying tags and URL fields, with new unit tests covering these behaviors.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/redux/headlampEventSlice.ts | Adds allowlist-based filtering before forwarding events to App Insights. |
| frontend/src/redux/headlampEventSlice.test.tsx | Updates/extends tests to verify allowlist behavior and that payload data is never forwarded. |
| frontend/src/lib/analytics.tsx | Removes trackException and leaves only trackEvent. |
| frontend/src/lib/analytics.test.ts | Adds unit tests for the telemetry initializer’s scrubbing behavior. |
| frontend/src/components/common/ErrorBoundary/ErrorBoundary.tsx | Sends scrubbed exception telemetry (error.name only) instead of full Error objects. |
| frontend/src/components/common/ErrorBoundary/ErrorBoundary.test.tsx | Adds tests asserting exception telemetry contains only errorName and no message/URL fragments. |
| frontend/src/analyticsSetup.ts | Disables SDK auto-collection, removes pageview tracking, and adds a privacy telemetry initializer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
trackException was removed in the privacy hardening change; the plugin-compatibility snapshot needs to follow.
Move the function from analyticsSetup.ts (which has module-load side effects when REACT_APP_APPINSIGHTS_CONNECTION_STRING is set) to a new lib/analyticsPrivacy.ts. The unit test imports from there so it never risks pulling AppInsights initialization into the test's module graph. Addresses Copilot review feedback on PR Azure#626.
Make TELEMETRY_EVENT_ALLOWLIST module-private and expose only an isTelemetryEventAllowlisted(type) predicate. ReadonlySet only restricts TypeScript callers; the underlying Set is mutable at runtime. A predicate closes that hole — the collection is unreachable from outside the module. Addresses Copilot review feedback on PR Azure#626.
Regenerated under Node 20 (matching CI) with no other changes. The diff is exclusively MUI emotion-generated CSS class-hash updates (e.g. 'css-1ml9dpn-MuiTable-root' -> 'css-1ok7ihs-MuiTable-root'); no semantic content shifted. 39 files / 49 line changes. Drift was pre-existing on headlamp-downstream and surfaced as snapshot mismatches on PR Azure#626's CI. Refreshing here so this PR's CI can go green.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 48 out of 48 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
skoeva
left a comment
There was a problem hiding this comment.
You could update the snapshots in the frontend since the CI seems to be failing
Any envelope the SDK emits or queues during initialization would otherwise bypass the privacy scrubber. Register the initializer first so the scrubbing pipeline is in place before telemetry can flow.
The previous wording claimed the allowlist "cannot be mutated at runtime", which isn't quite true — Set instances remain mutable, even when typed as ReadonlySet. The actual guarantee is encapsulation: only the isTelemetryEventAllowlisted predicate is exported, so external callers cannot reach the underlying set. Reword to describe what's actually guaranteed.
Removing trackException entirely from lib/analytics broke the public
plugin API surface — window.pluginLib.analytics.trackException is gone
for any external plugin still calling it. Restore the export as a thin
backwards-compatible wrapper that emits trackEvent('exception', {
errorName: error.name }), so only the constructor name reaches the
wire — never the message, stack, or Error object. Mark @deprecated
with a pointer to the recommended trackEvent call.
Restore the corresponding line in pluginLib.snapshot and add unit
tests covering the scrubbing behavior, the empty-name fallback, and
the appInsights-uninitialized no-op path.
45ca87f to
96fffec
Compare
Summary
Reduce App Insights telemetry shipped from the Headlamp frontend to anonymous app-usage counts only. An external review flagged the previous implementation as too liberal.
trackPageView()call (URLs in Headlamp encode cluster, namespace, and resource names).HeadlampEventTypeaddition that accidentally encodes user data in its name string cannot leak — it must be opted into telemetry explicitly.action.payload.datais still never forwarded.trackException(which shipped the fullErrorincluding message and stack) withtrackEvent('exception', { errorName: error.name }). Ships only the constructor name (e.g.TypeError,KubeApiError) — no message, no stack.ai.user.*/ai.session.id/ai.location.ipenvelope tags and clearsuri/refUri/urlfields on every outgoing envelope, regardless of how they got there.Net result: each outgoing telemetry envelope contains an allowlisted event name (
headlamp.list-view,headlamp.delete-resource, …) and, for'exception',errorName. No URL, user id, session id, IP, cluster, namespace, resource name, error message, or stack.Test plan
npm run tsccleannpm run lintcleananalytics.test.ts,headlampEventSlice.test.tsx,ErrorBoundary.test.tsx)REACT_APP_APPINSIGHTS_CONNECTION_STRINGset; confirm in browser devtools (Network →track) that envelopes contain noai.user.*/ai.session.id/ai.location.iptags, nouri/refUri/url, noPageView, noRemoteDependencyData, and that triggering an error in anErrorBoundaryproduces a singleEventDataenvelope withname: 'exception'andproperties: { errorName: ... }only.