Skip to content

feat: Add canister_metrics endpoint on management canister#6217

Open
dsarlis wants to merge 18 commits intodfinity:masterfrom
dsarlis:dimitris/management-canister-metrics
Open

feat: Add canister_metrics endpoint on management canister#6217
dsarlis wants to merge 18 commits intodfinity:masterfrom
dsarlis:dimitris/management-canister-metrics

Conversation

@dsarlis
Copy link
Copy Markdown
Contributor

@dsarlis dsarlis commented Apr 7, 2026

This PR introduces a management canister API that will be used to report canister related metrics to canister controllers or subnet admins. The API currently only includes a set of cycles consumed by different use cases but is designed that it can be easily extended to include more metrics (whether cycles related or even calls processed etc).

@dsarlis dsarlis requested review from a team as code owners April 7, 2026 07:35
@github-actions github-actions Bot added the interface-spec Changes to the IC Interface Specification label Apr 7, 2026
Comment thread docs/references/ic-interface-spec.md Outdated
Comment thread docs/references/_attachments/ic.did Outdated
@dsarlis dsarlis requested a review from mraszyk April 7, 2026 09:24
Comment thread docs/references/ic-interface-spec.md
Copy link
Copy Markdown
Contributor

@mraszyk mraszyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the note about query calls in the formal part of the spec

@dsarlis dsarlis requested a review from mraszyk April 7, 2026 09:44
Comment thread docs/references/ic-interface-spec.md Outdated
Co-authored-by: mraszyk <31483726+mraszyk@users.noreply.github.com>
dsarlis added a commit to dsarlis/ic that referenced this pull request May 5, 2026
Add counter versions of metrics for consumed cycles that are stored in
`ReplicatedState`. The existing ones behave like gauges (so their values
can go down when prepayments are made and up when refunds are issued)
which makes it more challenging for consumers to build automated
monitoring tools to perform aggregations over them. By having them
monotonically increase, it's easier to calculate rates of change, show
aggregates over time etc.

The key idea is to introduce a second map of `<CyclesUseCase,
NominalCycles>` in the `ReplicatedState` that will only be updated once
per use case: either at the payment stage if we know the precise amount
or only at refund stage if a prepayment is made with an expected refund
later. The second map is quite similar to the existing in all other
aspects (how they are stored in checkpoints or how they are exposed as
prometheus metrics) besides how the values are updated.

A new map is introduced to ease the transition as migrating from the old
map to new is non-trivial given that a proper cutoff point needs to be
introduced to handle outstanding callbacks that might have been created
before the metric introduction. This is left for a follow-up if and when
people decide to do it. The new map will be used in a follow up that
will implement the [new management canister
endpoint](dfinity/portal#6217) to retrieve
canister level metrics.

Additionally, the new metrics include the use case `HttpsOutcalls` in
the canister level metrics as it's useful to determine how much each
canister uses this feature. I've opted to not change existing metrics to
do the same as it would make things less clean imo than the current
approach -- a single specific API is used to perform this update in
exactly one place where it's needed.

The changes in the PR are mostly driven by the addition of the new map
of metrics, updates in protobuf files to store the new metrics, the
changes to support having the `HttpsOutcalls` use case additionally
included as well as some changes in tests to support the new metrics.
@dsarlis dsarlis requested a review from a team as a code owner May 5, 2026 08:59
Copy link
Copy Markdown
Contributor

@mraszyk mraszyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem merged at the moment so not sure what kind of feedback/action is expected.

list_canisters : () -> (list_canisters_result) query;

// Returns canister related metrics
canister_metrics: (canister_metrics_args) -> (canister_metrics_result) query;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this close to canister_status since it belongs together

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean only here or also in the main document? It's now listed as the last API, it seems we kinda follow chronological order there but I can move it, I don't have a strong opinion.

@dsarlis
Copy link
Copy Markdown
Contributor Author

dsarlis commented May 5, 2026

This doesn't seem merged at the moment so not sure what kind of feedback/action is expected.

Nothing more than what you already provided. I merged with master and had to fix some conflicts, so I just wanted to make sure it's ready (sans the changelog entry which will happen right before merging).

pull Bot pushed a commit to bit-cook/ic that referenced this pull request May 5, 2026
Implement the `canister_metrics` endpoint as described in
dfinity/portal#6217 to allow controllers or
subnet admins to retrieve canister level metrics for a target canister.
The metrics currently contain some basic cycles related ones but can be
extended in the future to contain more.

The necessary boilerplate is added to wire the new endpoint through the
code stack. As defined in the interface spec, the API is available both
in replicated mode as well as a query call.

A few tests are also added to ensure that the endpoint correctly returns
the values that are present in the replicated state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

interface-spec Changes to the IC Interface Specification

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants