Skip to content

Add precision at fixed recall#73

Merged
ablaom merged 13 commits intodevfrom
precision-given-recall2
Mar 22, 2026
Merged

Add precision at fixed recall#73
ablaom merged 13 commits intodevfrom
precision-given-recall2

Conversation

@ablaom
Copy link
Copy Markdown
Member

@ablaom ablaom commented Mar 8, 2026

Replaces #69.

Docstring
PrecisionAtFixedRecall(; recall_threshold=0.95)

Return a callable measure for computing the precision at fixed recall. Aliases: precision_at_fixed_recall.

m(ŷ, y)

Evaluate some measure m returned by the PrecisionAtFixedRecall constructor (e.g., m = PrecisionAtFixedRecall()) on predictions ŷ, given ground truth observations y. It is expected that ŷ be a vector of distributions over the binary set of unique elements of y; specifically, ŷ should have eltype <:UnivariateFinite from the CategoricalDistributions.jl package.

This metric is useful, in applications such as toxicity detection, anomaly detection, and screening for disease markers, if one wants a cap on the proportion of positives that are misclassified (one minus the recall) while minimizing the rate of false alarms (one minus the precision).

More precisely, the measure:

  1. Determines all values of the recall, as one varies the probability threshold for a positive outcome over all predicted probabilities for that class.
  2. Among these recalls, finds the smallest one that exceeds or equals recall_threshold.
  3. Returns the corresponding precision for that recall.

In the event there are multiple precisions for the same recall, the mean precision is returned. In the event no recall is found in Step 2, a precision of 0 is returned.

Core implementation: Functions.precision_at_fixed_recall.

Generally, an observation obs in MLUtils.eachobs(y) is expected to satisfy ScientificTypes.scitype(obs)<: ScientificTypesBase.OrderedFactor{2}.

See also precision_recall_curve.

For a complete dictionary of available measures, keyed on constructor, run measures().

Traits

consumes_multiple_observations = true
can_report_unaggregated = false
kind_of_proxy = LearnAPI.Distribution()
observation_scitype = ScientificTypesBase.OrderedFactor{2}
can_consume_tables = false
supports_weights = false
supports_class_weights = false
orientation = StatisticalMeasuresBase.Score()
external_aggregation_mode = StatisticalMeasuresBase.Mean()
human_name = precision at fixed recall

@ablaom ablaom mentioned this pull request Mar 8, 2026
1 task
@ablaom ablaom requested a review from OkonSamuel March 9, 2026 21:08
src/functions.jl Outdated
recalls = @view recalls[1:end-1]
precisions = @view precisions[1:end-1]

recall_threshold <= 1 || !isempty(recalls) || return 0
Copy link
Copy Markdown
Member

@OkonSamuel OkonSamuel Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
recall_threshold <= 1 || !isempty(recalls) || return 0
recall_threshold <= 1 && !isempty(recalls) || return 0

I'm guessing here you wanted to jump to the code block below if the recall_treshold <= 1. Maybe this doesn't affect anything as it's caught below and a zero is returned.
But I don't even think that the recalls and precision are ever empty arrays, they should always have a length of at least two from the way they were constructed and even after removing the last item, these array should have at least one item left.

A side note is when the recall_threshold is negative. Things still work out as stated in the doc string but the interpretation isn't clear. For example it no longer makes sense to say that the proportion of positives that are misclassified is one minus the recall_threshold.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. There are always at least two points return by precision_recall_curve, so I've just dropped that redundant check. So now:

recall_threshold <= 1 || return 0

I could also drop this line entirely, because recalls cannot be > 1, and the subsequent logic works, but by including this line avoids the search.

end
# extreme thresholds:
@test Functions.precision_at_fixed_recall(yhat, y, "X"; recall_threshold=0.9)==0
@test Functions.precision_at_fixed_recall(yhat, y, "X"; recall_threshold=1)==0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have one more test

    @test Functions.precision_at_fixed_recall(yhat, y, "X"; recall_threshold=1.2)==0

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks!

@ablaom
Copy link
Copy Markdown
Member Author

ablaom commented Mar 22, 2026

@OkonSamuel Thank you for your careful review.

@ablaom ablaom merged commit 341e246 into dev Mar 22, 2026
3 checks passed
This was referenced Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants