Conversation
tweak the implementation to exclude pr-curve "limit point"
tweak again
tweak the implementation to exclude pr-curve "limit point"
tweak again
src/functions.jl
Outdated
| recalls = @view recalls[1:end-1] | ||
| precisions = @view precisions[1:end-1] | ||
|
|
||
| recall_threshold <= 1 || !isempty(recalls) || return 0 |
There was a problem hiding this comment.
| recall_threshold <= 1 || !isempty(recalls) || return 0 | |
| recall_threshold <= 1 && !isempty(recalls) || return 0 |
I'm guessing here you wanted to jump to the code block below if the recall_treshold <= 1. Maybe this doesn't affect anything as it's caught below and a zero is returned.
But I don't even think that the recalls and precision are ever empty arrays, they should always have a length of at least two from the way they were constructed and even after removing the last item, these array should have at least one item left.
A side note is when the recall_threshold is negative. Things still work out as stated in the doc string but the interpretation isn't clear. For example it no longer makes sense to say that the proportion of positives that are misclassified is one minus the recall_threshold.
There was a problem hiding this comment.
You are right. There are always at least two points return by precision_recall_curve, so I've just dropped that redundant check. So now:
recall_threshold <= 1 || return 0
I could also drop this line entirely, because recalls cannot be > 1, and the subsequent logic works, but by including this line avoids the search.
| end | ||
| # extreme thresholds: | ||
| @test Functions.precision_at_fixed_recall(yhat, y, "X"; recall_threshold=0.9)==0 | ||
| @test Functions.precision_at_fixed_recall(yhat, y, "X"; recall_threshold=1)==0 |
There was a problem hiding this comment.
Maybe we should have one more test
@test Functions.precision_at_fixed_recall(yhat, y, "X"; recall_threshold=1.2)==0
Make review suggestions. Co-authored-by: Okon Samuel <[email protected]>
|
@OkonSamuel Thank you for your careful review. |
Replaces #69.
Docstring
Return a callable measure for computing the precision at fixed recall. Aliases:
precision_at_fixed_recall.m(ŷ, y)Evaluate some measure
mreturned by thePrecisionAtFixedRecallconstructor (e.g.,m = PrecisionAtFixedRecall()) on predictionsŷ, given ground truth observationsy. It is expected thatŷbe a vector of distributions over the binary set of unique elements ofy; specifically,ŷshould have eltype<:UnivariateFinitefrom the CategoricalDistributions.jl package.This metric is useful, in applications such as toxicity detection, anomaly detection, and screening for disease markers, if one wants a cap on the proportion of positives that are misclassified (one minus the recall) while minimizing the rate of false alarms (one minus the precision).
More precisely, the measure:
recall_threshold.In the event there are multiple precisions for the same recall, the mean precision is returned. In the event no recall is found in Step 2, a precision of
0is returned.Core implementation:
Functions.precision_at_fixed_recall.Generally, an observation
obsinMLUtils.eachobs(y)is expected to satisfyScientificTypes.scitype(obs)<:ScientificTypesBase.OrderedFactor{2}.See also
precision_recall_curve.For a complete dictionary of available measures, keyed on constructor, run
measures().Traits