Status: Phase 2B PR #15 (additive). Existing consumers reading confidence continue to work.
Before PR #15, every CVE finding carried a single confidence field that combined:
- Static evidence depth (symbol cooccurrence vs decompiled code vs P-code dataflow)
- Reachability multiplier (
directly_reachable1.0 ...unreachable0.5) - Backport penalty (-0.30 if the component shows a distro patch revision)
- EPSS additive (+0.10 / +0.05 / -0.05 depending on FIRST.org EPSS bucket)
External reviewers (Gemini, ChatGPT) flagged this as a category error:
"EPSS is a population-level exploitation likelihood -- making it modify detection confidence makes the field look like a ranking heuristic, not a probability of true positive."
PR #15 splits the single field into two scores with distinct semantics:
| Score | Meaning | Range | Source |
|---|---|---|---|
confidence (a.k.a. detection_confidence) |
Probability that the finding reflects a real vulnerability. Pure static evidence depth. | 0.0 - 1.0 | confidence_caps.py (capped) |
priority_score |
Operational priority for analyst triage. Combines detection confidence with EPSS, reachability, CVSS, backport. | 0.0 - 1.0 | scoring.compute_priority_score() |
Iron rule: if you want to rank findings by "look at this first", read
priority_score. If you want to know "is this a true positive", read
confidence. They are no longer the same thing.
Strict static-evidence-only signal, capped at the appropriate level from
confidence_caps.py:
| Cap constant | Value | When |
|---|---|---|
SYMBOL_COOCCURRENCE_CAP |
0.40 | Symbol co-occurrence only -- no code path confirmed |
STATIC_CODE_VERIFIED_CAP |
0.55 | Decompiled code inspected but no LLM taint trace |
STATIC_ONLY_CAP |
0.60 | Generic static-only ceiling (legacy) |
PCODE_VERIFIED_CAP |
0.75 | P-code SSA dataflow confirmed source -> sink |
CVE findings emitted by cve_scan.py are always capped at
STATIC_CODE_VERIFIED_CAP (0.55) -- NVD evidence by itself is "decompiled
code level" at best, never P-code-confirmed.
EPSS, reachability, backport status, and CVSS do not modify this
field. They feed priority_score instead.
| Stage | Sets confidence |
Sets priority_score |
|---|---|---|
cve_scan |
yes (static cap) | yes (full PriorityInputs) |
pattern_scan |
yes | via findings.py default (uses confidence only) |
taint_propagation |
yes (P-code cap when verified) | via findings.py default |
findings (assembler) |
normalizes existing field | fills priority_score for any finding missing it |
Operational priority for analyst triage. NOT a probability of true positive.
priority_score = clamp01(
detection_confidence * 0.50 # 50% weight: detection
+ epss_score * 0.25 # 25% weight: EPSS (if known)
+ reachability_mult * 0.15 # 15% weight: reachability
+ (cvss_base / 10.0) * 0.10 # 10% weight: CVSS
- backport_penalty # -0.20 if backport_present
)
Reachability multipliers (different from the legacy _REACHABILITY_MULTIPLIERS
in cve_scan.py -- these were retuned for the priority weighting):
| Reachability | Multiplier |
|---|---|
directly_reachable |
1.0 |
potentially_reachable |
0.7 |
unknown (or missing) |
0.5 |
unreachable |
0.2 |
Backport penalty: a flat -0.20 when backport_present=True. The
finding is still surfaced; it just ranks lower than an unpatched peer.
Both EPSS and CVSS are optional inputs -- when None, their term is
omitted from the sum (NOT replaced with zero), so a finding with no CVSS
data does not get artificially penalized.
Every finding with a priority_score also carries a priority_inputs
object documenting which signals were combined. Schema:
This object is intentionally serialized verbatim so analysts can audit the priority computation without re-running the scorer.
scoring.priority_bucket() (and the parallel
quality_metrics._priority_bucket_label()) classify a priority_score
into one of four operational buckets:
| Bucket | Range |
|---|---|
critical |
>= 0.80 |
high |
0.60 - 0.80 |
medium |
0.40 - 0.60 |
low |
< 0.40 |
quality_metrics.count_findings_by_priority() aggregates a finding list
into bucket counts. Pre-PR #15 findings (which lack priority_score)
land in an unscored bucket; the existing per-confidence aggregation is
preserved unchanged.
Hypothetical CVE: CVE-2024-9999 in dnsmasq 2.0 -- CVSS 9.8 critical,
EPSS 0.42 (very high), reachability directly_reachable, no backport.
match_conf = 0.90 # exact CPE match
score = 9.8 # CVSS base
confidence = 0.90 * 9.8/10.0 * 0.6 = 0.5292 # _finding_confidence
confidence = min(0.60, 0.5292) = 0.5292 # _STATIC_CONFIDENCE_CAP
confidence *= 1.0 (directly_reachable) # reach multiplier
confidence += 0.10 (EPSS >= 0.10) = 0.6292 # EPSS additive
confidence = min(0.60, 0.6292) = 0.60 # capped again
# Single field reads:
finding["confidence"] == 0.60The 0.60 is read by some downstream UIs as "we are 60% sure this is a
real vulnerability". That is wrong -- the static evidence alone only
supports 0.53; the rest of the bump comes from EPSS, which says nothing
about whether this binary is actually vulnerable.
match_conf = 0.90
score = 9.8
# detection_confidence: static evidence only, capped at STATIC_CODE_VERIFIED_CAP
detection_confidence = 0.90 * 9.8/10.0 * 0.6 = 0.5292
detection_confidence = min(0.55, 0.5292) = 0.5292
# priority_score: operational signal from the scorer
priority_score = (
0.5292 * 0.50 # detection = 0.2646
+ 0.42 * 0.25 # EPSS = 0.1050
+ 1.0 * 0.15 # reachability = 0.1500
+ 0.98 * 0.10 # CVSS / 10 = 0.0980
- 0.0 # no backport
) = 0.6176 # clamped to [0,1] -> 0.6176
# Two fields read:
finding["confidence"] == 0.5292 # strict detection -- unchanged by EPSS
finding["priority_score"] == 0.6176 # operational priority -- EPSS visible
finding["priority_inputs"] == {
"detection_confidence": 0.5292,
"epss_score": 0.42,
"epss_percentile": 0.93,
"reachability": "directly_reachable",
"backport_present": False,
"cvss_base": 9.8,
}The detection number now honestly reflects what the static evidence
supports; the EPSS boost is visible in priority_score and traceable in
priority_inputs for analyst audit.
Same finding, but backport_present=True (component has an opkg
distro patch revision):
detection_confidence == 0.5292 # UNCHANGED -- backport doesn't lie about evidence
priority_score == 0.6176 - 0.20 = 0.4176 # bumped down to medium bucketDetection stays anchored to the static evidence; priority drops because the vuln may already be patched in this image.
- Schema version is NOT bumped. Both
priority_scoreandpriority_inputsare additive optional fields, exactly likecategoryin PR #7a. - Existing consumers reading
confidencecontinue to work -- they will just see lower numbers for CVE findings (no more EPSS additive). - New ranking UIs should read
priority_scoreinstead ofconfidence. Thepriority_inputsfield exposes the inputs for full auditability. - Quality metrics: a new
count_findings_by_priority()helper produces per-bucket counts. The existing per-confidence and per-category aggregations are unchanged. - SARIF export:
properties.scout_priority_scoreandproperties.scout_priority_inputsmirror the new fields. The existingproperties.confidencekey is unchanged.
{ "detection_confidence": 0.55, // float 0.0-1.0; the static-evidence cap "epss_score": 0.42, // float 0.0-1.0 or null (FIRST.org EPSS) "epss_percentile": 0.93, // float 0.0-1.0 or null (companion percentile) "reachability": "directly_reachable",// string or null "backport_present": false, // boolean "cvss_base": 9.8 // float 0.0-10.0 or null }