Confidence¶

Every ORES result includes a confidence value between 0.0 and 1.0. Confidence tells you how much of the scoring model was backed by real signals you provided, versus neutral defaults the engine assumed on your behalf.

The Intuition¶

Think of the ORES scoring model as a puzzle with five pieces — the five scoring dimensions. Each piece has a different size (weight), and each piece can be filled in by one or more signals.

When you provide signals, you fill in pieces of the puzzle with real data. The pieces you leave empty get filled with neutral defaults — reasonable assumptions, but assumptions nonetheless.

Confidence answers one question: How much of the puzzle is real data versus assumptions?

graph LR
    subgraph "2 Signals = 0.35 confidence"
        A1["Base Vuln<br/>30%<br/>:heavy_check_mark: cvss"] --- A2["Exploitability<br/>25%<br/>:x: defaults"]
        A2 --- A3["Env Context<br/>20%<br/>:x: defaults"]
        A3 --- A4["Remed Gap<br/>15%<br/>:x: defaults"]
        A4 --- A5["Lateral Risk<br/>10%<br/>:heavy_check_mark: blast_radius"]
    end

    style A1 fill:#0d9488,color:#fff,stroke:none
    style A2 fill:#64748b,color:#fff,stroke:none
    style A3 fill:#64748b,color:#fff,stroke:none
    style A4 fill:#64748b,color:#fff,stroke:none
    style A5 fill:#0d9488,color:#fff,stroke:none

graph LR
    subgraph "8 Signals = 1.00 confidence"
        B1["Base Vuln<br/>30%<br/>:heavy_check_mark: cvss, nist"] --- B2["Exploitability<br/>25%<br/>:heavy_check_mark: epss, threat_intel"]
        B2 --- B3["Env Context<br/>20%<br/>:heavy_check_mark: asset, blast_radius"]
        B3 --- B4["Remed Gap<br/>15%<br/>:heavy_check_mark: patch, compliance"]
        B4 --- B5["Lateral Risk<br/>10%<br/>:heavy_check_mark: blast_radius"]
    end

    style B1 fill:#0d9488,color:#fff,stroke:none
    style B2 fill:#0d9488,color:#fff,stroke:none
    style B3 fill:#7c3aed,color:#fff,stroke:none
    style B4 fill:#6d28d9,color:#fff,stroke:none
    style B5 fill:#5b21b6,color:#fff,stroke:none

Confidence is metadata, not a penalty

Confidence does not reduce or adjust the score. The score already reflects defaults for uncovered dimensions. Confidence is there to tell you — and your downstream systems — how trustworthy that score is.

How Confidence Is Calculated¶

Confidence is a weighted average of dimension coverage. For each of the five scoring dimensions, ORES checks how many of the recognized signal types you provided:

Dimension	Weight	Signal Types	Sources
Base Vulnerability	30%	`cvss`, `nist`	2
Exploitability	25%	`epss`, `threat_intel`	2
Environmental Context	20%	`asset`, `blast_radius`	2
Remediation Gap	15%	`patch`, `compliance`	2
Lateral Risk	10%	`blast_radius`	1

For each dimension:

dimension_coverage = (sources you provided) / (total sources for that dimension)

Then confidence is the weighted sum:

confidence = SUM(dimension_weight * dimension_coverage)

Worked Example: cvss + epss + threat_intel

You provide three signals: cvss, epss, and threat_intel.

Dimension	Coverage	Weighted
Base Vulnerability (30%)	1/2 = 0.50	0.30 x 0.50 = 0.150
Exploitability (25%)	2/2 = 1.00	0.25 x 1.00 = 0.250
Environmental Context (20%)	0/2 = 0.00	0.20 x 0.00 = 0.000
Remediation Gap (15%)	0/2 = 0.00	0.15 x 0.00 = 0.000
Lateral Risk (10%)	0/1 = 0.00	0.10 x 0.00 = 0.000
		= 0.40

Final confidence: 0.40

Confidence by Signal Count¶

Here is how confidence builds as you add signals, assuming you add the highest-impact signals first:

Signals Provided	Confidence	Level
`cvss` only	0.15	Very Low
`cvss` + `nist`	0.30	Low
`cvss` + `nist` + `epss`	0.43	Moderate
`cvss` + `nist` + `epss` + `threat_intel`	0.55	Moderate
+ `asset`	0.65	Good
+ `blast_radius`	0.85	High
+ `patch`	0.93	High
+ `compliance` (all 8)	1.00	Complete

Confidence Ranges¶

Range	Level	Interpretation
0.00 - 0.30	Very Low	Only one or two dimensions covered. Treat scores as rough directional signals, not authoritative assessments.
0.30 - 0.55	Low to Moderate	Foundation (vulnerability + exploitability) is likely covered, but environmental context is missing. Useful for triage; supplement with asset data.
0.55 - 0.75	Moderate to Good	Most dimensions have at least one signal. Reasonably reliable for prioritization.
0.75 - 1.00	High to Complete	All or nearly all dimensions covered. Reliable for automated decision-making, SLA enforcement, and audit trails.

Low confidence does not mean low risk

A score of 78 with confidence 0.15 means: "Based on what we know, this looks high-risk, but we're mostly guessing on 85% of the model." The vulnerability could be more dangerous once you fill in the missing context.

How to Improve Confidence¶

Add signals for uncovered dimensions. Sorted by impact (largest weight contribution first):

Priority	Action	Dimensions Covered	Confidence Boost
1	Add `cvss` or `nist`	Base Vulnerability	up to +30%
2	Add `epss` and/or `threat_intel`	Exploitability	up to +25%
3	Add `asset`	Environmental Context	up to +10%
4	Add `blast_radius`	Environmental Context + Lateral Risk	up to +20%
5	Add `patch`	Remediation Gap	up to +7.5%
6	Add `compliance`	Remediation Gap	up to +7.5%

Check derived_from in every result

The explanation.factors array in every result includes derived_from lists for each factor. Any factor showing "derived_from": ["defaults"] is an uncovered dimension. Add the corresponding signal type to improve both confidence and accuracy.

Confidence in the API Response¶

Confidence is returned in the explanation field of every EvaluationResult:

{
  "apiVersion": "ores.dev/v1",
  "kind": "EvaluationResult",
  "score": 62,
  "label": "medium",
  "mode": "weighted",
  "version": "0.2.0",
  "explanation": {
    "signals_provided": 2,
    "signals_used": 2,
    "signals_unknown": 0,
    "unknown_signals": [],
    "warnings": [],
    "confidence": 0.55,
    "factors": [ ... ]
  }
}

The supporting fields give you additional insight into input quality:

signals_provided: Total number of signal keys in the request
signals_used: Signals that were recognized and successfully normalized
signals_unknown: Signals that were not recognized (typos, unsupported types)

When signals_used < signals_provided

This means some signals were either unknown or invalid. Check the warnings array and unknown_signals list for details. Common causes: typos in signal names, invalid field values, or using a signal type not supported by your ORES version.

Using Confidence in Automation¶

Confidence is designed to power downstream decision-making. Here are common patterns:

Gate on ConfidenceTiered SLAsRe-score Workflow

# Only auto-create tickets for high-confidence scores
if result.score >= 70 and result.explanation.confidence >= 0.55:
    create_jira_ticket(result)
else:
    queue_for_human_review(result)

# Adjust SLA based on confidence
if result.explanation.confidence >= 0.75:
    sla = "4 hours"   # High confidence — enforce strict SLA
elif result.explanation.confidence >= 0.40:
    sla = "24 hours"  # Moderate — standard SLA
else:
    sla = "best effort"  # Low confidence — enrich data first

# Store low-confidence results for re-scoring when data arrives
if result.explanation.confidence < 0.55:
    store_for_rescore(result, missing_dimensions=get_defaults(result))
    # When asset data is ingested, re-evaluate automatically