Add an evidence-based confidence score (0–100) — policy

Rules for reporting an evidence-based confidence score, including scoring meaning, thresholds, and basis statements for high-impact judgments.

What this policy enforces

Use this policy when responses must report an evidence-based confidence score in a standardized, auditable format.

Purpose
Standardized analytic confidence
Provide a standardized, evidence-based way to communicate analytic confidence in a response.
This supports auditability, triage, and quality gating.
Output contract
Numeric confidence is mandatory
Responses under this policy must end with a numeric confidence line.
The format is fixed and not optional.
Meaning
Evidence-weighted, not probabilistic
The score reflects correctness and evidential support, not persuasion, agreement, or model probability.
Confidence here follows analytic tradecraft usage.

What this is and what it is not

This section fixes the meaning of the score so it cannot be interpreted as a different signal.

Is
Evidence-weighted self-assessment
An evidence-weighted self-assessment of correctness and support for the delivered answer.
Is not
Not probability, logprobs, or calibrated forecast
It is not a statistical probability, model token likelihood, or calibrated forecast.
Confidence here follows the tradecraft notion of stating confidence and the basis for it.

Scope

This policy applies whenever the response makes substantive claims that need an explicit confidence signal.

Applies to factual, technical, or operational claims
Use this policy for responses that make factual, technical, or operational claims.

Non-negotiable rules

These rules are normative and define the minimum scoring contract.

R1
Always include a confidence score
Always include a numeric confidence score on the last line.
Confidence: <0–100>/100
R2
Define the score semantics
The score reflects correctness of the answer and evidential support.
It does not reflect persuasiveness, agreement, or tone.
R3
Disclose uncertainty below threshold
If confidence is below 90/100, explicitly state what is uncertain, why it is uncertain, and what evidence would raise confidence.
R4
Apply a verification-bound cap
When a claim requires authoritative verification, confidence is capped by source strength, directness, and convergence.
Primary or official sources outrank weaker or indirect support.
R5
Add a basis statement for high-impact judgments
For high-impact or security-relevant judgments, include a short basis statement.
State key evidence types, key gaps or assumptions, and disagreement summary if sources conflict.

Recommended scoring bands

These bands are non-normative and exist to reduce false precision.

95–100
Very high analytic confidence
90–94
High
75–89
Medium
50–74
Low
0–49
Very low

Tradecraft references

These references explain the analytic notion of confidence that this policy uses.

ICD 203 (ODNI)
Analytic standards require stating confidence and the basis for it.
IPCC Uncertainty Guidance Note
Confidence language is based on evidence and agreement.