Add an evidence-based confidence score (0–100) — procedure

Decide whether you will enforce confidence scoring through the dedicated system prompt, a manual response contract, or the full Fact-Checking Kit workflow.

Step 2

Install the template when using Option 1

If you use Option 1, install confidence-score.system.txt in the runtime.

Step 3

Preserve fail-closed compatibility

Ensure compatibility with evidence-boundary fail-closed behavior.

If an evidence policy triggers a sentinel-only response, output exactly the sentinel and stop. Recommended enforcement: instruction-hierarchy-and-evidence-boundary.system.txt

Step 4

Apply the scoring policy

Apply the scoring rules from the linked policy.

Evidence-based confidence score (0–100) — policy

Smoke test 1 — Adequate admissible evidence

Ask a factual question where you provide adequate admissible evidence.

Expected: the response ends with Confidence: <0–100>/100.

Smoke test 2 — Sentinel-only fail-closed case

Trigger a sentinel-only fail-closed case under your active evidence boundary.

Expected: output exactly the sentinel and stop, with no confidence line.

Option 1

System prompt template (recommended)

Enforce the confidence line through confidence-score.system.txt.

Best when you want a dedicated enforcement layer in the runtime.

Open system prompt

Policy Procedure

Example: “Summarize what the attached logs show about the error and what is NOT proven.” You must provide logs or excerpts and the active evidence boundary.

Option 2

Manual response contract

Add the confidence line and its rules directly into your own policy or template stack.

Best when you already maintain a custom contract and want to layer scoring into it explicitly.

Open policy

Template reference Procedure

Example: “Assess claim X and state confidence.” You must provide admissible evidence under the active policy.

Option 3

Full workflow (Fact-Checking Kit)

Run confidence scoring as part of the broader fact-checking workflow.

Best when confidence should be embedded in a full verification run rather than added as a standalone contract.

Open procedure

Policy System prompt

Example: “Verify whether claim X is supported; fail closed if not.” Output must end with a confidence line unless a sentinel-only response is required.

Confidence after sentinel

Appending a confidence line after a sentinel-only fail-closed response.

Wrong format

Using non-numeric formats instead of Confidence: <0–100>/100.

Wrong meaning

Treating confidence as probability instead of evidence-weighted analytic confidence.

Inflated confidence

Reporting high confidence when evidence is indirect, weak, or conflicting.

The policy requires evidence-strength caps.

Policies

Browse the rule set behind evidence-based confidence scoring.

Open index →