Confidence Scoring System – Technical Overview

Visibility: Public

What is a Confidence Score?

Each AI‑generated response is assigned a confidence score that indicates the system’s assessment of its reliability. The scoring pipeline evaluates multiple quality signals and applies structured deductions to a baseline score of High.

Major Signals That Impact the Confidence Score

The quality signals that are system evaluates comprises of the following categories:

Citation Coverage & Precision
- Detects presence/absence of citations.
- Tracks citation density (percentage of sentences containing references).
- Measures context precision (alignment of response text with provided source context).
Content Integrity Checks
- PII Detection: Screens for accidental disclosure of personally identifiable information.
- Tone Analysis: Flags negative, risky, or unsupported tonal patterns.
- Language Quality: Identifies non‑committal phrasing (e.g., “it depends,” “not available”) indicating reduced reliability.
Knowledge Coverage
If requested information is not retrievable within the provided materials, the system surfaces a structured placeholder:
- INFORMATION MISSING IN KNOWLEDGE HUB (highlighted in yellow for visibility).

Scoring Levels & Criteria

High: Source‑backed, precise, no integrity violations.
Medium: Generally reliable but at least one cautionary signal (e.g., PII risk, vague language).
Low: Insufficient support (few/no citations, weak alignment with source context). Recommended for human review before use.

Key Notes

The scoring mechanic applies deductions only—scores always initialize at high and degrade based on signal strength.
All scores are advisory heuristics, not authoritative truth judgments. Human review of both the response and source materials remains required in high‑stakes contexts.