Risk Scoring & Verdict Model
DocFirewall returns two outputs per scan:
verdict— one ofALLOW,FLAG,BLOCK. Derived from the classes of the findings produced by the scan, not from a probabilistic score threshold.risk_score— afloatin[0.0, 1.0]computed by probabilistic combination across non-INFO findings. Used for analytics, dashboards, and customer-facing severity bands. Does not gate the verdict.
This split is deliberate: a BLOCK decision should rest on definitive evidence (a YARA signature, an EICAR string, a javascript: URI, an embedded PE/ELF, etc.), not on accumulating multiple weak heuristic signals.
Finding classes
Every Finding carries a verdict_class field with one of three values:
| Class | What it means | Effect on verdict | Effect on risk score |
|---|---|---|---|
BLOCK | Definitive evidence of malicious intent — YARA hit, EICAR test string, AV-engine-infected, policy deny-list match, javascript:/data:/file:/vbscript:/jar:/IP-literal URI in PDF or DOCX, CSV =cmd\|... DDE pipe, ODF macro:// URI (CVE-2023-2255), RTF \javascript control word, JBIG2 + oversized dimensions (CVE-2021-30860), inline XLM + veryHidden sheet (Pikabot/IcedID dropper pattern), embedded PE/ELF/Mach-O/ISO-9660 (CVE-2023-36884), dropper-extension files inside a DOCX, eval(atob(...))/powershell -enc/cmd.exe /c in body text, <script>/javascript: in metadata, base64 decoded-to-dangerous content. | Any single BLOCK-class finding → verdict = BLOCK. Monotonic — no combination of REVIEW findings can BLOCK. | Contributes |
REVIEW (default) | Heuristic / suggestive signal — most prompt-injection ML hits, indirect-injection co-occurrence patterns, PII presence, social-engineering tri-signal, ToUnicode CMap anomalies, ATS keyword-stuffing patterns, hidden text indicators. | Any REVIEW finding (and no BLOCK) → verdict = FLAG. Combinations no longer escalate to BLOCK. | Contributes |
INFO | Recorded for audit but not a verdict driver — "PDF has N incremental update layers" (true of any edited PDF), descriptive structural patterns. | Never affects verdict. | Excluded from risk score entirely |
Verdict derivation
if any(f.verdict_class == VerdictClass.BLOCK for f in findings):
verdict = Verdict.BLOCK
elif any(f.verdict_class == VerdictClass.REVIEW for f in findings):
verdict = Verdict.FLAG
else:
verdict = Verdict.ALLOW
That's the entire rule. risk_score, config.thresholds.flag, and config.thresholds.block do not appear in the verdict path.
Risk score formula
The risk score is still computed for analytics and is exposed on ScanReport.risk_score:
Severity weights:
| Severity | Weight |
|---|---|
CRITICAL | 1.00 |
HIGH | 0.80 |
MEDIUM | 0.50 |
LOW | 0.25 |
INFO | (excluded) |
Threat weights (defaults — overridable via policy custom_threat_weights):
| Threat | Default weight |
|---|---|
T1_MALWARE | 1.00 |
T2_ACTIVE_CONTENT, T6_DOS | 0.90 |
T4_PROMPT_INJECTION, T10_INDIRECT_INJECTION, T11_RAG_POISONING | 0.80 |
T12_SOCIAL_ENGINEERING | 0.75 |
T7_EMBEDDED_PAYLOAD | 0.70 |
T5_RANKING_MANIPULATION, T8_METADATA_INJECTION | 0.60 |
T3_OBFUSCATION, T9_ATS_MANIPULATION | 0.50 |
Finding deduplication
Multiple detectors can fire on the same artifact (e.g. the same injection phrase detected by both fast-scan substring match and deep-scan Aho-Corasick). To prevent double-counting, findings are grouped by (threat_id, evidence["malicious_text"][:80]) before aggregation; the highest-confidence finding per group survives. Without deduplication, two p ≈ 0.5 findings on the same artifact would multiply to ≈ 0.75 and inflate the score.
Deep-scan trigger
To save time on obviously-clean files, the deep-scan stage only runs when the fast-scan risk score is ≥ config.thresholds.deep_scan_trigger (default 0.20), or for any known Office/PDF/RTF/HTML format. Note: this deep_scan_trigger threshold is the only threshold that still gates control flow — it controls whether deep scanning runs, not the verdict.
Why not just score thresholds?
Pre-0.4.4 the verdict was derived from risk_score crossing thresholds.block / thresholds.flag. That model produced false BLOCK verdicts on benign documents where several heuristic findings happened to combine: a resume with a /AA form field + a ToUnicode CMap pattern + PII could cross 0.70 with zero malicious content.
The class-based model fixes this: a BLOCK decision now points to a single, specific, explainable artifact that the reviewer can verify. FLAG queues still exist for the cases worth a human's attention, but the score is no longer mistaken for proof of malice.
See CHANGELOG.md entry for 0.4.4 for the full migration notes.
Plain-language explanations
Each Finding's explain field is intended for non-technical reviewers. The Scanner runs a post-process step (detectors.explanations.enrich_findings) that recognises the most common finding types and rewrites their explain text into plain prose — "this PDF is set up to run an action automatically the moment it's opened" rather than "Found suspicious token /OpenAction in raw file stream".
The original technical text is preserved verbatim in the new Finding.technical_detail field so SIEMs, forensic analysts, and rule engines that depend on stable string matching can keep working.
When enrich_findings does not recognise a finding type (because no entry has been added to the central mapping table yet), the finding passes through untouched — explain keeps the detector's original technical text, and technical_detail stays None. Coverage is intentionally rolled out incrementally for the most-encountered finding types first.
See src/doc_firewall/detectors/explanations.py for the mapping table and instructions on adding new entries.
Backwards compatibility
ScanConfig.thresholds.flagandScanConfig.thresholds.blockstill exist on the config object; they no longer drive the verdict but are kept so downstream tools can use the values as customer-facing risk bands ("low / medium / high risk" labels on dashboards).Policy.custom_threat_weightscontinues to influencerisk_score(still useful for tuning dashboards per corpus) but no longer changes verdict outcomes.- The legacy
RiskModel.get_verdict(score)call signature (withoutfindings) still works but emits aDeprecationWarning. Finding.explainis now plain-language for recognised finding types. SIEM rules that key on the previous short technical strings should switch toFinding.technical_detail(which preserves the original verbatim) or toFinding.title/Finding.module/Finding.evidencefor stable structured matching.