Real-World Email Security Performance: 97.43% F1 Score Accuracy

Production Statistics from 54,565 Emails Filtered Across 30 Days

Updated: April 13, 2026

30-Day Analysis: March 14 - April 13, 2026 AI-Powered Multi-Layer Email Security

Executive Summary

OpenEFA is an AI-powered email security platform that uses multi-layered analysis to detect spam, phishing, and malicious emails. Our advanced scoring system combines traditional authentication (SPF, DKIM, DMARC) with AI-powered behavioral analysis, DNS validation, and machine learning to provide industry-leading protection.

Over the past 30 days, OpenEFA has analyzed 54,565 emails with a 97.43% F1 Score and 95.23% precision. The system safely delivered 73.5% to inboxes, quarantined 3.3% for review, and auto-deleted 20.4% as high-confidence spam—all with <2 second processing time. Deployed across 32 protected domains serving 401 recipients, OpenEFA proves that AI-powered email security can deliver enterprise-grade protection at a fraction of the cost.

Key Metrics at a Glance

Metric OpenEFA Value Industry Standard Status
F1 Score 97.43% 85-92% Above Average
Spam Detection Rate 99.78% 90-95% Above Average
False Positive Rate 1.66% 15-25% 93% Better
Precision 95.23% 88-93% Above Average
Emails Processed (30 days) 54,565 N/A Production Scale
Daily Volume ~1,760 emails/day N/A Peak: 2,474 emails/day

Understanding F1 Score: 97.43%

The F1 Score is the single best measure of email security effectiveness, combining both precision and recall into one metric.

What This Means In Practice:
  • Out of 100 spam emails: OpenEFA catches 99
  • Out of 100 emails flagged: 95 are actually spam
  • Balance: Strong precision with high detection rate
Industry Comparison
  • Most commercial solutions: 85-92% F1 Score
  • Barracuda: ~90%
  • Mimecast: ~92%
  • Proofpoint: ~93%
  • OpenEFA (April 2026): 97.43% ✅ Above average performance
F1 Score Breakdown

97.43%

Overall F1 Score

Precision: 95.23%
Recall: 99.78%

Email Processing Breakdown (30 Days)

Disposition Count Percentage Description
Delivered (Safe) 40,103 73.5% Clean emails delivered safely to recipient inboxes
Quarantined (Review) 1,791 3.3% Suspicious emails held for user review and release
Auto-Deleted (Spam) 11,112 20.4% High-confidence spam automatically removed
Released 877 1.6% User-released from quarantine
Total Analyzed 54,565 100% All emails processed by OpenEFA
Protected Infrastructure
Protected Email Domains 32
Protected Recipients 401
Active Users 100+
Blocking Rules 9,619
Unique Sender Domains Analyzed 6,215
Average Spam Scores by Disposition
Delivered Emails 0.66 Low risk
Quarantined Emails 56.75 High-risk spam
Auto-Deleted 67.23 Very high-risk spam
Released -7.41 False positives (trusted)
Overall Average 17.60 System baseline
Key Insight: The 56.09-point difference between delivered and quarantined emails demonstrates excellent separation between legitimate and malicious content.

Confusion Matrix (30-Day Period)

Predicted
Spam Clean
Actual Spam 12,832
True Positive
30
False Negative
Clean 704
False Positive
38,220
True Negative
What These Numbers Mean:
  • True Positives (12,832): Spam correctly identified and blocked
  • True Negatives (38,220): Clean emails correctly delivered
  • False Positives (704): Clean emails quarantined (recoverable)
  • False Negatives (30): Spam that slipped through
Derived Metrics:
  • Accuracy: (12,832 + 38,220) / 51,789 = 98.58%
  • Precision: 12,832 / (12,832 + 704) = 94.80%
  • Recall: 12,832 / (12,832 + 30) = 99.77%
  • Specificity: 38,220 / (38,220 + 704) = 98.19%

Spam Score Distribution (30 Days)

OpenEFA uses a graduated spam scoring system where each email receives a cumulative score based on multiple risk factors. Understanding score distribution helps evaluate system effectiveness and threshold tuning.

Score Range Risk Level Count Percentage Typical Action
0 - 5.9 Safe 38,434 70.4% ✅ Delivered
6.0 - 9.9 Suspicious 1,307 2.4% ⚠️ Quarantined
10.0 - 14.9 High Risk 1,058 1.9% 🛑 Quarantined
15.0+ Very High Risk 13,764 25.2% ❌ Auto-Deleted
Intelligent Thresholds

OpenEFA uses adaptive, multi-factor thresholds to determine email disposition. Emails are classified as delivered, quarantined, or auto-deleted based on cumulative scoring across all analysis modules.

70.4%

Clean Email (Safe)

4.3%

Suspicious (Quarantine)

25.2%

High-Risk Spam (Deleted)

Top Blocked Threat Types

Threat Type Count Description
Marketing / Cold Commercial Patterns 25,765 Unsolicited bulk/marketing content flagged by content classifier
Adversarial Patterns 19,400 Obfuscation, evasion tactics, and adversarial content signals
Phishing Attempts 12,973 Credential harvesting, fake login pages, impersonation
BEC (Business Email Compromise) 9,212 Payment/wire fraud, executive impersonation — 544 CRITICAL, 769 HIGH, 1,076 MED, 6,823 LOW
First-Contact Risk (New Sender) 8,384 Sender and/or domain never seen before in system history
SPF / DKIM / DMARC Failures 6,128 Authentication failures across one or more protocols
Suspicious Payment Signals 2,383 Invoice, wire transfer, and payment-redirect fraud indicators
EFA Collective RBL Matches 1,326 Crowd-sourced blocklist hits from the OpenEFA Collective
Virus / Malware Detected 271 Known-bad attachments and embedded malware signatures

Machine Learning Performance

OpenEFA's ML ensemble model uses multiple classifiers trained on production email data to provide adaptive spam detection.

Ensemble Model Metrics
Training Samples 8,750
Training Balance 4,375 spam / 4,375 ham
ML Accuracy 81.9%
ML F1 Score 82.7%
ML ROC AUC 91.2%
Features 130
Base Model Performance (ROC AUC)
XGBoost 91.0%
Random Forest 90.0%
Logistic Regression 85.8%
Ensemble Strategy: Multiple models are combined using stacking to achieve higher accuracy than any individual model.

System Performance

<2s

Avg Processing Time

99.9%

System Uptime

~2.5GB

Memory Footprint

5,000+

Daily Capacity

Volume Statistics (30 Days)
  • Daily Average: 1,760 emails/day
  • Peak Day: 2,474 emails
  • Minimum Day: 674 emails
  • Total Processed: 54,565 emails

How OpenEFA Spam Scoring Works

OpenEFA uses a multi-module scoring system where each analysis component contributes to the final spam score. This layered approach provides comprehensive threat detection while minimizing false positives.

1. Email Authentication Module

Validates sender authenticity using industry-standard protocols:

  • SPF: Verifies sending server is authorized
  • DKIM: Cryptographic signature validation
  • DMARC: Policy enforcement
Scoring:
  • ✅ All pass: Score reduced (trusted)
  • ⚠️ Partial: Neutral
  • ❌ Failed: Score increased (high risk)
2. DNS Analysis Module

Advanced DNS validation and domain reputation:

  • RBL Checks: Multiple blocklist sources
  • Domain Spoofing: Multi-domain validation
  • PTR Records: Reverse DNS verification
  • Domain Age: New domain flagging
Scoring:
  • ✅ Clean reputation: No impact
  • ⚠️ Minor issues: Low increase
  • 🛑 RBL listed: Moderate increase
  • ❌ Spoofing detected: Significant increase
3. Phishing Detection Module

AI-powered analysis of phishing indicators:

  • Suspicious URL patterns (shortened, obfuscated)
  • Brand impersonation detection
  • Urgency language analysis
  • Credential harvesting indicators
  • Look-alike domain detection
Scoring:
  • ✅ No indicators: No impact
  • ⚠️ Low confidence: Low increase
  • 🛑 Medium confidence: Moderate increase
  • ❌ High confidence: Significant increase
4. Business Email Compromise (BEC)

Detects executive impersonation and wire fraud:

  • Display name spoofing detection
  • Payment request indicators
  • Urgency/secrecy language analysis
  • Executive title spoofing
Scoring:
  • ✅ No BEC indicators: No impact
  • ⚠️ Low confidence: Low increase
  • 🛑 Medium confidence: Moderate increase
  • ❌ High confidence: Significant increase
5. Behavioral Analysis Module

Analyzes sender behavior patterns and anomalies:

  • First contact detection
  • Sender reputation analysis
  • Graph-based relationship analysis
Scoring:
  • ✅ Normal behavior: No impact
  • ⚠️ Minor anomalies: Low increase
  • 🛑 Significant anomalies: Moderate increase
  • ❌ Severe anomalies: High increase
6. ML Ensemble Module

Adaptive learning from user feedback:

  • Multi-model ensemble voting
  • Confidence-weighted adjustments
  • Learns from released emails (false positives)
  • Learns from deleted spam (true positives)
Scoring:
  • ✅ Ham prediction: Score reduced
  • ⚠️ Uncertain: No impact
  • ❌ Spam prediction: Score increased

How OpenEFA Compares

Metric OpenEFA Barracuda Mimecast Proofpoint
F1 Score 97.43% ~90% ~92% ~93%
Spam Detection 99.78% ~95% ~96% ~97%
Precision 95.23% ~89% ~91% ~94%
False Positive Rate 1.66% ~12% ~10% ~8%
Cost (50 users/year) $199-799 ~$3,000 ~$4,800 ~$7,200
Privacy-First AI ✅ Yes ❌ No ❌ No ❌ No
Key Advantages
  • ✅ Above-average accuracy (97.43% F1 Score)
  • ✅ Strong precision (95.23%)
  • ✅ Low false positive rate (1.66%)
  • ✅ 60-80% cost savings vs. commercial
  • ✅ Full transparency (detailed scoring)
  • ✅ Data sovereignty (self-hosted)
  • ✅ No vendor lock-in
  • ✅ Continuous learning system

Data Quality & Methodology

Measurement Period
  • Start Date: March 14, 2026
  • End Date: April 13, 2026
  • Duration: 30 days
  • Total Emails: 54,565
  • Environment: Production deployment (32 domains, 401 recipients)
Classification Methodology
  • Spam Threshold: Score ≥ 18.0
  • Clean Threshold: Score < 6.0
  • Validation: User quarantine actions (releases)
  • Source: Production MySQL database
Why These Numbers Matter

This 30-day period represents OpenEFA's production performance with fully operational detection modules including multi-module spam scoring with 20+ detection components, AI-powered NLP analysis using spaCy en_core_web_lg, machine learning ensemble with adaptive learning, and real-time DNS and authentication validation.

Note: These statistics represent real production data from OpenEFA deployments across multiple client domains. All metrics are verifiable and reproducible from the source database.

Ready to Experience These Results?

Join organizations worldwide protecting their email with OpenEFA's AI-powered security.