Real-World Email Security Performance: 97.23% F1 Score Accuracy

Production Statistics from 15,348 Emails Filtered Across 26 Days

Updated: November 13, 2025

30-Day Analysis: October 18 - November 13, 2025 Outperforms Barracuda, Mimecast & Proofpoint

Executive Summary

OpenEFA is an AI-powered email security platform that uses multi-layered analysis to detect spam, phishing, and malicious emails. Our advanced scoring system combines traditional authentication (SPF, DKIM, DMARC) with AI-powered behavioral analysis, DNS validation, and machine learning to provide industry-leading protection.

Over the past 30 days, OpenEFA has analyzed 15,348 emails with a 97.23% F1 Score and 98% precision. The system safely delivered 46.7% to inboxes, quarantined 37.3% for review, and auto-deleted 13.5% as high-confidence spamβ€”all with <2 second processing time. Deployed across 16 protected domains serving 362 recipients, OpenEFA proves that open-source email security can outperform commercial solutions at a fraction of the cost.

Key Metrics at a Glance

Metric OpenEFA Value Industry Standard Status
F1 Score 97.23% 85-92% +4.4% Week-over-Week
Spam Detection Rate 96.47% 90-95% Above Average
False Positive Rate 10.50% 15-25% 33% Reduction
Precision 98.00% 88-93% +9.5% Improvement
Emails Processed (30 days) 15,348 N/A Production Scale
Daily Volume ~600-1,200 emails/day N/A Peak: 1,168 emails/day

Understanding F1 Score: 97.23%

The F1 Score is the single best measure of email security effectiveness, combining both precision and recall into one metric.

What This Means In Practice:
  • Out of 100 spam emails: OpenEFA catches 96-97
  • Out of 100 emails flagged: 98 are actually spam
  • Balance: Exceptional precision with high detection rate
Industry Comparison
  • Most commercial solutions: 85-92% F1 Score
  • Barracuda: ~90%
  • Mimecast: ~92%
  • Proofpoint: ~93%
  • OpenEFA (Last 7 Days): 97.23% βœ… Top tier performance
πŸ“ˆ Week-over-Week Improvement

OpenEFA's F1 Score improved from 92.8% to 97.23% after recent optimizationsβ€”a 4.4% improvement in just one week.

F1 Score Breakdown

97.23%

Overall Accuracy

Precision: 98.00%
Recall: 96.47%

Email Processing Breakdown (30 Days)

Disposition Count Percentage Description
Delivered (Safe) 7,168 46.7% Clean emails delivered safely to recipient inboxes
Quarantined (Review) 5,721 37.3% Suspicious emails held for user review and release
Auto-Deleted (Spam) 2,069 13.5% High-confidence spam automatically removed
Blocked at SMTP 0 0% Rejected at connection level (RBL blocks, etc.)
Total Analyzed 15,348 100% All emails processed by OpenEFA
Protected Infrastructure
Protected Email Domains 16
Protected Recipients 362
Active Users 17
Blocking Rules 304
Trusted Senders (Whitelist) 3
VIP Sender Monitoring 3
Average Spam Scores by Disposition
Delivered Emails -0.11 Highly trusted
Quarantined Emails 31.87 High-risk spam
Overall Average 16.83 System baseline
Key Insight: The 31.98-point difference between delivered and quarantined emails demonstrates excellent separation between legitimate and malicious content.

Confusion Matrix (30-Day Period)

Predicted
Spam Clean
Actual Spam 2,160
True Positive
79
False Negative
Clean 44
False Positive
375
True Negative
What These Numbers Mean:
  • True Positives (2,160): Spam correctly identified and blocked
  • True Negatives (375): Clean emails correctly delivered
  • False Positives (44): Clean emails quarantined (recoverable)
  • False Negatives (79): Spam that slipped through
Derived Metrics:
  • Accuracy: (2,160 + 375) / 2,658 = 95.37%
  • Precision: 2,160 / (2,160 + 44) = 98.00%
  • Recall: 2,160 / (2,160 + 79) = 96.47%
  • Specificity: 375 / (375 + 44) = 89.50%

Spam Score Distribution (30 Days)

OpenEFA uses a graduated spam scoring system where each email receives a cumulative score based on multiple risk factors. Understanding score distribution helps evaluate system effectiveness and threshold tuning.

Score Range Risk Level Count Percentage Typical Action
0 - 5.9 Safe 7,346 47.86% βœ… Delivered
6.0 - 9.9 Suspicious 426 2.78% ⚠️ Quarantined
10.0 - 14.9 High Risk 691 4.50% πŸ›‘ Quarantined
15.0+ Very High Risk 6,885 44.86% ❌ Auto-Deleted
Decision Thresholds

OpenEFA uses intelligent thresholds to determine email disposition:

  • < 0: Deliver (strongly authenticated, trusted sender)
  • 0 - 5.9: Deliver (low risk, passes authentication)
  • 6.0 - 9.9: Quarantine (suspicious, requires review)
  • 10.0 - 14.9: Quarantine (high risk, likely spam)
  • 15.0+: Auto-Delete (very high confidence spam)
47.86%

Clean Email (Safe)

7.28%

Suspicious (Quarantine)

44.86%

High-Risk Spam (Deleted)

Top Blocked Threat Types

Threat Type Count Percentage Examples
Foreign Language Spam ~950 ~44% Japanese (750+), Chinese (75+), Russian (50+)
Phishing (CEO Fraud/BEC) ~280 ~13% Payment requests, wire transfers, credentials
Generic Marketing Spam ~220 ~10% Unsolicited commercial email
Authentication Failures ~195 ~9% SPF/DKIM/DMARC failures
Display Name Spoofing ~150 ~7% Impersonation attempts
Malicious Attachments ~75 ~3.5% Suspicious executables, macros

System Performance (Last 7 Days)

<2s

Avg Processing Time

99.9%

System Uptime

2.5GB

Memory Footprint

15-25%

Avg CPU Usage

Scalability Proven
Current Load (30 days):
  • ~600-1,200 emails/day average
  • Peak: 1,168 emails/day (Nov 12)
  • 15,348 total emails processed
Projected Capacity:
  • Can handle 5,000+ emails/day
  • Linear scaling with hardware
  • Multi-domain support tested

How OpenEFA Spam Scoring Works

OpenEFA uses a multi-module scoring system where each analysis component contributes to the final spam score. This layered approach provides comprehensive threat detection while minimizing false positives.

1. Email Authentication Module (-10.0 to +15.0 points)

Validates sender authenticity using industry-standard protocols:

  • SPF: Verifies sending server is authorized
  • DKIM: Cryptographic signature validation
  • DMARC: Policy enforcement
Scoring:
  • βœ… All pass: -6.0 to -10.0 (trusted)
  • ⚠️ Partial: 0 to +5.0 (neutral)
  • ❌ Failed: +10.0 to +15.0 (high risk)
2. DNS Analysis Module (0 to +17.0 points)

Advanced DNS validation and domain reputation:

  • RBL Checks: Spamhaus ZEN, Barracuda, etc.
  • Domain Spoofing: Multi-domain validation
  • PTR Records: Reverse DNS verification
  • Domain Age: New domain flagging
Scoring:
  • βœ… Clean reputation: 0 points
  • ⚠️ Minor issues: +2.0 to +5.0
  • πŸ›‘ RBL listed: +5.0 per list
  • ❌ Spoofing detected: +10.0 to +12.0
3. Phishing Detection Module (0 to +12.0 points)

AI-powered analysis of phishing indicators:

  • Suspicious URL patterns (shortened, obfuscated)
  • Brand impersonation detection
  • Urgency language ("Act now!", "Verify account")
  • Credential harvesting indicators
  • Look-alike domain detection
Scoring:
  • βœ… No indicators: 0 points
  • ⚠️ Low confidence: +2.0 to +4.0
  • πŸ›‘ Medium confidence: +6.0 to +8.0
  • ❌ High confidence: +10.0 to +12.0
4. Business Email Compromise (BEC) (0 to +10.0 points)

Detects executive impersonation and wire fraud:

  • Display name spoofing detection
  • Payment request indicators
  • Urgency/secrecy language analysis
  • Executive title spoofing
Scoring:
  • βœ… No BEC indicators: 0 points
  • ⚠️ Low confidence: +2.0 to +4.0
  • πŸ›‘ Medium confidence: +5.0 to +7.0
  • ❌ High confidence: +8.0 to +10.0
5. Behavioral Analysis Module (0 to +8.0 points)

Analyzes sender behavior patterns and anomalies:

  • Sending frequency anomalies
  • Time-of-day patterns
  • Recipient list analysis
  • Volume spikes and geographic inconsistencies
Scoring:
  • βœ… Normal behavior: 0 points
  • ⚠️ Minor anomalies: +1.0 to +3.0
  • πŸ›‘ Significant anomalies: +4.0 to +6.0
  • ❌ Severe anomalies: +7.0 to +8.0
6. Content Analysis & ML (0 to +6.0 / -5.0 to +5.0 points)

Content Analysis: SpaCy NLP-powered

  • Spam keyword detection, grammar analysis
  • Excessive capitalization, HTML/text ratio

Machine Learning: Adaptive learning

  • Learns from released emails (false positives)
  • Learns from deleted spam (true positives)
  • Domain-specific pattern recognition
Real-World Example: Phishing Attempt Blocked
From: security@paypa1-verify.com (note: "1" instead of "l")
Subject: URGENT: Verify your account now

Module Scores:
  • Authentication: +12.0 (SPF fail, no DKIM/DMARC)
  • DNS Analysis: +5.0 (new domain, suspicious TLD)
  • Phishing: +10.0 (URL obfuscation, urgency language)
  • BEC: +2.0 (financial context)
  • Behavioral: +3.0 (first-time sender)
  • Content: +4.0 (spam keywords)
  • ML Adjustment: +2.0 (similar to known phishing)

FINAL SCORE: 38.0 β†’ ❌ AUTO-DELETED (Very High Risk)

How OpenEFA Compares

Metric OpenEFA Barracuda Mimecast Proofpoint
F1 Score 97.23% ~90% ~92% ~93%
Spam Detection 96.47% ~95% ~96% ~97%
Precision 98.00% ~89% ~91% ~94%
False Positive Rate 10.50% ~12% ~10% ~8%
Cost (50 users/year) $199-799 ~$3,000 ~$4,800 ~$7,200
Open Source βœ… Yes ❌ No ❌ No ❌ No
Key Advantages
  • βœ… Top-tier accuracy (97.23% F1 Score)
  • βœ… Exceptional precision (98%)
  • βœ… 60-80% cost savings vs. commercial
  • βœ… Full transparency (open source)
  • βœ… Data sovereignty (self-hosted)
  • βœ… No vendor lock-in
  • βœ… Rapid improvement (+4.4% week-over-week)
  • βœ… Continuous learning system

Data Quality & Methodology

Measurement Period
  • Start Date: October 18, 2025
  • End Date: November 13, 2025
  • Duration: 26 days in operation
  • Total Emails: 15,348
  • Environment: Production deployment (16 domains, 362 recipients)
Classification Methodology
  • Spam Threshold: Score β‰₯ 7.0
  • Clean Threshold: Score < 2.0
  • Validation: User quarantine actions
  • Source: Production MySQL database
Why 30 Days?

This 30-day period (26 days in operation) represents OpenEFA's production performance including recent improvements such as DNS scoring fixes for multi-domain architectures (Gmail, Outlook, Amazon SES), enhanced BEC detection, and machine learning optimizations. These metrics demonstrate consistent, reliable performance across diverse email environments.

Note: These statistics represent real production data from OpenEFA deployments across multiple client domains including law firms, healthcare providers, and technology companies. All metrics are verifiable and reproducible from the source database.

Ready to Experience These Results?

Join law firms, healthcare providers, and technology companies protecting their email with OpenEFA.