Content Shift Detection: When Message Patterns Change Without Explanation

People are remarkably consistent in how they write. The words they choose, the topics they address, the types of requests they make, the level of formality they maintain — all of these form a content fingerprint that persists over months and years. When that fingerprint changes abruptly, the message may still come from the right address, but it may not come from the right person.

Content shift detection is the practice of identifying when a sender's messaging pattern deviates from their established baseline in ways that suggest compromise, impersonation, or manipulation. It is one of the most powerful signals available to modern email security because it operates at a level that authentication alone cannot reach.

Why Content Consistency Matters

Every sender develops a natural content profile over time. This profile includes dimensions that most people never consciously think about:

Topic distribution: A sender from your IT department discusses infrastructure, tickets, and system updates. A sender from your legal team discusses contracts, compliance, and review deadlines. These topic domains are consistent and predictable.
Request types: Some senders ask questions. Some send status updates. Some issue directives. The mix of request types for a given sender tends to remain stable.
Formality level: Some senders are casual and use contractions, first names, and sentence fragments. Others maintain formal structure with full salutations and closing remarks. This register rarely changes without reason.
Sentence structure: Average sentence length, paragraph density, use of bullet points versus prose — these structural patterns are deeply habitual.
Vocabulary range: The specific words and phrases a person uses regularly form a lexical signature. Industry jargon, preferred abbreviations, and even filler phrases like "just wanted to follow up" or "as discussed" are consistent markers.

None of these dimensions is individually conclusive. But together, they create a rich profile of what "normal" looks like for any given sender — and a clear basis for detecting when something has changed.

How Content Shifts Indicate Compromise

When an attacker takes control of a legitimate email account, they inherit the sender's identity but not their communication habits. The resulting messages often exhibit detectable shifts across multiple content dimensions:

Topic Departure

A sender who has exclusively discussed project timelines and deliverables for six months suddenly sends a message about updating banking information. The topic is completely outside the sender's established domain. This departure is meaningful even when the message is well-written and professional, because the content does not match the relationship's history.

Request Escalation

The sender's messages have historically been informational — status updates, scheduling confirmations, document shares. A new message introduces an urgent financial request with a deadline. The shift from informational to transactional, combined with artificial urgency, is a pattern that appears consistently in business email compromise attacks.

Tone Shift

A sender who writes in relaxed, conversational language suddenly sends a message that is terse, formal, and directive. Or the reverse — a consistently formal sender suddenly adopts an unusually casual tone. These shifts can indicate that a different person is operating the account, or that the message was generated by an AI tool or template that doesn't match the sender's natural voice.

Language and Grammar Changes

Subtle changes in grammar patterns, spelling conventions (British vs. American English), punctuation habits, or even the way dates and numbers are formatted can indicate a different author. A sender who has never used the Oxford comma suddenly uses it consistently. A sender who writes "colour" suddenly writes "color." These micro-patterns are difficult for attackers to replicate because they require intimate knowledge of the sender's writing habits.

Emotional Manipulation

Compromised accounts are often used to send messages that introduce emotional pressure — urgency, secrecy, authority, or fear — that the real sender has never employed. "Please handle this confidentially" or "I need this done before the board meeting" from a sender who has never used such language is a significant content shift.

How OpenEFA^® Tracks Content Baselines

OpenEFA builds a multi-dimensional content profile for each sender-recipient relationship. This profile is not a simple keyword list — it is a behavioral model that captures how a sender communicates across several axes:

Topic modeling: OpenEFA categorizes the subject matter of messages over time, building a map of what topics are normal for each sender. When a message falls outside the established topic space, the deviation is measured and scored.
Intent classification: Each message is classified by intent — informational, request, confirmation, question, directive, escalation. The distribution of intent types for a given sender forms a stable baseline.
Stylometric analysis: Writing style features including sentence length, vocabulary complexity, punctuation patterns, and structural preferences are tracked per sender. Deviations in style are weighted according to their statistical significance.
Request sensitivity scoring: Messages that introduce financial transactions, credential requests, access changes, or confidentiality requirements are scored for sensitivity. When a sender who has never made sensitive requests suddenly introduces one, the sensitivity score creates a flag independent of content matching.
Relationship context: The content profile is specific to each sender-recipient pair. A sender may discuss finances with the CFO and project timelines with engineering. OpenEFA understands that a financial request to engineering is anomalous even if the sender discusses finances in other relationships.

The Challenge of Detecting Content Shift

Content shift detection is not simple pattern matching. It requires solving several difficult problems:

Distinguishing Change from Evolution

People's communication patterns do change over time. A new project begins, and the topics shift. A promotion changes the sender's role, and their request types evolve. OpenEFA addresses this by tracking the rate of change, not just the presence of change. Gradual evolution over weeks is normal. Abrupt shifts within a single message or a short burst of messages are not.

Handling Low-Volume Relationships

Some sender-recipient pairs exchange only a few messages per month. Building a robust baseline from limited data requires careful statistical treatment. OpenEFA uses hierarchical models that combine relationship-specific data with broader sender behavior patterns, allowing meaningful baseline construction even for infrequent correspondents.

Avoiding False Positives on Legitimate New Topics

A sender may legitimately introduce a new topic for the first time. The goal is not to block every novel message, but to recognize when novel content co-occurs with other risk signals. A new topic from a sender whose timing, volume, and style are all consistent is handled differently from a new topic that arrives with timing anomalies and a changed writing style.

A Real-World Scenario

Consider this situation:

Sender: A department head who has been emailing your HR director for two years.

Established pattern: Monthly messages about headcount planning, team reviews, and interview scheduling. Tone is warm and conversational, uses first names, often includes pleasantries. Typical message length: 150–300 words.

New message: A 45-word email with no greeting, requesting that HR "process the attached direct deposit change form for [employee name] immediately" and "confirm completion by end of day."

The message is signed with a full formal name block that the sender has never used before.

Authentication passes. The email address is correct. The subject line references an ongoing HR process. A traditional security gateway sees a clean message from a trusted internal sender.

But OpenEFA detects a constellation of content shifts:

Topic shift: Direct deposit changes have never appeared in this relationship's history. The sender has always discussed planning and scheduling, not payroll modifications.
Tone shift: The warm, conversational register has been replaced by a curt, directive tone. No greeting, no pleasantries, no first-name usage.
Length anomaly: At 45 words, the message is far shorter than the sender's established range of 150–300 words.
Request type shift: The sender has never issued a directive with a deadline in this relationship. Their messages have been collaborative and informational.
Formality shift: The full formal signature block replaces the sender's usual casual sign-off.
Urgency introduction: "Immediately" and "by end of day" are urgency markers that have never appeared in this sender's communication.

Each signal on its own might be explainable. Together, they paint a clear picture: the person writing this message is not communicating the way this sender communicates. The account may be compromised, and this request deserves additional verification before it is processed.

Signal Composition

Content shift detection becomes significantly more powerful when combined with other signals in the OpenEFA framework:

Content shift + timing anomaly: A message with unusual content arrives at an unusual time for the sender. The content deviation confirms that the timing deviation is not a simple schedule change.
Content shift + reply chain entry: A reply in an existing thread introduces a topic or request type that has never appeared in the conversation. The thread context suggests continuity, but the content says otherwise.
Content shift + volume change: A sender who normally sends long, detailed monthly updates suddenly sends several short, directive messages in a single day. The content and volume shifts reinforce each other.
Content shift + first-time attachment: A sender whose messages have always been plain text suddenly introduces an attachment alongside a content shift. The combination of new behavior across two dimensions increases confidence that the message warrants scrutiny.

This compositional approach means that OpenEFA can confidently flag messages that exhibit multiple simultaneous deviations while allowing individual, isolated changes to pass with appropriate context.

The Broader Principle

Content Shift Detection is part of the OpenEFA Signals framework — a set of behavioral and contextual patterns that reveal risk before it becomes an incident.

The core principle: what a person says is as identifiable as who they claim to be. Authentication verifies the address. Content analysis verifies the author. When the two diverge, the most sophisticated attacks become visible.

Traditional security asks: "Does this message contain anything dangerous?"

OpenEFA asks: "Does this message sound like the person who is supposed to be sending it?"

Attackers can steal credentials. They can forge headers. They can pass every authentication check. But they cannot replicate years of communication habits, personal vocabulary, and relationship-specific patterns. That gap between the real sender and the impersonator is where content shift detection operates — and where some of the most dangerous attacks are caught.

← Back to Signals ← Back to Blog