Shannon's entropy quantifies the surprise in a signal.
It cannot see what happens when the signal arrives
in a world different from the one that sent it.
This is not a flaw in information theory.
It is the boundary where a new theory begins.
Signs Without Ground · II · The Shannon Contrast
Extension of: On the Inseparability of Motion Primitives from Their Enactive Context
scroll ↓
I Shannon's Framework
What Shannon measured and what he left out
In 1948, Claude Shannon published "A Mathematical Theory of Communication" —
one of the most consequential papers of the twentieth century. It gave us
entropy, channel capacity, and the theoretical foundation of every digital
system that exists. It also contained an explicit bracketing that has been
largely forgotten.
Shannon wrote: "The semantic aspects of communication are irrelevant to
the engineering problem." This was not an oversight. It was a deliberate
methodological choice. Shannon wanted a theory that worked regardless of what
the symbols meant — a theory of the pipe, not the water.
That choice was extraordinarily productive. It made the theory universal,
mathematical, and applicable to any communication system. But it built a
wall. On one side: everything Shannon could measure. On the other: everything
that happens when meaning crosses a context boundary.
Shannon's Communication Model — annotated
Shannon's model is defined between encoder and decoder. The contexts at source and
destination — what the symbols mean — are outside the model's boundary by design.
H(X) = −∑ p(x) log₂ p(x)
// entropy: average surprise in source X
C = max I(X;Y) = max [H(X) − H(X|Y)]
// channel capacity: maximum mutual information
d(M_A, M_B) = ?
// semantic displacement: undefined in Shannon's framework
The formula for semantic displacement — the distance between the meaning enacted
by the sender and the meaning reconstructed by the receiver — has no place in
information theory. Not because Shannon missed it. Because he left room for someone
else to define it.
II The Gap
Noiseless channel. Drifting meaning.
The most unsettling possibility: a communication system that performs perfectly
by every information-theoretic measure, while meaning diverges completely
between sender and receiver. Shannon's framework would report success.
The agents would believe they understood each other. They would not.
The noiseless channel paradox — click each layer
Shannon's channel model sees a perfect transmission. The symbol arrives intact. Every metric is green.
This is the paradox at the heart of the Shannon Gap. Information theory gives us
tools of extraordinary precision for measuring what travels through a channel.
It has no tools for measuring what is lost in the translation between contexts —
because that translation was never inside the model.
A communication can have zero entropy loss and maximum semantic displacement simultaneously.
These are not contradictory. They are orthogonal. They live in different spaces.
— Original thesis, this paper
III Semantic Displacement
Defining what Shannon left open
Semantic displacement d(M_A, M_B) is the distance between the meaning
enacted in the sender's context and the meaning reconstructed in the
receiver's context. It is zero only when contexts are identical.
It is undetectable by any information-theoretic measure.
To formalise this we need to be precise about what "meaning" is in this
framework. We are not appealing to subjective experience or phenomenal
consciousness. We are making a functional claim: the meaning of a symbol
is the set of actions it licenses, the set of expectations it generates,
and the set of environmental interactions it predicts — relative to the
context in which it is received.
M(s, C) = { actions, expectations, predictions } licensed by s in context C
// meaning as a function of symbol s and context C
d(M_A, M_B) = distance( M(s, C_A), M(s, C_B) )
// semantic displacement: same symbol, different contexts
d(M_A, M_B) = 0 ⟺ C_A = C_B
// zero drift only when contexts are identical
∀ H(error) = 0 : d(M_A, M_B) ∈ [0, ∞)
// perfect transmission is compatible with any amount of semantic displacement
Shannon entropy
H(X)
Measures average surprise in source. Defined over symbol distributions. Context-independent by design. Zero when signal is noiseless.
Semantic displacement
d(M_A,M_B)
Measures meaning distance across contexts. Defined over enacted interpretations. Context-dependent by definition. Invisible to information theory.
Channel capacity
C
Maximum reliable transmission rate. Upper bound on mutual information. Measures the pipe, not the water. Says nothing about meaning.
Context gap
Δ(C_A,C_B)
Distance between sender and receiver contexts. The hidden variable that generates drift. Not modelled in Shannon's framework. The source of all silent divergence.
The orthogonality claim
Shannon entropy and semantic displacement are not opposites on a single axis.
They are orthogonal — they measure different things in different spaces.
A message can simultaneously have high entropy and low displacement
(a complex but shared context), or zero entropy and maximum displacement
(a simple symbol, two worlds apart). The two metrics do not trade off.
They do not interact. They simply occupy different theoretical planes.
The orthogonal space — Shannon entropy vs. semantic displacement
The two axes are orthogonal. Any combination of entropy and semantic displacement is possible.
Shannon's framework only sees the horizontal axis.
IV The Separation & Drift
How context is stripped and meaning drifts
The Separation Operation — step through abstraction
Reality: a continuous agent–environment coupling with no natural joints.
Silent Meaning-Drift — hover each element
Hover an element above.
V Where This Shows Up
The same structure, everywhere
Silent meaning-drift is not a robotics problem or an AI problem.
It is the general structure of what happens when any symbol crosses
a context boundary. Once you see it, you see it everywhere.
Semantic displacement across domains — click each domain
Robotics: A motion primitive learned in a controlled lab environment is transferred to a real-world deployment. The symbol arrives intact — the weight vector is identical. But the affordance structure of the new environment differs: compliant surfaces, unknown masses, spatial constraints. The primitive executes. The action fails. Shannon reports success. The robot has no mechanism to detect the context gap.
Domain
Symbol
Context A
Context B
Drift type
Robotics
motion primitive W(r*,c*)
lab training env
real-world deployment
affordance mismatch
LLMs
token embedding
training corpus
user's specific context
distributional shift
Human language
word / phrase
speaker's life-world
listener's life-world
semantic gap
Institutions
policy / directive
leadership context
implementation context
intent-execution gap
Translation
translated text
source language world
target language world
cultural displacement
In every case: the channel is clean. Shannon is satisfied. The displacement
is real, significant, and invisible to every tool that information theory
provides. The problem is not in the pipe. The problem is in the assumption
that sender and receiver share a context — an assumption Shannon never made
explicit because he never needed to. He had bracketed it from the start.
VI What This Opens
Not a critique of Shannon. A continuation.
Shannon gave us the mathematics of transmission. This framework proposes
the mathematics of interpretation — the missing second half of communication
theory. The two are not in conflict. They are complementary. Shannon stops
at the decoder. The new theory begins there.
The central open problem is: can semantic displacement d(M_A, M_B) be
formalised in a way that is measurable, not merely conceivable? If the
context gap Δ(C_A, C_B) can be estimated — from environmental sensors,
from behavioural divergence, from model disagreement — then semantic
displacement becomes a quantity that can be monitored, minimised, and
perhaps corrected.
This would not extend Shannon's theorem. It would complement it: a
channel capacity theorem for meaning, running in parallel with the
theorem for signal. A system with both would know not only whether
the message arrived, but whether it arrived with its meaning intact.
Shannon built the pipe. The water — its meaning, its context,
its enacted interpretation at the other end — was always outside
the model. This is not a flaw. It is the boundary where the next
theory lives.
— Original thesis, this paper
Shannon's theorem (1948):
C = B · log₂(1 + S/N)
// channel capacity as function of bandwidth and signal-to-noise
Proposed complement:
K = f( Δ(C_A, C_B), vocab_coverage, context_overlap )
// semantic capacity: maximum meaning transferable given context gap
when Δ(C_A, C_B) → 0 : K → C
// when contexts converge, semantic capacity approaches channel capacity
when Δ(C_A, C_B) → ∞ : K → 0
// when contexts diverge completely, no meaning transfers — regardless of C