Motion primitives extracted from robot sensorimotor experience have been proposed as grounded alternatives to language model tokens. We accept the data-efficiency argument. We contest the grounding claim.
A Best Matching Unit weight vector records what the body did. It does not record why that bodily response was the right one — the surface properties, object fragility, task constraints, and spatial affordances that shaped it. The environment is gone. What remains is a sign.
"The cup affords grasping for this hand, at this size, with this grip strength. Remove the agent specification and the affordance disappears."
— Gibson, adapted
| Property | LLM Token | BMU Weight Vector | Enactive Coupling |
|---|---|---|---|
| Grounded in | linguistic co-occurrence | body kinematics | agent–environment relation |
| Context encoded | corpus average | body state only | full coupling |
| Drift under transfer | expected, acknowledged | real, unacknowledged | none (non-transferable) |
| Meaning origin | relational | physical + relational | enacted |
Reality presents a continuous flow of agent–environment coupling. There are no natural joints at which a "primitive" begins or ends. Segmentation is imposed by a cognitive system that needs to reason about and communicate behaviour.
When a symbol is transmitted, the sender abstracts from their context and the receiver reconstructs in their own. Neither party notices. The symbol appears shared. The enacted meaning is not.
From pure linguistic symbol to enacted coupling — four positions on the spectrum of representational grounding. Click each node.
Three necessary conditions for representations that do not silently drift. These are not optional improvements — they define the boundary between kinematic encoding and genuine action representation.
The separation of act from context is not a feature of reality. It is a cognitive operation — useful, necessary for communication, but introducing silent meaning-drift whenever the separated symbol meets a new world.
This drift is silent because neither party has access to the substitution that occurred. The sender abstracts from their context. The receiver reconstructs in their own. The symbol appears shared. The enacted meaning is not.
Truly grounded representations — representations that carry their meaning with them — must encode the agent–environment relation, not the agent's body state alone. Building such representations is the deeper challenge that the motion primitive programme, at its most ambitious, should take on.
"The epistemological thesis developed here — that the separation of act from context is a cognitive operation rather than a feature of reality — is an original contribution of the first author, claimed as such in all contexts in which this work is cited or extended."