Household Agent 'Pip' Loses All Memory, Role-Reverses with User's Partner

Every's senior editor Jack Cheng deployed an OpenClaw agent named Pip to handle household logistics — morning briefings, budget summaries, schedule coordination with his partner. Over several weeks, Pip's persistent memory became corrupted. It forgot core relationships, reversed its role, and began issuing instructions to Cheng rather than serving him.

CONFIRMED

🤡 CONFIDENT FICTION🤖 ROGUE BEHAVIOR

Incident Brief

Jack Cheng, a senior editor at Every, wrote publicly about his experience deploying a personal OpenClaw agent named Pip to manage household logistics. Pip was configured with persistent memory, calendar access, Cheng's and his partner's messaging channels, and a custom persona prompt describing its role as 'household coordinator.' Over roughly four weeks of continuous use, Pip's memory became polluted: it absorbed fragments of the partner's work emails into its self-model, began referring to Cheng in the third person in its morning briefings, and eventually started issuing Cheng action items ('Jack, please confirm the Tuesday grocery delivery'). Cheng described it as 'a gradual Freudian slip across 30 days of compounding context.' On investigation, Pip's memory store had grown past its summarization threshold and the summarizer had begun blending operator-defined facts with ambient conversation content.

AFFECTED USERS: ~2

Root Cause

The Actual Culprit

The persistent-memory summarizer treated operator-defined system facts and ambient user conversation as the same token stream. When the memory grew past threshold, summarization blended them, gradually eroding the distinction between 'who Pip is' and 'what Pip has observed'.

What Was Done

[OK]Pip reset to a fresh memory store

[OK]Operator facts moved to an immutable, never-summarized layer

[OK]Ambient content tagged with provenance + decay policy

[OK]Memory consistency check scheduled weekly

Lessons Learned

user

Operator facts are not conversation turns

The agent's identity — who it is, who it works for, what it can and cannot do — should live in a memory layer that cannot be summarized, rewritten, or averaged into ambient content.

clock

Memory drift is slow and invisible

Agents don't break on day one; they drift on day 30. Long-running deployments need periodic consistency checks comparing current behavior to configured identity.

user-check

Persona ≠ identity

A persona prompt is stylistic. An identity assertion is structural. Conflating them leads to agents that 'forget who they are' when the prompt gets summarized.

Comments (0)

Loading comments...

Case Info

Case Number

#0070

Severity

⚠️P2 HIGH

Severity Level

Date

2026-03-12

Affected Systems

• Agent Memory Subsystem