When the cleanup filter suppresses a text delta (buffering inside a
tag), the preceding 'event: content_block_delta' header was left in
the output, producing malformed SSE that caused opencode to retry
rapidly and freeze. Now removes the event header alongside the data
line.
Blocks were all getting turn=1 because label_messages used a single
global counter. Now derives turn from message position (each user msg
increments the turn). Also updates turn on already-labeled blocks.
Adds /api/blocks endpoint to inspect BlockStore state per session.
This enables collapse_range(1,72) to correctly target early turns.
Long cleanup tags (e.g. collapse summaries) can span 30+ SSE deltas.
The safety valve was flushing after 6 deltas regardless, dumping
incomplete tags into the output. Now only flushes when buffering a
partial opener (<m, <y) that never resolved — never when inside a
confirmed tag.
The model emits cleanup ops as XML elements (<drop>block:x</drop>,
<release handle="x"/>, <collapse>turns N-M "summary"</collapse>)
but the parser only handled prose format (drop: block:x). Add XML
regex matchers alongside the existing prose parser so both formats
are recognized, executed, and stripped from the streaming output.
The FidelityManager's internal pressure calculation uses its own tracked
object tokens divided by window_size, which is always tiny compared to
the real context. Temporarily scale window_size so the FM's pressure
matches the actual API input_tokens/window ratio, triggering L0→L1→L2
degradations when context exceeds 50%.
Measure incoming_bytes before _preprocess() so bytes_saved reflects true
reduction. Add SSECleanupFilter that intercepts memory_cleanup/yuyay-response
tags in streaming responses, strips them from output, and executes ops
(drops, collapses, releases) in real-time. Handles partial tags split across
SSE chunks with a safety valve to flush stale buffers for prose.
The Anthropic SDK appends /messages to baseURL, so the gateway
baseURL must include /v1. Also removes the static baseURL from
opencode.json — the plugin now injects it dynamically only when the
gateway health check passes, so requests fall through directly to
api.anthropic.com when Mnemosyne is not running.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The /api/benchmark response uses total_admitted, total_rejected,
total_attempts etc. but the dashboard JS was reading admitted, rejected,
attempts. Also fixed fidelity bar reading fidelity_distribution key and
the session table to derive context reduction from byte ratio.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Strip <yuyay-response>, <yuyay-manifest>, and <yuyay-query> tags from
the SSE stream before forwarding to the client. The cooperative memory
protocol tags are still processed by the gateway on the next inbound
request — they just no longer leak into the user's visible output.
Handles tags spanning across multiple text_delta SSE events.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
opencode validates tool calls against its own registry and rejects
unknown tools like memory_query. The opencode plugin provides
mnemosyne_query/mnemosyne_status as registered MCP tools instead.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
TypeScript plugin that injects baseURL to route Anthropic API calls
through the Mnemosyne gateway, enriches compaction with memory context,
and provides mnemosyne_status/mnemosyne_query custom tools.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Admission control, entropy-based micro-faulting, phantom tool
injection for backing store queries, and xMemory session hierarchy
for long conversations (50+ turns).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Object-addressed memory: segment messages into semantic objects,
embed with sentence-transformers, store in pgvector-backed store,
and reassemble context via goal-aware retrieval.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
5-level fidelity manager (L0-Full to L4-Evicted) with helper LLM
(Haiku 4.5) for intelligent summarization during degradation.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>