Commit graph

26 commits

Author SHA1 Message Date
f5c2c91057 fix: remove orphaned SSE event headers when suppressing text deltas
When the cleanup filter suppresses a text delta (buffering inside a
tag), the preceding 'event: content_block_delta' header was left in
the output, producing malformed SSE that caused opencode to retry
rapidly and freeze. Now removes the event header alongside the data
line.
2026-03-13 21:39:48 -06:00
2c42f9b52a fix: assign conversation turn numbers to blocks and add /api/blocks debug endpoint
Blocks were all getting turn=1 because label_messages used a single
global counter. Now derives turn from message position (each user msg
increments the turn). Also updates turn on already-labeled blocks.
Adds /api/blocks endpoint to inspect BlockStore state per session.
This enables collapse_range(1,72) to correctly target early turns.
2026-03-13 21:35:19 -06:00
e0af1edadf fix: safety valve only flushes partial openers, not real tags
Long cleanup tags (e.g. collapse summaries) can span 30+ SSE deltas.
The safety valve was flushing after 6 deltas regardless, dumping
incomplete tags into the output. Now only flushes when buffering a
partial opener (<m, <y) that never resolved — never when inside a
confirmed tag.
2026-03-13 21:27:08 -06:00
ad2c296ba3 fix: parse XML-format cleanup tags and strip from SSE stream
The model emits cleanup ops as XML elements (<drop>block:x</drop>,
<release handle="x"/>, <collapse>turns N-M "summary"</collapse>)
but the parser only handled prose format (drop: block:x). Add XML
regex matchers alongside the existing prose parser so both formats
are recognized, executed, and stripped from the streaming output.
2026-03-13 21:23:26 -06:00
65e4e38a98 fix: scale FM window_size to match real API pressure for fidelity degradation
The FidelityManager's internal pressure calculation uses its own tracked
object tokens divided by window_size, which is always tiny compared to
the real context. Temporarily scale window_size so the FM's pressure
matches the actual API input_tokens/window ratio, triggering L0→L1→L2
degradations when context exceeds 50%.
2026-03-13 21:13:38 -06:00
92fba55f70 fix: accurate context reduction stats and SSE cleanup tag filter
Measure incoming_bytes before _preprocess() so bytes_saved reflects true
reduction. Add SSECleanupFilter that intercepts memory_cleanup/yuyay-response
tags in streaming responses, strips them from output, and executes ops
(drops, collapses, releases) in real-time. Handles partial tags split across
SSE chunks with a safety valve to flush stale buffers for prose.
2026-03-13 21:07:52 -06:00
2bf6baaa33 fix: rebuild session history for undo and segment only new messages
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 15:40:48 -06:00
6719d3f3f0 fix: render collapsed turn summaries in outbound context
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 15:40:47 -06:00
235e88d416 fix: route mnemosyne provider instead of anthropic in opencode plugin
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 13:38:26 -06:00
5702a5a1e2 fix: wire bytes_saved through benchmark, restore _check_token_cap, apply block cleanup to outbound
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 13:37:33 -06:00
fa1f27bad5 fix: remove YuyayStreamFilter so pichay can receive yuyay-response blocks 2026-03-13 12:40:18 -06:00
59cce5c6d2 fix: pass real incoming/outgoing bytes to record_turn for context reduction stats 2026-03-13 12:35:57 -06:00
8df3f4f2b7 feat: add systemd user service for mnemosyne auto-restart 2026-03-13 12:07:17 -06:00
f8f85aea47 fix: inject /v1 suffix in baseURL and skip injection when gateway is down
The Anthropic SDK appends /messages to baseURL, so the gateway
baseURL must include /v1. Also removes the static baseURL from
opencode.json — the plugin now injects it dynamically only when the
gateway health check passes, so requests fall through directly to
api.anthropic.com when Mnemosyne is not running.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:59:50 -06:00
a822c497c5 fix: correct dashboard JS field mappings for benchmark API
The /api/benchmark response uses total_admitted, total_rejected,
total_attempts etc. but the dashboard JS was reading admitted, rejected,
attempts. Also fixed fidelity bar reading fidelity_distribution key and
the session table to derive context reduction from byte ratio.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:59:41 -06:00
bee90915db feat: add SSE stream filter for yuyay protocol tags
Strip <yuyay-response>, <yuyay-manifest>, and <yuyay-query> tags from
the SSE stream before forwarding to the client. The cooperative memory
protocol tags are still processed by the gateway on the next inbound
request — they just no longer leak into the user's visible output.

Handles tags spanning across multiple text_delta SSE events.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:49:12 -06:00
6b9b3df64d fix: disable phantom tool injection when proxying for opencode
opencode validates tool calls against its own registry and rejects
unknown tools like memory_query. The opencode plugin provides
mnemosyne_query/mnemosyne_status as registered MCP tools instead.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:43:56 -06:00
7c6a3dbe4a docs: add architecture and reference documentation
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:41:41 -06:00
b21871b8fc feat: add opencode plugin for Mnemosyne routing
TypeScript plugin that injects baseURL to route Anthropic API calls
through the Mnemosyne gateway, enriches compaction with memory context,
and provides mnemosyne_status/mnemosyne_query custom tools.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:41:36 -06:00
9b25b33a50 test: add gateway integration tests
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:41:28 -06:00
d660414ad7 feat: add benchmarking, auth, and utility modules
CLI benchmark command, threshold auto-tuning, OAuth PKCE auth
(same flow as Claude Code), cost tracking, telemetry, and replay.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:41:22 -06:00
681c1454cb feat: add memory management pipeline
Admission control, entropy-based micro-faulting, phantom tool
injection for backing store queries, and xMemory session hierarchy
for long conversations (50+ turns).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:41:12 -06:00
a13719f754 feat: add object store with semantic segmentation
Object-addressed memory: segment messages into semantic objects,
embed with sentence-transformers, store in pgvector-backed store,
and reassemble context via goal-aware retrieval.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:41:04 -06:00
d26c56c2f0 feat: add multi-fidelity compression engine
5-level fidelity manager (L0-Full to L4-Evicted) with helper LLM
(Haiku 4.5) for intelligent summarization during degradation.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:40:56 -06:00
974863e7b3 feat: add core proxy framework with gateway and providers
Multi-provider HTTP proxy (Anthropic + OpenAI) with session management,
message processing pipeline, block labeling, cache control placement,
and embedded monitoring dashboard.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:40:49 -06:00
ed0361f97c chore: initialize project scaffold and config
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-13 11:40:35 -06:00