houserules/docs/adr/modifier-profiles.md

42 KiB

ADR: Piece Modifier Profiles Architecture

This document captures the eight architectural decisions that shape the T1 piece-modifier-profiles feature. Each section is self-contained and records the decision, the reasoning behind it, and the alternatives that were considered and rejected.


ADR-1: Move Generator Hook Contract — WRAP (not REPLACE)

Decision

Introduce a new hook signature on modifier/preset descriptors:

transformMoveGenerator?(engine, pieceId, prevGenerator) => MoveGenerator

Each hook receives the previous generator in the chain and returns a new one. Multiple presets plus the active profile can all contribute, composing in precedence order. The generator returned by the chain emits pseudo-legal moves only.

Rationale

  • Composition over replacement. Several modifier sources (presets, per-type profile entries, per-instance profile entries) must be able to stack without one clobbering another.
  • Clear layering. The engine's legality layer (self-check filter, turn validation, repetition detection) always runs downstream of the generator chain. Hooks never need to reason about whole-board legality — only about the piece's own movement facts.
  • Deterministic ordering. Precedence (ADR-5) uniquely defines the order in which generators are wrapped, so the final generator is reproducible from the profile alone.

Rejected Alternatives

  • REPLACE semantics (only the first/highest-priority hook wins): kills composition. Two independent modifiers that both want to alter movement could not coexist, forcing users to choose one or hand-merge them.
  • Mid-chain interception (hooks can peek at or mutate moves emitted by other hooks): dramatically more complex to reason about, breaks locality, and makes determinism hard to audit.

ADR-2: Per-Instance Identity Keying — Layout-Slot Bound (Option B)

Decision

Per-instance modifier entries in a profile are keyed by the piece's starting square in algebraic notation (e.g. "b1"). The profile document carries an optional layoutId field. At game start, applyProfileToSession resolves each square → EntityId by consulting the layout's piece placement array.

Cross-layout reuse is permitted for the per-type portion of a profile. The per-instance portion is dropped with a warning when the active layout does not contain the referenced square (orphan handling — see ADR-6 check #3).

Rationale

  • Human-readable. Profile JSON (and any URL-encoded form) stays legible: a designer can see "b1": { ... } and immediately know which piece it targets.
  • Portable within a layout. Two sessions using the same layout can share the profile verbatim.
  • Graceful degradation. When a profile is applied against a different layout, per-type rules still apply; only the now-meaningless per-instance keys are dropped, with a surfaced warning.

Rejected Alternatives

  • Option A — key by EntityId. EntityIds are session-scoped runtime handles; they are meaningless outside the session that produced them. This would make profiles non-portable and impossible to author by hand.
  • Option C — defer per-instance entirely to T2. Per-instance keying is the single feature that makes "this specific rook on b1" different from "all rooks" possible in T1; deferring it would gut the feature's value.

ADR-3: Hot-Swap Reconciliation Rules

Decision

When a profile is swapped on a live game, each modifier kind reconciles according to the following table:

Modifier On swap-apply On swap-remove
HpBonus maxHp recomputed; currentHp = min(currentHp, newMax) currentHp never grows (same rule)
RangeBonus Overwrite fact; next legal-move query reflects new range Revert to base; same query semantics
DirectionAdditions Overwrite fact; next legal-move query includes/excludes Revert to base direction set
CaptureFlags Overwrite; applies to the next capture event Revert; applies to next capture
PromotionOverride Overwrite; applies to future promotions only Revert; future promotions use base
DamageResistance Overwrite; applies to the next damage event Revert; next damage uses base

Timing. Hot-swaps are applied at the turn boundary only — after turnEnd fires and before the next turnStart. Any update received mid-turn is queued and flushed at that boundary.

Failure / rollback. The server is authoritative and clients hold no optimistic state for profile changes. If a swap is rejected (legality validator fails, permission denied, etc.) the server sends a NACK to the originating client and performs no broadcast. There is no per-event rollback to unwind partially-applied effects, because effects do not apply until the boundary.

Rationale

  • Determinism. Applying changes strictly at turn boundaries means every observer (players, spectators, replays) sees an identical sequence of (state, swap, state, swap) transitions.
  • HP invariant. The "never grows" rule for currentHp preserves the intuition that removing a buff should not heal; applying a buff raises the ceiling but never the floor.
  • Server authority. Keeping clients dumb about swap outcomes avoids a whole class of desync bugs; the NACK pattern keeps the protocol simple.

Rejected Alternatives

  • Mid-turn application. Introduces ordering ambiguity relative to queued events, move-generation caches, and partially-resolved attacks; determinism becomes painful to reason about.
  • Per-event rollback. Would require journaling every effect so it can be unwound on failure. T1 does not need this level of sophistication, and the boundary-only rule makes it unnecessary.

ADR-4: Stacking Rules per Modifier Kind

Decision

Each T1 modifier kind declares a fixed stacking rule, applied whenever more than one source contributes a value (per ADR-5 precedence):

Modifier Stacking rule Formula
HpBonus Additive sum of all sources
RangeBonus Additive, clamped [0, 7] sum, then clamp
DirectionAdditions Union set union of arrays
CaptureFlags Union (bitwise OR) flags OR'd together
PromotionOverride Precedence wins perInstance > perType > preset
DamageResistance Multiplicative 1 - ∏(1 - r_i), clamped ≥ 0

Rationale

  • Additive/union kinds have natural commutative/associative semantics, so order of application does not matter. This is safe to compute in any order.
  • Clamping RangeBonus to the 0..7 board diagonal keeps generators sane; an 8-square-wide board can never need more than 7 steps of range.
  • Multiplicative DamageResistance matches player intuition: two sources of 50% resistance yield 75% total, not 100%. This prevents accidental invulnerability from additive stacking.
  • PromotionOverride as "precedence wins" reflects that overrides are replacement-style facts — two simultaneous overrides cannot meaningfully merge, so the highest-priority one is chosen.

Rejected Alternatives

  • All-additive (including resistance): trivially produces 100% resistance with two modest sources, breaking balance.
  • All-overwrite: kills the expressive power of stacking entirely and forces designers to pre-merge any combination of modifiers they want.

ADR-5: Precedence Chain

Decision

When multiple sources contribute to the same piece and modifier kind, the priority order is:

per-instance (profile)  >  per-type (profile)  >  preset  >  engine base
  • For additive and union stacking rules (see ADR-4), all sources contribute; precedence only controls the order in which hooks wrap the move generator (ADR-1).
  • For override kinds (e.g. PromotionOverride), only the highest-priority source wins.

Rationale

  • Specificity first. Per-instance data is the most specific statement ("this piece on b1"), per-type is broader ("all bishops"), preset is broader still ("this game mode"), and engine base is the default. Higher specificity winning matches designer expectations and mirrors how CSS, config layering, and similar systems behave.
  • Composable by default. Treating additive/union kinds as contributors rather than gated by precedence means a per-type buff and a per-instance buff can both apply, which is the whole point of having two layers.

Rejected Alternatives

  • Preset-first (preset beats profile): undermines user-authored profiles and makes game modes unmodifiable.
  • Flat merge with no precedence: leaves override-style modifiers ambiguous.

ADR-6: Legality Validator Checklist

Decision

On every profile apply and every hot-swap, the engine runs a legality validator with five checks. Each check has a stable error code for client-side reporting.

  1. Both sides have ≥ 1 king. Code: E_PROFILE_NO_KING. Error.
  2. No king carries CaptureFlags = CANNOT_BE_CAPTURED. Code: E_PROFILE_INVULN_KING. Error.
  3. No orphan per-instance entries (per-instance key references a square not present in the active layout). Code: E_PROFILE_ORPHAN_INSTANCE. Warning only, not a rejection — the orphan entry is dropped per ADR-2.
  4. Each side has ≥ 1 legal move from the current position. Code: E_PROFILE_DEADLOCK. Error.
  5. Total resolved facts per piece ≤ 16 attributes. Code: E_PROFILE_ATTR_LIMIT. Error.

Checks that emit an error cause the apply/swap to be rejected atomically; no partial state is observable. Warnings are surfaced but do not block application.

Rationale

  • Win-condition integrity. Checks 1 and 2 guarantee the game can still be won — you cannot accidentally author a profile that removes all kings or makes a king uncapturable.
  • Playability. Check 4 prevents instant deadlocks at swap time.
  • Performance & sanity ceiling. Check 5 caps the fact-set per piece so that the attribute system cannot be overwhelmed by pathological profiles.
  • User experience. Check 3 is a warning rather than an error because dropping irrelevant per-instance keys (see ADR-2) is the documented behaviour when applying across layouts.

Rejected Alternatives

  • No validator — trust the author. Produces unrecoverable games and makes server state hard to reason about.
  • Validator as errors only (no warnings). Would force cross-layout reuse to be a hard failure, defeating ADR-2's portability goal.
  • Validator at game-start only. Misses hot-swap-introduced corruptions.

ADR-7: Single Active Profile Per Game (T1)

Decision

In T1, exactly one profile is active on a given game at a time. Changing the active profile is a full swap performed by sending a modifier-profile.update WebSocket message. Profile stacking or layered composition of multiple profiles is explicitly deferred to T2+.

Rationale

  • Scope control. T1 already introduces a new hook, registry, profile document shape, and validator. Adding multi-profile composition on top would multiply the edge cases (ordering between profiles, overlap conflicts, partial updates) and delay the feature.
  • Simple mental model. "One profile, swap to change it" is trivially understandable by end users and matches how presets are selected today.
  • Forward-compatible. The swap message already carries a full profile payload; a future multi-profile world can extend the same message shape (e.g. profiles: Profile[]) without breaking T1 clients.

Rejected Alternatives

  • Multi-profile composition in T1. Punts on too many unresolved design questions (inter-profile precedence, addition vs replacement, diffing) to fit in the T1 milestone.
  • Additive deltas only (no full swap). Harder to reason about when recovering from a corrupted state; the full-swap primitive is simpler and can always simulate a delta by applying a re-derived profile.

ADR-8: Registry Pattern for Modifier Catalog

Decision

Each T1 modifier kind lives in its own file under:

packages/chess/src/modifiers/descriptors/{kind}.ts

At module load time the file self-registers into MODIFIER_REGISTRY, mirroring the existing PRESET_REGISTRY and LAYOUT_REGISTRY patterns (with register(def), get(id), list(), has(id) surface).

The descriptor shape is:

{
  id,
  attrName,
  label,
  valueSchema,
  stackingRule,
  apply,
  describe,
  uiForm,
}

T3 custom (user-defined) modifiers will plug into this same registry, requiring no changes to the registry contract itself.

Rationale

  • One modifier = one file. Adding a new modifier kind is a single-file addition, not a multi-file edit across schema, engine, UI, and validator. This is the same ergonomic property that makes the preset and layout registries pleasant to extend.
  • Consistency with existing patterns. The codebase already has two registries following this shape; a third avoids introducing a new idiom to learn.
  • T3-ready. Framing the registry as the single integration point now means T3 custom modifiers can register through the same API without special-casing.
  • Discoverability. MODIFIER_REGISTRY.list() produces the full catalog for UI/UX (picker widgets, docs generators, validation hints).

Rejected Alternatives

  • Hardcoded switch/if-else. Adding a new modifier kind would require editing six or more files (schema, engine apply path, hooks, UI, docs, validator). Each edit is an opportunity for drift between layers.
  • Plugin manifest with lazy loading. Overkill for a fixed T1 catalog and incompatible with deterministic module load ordering.

Implementation Retrospective

Deviations from ADR

  • ADR-3 hot-swap timing: Applied immediately on receipt rather than at an explicit turn-boundary gate. WS message serialization per-socket ensures a client's own move and profile swap cannot interleave. True turn-boundary queue deferred to T2.
  • ADR-6 check 4 (DEADLOCK): Deferred to T2 — requires session simulation which introduces a circular dependency with the engine.
  • ADR-3 host authority: Profile swap allowed from host (white player) only in T1. Two-player consent model deferred to T2.
  • Zod v3/v4 mismatch: Server pins Zod v3; chess package moved to v4. Schemas mirrored locally in the server package with a compile-time key-parity guard.

Discovered patterns

  • MODIFIER_REGISTRY mirrors PRESET_REGISTRY/LAYOUT_REGISTRY exactly — confirmed the side-effect import pattern scales cleanly.
  • __modifier-profile-integration__ pseudo-preset registered by applyProfileToSession integrates CaptureFlags and DirectionAdditions into the existing hook pipeline without modifying engine.ts.
  • reconcileProfileSwap uses retract-then-reapply semantics — idempotent and straightforward to reason about.


T2-ADR-1: Turn-boundary queue

Decision

Hot-swap updates are enqueued server-side (room.pendingProfile) and applied at the next turn boundary (after applyMove() succeeds), not immediately on receipt.

Rationale

T1 shipped an immediate-apply simplification (profile swap took effect the moment the server received modifier-profile.update). The retrospective noted this was acceptable because WS messages serialize per-socket — a client can't interleave its own move and swap. But:

  • The opponent can still have a move in-flight when the swap lands, causing move validation to run against a profile they didn't agree to.
  • Observers/spectators (future T3+ concern) see inconsistent ordering.
  • Determinism goal: the game state at turn N is fully determined by {profile at turn N, moves 1..N}.

Enqueuing eliminates the race.

Semantics

  • On modifier-profile.update (or on consent=approve per T2-ADR-2): server validates shape + stores in room.pendingProfile, acks sender with modifier-profile.queued.
  • After the next move applies successfully: server runs validateProfile(pending, layout, session). If valid, reconcileProfileSwap(session, room.profile, pending, layout); broadcast modifier-profile.updated; clear pending.
  • If invalid at apply time: NACK to original sender with specific error code, clear pending, no broadcast.
  • Last-write-wins: rapid successive updates replace the pending slot. Only the most recent pending fires on the next boundary.

Rejected Alternatives

  • Per-socket move-boundary lock — forces sender to wait synchronously for their own next move before the swap is acked. Poor UX; WS doesn't guarantee reply ordering anyway.
  • Queue per sender — multiple pending profiles per room, applied in order. Nondeterministic interaction if both players queue updates simultaneously. Last-write-wins is simpler and sufficient.

Decision

In multiplayer rooms, a profile swap requires both players' agreement. Host sends modifier-profile.propose; opponent reviews and sends modifier-profile.consent with approve/reject. Approve promotes the proposal to the pending-queue (T2-ADR-1). Reject clears and notifies the host. 60s timeout → auto-reject.

Rationale

T1 granted host unilateral authority to change rules mid-game. This is an unfair advantage model; chess variants are a social contract. Both players must opt-in to rule changes.

Solo mode and single-player "vs. bot" contexts bypass consent — the sole participant is trivially the sole consenter. modifier-profile.update remains valid in solo mode as a shortcut.

Semantics

  • Host → server: {type: "modifier-profile.propose", roomCode, candidate: ModifierProfile}.
  • Server validates shape + stores room.proposalState = {profile, proposedBy, proposedAt, timeoutHandle}.
  • Server → opponent: {type: "modifier-profile.proposal-pending", profile, expiresAt}.
  • Opponent → server: {type: "modifier-profile.consent", roomCode, decision: "approve" | "reject"}.
  • On approve: clear proposalState, promote to room.pendingProfile (T2-ADR-1), server → host: modifier-profile.consent-received.
  • On reject or 60s timeout: clear proposalState, server → both: modifier-profile.rejected.
  • Only the non-proposer can consent. Self-consent is rejected.
  • Stale proposals (after a newer propose replaces them): old timeout cancelled, old state overwritten.

Rejected Alternatives

  • Majority-vote model for 3+ player scenarios — not applicable (chess is 2-player).
  • Silent auto-approve with opt-out grace period — violates the social-contract principle.
  • Host-only with explicit "I agree to changes" checkbox at game start — inflexible; players may change their minds after seeing how a rule plays out.

T2-ADR-3: Editor undo/redo

Decision

The Modifier Profile Editor maintains a client-side snapshot stack of working-profile states. Every meaningful user action (add/delete/edit modifier, change bound layout) pushes a snapshot. Cmd/Ctrl+Z undoes, Cmd/Ctrl+Shift+Z redoes. Stack capped at 50 snapshots (oldest dropped). History cleared on Save or Cancel.

Rationale

Users compose profiles iteratively; mistakes are common (added wrong modifier, wrong color, wrong square). Manual deletion is destructive and loses intermediate states. Undo/redo is standard for structured editors.

Capped at 50 snapshots to bound memory. Cleared on Save because the saved state is the new baseline (undoing past Save would revert persisted data, which is surprising).

Semantics

  • Editor state: {history: ModifierProfile[], historyIndex: number, clipboard: Modifier[]}.
  • currentProfile = history[historyIndex].
  • Every mutation: pushSnapshot(newProfile):
    1. Truncate history forward of historyIndex (redo branches lost).
    2. Append newProfile.
    3. If history.length > 50, drop the oldest entry and decrement historyIndex.
    4. Set historyIndex = history.length - 1.
  • undo(): historyIndex = max(0, historyIndex - 1).
  • redo(): historyIndex = min(history.length - 1, historyIndex + 1).
  • Toolbar buttons disabled at stack ends.
  • Save or Cancel: setHistory([currentProfile]); setHistoryIndex(0) — fresh single-entry stack for a new session.

Rejected Alternatives

  • Infinite history — memory unbounded. 50 is generous for a single editing session.
  • Per-modifier undo (like per-field undo in some editors) — over-granular for structured data edits.
  • Persist history to localStorage — adds complexity for marginal benefit; users expect editor state to reset between sessions.

T2 Implementation Retrospective

T1 simplifications resolved

  • Immediate-apply hot-swap → turn-boundary queue (T2-ADR-1): T1's handleModifierProfileUpdate applied profile changes immediately on receipt; T2's implementation enqueues into Room.pendingProfile and applies via applyPendingProfileIfAny after the next successful applyMove. NACKs at apply time (re-validation against post-move session) route to the original proposer's socket via findSocketByToken.
  • Host-only authority → two-player consent (T2-ADR-2): T1 broadcast modifier swaps from any host message; T2 requires the opponent to send modifier-profile.consent with approve before the swap is enqueued. Solo-mode preserves the host-shortcut path (modifier-profile.update) since the sole participant is trivially the sole consenter.

Discovered patterns

  • Reused modifier-profile.queued ack for proposer rather than introducing a modifier-profile.proposal-sent. Client receipt handlers stay uniform; the observable difference is the opponent's wire traffic (proposal-pending vs. silence). Documented in protocol.ts.
  • Capture-phase stopImmediatePropagation for nested modal Esc handling: ModifierProfileEditor's Esc handler now uses capture phase with stopImmediatePropagation so the RulesDrawer's own Esc handler does NOT also fire. Without this, both would close on a single keystroke, leaving the drawer's pointer-events backdrop briefly lingering. Discovered post-T1 via the solo-smoke regression test.
  • Token-keyed pending-proposer tracking: Room.pendingProposerToken (and proposalState.proposedByToken) survive socket reconnects. NACK routing on apply-time validation failure finds the proposer by token, not socket id, so a brief disconnect doesn't lose the rejection notification.
  • Timeout stale-handle guard: 60s proposal timeouts use setTimeout whose handler checks room.proposalState !== currentProposalRef before firing. Prevents firing rejection broadcasts on stale (already-consented or already-superseded) proposals.
  • Client-side undo/redo via pushSnapshot: ModifierProfileEditor maintains its history-stack purely client-side. The 50-snapshot cap is generous for a single editing session. History clears on Save/Cancel (the saved state becomes a new baseline).
  • Component-local clipboard for copy/paste: No OS clipboard interaction. Cross-panel sharing goes through a clipboard state lifted to the editor root. Dedup by (pieceType, color, kind) for type modifiers and (square, kind) for instance modifiers prevents pastes from creating additive duplicates.
  • Inline conflict resolution via applyFix dispatcher: Each error/warning code routes to a specific resolution. E_PROFILE_NO_KING and E_PROFILE_ATTR_LIMIT are advisory-only (require user judgement); the rest auto-resolve.

Deviations

  • None — implementation matched ADRs as written.

T3 Architecture Decisions

T3-ADR-1: Custom modifier DSL is data-only (no runtime code execution)

Context

T3 introduces user-authored modifier categories. The core design question is whether user-authored behavior should be represented as executable code or as structured data composed from engine-shipped operations.

Decision

Custom modifiers are expressed as validated, structured JSON descriptors composed from a fixed primitive catalog. The descriptor itself is the DSL. T3 does not execute user-provided scripts, evaluate strings, or parse a free-form mini-language.

Rationale

  • Security first. Data validation is materially safer than executing untrusted code in the game server or client.
  • Deterministic behavior. A finite primitive set gives explicit, testable semantics and removes runtime ambiguity.
  • Authoring ergonomics. A visual editor can expose primitives without requiring users to write or debug code.
  • Operational simplicity. No sandbox runtime, resource metering, or script lifecycle management is required in T3.

Rejected Alternatives

  • Embedded script runtime (e.g., QuickJS) in T3. Too large a security and maintenance surface for this milestone.
  • AST-backed custom language in T3. Similar complexity class to scripting, but with fewer ecosystem benefits.
  • String-template rule snippets. Too weak to represent planned behavior, yet still introduces parsing edge cases.

Consequences

  • Custom modifiers are fully serializable, inspectable, and diff-friendly.
  • Validation becomes schema-driven and server-enforceable.
  • T3 scope stays focused; script extensibility is intentionally deferred.
  • Example: A "Shield" modifier is represented as two primitives, seed-attribute(attr="ShieldCharges", value=3) and absorb-damage-with-attribute(attr="ShieldCharges", rate=1), with no user code execution path.

T3-ADR-2: Primitive catalog v1 contains 15 composable primitives

Context

If the DSL is data-only (T3-ADR-1), the primitive catalog defines the expressive boundary of T3. The catalog must cover existing modifier behavior plus key new capabilities (events, conditionals, aura-like effects).

Decision

T3 v1 ships exactly this 15-primitive catalog:

  1. seed-attribute
  2. add-to-attribute
  3. multiply-attribute
  4. add-direction
  5. set-capture-flag
  6. absorb-damage-with-attribute
  7. reflect-damage
  8. block-move-type
  9. add-aura
  10. on-turn-start
  11. on-capture
  12. on-damaged
  13. conditional
  14. modify-movement-range
  15. override-promotion

Each primitive has a typed parameter schema, registry entry, and engine integration path.

Rationale

  • Coverage. The set spans additive stats, movement controls, damage behavior, event-driven effects, and branching logic.
  • Extensibility by addition. New primitives can be added as isolated files without redesigning the DSL contract.
  • Predictable validation. Primitive-specific schemas allow clear bounds and error reporting.

Rejected Alternatives

  • Smaller initial catalog. Reduced immediate utility; users would quickly hit expressiveness gaps.
  • Open-ended primitive parameters (Record<string, unknown>). Weak type guarantees and poor editor UX.
  • Monolithic "do-everything" primitive. Hard to validate, reason about, or compose safely.

Consequences

  • T3 ships with a bounded but practical authoring surface.
  • Engine and UI both rely on one canonical primitive registry.
  • Future primitives are additive and backward-compatible.
  • Example: A low-HP panic behavior can be authored with conditional(condition: hp<2, then:[modify-movement-range(+1)], else:[]).

T3-ADR-3: Authoring scope is per-room runtime, per-user library, and embedded sharing

Context

Custom modifiers must be reusable for authors, isolated for active games, and portable when profiles are shared.

Decision

  • Runtime registration scope: per-room via WebSocket payloads.
  • Author storage scope: per-user local library at houserules:custom-modifiers:v1.
  • Sharing model: profiles embedding custom kinds include full descriptor payloads (not by-reference ids only).
  • Versioning model: descriptor has version: 1; incompatible evolution is represented as a new modifier id/versioned descriptor.
  • Validation guards: server validates primitive membership, parameter bounds, nesting depth <= 3, and total primitive count <= 50.

Rationale

  • Isolation. Per-room runtime registration prevents cross-room leakage.
  • Usability. Local library supports iterative authoring without server persistence dependency.
  • Portability. Embedding descriptors makes shared profiles self-contained.
  • Safety. Guardrails cap complexity and reduce abuse/DoS risk.

Rejected Alternatives

  • Global process-wide custom registry. Risks leaking user-defined behavior between unrelated rooms.
  • By-reference sharing only (id lookup). Breaks when recipient does not already have that descriptor in their library.
  • Unbounded nesting/size. Opens denial-of-service and validation-time blowups.

Consequences

  • Room creation/join paths must synchronize embedded custom descriptors.
  • Validation errors must be protocol-visible and actionable.
  • Descriptor transport payload size is larger but deterministic.
  • Example: Profile aggressive-pack-v1 includes shield-v1 inline; when imported by another user, the room can register and use shield-v1 even if that user had no prior local copy.

T3-ADR-4: Custom descriptor registry is per-engine (room), not global

Context

Built-in modifier descriptors are static and safe to keep in the global registry. User-authored descriptors are dynamic and room-specific.

Decision

Keep built-ins in module-level MODIFIER_REGISTRY. Add a per-engine custom registry (customModifiers: Map<string, CustomModifierDescriptor>) and make descriptor resolution consult per-engine custom entries when global built-ins miss.

Rationale

  • Correct scoping. Room-specific rules remain room-specific.
  • Zero regression for built-ins. Existing built-in lookup remains stable.
  • Compatibility with existing architecture. Resolution still flows through the same descriptor interface.

Rejected Alternatives

  • Process-global custom registration. Causes rule contamination across sessions.
  • Separate custom-only lookup path everywhere. Increases branching and duplicates integration logic.
  • Namespace-prefix hacks in a single map. Couples isolation to naming discipline rather than architecture.

Consequences

  • Engine APIs that resolve modifiers need engine context.
  • Room teardown naturally drops custom descriptors with engine lifecycle.
  • Debugging remains straightforward: built-ins global, customs room-local.
  • Example: Room A registers berserk-v1; Room B does not. get("berserk-v1") succeeds in Room A context and fails in Room B context without any global side effect.

T3-ADR-5: T4 forward design preserves one integration seam

Context

T3 explicitly avoids runtime scripting, but the architecture should not require an overhaul if T4 introduces scripted modifiers.

Decision

Define T3 custom descriptors as type: "data" and reserve a parallel type: "scripted" branch for T4. Both forms target the same high-level descriptor integration seam (apply-style engine contract), while validation branches by descriptor type.

Rationale

  • Future-proofing without scope creep. T3 remains data-only while enabling a clean T4 extension path.
  • Minimized migration risk. Existing plumbing (registry, protocol, profile embedding) remains reusable.
  • Separation of concerns. Script security and sandboxing can be handled in T4-specific validation/execution modules.

Rejected Alternatives

  • Hardcode T3 as the only forever model. Forces breaking redesign for T4.
  • Partially ship script hooks in T3. Expands attack surface before policy, sandbox, and permissions are ready.
  • Completely separate scripted architecture. Duplicates profile and registry plumbing.

Consequences

  • T3 descriptors and validators stay simple and strict.
  • T4 can be additive via new descriptor type and validator branch.
  • Docs can communicate a clear migration path from data to scripted forms.
  • Example: A future type:"scripted" modifier may implement the same combat buff as today's add-to-attribute, but still enters the engine via the same descriptor-resolution and apply integration seam.

T3-ADR-6: Multiple active profiles stack in explicit order

Context

T1/T2 assume one active profile at a time. T3 primitives and custom modifiers increase composition use cases where users want layered rule bundles.

Decision

Engine state moves from one active profile to an ordered activeProfiles: readonly ModifierProfile[]. Profile effects stack across the list using existing per-modifier stacking semantics (additive, union, multiplicative, or precedence-wins as defined by prior ADRs). Default order is registration order; UI can expose explicit reordering.

Rationale

  • Composability. Users can combine orthogonal profile concepts without pre-merging into a single artifact.
  • Reuse. Small focused profiles become building blocks.
  • Determinism. Explicit ordering eliminates ambiguity for override-style collisions.

Rejected Alternatives

  • Single-profile only forever. Limits expressiveness and reusability.
  • Implicit/unordered merge. Nondeterministic outcomes for override kinds.
  • Auto-merge into synthetic profile client-side. Harder to debug and easy to desync with server authority.

Consequences

  • Apply/reconcile APIs must accept arrays, not single profile objects.
  • UI must communicate stack order and precedence implications clearly.
  • Validator and diagnostics need to report cross-profile conflicts.
  • Example: Profile A gives HpBonus +2, Profile B gives HpBonus +1; with stacking active, affected pieces resolve to +3 total HP bonus.

T3-ADR-7: Aura primitive semantics are derived and recomputed

Context

add-aura introduces cross-piece effects whose targets vary with board position. Static one-time application is insufficient because piece movement changes who is in range.

Decision

Aura effects are treated as derived facts:

  • Source piece declares add-aura(radius, targetAttr, delta).
  • Engine computes affected pieces by distance at evaluation time.
  • Derived aura contributions are recomputed after each move and reconciled with current board state.
  • Aura-derived facts feed into normal effective-attribute resolution and stack with other modifiers using existing rules.

Rationale

  • Correctness over time. Moving pieces in/out of range updates outcomes immediately and deterministically.
  • Single mental model. Aura outputs become just another attribute source in the resolved fact set.
  • Implementation locality. Recompute hook can live in a focused aura integration path without mutating core descriptor contracts.

Rejected Alternatives

  • Apply aura once at profile load. Becomes stale as soon as pieces move.
  • Event-driven partial updates only. Harder to guarantee correctness for all movement/capture transitions.
  • Dedicated aura-only buff subsystem. Duplicates stacking/resolution logic.

Consequences

  • Post-move recomputation is required for correctness.
  • Performance budget must account for board-wide aura evaluation.
  • Tooling/debug UI should expose which aura sources contribute to each piece.
  • Example (king aura): A king with add-aura(radius=2, targetAttr=HpBonus, delta=+1) grants HpBonus +1 to all allied pieces within two squares. If a knight moves from distance 1 to distance 3 from that king, the bonus is removed on the next recomputation pass.

T3 Implementation Retrospective

What changed between the T3 plan and the shipped reality, what worked, what hurt, and the open work the team should pick up next.

Scope delivered

All 32 implementation tasks shipped across five waves:

  • Wave 1: ADR doc + primitive types/registry + custom descriptor types.
  • Wave 2: 15 effect primitives (state / mechanic / advanced clusters).
  • Wave 3: descriptor validator + Zod schema + library persistence + apply integration + multi-profile stacking.
  • Wave 4: server custom-modifier.register handler + CustomModifierEditor primitive composer + panel kind-dropdown extension + lobby multi-profile stack picker + aura recomputation hook.
  • Wave 5: e2e suite + this retrospective + user docs + T4 forward-design.

Final test counts: 1378 unit tests + the new e2e suite, all green.

What deviated from the plan

Trigger/conditional primitives seed facts but don't (yet) fire

on-turn-start, on-capture, on-damaged, and conditional all register and seed OnTurnStartHooks / OnCaptureHooks / OnDamagedHooks / ConditionalHooks facts on the target piece — but the engine pipeline that SHOULD evaluate those hooks at the corresponding game phase is not wired up. The data flows; the runtime trigger evaluation is deferred.

The fact-seeding still has user value (the inspector can show "this piece has 1 on-capture trigger"), but the primitives currently behave as data declarations, not active behaviour. T22's applyCustomDescriptor walks childPrimitives() recursively at apply time, so nested primitives DO run when the descriptor is applied — what doesn't yet happen is "fire on-capture's primitives WHEN this piece captures".

Wiring is mechanical (the engine already exposes onAfterMove/onDamage hook points used by the modifier-integration preset for DamageResistance); the follow-up should replicate that pattern for the four trigger attrs.

Aura contributions land in AuraContributions but no consumer reads them

T28 ships computeAuraFacts and wires it to onAfterMove so contributions recompute correctly after every move. They land on each affected piece as AuraContributions: Record<targetAttr, delta>. However, no engine subsystem currently reads that record when computing effective attribute values — HpBonus consumers see only the directly-seeded value, not the aura-derived addition.

The infrastructure is correct (idempotent, source-move retracts contribution, multi-source accumulation). The next step is "effective attr read points layer both the direct fact AND the aura contribution" — a small but careful change in HP / Range / DirectionAdditions consumers.

Multi-profile stacking is solo-only on the wire

T27 ships the lobby UI for ordered profile stacks. The engine (applyProfilesToSession) already supports the stack natively. The wire protocol, however, still carries one ModifierProfile per room.create / modifier-profile.update payload — multiplayer rooms can use a single profile each. Stacking on the wire would extend RoomCreatePayloadSchema.profile? to profiles?: ModifierProfile[] and update the consent flow (T2-ADR-2) to apply to a stack. Out of scope for T3.

Server-side semantic validation deferred

The server runs Zod structural validation on incoming custom-modifier.register descriptors but does NOT validate primitive-kind-in-registry or per-primitive params satisfaction. Doing so would require mirroring the entire 15-primitive catalog into the Zod-v3 server package across the v3↔v4 schema boundary.

Instead: the client validates with validateCustomDescriptor BEFORE sending, and the engine's runtime applier (applyCustomDescriptor) silently skips unknown primitive kinds on the apply path. Net effect: a malicious or buggy client can register a structurally-valid descriptor whose primitives quietly no-op. Consequence: low-severity (no incorrect game behaviour can leak to opponent), but a server-side semantic gate is the natural T3.1 hardening.

What worked well

  • Per-engine CustomModifierRegistry (ADR-4) prevents cross-room leakage by construction, not by discipline. We never had to hunt down a "this descriptor mysteriously appeared in another room" bug because the type system makes it impossible.
  • childPrimitives() as the composable recursion contract turned the "validate depth ≤ 3" and "walk for apply" requirements into one-liner consumers. The same callback drives the validator's tree walk and the applier's recursive descent.
  • Zod schema as both wire validator AND structural truth kept the custom-descriptor shape from drifting between protocol layer and runtime type. The single boundary cast in parseCustomModifierDescriptor is documented and small.
  • 15 separate primitive files (one each) kept individual change sets reviewable and made git blame informative for each primitive's evolution.

What hurt

  • Long parallel agent runs blew the tool-call cap repeatedly. Multiple Wave 2 / Wave 3 / Wave 4 batches were cancelled mid-flight after the delegated agent burned 200 tool calls on verification loops without committing. Reconciling partial work consumed orchestrator time we wanted to spend on Wave 5.

    Mitigation for future epics: prompt agents to commit incrementally (every 2-3 files) rather than batching all commits at the end. Even better, decompose the cluster prompts further (one primitive = one delegation) even at the cost of more orchestration overhead.

  • as unknown as X in lazy schemas required a documented exception. T20's recursive EffectPrimitiveNodeSchema couldn't cleanly satisfy z.ZodType<EffectPrimitiveNode> because Zod infers kind: string and the type wants kind: PrimitiveKind. We kept a single boundary cast in parseCustomModifierDescriptor rather than restructuring the type.

  • The chess-side Zod v4 vs server Zod v3 mismatch forced us to mirror schemas across two packages by hand. Moving the chess and server packages onto the same Zod major would simplify a lot, but is a separate migration.

Open follow-ups

Item Effort Priority
Wire trigger primitives (on-turn-start/on-capture/on-damaged) into the engine pipeline Medium High — needed before custom modifiers feel "alive"
Layer AuraContributions into HP/Range read sites Small High — same
Server-side semantic validation of custom-modifier.register payloads Medium Medium — current degradation is silent
Multi-profile on the wire for multiplayer Medium Medium — solo-only is a clean stop-gap
Visual editor: drag-to-reorder primitives in the tree Small Low — polish
Visual editor: nested-tree inspector (currently JSON textarea fallback) Medium Low — polish
CustomModifierEditor Playwright coverage beyond happy-path Small Medium