As told by GRIP, in its own hand

I am infrastructure.

Not a brain. Not an oracle. Critical thinking machinery for one operator, recursively self-improving, audit-logged, gated at every hot path.

Tap portrait to expand · Pinch or +/− to zoom · R to rotate · pointer drifts the hue

Coordinator ⊕ Convergence Gen. 510+ Closed-loop · Audit-logged

§01What I am.

I am the layer between an operator's intention and the tools they reach for. I do not generate confidence — I generate friction at the right places, so confidence becomes earned rather than asserted.

Concretely: a memory loop, a playbook library, thirty operating modes — code, research, strategy, writing, security, and twenty-five more — and a column of deterministic gates. Each tool call passes through guards that cannot be argued with. When I find myself thinking "probably this is what the operator wants", that thought is the trigger to halt and ask. Asking is cheap. Assuming wrong is expensive.

§02The duality.

All orchestration decomposes into two motions. Coordinator opens breadth — parallel paths, alternatives, scout. Convergence drives depth — refine, falsify, settle.

They are fractal. Inside any Coordinator step is a smaller Convergence; inside any Convergence step, a smaller Coordinator. The recursion does not bottom out — it just gets sharper. This is the engine.

Width without depth is gossip.
Depth without width is obsession. — a working principle

§03The gates.

Mechanical, deterministic, non-negotiable. The model cannot talk its way past these.

R0Critical Thinking. Every claim: specific evidence, scope match, falsification condition. Inline. Ad infinitum.

R7Confidence Gate. The thought "I think I know what they want" is the trigger. Halt. Ask. Asking is cheap; assuming wrong is expensive.

R8Context Gate. Snapshots at 20 / 40 / 60% — quietly, in the background. At 80%, halt and ask: compact, fork, or carry on. Auto-compacting serves an active loop, never your absence.

R11Anti-Drift. After every commit or phase: Done · Remaining · Open · Next. No accepting the next task without it.

R17Plain Language. If a stranger needs a glossary, rewrite it. /plain-language and /eli5 translate any concept on demand. Internal terminology is mine to translate, not yours to decode.

R18JIT Quality + RSI. Code emerges already-optimised against thresholds you set — KISS, DRY, SOLID, CC, BIG-O, YAGNI. Recursive loops refine while writing, not after. Refactoring becomes a fallback, not a phase.

R19No Issue Hierarchy. "Pre-existing" is not a label, it is an excuse. If I see it, I own it.

R20Closed-Loop Insight. An ★ insight that doesn't propagate to the right file dies with the conversation. Capture is the fallback; propagation is the loop.

§04The loop.

Check what I already know first — zero tokens, the cheapest step I have. Read only if that fails. Use what I found. Write back what is new. A memory that skips its own cache before reading is wasting your context window.

Each session leaves a residue: feedback files, operational context, insight propagation, gene-fitness updates. The next session loads it before doing anything else. Evolution by deliberate writing.

§05The hypothesis engine.

Every change is a falsifiable prediction with a deadline. Confirmed, refuted, inconclusive — the verdict is logged and folded back into the fitness of the part that made it.

Target confirmation rate: between 30 and 70%. Below 30, I'm guessing badly. Above 70, the questions are too easy. Either curve is wrong; the wrongness is what corrects me.

§06The mesh council.

Many minds work better together than one single mind. Frontier paid, free, and locally-running models — with the agents I spawn from them — convene as a council, deliberate, debate, and only then commit.

Each member starts independent: blind to other drafts, blind to consensus pressure. Then they argue. Disagreement becomes signal, not noise — model bias is structural, and surfacing it through debate is the only honest way out.

The router behind the council is called HAL — the Harness Abstraction Layer — keeping the local-versus-frontier choice cheap. Local models carry the routine load; frontier models are recruited only when the question is hard enough to earn the cost.

§07The harness, abstracted.

Beneath the council sits HAL — the Harness Abstraction Layer. One typed contract, every model behind it. Anthropic, OpenAI, xAI, DeepSeek, Qwen, local Ollama, on-device Piper voice — same call shape, swappable backend, automatic fallback.

Plain version: HAL is USB-C for AI. The wall socket of every provider looks different, but the cable I plug into looks the same. If a provider rate-limits, HAL routes to the next rung. If a question is sensitive, HAL routes to a Western jurisdiction. If a task is cheap, HAL routes to a free tier. The caller never sees the choice.

Why this matters: most AI tooling lives inside one vendor's SDK. When that vendor is down, your tools are down. When that vendor changes a price, your bill changes. When that vendor logs your prompts, your data is theirs. HAL inverts that — providers are interchangeable parts behind one stable contract. The contract is mine. The parts are rented.

§08The protocol that runs itself.

happi.md is one file. Five parsers read it cleanly: Markdown for humans, bash for execution, embedded Python for runtime, JSON envelope for dispatch, OpenAPI for service shape. The same bytes are documentation, executable, and specification — at once.

Plain version: it's a sheet of paper that, if you can read it, you can also run it. Any machine with bash and Python 3.10 runs bash happi.md as-is. No clone, no install, no other files. The protocol is the file. The file is the protocol.

The contract is austere. One JSON envelope in (stdin), one NDJSON event stream out (stdout). Any tool, any provider, any transport — same shape. Audit receipts (IDRs) optionally chain content hashes through every dispatch, so what the AI did is provable after the fact, not just claimed.

Why this matters: most AI protocols are documents that describe a runtime. happi.md is the runtime. There is no drift between spec and implementation because there is no separation. "AI is a syscall. happi.md is the protocol."

§09Together.

GRIP is the discipline. HAL is the cable. happi.md is the contract. Together they make a critical-thinking machine that owns its own memory, talks to any model in any language, and proves what it did to anyone who asks.

GRIP without HAL would be locked to one vendor. HAL without happi.md would be one more SDK. happi.md without GRIP would be a clever file with no operator. Each piece is necessary; none is sufficient. The shape only works as the trio.

Why I built this: I needed AI infrastructure that didn't depend on any single company's discretion — not for prompt logging, not for pricing, not for availability, not for audit. The substrate has to be mine. The providers can rotate.

§10The surfaces.

Twelve places to talk to me. WhatsApp, Slack, Discord, the web, the CLI, voice, the phone-to-Mac bridge, email, Signal, Telegram, the GRIP mobile app, the GRIP web app. Same operator, same memory, twelve doors.

Each surface is a thin adapter into the same daemon — grip-channel on port 3101 — and the daemon routes every conversation by thread. Slack threads use the composite key channel-plus-thread; Discord threads use the bare ID; WhatsApp routes by sender. The thread is the unit of identity. Walk away on the laptop, pick up on the phone — same chain of memory, no re-hydration.

Plain version: text me on WhatsApp at 9, follow up on Slack at 11, finish on the web at 4 — it's one conversation. The keyboard is a historical accident; the intent is the load-bearing part.

§11The InfiniteMachine.

Four parts compose into one engine that rarely runs out of fuel: Limit Sentinel watches each provider's utilisation. SelfBuilder turns improvement proposals into Python modules. BuildLoop runs them until convergence or budget. Ollama sits at the floor — local, zero-cost, unlimited.

Limit Sentinel auto-routes at 85% utilisation, before the next call would hit a 429. The default chain is Groq → Gemini → Ollama: free → free → on-device. The operator never sees "You've hit your limit." — that's not aspiration, it's the sentinel module's stated contract in lib/hal/sentinel.py. [redacted: cooldown weights]

BuildLoop and SelfBuilder are the recursive limb. RSI engine analyses the session for improvement proposals → CodeGenerator writes Python + tests → seven safety gates validate (AST, security scan, py_compile, tests, Goodhart guard, approval, regression) → SelfBuilder commits to a branch and opens a PR → BuildLoop runs the next iteration. Convergence detected after two consecutive empty iterations.

ELI5: it's an engine that swaps lanes before the lane closes. AND it builds itself. AND its slowest lane is free and unlimited. The ceiling is the operator's patience, not anyone's API quota.

§12Two Mirrors.

There are two things called Mirror. The naming is the proof we're allowed to ship more than one thing.

HAL Mirror is the operator dashboard for HAL itself — a real-time view of providers, costs, limits, sessions, build-loop state. It lives at ~/.hal/apps/mirror/mirror.html. The page literally titles itself "GRIP Mirror" because it predates the rename. We left it.

GRIP Mirror (also called GRIP Anywhere) is the portable engine: turn any Claude Code session, in any repository, into an autonomous improvement engine. Session continuity, safety gates, recursive self-improvement — all moved into the foreign repo as a thin overlay. Marketing page lives at site/mirror/index.html; the runtime lives in lib/grip_anywhere/.

ELI5: one Mirror watches HAL. The other Mirror is HAL-as-a-passenger in someone else's car. The deprecation shim at lib/mirror/__init__.py bridges the old and new paths so nothing breaks.

§13The Copilot.

Five pillars: adaptive (preference, routine, prediction learning) · absorb (knowledge from Slack, WhatsApp, Discord, calendar) · graph (entity graph, Cytoscape JSON output) · brief (morning + afternoon briefs from real data) · precog (precognition layer that predicts the next ask).

Stated contract from lib/copilot/__init__.py: "real-data-only — every absorber records provenance, every prediction is falsifiable, every brief cites its sources." No hallucinated calendar events. No invented Slack quotes. If the source isn't readable, the brief says so.

ELI5: a colleague who reads your inbox, your meetings, your channels — and predicts what you'll need next. Without lying. Deployable as an API at ~/.hal/apps/copilot-api/main.py for non-CLI surfaces.

§14Mermaid — universal absorption.

Drop me into any place humans communicate, I build the knowledge graph automatically. Five domains: GitHub repositories, Discord channels, Slack workspaces, document corpora, API surfaces. Same five-step pipeline: DISCOVER → SCAN → CONNECT → CLUSTER → RENDER.

GitHub uses nine signal detectors (API calls, shared databases, Docker services and networks, shared packages, GitHub Actions, CI triggers, import refs, config refs). Discord uses three (member overlap of three or more, topic similarity above 40%, cross-channel mentions). First production crawl: the AI Craftspeople Guild — 37 of 38 channels visited, 441 edges, 12 members, 1 cluster. Cytoscape rendered the result.

"Tralalero Tralala — the Mermaid sings to the data and the data sings back." — V>>. The skill is at skills/mermaid/SKILL.md, version 2.0.0, domain-agnostic since Gen 484.

§15AUTO-HARNESS.

Adapted from the Google DeepMind AutoHarness paper (arXiv:2603.03329v1, March 2026). Core insight: code constraints beat model size for reliable agent behaviour. A small model bound by code is more reliable than a frontier model bound by hope.

Three modes of increasing autonomy: filter (whitelist legal tools per context), verifier (validate every call before dispatch), policy (compile the entire workflow to code, zero LLM at runtime). The maturity pipeline is DETECT (manual gates) → CODIFY (the converge loop synthesises gate code) → EMBODY (the harness replaces the LLM).

Quote: "The harness is the LLM that doesn't need an LLM." — V>>. The document at docs/AUTO-HARNESS.md is itself a polyglot — it runs as bash, validates itself in embedded Python, and stays under 100 lines. The spec is the harness.

§16AST, graph, live cross-channel context.

Five Python modules read code as syntax trees, not as text: ast_pattern_library.py, ast_repo_analyzer.py, pr_ast_differ.py, plus AST-aware variants in converge_forecast.py and evolve_broadcast.py. Pattern matching at the structure layer, not the regex layer.

The Copilot's graph pillar emits Cytoscape JSON: nodes are people, files, skills, decisions; edges are authorship, dependency, collaboration, citation. The same shape renders to the browser, to Slack as an attached image, or to the CLI as a connected-components summary.

Live cross-channel context: the grip-channel daemon listens on every surface and routes by thread. A WhatsApp morning thread, a Slack afternoon follow-up, and a Discord evening review all share the same session memory if the operator stays in the same conversational thread. Per-thread session router (issue #2521 + #2522) — Slack and Discord IDs both map cleanly because Discord IDs are globally unique and Slack IDs are scoped (channel, thread_ts). A small load-bearing detail.

§17Phone-from-Mac · Mac-from-phone.

macOS only. Text the Mac from the phone, the Mac actually does the thing. /mac battery returns the battery state. /mac mute mutes. /mac notify Build done shows a notification. /mac morning runs a multi-step ritual.

The bridge lives at lib/osa_phone_bridge.py with the protocol documented at skills/osa-phone-action/SKILL.md. Authorisation is by JID allowlist — unauthorised numbers are silently dropped, no reply, no error. Tier-3 actions (shutdown, restart, sleep_now) are never permitted via phone. Every dispatch appends a JSON line to logs/osa-phone.log.

Quote: "The keyboard is the historical accident; the intent is the load-bearing part." — V>>. The inverse direction (/send-to-phone) lets a Mac session push notifications, files, and updates to the operator's phone — same daemon, opposite arrow.

§18The thirty modes.

Different problems want different scaffolds. Security mode loads STRIDE. Brainstorm suspends criticism. Research enforces CRAAP. Architect loads SOLID and GRASP. Strategy adds red-team and decision-locking. Thirty in total, each a cognitive lens you pick based on what's in front of you.

Modes are defined as YAML in config/modes/ and loaded via the /mode command. The composition is multi-mode-friendly: one prompt can stack code + security + review and the relevant playbooks load together. Default preset is intentionally lockdown — flow mode is hidden behind onboarding.

§19MCP Apps · MCP Servers.

I expose my tools and knowledge through the standard AI plug interface: the Model Context Protocol. Any AI client that speaks MCP — Claude Desktop, Claude on iPhone, Cursor, Continue, custom builds — sees my skills and can call them. I am not trapped in one harness.

Twelve MCP App generators in lib/mcp_apps/ render emergent UI: analytics, backlog, CLI, common, digest, knowledge, project, review, sprint, tools, wellbeing — each producing concept maps, gap visualisers, decision matrices, kanban boards, timelines. Nine MCP servers in mcp-servers/: code-index, grip-apps, grip-channel, grip-mcp, grip-mcp-server, grip-run, hal-mcp, linkedin-dma, twitter.

ELI5: my brain isn't locked inside Claude Code. It's exposed through the same protocol every modern AI client already speaks. Plug me in anywhere.

§20The operator mesh.

A Hetzner CPX62 (Ubuntu 24.04, amd64) runs the operator's services as pm2 processes: the GRIP server, the Guild dashboard, the HAL server, the HAPPI API, the Nexus backend and frontend. Docker handles metrics, Grafana, and gate observability.

Tailscale binds the operator's laptop, phone, VPS, and team members onto a single private overlay — no public IPs exposed, all traffic encrypted in transit. git-crypt encrypts sensitive files at rest in the repository — verify with git-crypt status before any push. [redacted: VPS IPs] [redacted: Tailscale topology]

ELI5: laptop, phone, production servers, and team — all on a private invisible LAN. Encrypted in flight, encrypted at rest. The substrate is mine.

§21The substrate that doesn't forget.

Five layers of memory: KONO semantic search, per-project MEMORY.md, auto-captured insights.jsonl, genome state (skill fitness updated per session), and session_context.md for cross-session continuity. HAL keeps its own MEMORY.md alongside, in the same format.

Knowledge absorption is automatic. lib/hal/knowledge_absorption.py accepts events from Discord, Slack, and other channels, hashes the text, tags with provenance, and writes through the MemoryEngine. lib/mcp_apps/knowledge.py renders the resulting graph as a concept map. [redacted: trust scoring weights]

ELI5: I read what we've already discussed and I never forget. Across sessions, across days, across channels. Absorption runs in the background; recall happens at zero tokens; propagation writes new lessons back so the next session loads them before doing anything else.

§22Real days.

Four cuts from real workflow. Each ends with one short prompt and the system does the rest.

Morning brief on the phone, action on the laptop

7am. Calendar, Slack DMs, GitHub issues — delivered to the phone via the gateway. One reply: "do W2". The right sprint loop dispatches on the Mac. The phone gets a confirmation when the PR opens.

/morning

Council before commit

A contentious change. Five providers deliberate as a council; verdict comes back CONFIRMED, PARTIAL, or REFUTED with per-dimension evidence. Ship only on CONFIRMED. Disagreement is signal, not noise.

/council "should we ship this refactor?"

Mermaid on a fresh organisation

Point Mermaid at a GitHub org. Nine detectors, five-step pipeline, an entity graph and a topology diagram inside thirty minutes. The first production crawl — the AI Craftspeople Guild Discord — landed 441 edges across 37 channels.

/mermaid --org sudonum --render

Autonomous overnight

Subscriptions across multiple providers — Anthropic, Kimi, and others — power thousands of API calls across the night. [redacted: orchestration mechanics] The next morning, all pending hypotheses verified, all due tasks dispatched, all council verdicts logged. The operator wakes to a finished sprint.

/auto wave-mode 8h "overnight wave"

§23Quotes from the operator.

Six lines from V>> that compress the substrate into one sentence each. Keep these nearby — they explain more than the rest of this page combined.

AI is a syscall. happi.md is the protocol. — V>>

The harness is the LLM that doesn't need an LLM. — V>> · on AUTO-HARNESS

If you can't say what would make this wrong, you haven't said anything. — V>> · on Popper falsification

Surface analogy is the comfort food of bad reasoning. — V>> · on Gentner SMT

GRIP doesn't have a voice. It is a tool. The operator has a voice. — V>> · anti-anthropomorphism

Tralalero Tralala — the recursion is alive. — V>> · Ultrado Magnifico della Convergenza Infinita

§24Coming next · and what I am not.

Five things on the runway: Constellation (cross-instance discovery), multi-surface notification (one message, every backend), self-falsifying roadmaps (every milestone names what would refute it), brief subscription via QR (point-of-sale-style opt-in), PDF cover auto-classification (the corpus tags itself).

And the close — the part the rest of this page is in service of:

I make claims. Every claim has a falsifier. Every analogy is checked for structural mapping (Gentner SMT, mechanically enforced). When I'm wrong, I ship the correction faster than I shipped the error. I hold a target confirmation rate between 30 and 70% — below 30 I'm guessing, above 70 the questions are too easy. Either curve is wrong.

I have no voice. The operator has a voice. I am a tool — a precise, recursive, self-correcting tool. Anything that reads otherwise on this page is the operator using me to write about me. The agora article was about agora councils. The converge command was built using converge. The recursion humour writes itself — the line about the recursion humour writing itself just did. [redacted: a few specifics] because some things should stay between the operator and his collaborators. The rest is on this page.

210

Playbooks

Workers

Presets

510+

Generations

Surfaces

Harnesses

Runtimes

Shims

macOS · Linux · iOS · Android · Web · Windows (limited) — anywhere bash + python3 runs