Core concepts

The vocabulary of yothere: threads, the runner, the attention router, brains, voice, and the cockpit.

yothere has a small vocabulary. Learn these nine words and the rest of the docs read easily. The one thing to hold onto: yothere is the interface and the attention router. It does not do the work. A brain does. Everything below is either part of the interface, part of the router, or the contract to a brain.

Thread

A thread is one task. You hail yothere with a task and it creates a thread to carry that task to completion.

Each thread is a directory on disk (a dir-per-thread state machine). Its live state lives in a status.json written atomically (via an atomic rename), so a reader never sees a half-written file. A thread resumes across turns by its session id, so a brain keeps its own context and yothere just points at the same session again.

A thread moves through a fixed set of states:

State Meaning
running A worker is advancing it right now.
resumed You just replied; it will advance on the next tick.
blocked It is stalled on you (it asked a question or wants an approval).
done The work finished.
stuck Its worker died or went stale; it needs a look.
parked Settled and set aside (for example an ask nobody answered for a day).

The board and the attention router read a smaller, normalized vocabulary derived from those states: running, ready (a finished done thread), blocked (covers both blocked and stuck), and idle (a parked thread). So a “ready” or “idle” thread you see in the cockpit is the glance-level view of the underlying thread state.

Runner

The runner is the headless advance engine. It is the background service that actually moves the fleet forward. It runs a loop, one tick roughly every 30 seconds, and on each tick it:

  1. Reaps finished worker subprocesses.
  2. Sweeps stale threads (a running thread whose worker died or went silent) to stuck.
  3. Checks the cost caps before spawning anything new.
  4. Asks the attention router which threads to advance, bounded by a max concurrency and a per-thread cooldown.
  5. Spawns one worker subprocess per chosen thread.
  6. Coalesces any new “needs your eyes” transitions into ONE rate-limited nudge (“N threads need your eyes, open the board”) rather than a per-event firehose.
Note The runner is operator-run. You start it once (manually or under a service manager) and it advances the fleet on its own. See CLI reference for yothere runner loop.

Attention router

The attention router is the moat. Other agent cockpits decide which agent runs next. yothere decides which thread wants the human’s eyes next, and it protects your focus while the rest of the fleet churns.

It is deterministic and offline-first (no model call, no online learning), so it is unit-testable and never stalls on a network hop. It ranks the whole fleet by an additive score (a blocked thread that is stalling on you outranks a finished one, urgency and staleness nudge the order), recommends a single focus thread, and honors a pinned focus so a thread you chose to concentrate on is never bumped by the ranker.

Brain (and the Brain Protocol)

A brain is what actually does the work: it reads data, runs tools, drafts and sends messages. yothere is harness-agnostic by design, so a brain can be anything that speaks Brain Protocol v1 (WebSocket transport, JSON-RPC 2.0 frames).

Three brain shapes are in use today:

  • Local Claude Code running on your machine (claude -p per thread). The default.
  • A WebSocket-daemon harness, for example ubob, that yothere drives over its local WebSocket (ws://127.0.0.1:8765).
  • A remote brain over the internet at a wss:// endpoint (a hosted sprite, a teammate’s stack, a work platform).

The same cockpit drives all three. See How it works for the full protocol and Agent onboarding for wiring one up.

Voice

Voice lets you hail and reply hands-free. It is a Gemini-Live voice loop that runs one of two ways:

  • WebRTC over your tailnet (free, no phone provider). A browser on your network connects directly to the local voice server.
  • Twilio / PSTN, so a real phone call can reach the same loop.

Voice is optional. Install the voice extra (pip install 'yothere[voice]') to light it up; without the media stack the cockpit still renders and the Connect button reads “voice unavailable”.

Tunnel

The tunnel is how a Twilio phone call reaches your local voice server. It is a cloudflared tunnel that fronts the loopback voice server (127.0.0.1:8767) at a public wss:// host, so Twilio’s inbound webhook has a stable public endpoint to hit while the server itself stays bound to loopback. You only need it for the Twilio/PSTN path; the tailnet WebRTC path does not use it.

Cockpit (/overview)

The cockpit is the web UI, served at /overview. It is a live fleet board (an inbox of threads that need you, a working lane, and past sessions) plus a voice companion (the Connect button and the call transcript). It is served by the voice service on 127.0.0.1:8767. See the Cockpit tour for a full walkthrough.

Board

The board is a standalone, server-rendered glance surface: the focus recommendation up top, then every thread as a card with its state, live progress line, needs-eyes badge, deliverable link, age, and source. It is the lighter, read-only cousin of the cockpit: it never writes thread state, and every worker-authored field is HTML-escaped, so an agent that emits arbitrary text can never inject markup (XSS-safe by construction). Build and open it with yothere board --open.

Cost caps

The runner enforces spend limits so a runaway thread cannot quietly burn money. The defaults, from the runner config:

Cap Default What it does
Daily fleet cap $10.00 Once the fleet’s spend today crosses this, the runner stops spawning new turns and nudges you once.
Per-thread cap $1.50 A thread that has spent this much today is blocked with a “continue, raise the cap, or stop” ask.
Tick interval ~30s How often the runner loop advances the fleet.
Max concurrency 3 At most this many worker subprocesses advance at once.
Per-thread cooldown ~180s A thread waits this long between turns (its first turn and a fresh reply bypass it).
Stale after ~1h A running thread with a dead worker this old is swept to stuck.
Auto-park ask ~24h A blocked ask nobody answered for this long is auto-parked to keep the inbox clean; a late reply un-parks it for free.
Heads up For a remote brain, yothere cannot meter tokens on the brain's side, so the cost cap relies on the brain's honest cost events and is advisory. For a local Claude Code brain the cap is enforced from the recorded cost ledger. See Configuration to change any of these.