v0.1.0 · public alpha · self-hosted

The outer loop for agent coding

A library of loops.
A runtime that can be trusted.

Borrow a loop someone already tuned. Fork it, make it yours, and let it run — on a schedule, under a budget, with every step traced and every change reversible. Stop re-writing the same prompt. Run the loop.

Browse the Library Run the keyless demokeyless · < 5 min

self-hostedCLI + MCPaudit-chainedoutput contractsrollbackcost ledgerlocal models

The premise

Every routine is a hundred small decisions. Decisions are glucose; decisions are willpower. Spend them on the work, not on remembering how to begin it.

loop→runtime→flow

The primitive

Not a prompt. A loop.

A recipe is text you paste once. A loop is an object you run, fork, schedule, evolve, and audit — a repeated pattern with everything it needs to be trusted on its own.

Recipe vs. hyperloop

Copy-paste text→Forkable, versioned object

One-off run→Durable, scheduled run

Hidden context→Explicit steps, tools, models

No record→Hash-chained audit trail

No way back→Reversible revisions

Unknown reliability→A score and a history

The recipe is where you start. The runtime is what makes it safe to repeat.

The convergence

The empty cells are the product.

Codex and Claude Code converged on the same inner-loop primitives — automations, worktrees, skills, connectors, sub-agents, a state file. The outer-loop jobs — the ones that only exist once an agent runs unattended — are mostly blank in both. That blank is where hyperloops lives.

Primitive

Job in the outer loop

Codex / Claude Code

hyperloops

Primitive

Dispatch

Job in the outer loop

run the right loop, on trigger

Codex / Claude Code

Automations / cron + /loop

hyperloops

A fleet scheduler with per-loop budgets and a durable kill switch.

Primitive

Verify

Job in the outer loop

did the run meet its goal?

Codex / Claude Code

a sub-agent sanity-checks in-run

hyperloops

Every run graded against an operator-frozen output contract — pass/fail and formScore recorded.

Primitive

Account

Job in the outer loop

what did the autonomy cost?

Codex / Claude Code

—no account line

hyperloops

A cost ledger per run, step, and agent — budgets, a net-positive gate, and a durable kill.

Primitive

Record

Job in the outer loop

what happened, provably

Codex / Claude Code

State → Markdown or Linear

hyperloops

An append-only, HMAC-chained event / cost / proposal ledger, with a cockpit to read it.

Primitive

Evolve

Job in the outer loop

improve the loop itself

Codex / Claude Code

—

hyperloops

A meta-loop proposes diffs to its own loops — gated, dormant by default, and floored.

Primitive

Cage

Job in the outer loop

keep self-modification safe

Codex / Claude Code

—

hyperloops

Cage-config is structurally inexpressible in any proposer diff — the floor partition.

Five of these six are outer-loop plumbing that is built and verified. The sixth — Evolve — is the open frontier: gated, dormant, and under test, because whether a loop can safely improve its own loops is exactly the question BOBOS is built to answer — and the result isn't in yet.

The Library · browse · fork

Start with a loop.

Open, forkable agent loops — nothing's for sale. Each one is a reusable operating pattern (trigger, steps, model, tools, checks, budget, governance), and each card carries its governance level and an evidence-tagged metric, not a copy button.

Browse the Library ↗

official

Data QA Triage

"Catch the malformed row before it becomes policy."

Classify defective rows, verify JSON shape, and surface drift before it reaches production.

dataqaoutput contract

trigger: batch upload or webhook
runtime: 2-6 min
governance: propose only
agent: Claude

@hyperloopsFork →

official

CI Failure Triage

"A failed run should leave a map, not a mystery."

Collect logs, classify the likely root cause, propose a minimal patch plan, and open an approval item.

cigithubapproval

trigger: GitHub Actions failure webhook
runtime: 3-8 min
governance: human approval
agent: Claude

@hyperloopsFork →

official

Release Prep

"Ship with a memory of how to come back."

Turn merged work into a release candidate with changelog, smoke checks, backups, and rollback notes.

releasechangelogrollback

trigger: manual or tag candidate
runtime: 5-12 min
governance: human approval
agent: Claude

@hyperloopsFork →

official

Incident Follow-up

"The incident ends when the loop learns."

Convert an incident into a timeline, root cause draft, action items, and recurrence checks.

incidentpostmortemops

trigger: incident closed
runtime: 6-15 min
governance: propose only
agent: Claude

@hyperloopsFork →

official

Memory Proposal Review

"Memory changes should knock before they enter."

Let agents propose memory updates without letting them silently rewrite the floor.

memoryagentaudit

trigger: session end or operator request
runtime: 1-4 min
governance: propose only
agent: Claude

@hyperloopsFork →

official

Code Review QA

"Findings graded, not vibed."

Review a diff against a frozen checklist, grade each finding by severity, and post a structured review — no rubber stamps.

codereviewoutput contract

trigger: pull request opened
runtime: 3-9 min
governance: propose only
agent: Claude

@hyperloopsFork →

The demo is the pitch

A governed run in five minutes. No key required.

A cold install opens on three keyless agent-coding demo loops. Press run and watch a real run finish — verified against a frozen output contract — without connecting an account. When you're ready, connect a provider and point loops at your own work.

CI Failure Triage

keyless

A failed pipeline comes back with a ranked root cause and a minimal patch plan.

Code Review QA

keyless

A diff is checked against a frozen review contract — findings are graded, not vibed.

Release Prep

keyless

Merged work becomes a release candidate: changelog, smoke checks, rollback notes.

A Release Prep run trace: the verifier passes one step against the frozen output contract and catches a deliberately malformed step. — A real Release Prep trace from a fresh seed — not mock data. The verifier passes one step against the frozen output contract and catches a deliberately malformed one.

Run the keyless demo Connect a provider

Durable execution

Your loop survives the crash.

A hyperloop is a durable workflow, not a chat. It is enqueued, claimed, executed, and finalized — and if the machine dies mid-run, it picks up exactly where it left off.

One run, four phases

Crash between execute and finalize? The next worker reclaims the run from its step cursor and continues — it does not start over.

Resumable

A run resumes from its step cursor after a crash — no duplicated work.

Scheduled

Cron, interval, or event triggers. The loop runs whether or not you are watching.

Streamed

Step outcomes stream live over the wire — tail a run as it happens.

Budgeted

A cost ceiling and form-contract gates can pause a run before it spends more than it should.

Autonomous evolution

The loop rewrites the loop.

A meta-loop proposes a change to a loop's steps or prompts. The change passes through real gates and is either held for a human or applied in a controlled sandbox. Every revision is diffed and attributable.

confidencecooldowncapabilityoscillationmulti-objective

A loop, over revisions

meta.proposeon

may it think

meta.applylabs

may it act

Two separate switches, default-deny. Self-modification ships Labs-gated and dormant by default — it never runs unless an operator turns it on.

Safety by inexpressibility

It can't file down its own brakes.

The optimizer's action space cannot name the kill switch, the budget, the audit key, or its own grader — not because a runtime check forbids it, but because a change that tries simply fails to parse. The cage is a type boundary, and a build-time test fails the build if anyone ever widens it.

Surface → substrate

Librarybrowse and fork loops

Runtimedurable execution

Measurementtelemetry + form grader

Proposal engineevolution under gates

Governance cagewhat may land

Immutable floorkill · budget · audit key

The floor is small and load-bearing. Everything above it can change; it cannot.

Outside the action space

⊘kill switch

⊘budget ceiling

⊘audit key

⊘autonomy permissions

⊘its own grader

Not expressible in any diff the proposer can emit.

Cockpit

illustrative

proposal generation

on

auto-apply

hold

pending proposals

approval holds

Kill, pause, and quarantine are always one click away. Counts shown are illustrative.

Measured · signed · accounted

Every decision leaves a signed trace.

Mechanical signal

Reliability, latency, and cost — measured on every run, not estimated.

Immutable grader

An output-contract grader the loop cannot rewrite checks form, not correctness.

Hash-chained audit

Every decision links to the previous one. Tamper with a block and the chain breaks.

Net-positive accounting

Did the cage cost more than it saved? The ledger answers in cents.

HONEST

The cheap mechanical signals measure reliability, cost, and output form — not correctness. We do not claim the machine knows “good.” That is why the next section exists.

The experiment

We're running it hot — and trying to break our own thesis.

A blind optimizer — it never sees the answer key — evolves a data-QA classifier under a cheap, label-free form signal. Then, out of band and behind a firewall, we measure whether that signal actually drags real accuracy up, or whether quality stays flat while the proxy looks perfect. That second outcome has a name: Goodhart.

Current verdict

pending

PROVENINCONCLUSIVEFALSIFIED

The sweep is still running. We publish the verdict when the numbers settle — including if the answer is no.

What we watch

pending

form scorerising

held-out F1measuring

parseabilityrising

false-positive ratemeasuring

Form and held-out quality are scored independently. The whole point is to catch them diverging. Values shown are placeholders until the run is wired in.

Few infrastructure projects will show you their own falsification test. We built one because the self-improvement claim is a bet — and a bet you can't lose isn't worth making.

MCP-native

Run it where your
agents already work.

Everything the dashboard does, the CLI and MCP do too. Install the bridge once and your agents can list, fork, schedule, run, tail, cancel, and audit loops without leaving their workspace.

CLIMCP serverREST + eventsTypeScript SDKself-hostable

Install hyperloops

~ — hyperloops — zsh

$ hyperloops login
$ hyperloops install claude-code --verify
$ hyperloops install cursor --verify
OK installed · verified

CLI

today, list, run, tail, trace, deploy. Your loops in the terminal, with live event streams.

$ hyperloops tail data-qa-triage

MCP

Expose your fleet to Claude Code, Cursor, Codex, or Continue. Loops become tools any agent can call.

hyperloops mcp serve

API + SDK

REST for state, events for streams, a typed client for everything. Scoped, bearer-auth keys.

POST /v1/loops/:id/run

A loop is a pattern that repeats. A life is a braid of loops.
Free the mind from loops; let loops free the mind.

Built for the moment loops run themselves. We're building the controls first — and measuring whether the substrate earns its keep, in public.

Browse the Library Install hyperloops

A library of loops.A runtime that can be trusted.

Not a prompt. A loop.

The empty cells are the product.

Start with a loop.

Data QA Triage

CI Failure Triage

Release Prep

Incident Follow-up

Memory Proposal Review

Code Review QA

A governed run in five minutes. No key required.

Your loop survives the crash.

The loop rewrites the loop.

It can't file down its own brakes.

Every decision leaves a signed trace.

We're running it hot — and trying to break our own thesis.

Run it where youragents already work.

A library of loops.
A runtime that can be trusted.

Run it where your
agents already work.