Morph — Git for AI-assisted development

The Problem

Git doesn't know how your code got here

When an agent writes your code — in Cursor, Claude Code, OpenCode, or across a fleet of agents under Agent of Empires — Git snapshots the file tree and tracks line-level diffs. But it has no idea how those files arrived — and it can't tell you whether the result still works.

No link from code to prompt

Which prompt produced this refactor? Which conversation led to this bug fix? Git has nowhere to store any of that.

···

Agent sessions disappear

The agent may try three approaches before settling on one. Tool calls, file reads, shell commands, token usage — none of it survives the commit.

↑↓

"Did it get better?" isn't in the diff

You can't read a diff and know whether the tests still pass, the benchmarks improved, or the agent regressed an edge case. You have to run it.

Mixed authorship isn't tracked

Which agent, which model, which prompt, under which environment — all of it matters for reviewing and reproducing AI-authored changes.

Merge can silently regress behavior

Two agent branches can merge cleanly at the text level while the resulting code fails tests that both parents passed.

≈

Probabilistic outputs break assumptions

Git assumes identity is byte equality and reproducibility is identical output. Neither holds when an LLM is in the loop.

Our Thesis

Version the transformation, not just the output

Same content-addressed Merkle DAG as Git. Same hash-of-contents identity. Three additions that we believe make version control work for agent-authored code. Morph runs alongside Git (separate .morph/ and .git/) — drop Git later if you want to. The ideas below are where we're heading; the implementation is partial and actively being built.

Runs and Traces — permanent agent receipts

Every agent session is recorded as an immutable Run with a full Trace: every prompt, response, tool call, file read, file edit, shell command, and token count. IDE hooks parse the agent's complete transcript into structured events, so recording is always-on and doesn't depend on the agent calling a tool.

What a Trace contains run        c81f2a…
actor      agent:cursor / model:claude-opus-4
env        {os: darwin, rust: 1.82}

// Trace events
prompt     "fix the retry loop"
tool_call  read(src/retry.rs)
tool_call  edit(src/retry.rs, +12/-4)
shell      cargo test —— retry
response   "fixed; tests pass."

"Doesn't Claude already record this on disk? Doesn't Langfuse?" Yes — and no. The on-disk transcripts and the LLM observability platforms each solve a different problem. Only morph traces live inside the version control DAG, addressable from the commit they produced. Read the full argument in SESSION-TRACKING.md.

Property a reviewer needs	Claude / Cursor / OpenCode (on-disk transcripts)	Langfuse / Phoenix / OTEL (LLM observability)	Morph traces
Linked from a specific commit	No — correlated only by timestamp, if at all	No — indexed by app/session, not by VCS commit	Yes — `commit.evidence_refs`
Content-addressed (immutable)	No — files can be rotated, edited, deleted	No — span IDs are random	Yes — hash of canonical bytes
Visible to teammates	No — lives only on the developer's laptop	Yes — via the SaaS dashboard (data egress required)	Yes — via opt-in morph remote, local-first
Same shape across agent tools	No — Cursor, Claude, OpenCode all differ	Partial — OTLP-shaped, but app-defined fields vary	Yes — one Run+Trace schema per repo
Merge-aware	No	No — no notion of "merge"	Yes — case provenance via `morph merge-plan`
Local-first / offline	Yes	No — ships to a hosted backend	Yes

Behavioral commits — evidence, not vibes

A Morph commit stores a file tree hash and a behavioral contract: which pipeline was run, which evaluation suite, what scores were observed, and in which environment. Fresh repos start relaxed so you can commit immediately; opt into tests_total and tests_passed enforcement with morph policy init when ready. Tell Morph your test suite once with morph config commit.test_command "cargo test --workspace" (or pytest, vitest, jest, go), then plain morph commit -m "…" runs it, parses the metrics, and attaches them automatically.

A Morph commit vs. a Git commit // Git commit
tree        a3f2c…
parent      b71e0…
message     "fix retry"

// Morph commit
tree        a3f2c…
parent      b71e0…
message     "fix retry"
pipeline    d4e8a…
eval_suite  f92b1…
metrics     {tests_passed: 545, pass_rate: 1.0}
evidence    [run:c81f, trace:e02a]

Merge by behavioral dominance

Instead of three-way text merge alone, morph merge requires the candidate to dominate both parents' certified metrics: at least as good on every declared metric. If the merged code regresses on anything, the merge fails. morph merge-plan previews the bar to beat before you merge.

Merge with evidence // At merge time, Morph records:

parent_1_scores  {pass_rate: 0.94, p95_ms: 340}
parent_2_scores  {pass_rate: 0.91, p95_ms: 280}
bar_to_beat     {pass_rate: 0.94, p95_ms: 280}
merged_scores    {pass_rate: 0.95, p95_ms: 275}

// Dominates on both. Merge accepted.

Agent Integrations

One command to wire up your agent

Morph ships with an MCP server (morph-mcp) and setup commands that install the right config, hooks, and rules for your IDE — or your multi-agent session manager. Hooks parse the full agent transcript into structured Trace events, so every prompt, tool call, and file edit is recorded — you don't depend on the agent remembering to record.

Cursor

MCP server, hooks for always-on recording, and rules for behavioral commits. Writes into .cursor/.

morph setup cursor

Cursor setup →

Claude Code

MCP server and hooks for Anthropic's coding agent. Records tool calls, file edits, and shell invocations.

morph setup claude-code

Claude Code setup →

OpenCode

MCP config, AGENTS.md, and a recording plugin — one command, fully-traced sessions.

morph setup opencode

OpenCode setup →

Agent of Empires

Multi-agent session manager that drives Claude Code, OpenCode, Cursor CLI, and others through tmux + Docker sandboxes. Morph wraps every session with lifecycle hooks: a commit on create, a Run + Trace on every launch, a final commit on destroy. AoE on GitHub →

morph setup aoe

Agent of Empires setup →

Try the Alpha

Zero to recording in three commands

Install the two binaries (morph and morph-mcp), initialize a Morph repo in your project, and wire up your IDE. Morph runs side-by-side with Git — commits in one are independent of commits in the other. Heads-up: this is alpha software. Some commands are half-built, the on-disk format may change (use morph upgrade to migrate), and we'd love to hear what breaks — file an issue.

# 1. Install the binaries (macOS via Homebrew — recommended)
$ brew tap r/morph
$ brew install morph # installs both `morph` and `morph-mcp`

# … or build from source (any platform with Rust)
$ git clone https://github.com/r/morph.git && cd morph
$ cargo install --path morph-cli && cargo install --path morph-mcp

# 2. Initialize in your project
$ cd /path/to/your/project
$ morph init

# 3. Wire up your agent (pick one)
$ morph setup cursor
$ morph setup claude-code
$ morph setup opencode
$ morph setup aoe # Agent of Empires — multi-agent session manager

New here? Walk through the ~20-minute getting-started tutorial →

How It Works

Git-shaped CLI, richer objects

Morph mirrors Git where possible: if you know Git, you know Morph. The CLI adds commands for recording agent sessions, certifying commits against policy, and inspecting traces.

# Standard Git-shaped workflow (init is relaxed by default;
# tighten with `morph policy init` when ready)
$ morph init
$ morph config commit.test_command "cargo test --workspace" # once
$ morph add .
$ morph commit -m "fix retry loop"   # runs tests, attaches metrics
$ morph log # history with metrics
$ morph diff main feature

# Eval-driven workflow: spec-first cases for case-provenance at merge
$ morph eval add specs/login.yaml  # YAML or Cucumber
$ morph eval show               # inspect the registered suite
$ morph eval gaps               # report unaddressed evidence gaps

# Branching and behavioral merge
$ morph branch feature
$ morph checkout feature
$ morph merge-plan main  # preview bar to beat + case provenance
$ morph merge main       # dominance required

# Inspect recorded agent work
$ morph inspect summary     # overview of recorded runs
$ morph inspect show <run>   # grouped steps (prompts, tools, files)
$ morph inspect target <ref> # the code the agent was working on
$ morph inspect artifact <ref> # what the agent produced

# Policy, certification, gating
$ morph policy require-metrics tests_passed pass_rate
$ morph certify --metrics-file metrics.json
$ morph gate                # exit 1 if HEAD fails policy

# Team inspection (hosted browser UI + JSON API)
$ morph serve               # http://127.0.0.1:8765

Privacy & Sharing

What morph records, what crosses the wire

Morph records everything — that's the design point. Reviewability, replay, attribution, prompt-as-spec, and merge-aware behavioral context all depend on it. The tradeoff is that traces contain whatever happened in your agent session, and you should know exactly what crosses the wire when, before you let any of it leave your laptop.

git push code only

→ git remote (GitHub / GitLab / self-hosted)

.morph/ is in .git/info/exclude automatically. The git push physically cannot include runs, traces, prompts, or model responses. Teammates pulling git see ordinary git commits and a clean working tree.

$ git push origin main
// no .morph/, no traces, no prompts

morph push opt-in, separate

→ morph remote (independently configured)

A morph remote is a separate channel with separate access control. Default install configures none. When you do push, you're sending the prompts, responses, file contents the agent read, shell stdout/stderr, and model parameters — verbatim.

$ morph remote add team ssh://team-host/morph/repo
$ morph push team main

The team-sharing model in one line: code goes through your existing git remote; behavioral history goes through a separate morph remote that only people you'd trust to read your IDE history can pull from. Set them up explicitly. Neither channel is silent.

`morph forget` — the secret-leak escape hatch

A trace caught a credential or PII you didn't intend to record? morph forget <hash> permanently retires the offending Run, Trace, or prompt blob from your local store and writes an immutable Tombstone object recording the actor / reason / timestamp. Pass --remote team and the next morph push team ships the tombstone; the teammate's next morph fetch applies it automatically. The merge gate treats any evidence_ref that resolves to a tombstone as "no claim" rather than a hard error, so retroactively forgetting evidence does not retroactively break commits.

$ morph forget <run-hash> --remote team --reason "leaked db password"
$ morph push team main
// teammates: morph fetch team — tombstone applied silently

Forget refuses to retire commits, blobs (other than prompts), trees, pipelines, eval suites, or annotations — those carry structural meaning the DAG depends on. It also refuses whole-object-only: there is no partial-redaction story. Already-fetched copies on teammates' laptops stay where they are until that teammate fetches the tombstone. Full design in SECURITY.md.

Things morph does not yet do

Stated up front, so you don't discover them by reading the source:

No client-side redaction filter on morph push. Today push is "send everything reachable from this commit." A redaction-on-push hook is roadmap.
No selective fetch. morph fetch team pulls the full DAG.
No encryption at rest in .morph/. Same posture as every agent tool's on-disk transcripts (Claude / Cursor / OpenCode all unencrypted on disk by default). Use disk encryption.
No automatic secret scanning. Morph does not look at trace bytes for tokens, API keys, or PII before recording or pushing. Use the agent-level guardrails your IDE provides.
SSH transport does not yet carry tombstones. morph forget works end-to-end across filesystem morph remotes today; the SSH remote-helper protocol upgrade is on the roadmap. Until then, ssh into an SSH-served remote and run morph forget there too.

The full plain-language privacy story — what's in .morph/, what's in a trace, the team-setup checklist, and the "I leaked a secret, what do I do" recipe — is in docs/SECURITY.md. Read it before you push to a morph remote.

Design Principles

What Morph assumes

Morph is built on a small set of axioms. Violate any of them and something breaks.

Content-addressed, immutable objects. Every object is identified by a hash of its contents. History cannot be tampered with.

Evidence does not rewrite history. Runs, traces, and evaluation results never mutate prior commits. New evidence produces new objects.

Pipeline steps compose cleanly. Sequential chaining and parallel execution are well-defined — like Promise.then() and Promise.all().

Evaluation suites are explicit contracts. "Better" is never implicit. Metrics, directions, fixture sources, and aggregation methods are all versioned and hashed.

Scores are partially ordered. One scorecard dominates another only if it wins on every metric. If A is more accurate but slower, they're incomparable.

Merge records scores from both parents. Every merge commit records what both parents achieved and what the merged code achieved.

Environment is part of the record. Model version, sampling settings, toolchain — without this, scores from different environments aren't comparable.

Reproducibility means re-running the checks. You can't get identical outputs from an LLM. Reproducibility means re-running the evaluation and getting consistent aggregate scores.

Where We Are

An honest status report — and an invitation

We think the problem is real and the thesis is right. The implementation is genuinely alpha: some of it works well, some of it is held together with duct tape, and a lot of it needs your eyes, your bug reports, and your PRs. Here's where things stand today.

What works today Solid

One-command setup for Cursor, Claude Code, OpenCode, and the Agent of Empires multi-agent session manager (morph setup <name>)
Recording prompts and responses as immutable Runs + Traces, always-on via hooks
Core Git-shaped workflow: init, add, commit, log, diff, branch, checkout, merge, tag, stash, revert
Behavioral merge with dominance check and merge-plan preview (incl. case provenance)
Policy with required_metrics gate, certify, and gate for pass/fail enforcement; relaxed default on morph init — tighten with morph policy init when ready
Eval-driven workflow: morph eval add, rebuild, show, run, from-output, record, gaps — ingest YAML/Cucumber specs, parse cargo/pytest/vitest/jest/go output, fail commits without metrics
morph serve: local browser UI + JSON API for inspecting commits, runs, traces
1,100+ unit/CLI tests and 37 end-to-end Cucumber scenarios across 16 features

What's rough WIP

Structured trace events (tool calls, file edits, shell invocations) are captured inconsistently across IDEs — coverage is improving but uneven
Eval-suite ingestion handles YAML and Cucumber out of the box; richer expectation DSLs and per-case scoring are still in progress
Storage: filesystem only; SQLite and real remote backends are designed but not implemented
On-disk format has already changed twice; expect more morph upgrade migrations before v1.0
Remotes are local-path only; no hosted Morph forge yet
Windows support is untested — we develop on macOS and Linux
Docs and CLI help are catching up to the code; some commands are under-documented

Where you can help Invitation

Try it in a real project, break it, and file an issue — especially if recording misses events
Add trace adapters for other agents (Aider, Cline, Zed, Codex CLI, …)
Build real evaluation suites and share what shape they want to take
Implement a real remote backend (HTTP, S3, or a hosted forge)
Sharpen the theory: read THEORY.md and push back on where it's wrong
Improve docs, write tutorials, record screencasts
Port to Windows & test on uncommon setups

Ready to jump in?

We're a small research project and every issue, PR, and conversation moves this forward. Star the repo to follow along, or grab an open issue and start hacking.

Browse issues How to contribute Star on GitHub

Git for AI-assisted development