Alpha Morph is an early research prototype. Expect bugs, rough edges, and breaking changes. Help us build it →
Alpha · v0.7.1 · Research prototype

Git for AI-assisted development

Morph is a version control system that records every agent session as an immutable Run with a full Trace, attaches evaluation scores to commits, and gates merges on behavioral dominance — not just a clean text diff. It's an open research prototype. We're building it in the open and we want your help.

Try the alpha Help build it GitHub

The Problem

Git doesn't know how your code got here

When an agent writes your code in Cursor, Claude Code, or OpenCode, Git snapshots the file tree and tracks line-level diffs. But it has no idea how those files arrived — and it can't tell you whether the result still works.

?

No link from code to prompt

Which prompt produced this refactor? Which conversation led to this bug fix? Git has nowhere to store any of that.

···

Agent sessions disappear

The agent may try three approaches before settling on one. Tool calls, file reads, shell commands, token usage — none of it survives the commit.

↑↓

"Did it get better?" isn't in the diff

You can't read a diff and know whether the tests still pass, the benchmarks improved, or the agent regressed an edge case. You have to run it.

>_

Mixed authorship isn't tracked

Which agent, which model, which prompt, under which environment — all of it matters for reviewing and reproducing AI-authored changes.

×

Merge can silently regress behavior

Two agent branches can merge cleanly at the text level while the resulting code fails tests that both parents passed.

Probabilistic outputs break assumptions

Git assumes identity is byte equality and reproducibility is identical output. Neither holds when an LLM is in the loop.

Version the transformation, not just the output

Same content-addressed Merkle DAG as Git. Same hash-of-contents identity. Three additions that we believe make version control work for agent-authored code. Morph runs alongside Git (separate .morph/ and .git/) — drop Git later if you want to. The ideas below are where we're heading; the implementation is partial and actively being built.

Runs and Traces — permanent agent receipts

Every agent session is recorded as an immutable Run with a full Trace: every prompt, response, tool call, file read, file edit, shell command, and token count. IDE hooks parse the agent's complete transcript into structured events, so recording is always-on and doesn't depend on the agent calling a tool.

What a Trace contains run        c81f2a…
actor      agent:cursor / model:claude-opus-4
env        {os: darwin, rust: 1.82}

// Trace events
prompt     "fix the retry loop"
tool_call  read(src/retry.rs)
tool_call  edit(src/retry.rs, +12/-4)
shell      cargo test —— retry
response   "fixed; tests pass."

Behavioral commits — evidence, not vibes

A Morph commit stores a file tree hash and an optional behavioral contract: which pipeline was run, which evaluation suite, what scores were observed, and in which environment. Plain morph commit -m "…" still works exactly like Git when you don't need evaluation gating.

A Morph commit vs. a Git commit // Git commit
tree        a3f2c…
parent      b71e0…
message     "fix retry"

// Morph commit
tree        a3f2c…
parent      b71e0…
message     "fix retry"
pipeline    d4e8a…
eval_suite  f92b1…
metrics     {tests_passed: 545, pass_rate: 1.0}
evidence    [run:c81f, trace:e02a]

Merge by behavioral dominance

Instead of three-way text merge alone, morph merge requires the candidate to dominate both parents' certified metrics: at least as good on every declared metric. If the merged code regresses on anything, the merge fails. morph merge-plan previews the bar to beat before you merge.

Merge with evidence // At merge time, Morph records:

parent_1_scores  {pass_rate: 0.94, p95_ms: 340}
parent_2_scores  {pass_rate: 0.91, p95_ms: 280}
bar_to_beat     {pass_rate: 0.94, p95_ms: 280}
merged_scores    {pass_rate: 0.95, p95_ms: 275}

// Dominates on both. Merge accepted.

IDE Integrations

One command to wire up your agent

Morph ships with an MCP server (morph-mcp) and setup commands that install the right config, hooks, and rules for your IDE. Hooks parse the full agent transcript into structured Trace events, so every prompt, tool call, and file edit is recorded — you don't depend on the agent remembering to record.

Cursor

MCP server, hooks for always-on recording, and rules for behavioral commits. Writes into .cursor/.

morph setup cursor

Cursor setup →

Claude Code

MCP server and hooks for Anthropic's coding agent. Records tool calls, file edits, and shell invocations.

morph setup claude-code

Claude Code setup →

OpenCode

MCP config, AGENTS.md, and a recording plugin — one command, fully-traced sessions.

morph setup opencode

OpenCode setup →

Zero to recording in three commands

Build the two binaries (morph and morph-mcp), initialize a Morph repo in your project, and wire up your IDE. Morph runs side-by-side with Git — commits in one are independent of commits in the other. Heads-up: this is alpha software. Some commands are half-built, the on-disk format may change (use morph upgrade to migrate), and we'd love to hear what breaks — file an issue.

# 1. Build and install the binaries
$ git clone https://github.com/r/morph.git && cd morph
$ cargo install --path morph-cli && cargo install --path morph-mcp

# 2. Initialize in your project
$ cd /path/to/your/project
$ morph init

# 3. Wire up your IDE (pick one)
$ morph setup cursor
$ morph setup claude-code
$ morph setup opencode

How It Works

Git-shaped CLI, richer objects

Morph mirrors Git where possible: if you know Git, you know Morph. The CLI adds commands for recording agent sessions, certifying commits against policy, and inspecting traces.

# Standard Git-shaped workflow
$ morph init
$ morph add .
$ morph commit -m "fix retry loop" --metrics '{"pass_rate": 1.0}'
$ morph log # history with metrics
$ morph diff main feature

# Branching and behavioral merge
$ morph branch feature
$ morph checkout feature
$ morph merge-plan main  # preview bar to beat
$ morph merge main       # dominance required

# Inspect recorded agent work
$ morph tap summary         # overview of recorded runs
$ morph tap inspect <run>   # grouped steps (prompts, tools, files)
$ morph traces target-context <ref> # the code the agent was working on
$ morph traces final-artifact <ref> # what the agent produced

# Policy, certification, gating
$ morph policy set --require pass_rate=1.0
$ morph certify --metrics-file metrics.json
$ morph gate                # exit 1 if HEAD fails policy

# Team inspection (hosted browser UI + JSON API)
$ morph serve               # http://127.0.0.1:8765

What Morph assumes

Morph is built on a small set of axioms. Violate any of them and something breaks.

01

Content-addressed, immutable objects. Every object is identified by a hash of its contents. History cannot be tampered with.

02

Evidence does not rewrite history. Runs, traces, and evaluation results never mutate prior commits. New evidence produces new objects.

03

Pipeline steps compose cleanly. Sequential chaining and parallel execution are well-defined — like Promise.then() and Promise.all().

04

Evaluation suites are explicit contracts. "Better" is never implicit. Metrics, directions, fixture sources, and aggregation methods are all versioned and hashed.

05

Scores are partially ordered. One scorecard dominates another only if it wins on every metric. If A is more accurate but slower, they're incomparable.

06

Merge records scores from both parents. Every merge commit records what both parents achieved and what the merged code achieved.

07

Environment is part of the record. Model version, sampling settings, toolchain — without this, scores from different environments aren't comparable.

08

Reproducibility means re-running the checks. You can't get identical outputs from an LLM. Reproducibility means re-running the evaluation and getting consistent aggregate scores.

Where We Are

An honest status report — and an invitation

We think the problem is real and the thesis is right. The implementation is genuinely alpha: some of it works well, some of it is held together with duct tape, and a lot of it needs your eyes, your bug reports, and your PRs. Here's where things stand today.

What works today Solid

  • IDE setup for Cursor, Claude Code, and OpenCode via one morph setup command
  • Recording prompts and responses as immutable Runs + Traces, always-on via hooks
  • Core Git-shaped workflow: init, add, commit, log, diff, branch, checkout, merge, tag, stash, revert
  • Behavioral merge with dominance check and merge-plan preview
  • Policy, certify, and gate for pass/fail enforcement
  • morph serve: local browser UI + JSON API for inspecting commits, runs, traces
  • 545+ unit/CLI tests and 32 end-to-end Cucumber scenarios

What's rough WIP

  • Structured trace events (tool calls, file edits, shell invocations) are captured inconsistently across IDEs — coverage is improving but uneven
  • Evaluation-suite integration is minimal: metrics are honored, but the full "suite as versioned contract" story is still being built
  • Storage: filesystem only; SQLite and real remote backends are designed but not implemented
  • On-disk format has already changed twice; expect more morph upgrade migrations before v1.0
  • Remotes are local-path only; no hosted Morph forge yet
  • Windows support is untested — we develop on macOS and Linux
  • Docs and CLI help are catching up to the code; some commands are under-documented

Where you can help Invitation

  • Try it in a real project, break it, and file an issue — especially if recording misses events
  • Add trace adapters for other agents (Aider, Cline, Zed, Codex CLI, …)
  • Build real evaluation suites and share what shape they want to take
  • Implement a real remote backend (HTTP, S3, or a hosted forge)
  • Sharpen the theory: read THEORY.md and push back on where it's wrong
  • Improve docs, write tutorials, record screencasts
  • Port to Windows & test on uncommon setups

Ready to jump in?

We're a small research project and every issue, PR, and conversation moves this forward. Star the repo to follow along, or grab an open issue and start hacking.

Read the theory and the spec

The formal model — pipelines as monadic computations, certificate vectors, the merge monotonicity theorem — plus a concrete v0 system design with object schemas and CLI reference.

Morph: Version Control for AI-Assisted Development

Raffi Krikorian · Mozilla

Theory v0 Spec Paper (LaTeX)