Manifest-backed proofs — Merkle-batch up to 10,000 items in one anchor

When you have many small items to anchor — every CSV row, every JSON record, every eval verdict, every line of a daily ledger — anchoring each one separately is one transaction per item. Manifest-backed proofs bind up to 10,000 items via a Merkle root, then anchors that root in one transaction. Each item can still be proven individually via a Merkle inclusion path; the chain footprint is one OP_RETURN regardless of N.

Companion docs: API reference · OpenAPI spec · What to hash · Bundle spec — manifest-items-v1 · Production checklist · Compatibility map

What "manifest mode" means — and doesn't. The anchor mode field reads back as standard, sealed, or manifest, but those are not three parallel privacy tiers you pick between. "Manifest" only describes the shape of one anchor: it commits a Merkle root over an items[] batch instead of a single file's hash. You don't set it — the notary auto-selects it from the presence of items[] (sending mode: "manifest" is unnecessary, and sending it alongside sha256_hex 400s). Two clarifications this guide assumes: - It's the plain-SHA-256 batch path. If your rows are low-entropy and need HMAC leaves, you want a sealed manifest — the same batching idea, but a different wire shape sent via mode: "sealed" with a proof_set (see Sealed mode), not the items[] path below. "Sealed" and "manifest" are orthogonal axes (single-vs-batch × plain-vs-sealed), not mutually exclusive modes. - The leaf rule here is manifest-items-v1, which is not the csv-row-v1 rule the selective-disclosure / redaction tooling binds to. Don't reach for items[] to mint a disclosure carrier — that path goes through the disclosure builder / sealed proof_set, not here.

1. The 60-second framing

You send a list of {label, sha256_hex} items. The notary builds a Merkle tree over them (sha256({label, value}) per leaf, plain SHA-256 inner nodes, last-node duplication at odd levels), anchors a transaction that commits to that root — on chain as the canonical record's 20-byte doc_hash — and returns one anchor — proof_id, txid, leaf_count, root. Inclusion proofs for individual leaves are re-derivable from the items list later; the chain pins the root (via that 20-byte doc_hash), your store carries the leaves.

Concrete: an eval pipeline scoring 5,000 model outputs anchors all 5,000 scores in one transaction. A reviewer who later wants to verify one specific score gets the row + the inclusion path (O(log N) sibling hashes) and confirms it against the on-chain root, without seeing or needing the other 4,999 rows.

2. Use this when

You have many small items and a per-item anchor is wasteful (1 txn × N items vs 1 txn × 1 manifest).
The items share a logical group — a batch of eval results, a day's ledger entries, the records in a single export, a multi-file release.
Selective disclosure matters: you want to reveal one item to a counterparty without disclosing the others.
The items are structured rows (CSV, JSONL, table records) and merkle-row-v1 (or its sealed variant) is the right scheme.
You're building an agent-session evidence bundle (the manifest is the binding artifact at session end — see Agents).
You're shipping a release that bundles N artifacts (binaries, configs, docs) and want one anchor per release rather than N.

Don't use this when:

You have only one artifact — use Files.
N > 10,000 — split into multiple sub-manifests + a top-level manifest of those (a "manifest of manifests").
The items are low-entropy and a plain SHA-256 leaf could be brute-forced — use the sealed variant merkle-row-sealed-v1 (covered in Sealed mode).

3. What you send

POST /api/v1/anchors with items (the presence of items selects manifest mode; you MUST NOT also send sha256_hex or file_size).

{
  "folder_slug": "agent-runs-prod",
  "items": [
    {"label": "Q1", "sha256_hex": "10343a87...aa921669"},
    {"label": "Q2", "sha256_hex": "f68246b5...4b56894c"}
  ],
  "category": "evidence_bundle",
  "label": "math-eval verdicts 2026-05-07",
  "session_id": "run-2026-05-09-001"
}

field	type	required	meaning
`folder_slug`	string	yes	folder slug.
`items`	array	yes	1..10,000 items. Each item is `{label: string, sha256_hex: 64-hex}`.
`category`	string	yes	usually `evidence_bundle`. Other enum values allowed.
`label`	string	no	free-text tag for the manifest itself (display only).
`session_id`	string	no	grouping key for `GET /api/v1/anchors?session_id=...`.
`force_new`	bool	no	bypass the same-root dedup gate.

MAX_LEAVES = 10000. Above that, the request 400s with manifest_too_many_items.

Item shape

{"label": "row-7", "sha256_hex": "abc123..."}

field	type	constraints
`label`	string	1..256 chars. Preserved exactly — bytes matter (used in the leaf-construction rule).
`sha256_hex`	string	64 lowercase hex chars. Hash of whatever item you're committing — a row, a file, a JSON envelope, etc.

The label is part of the leaf hash. Labels can leak metadata — "alice-bid", "bob-bid" in a sealed-bid auction leaks bidder names even if the bid amounts stay sealed. Use neutral labels (row-7, item-12) where label privacy matters; see the threat-model note in /spec-merkle-row.

Leaf-construction rule

Each leaf is sha256(canonical_bytes(label, sha256_hex)) — the notary applies SCJ-v1 canonicalization (sorted keys, compact, NFC — deliberately NOT RFC 8785/JCS; see /spec-provenance §3) to {label, sha256_hex}, produces UTF-8 bytes, sha256s. Inner nodes are plain SHA-256. Last node duplicates at odd levels to round to a complete binary tree.

For verification later, a verifier with the items list reproduces the leaves and the tree identically — SCJ-v1 is deterministic.

Response

{
  "proof_id": "abc123def456...",
  "txid": "5e9a...c4f1",
  "mode": "manifest",
  "category": "evidence_bundle",
  "folder_slug": "agent-runs-prod",
  "proof_url": "https://app.satsignal.cloud/w/.../r/abc123def456",
  "bundle_url": "https://app.satsignal.cloud/bundle/abc123def456.mbnt",
  "leaf_count": 2,
  "root": "<64 hex Merkle root>"
}

leaf_count and root are manifest-mode-specific. The .mbnt bundle's canonical doc carries subject.kind == "manifest" and subject.scheme == "manifest-items-v1"; verifiers dispatch on the canonical-doc subject, not on the manifest layer (see bundle-v1 §3.4).

4. What you store

proof_id + txid — the standard handles.
root + leaf_count — useful for quick sanity checks at verify time without re-fetching the bundle.
The full items[] list, in submission order. This is the load-bearing one. The Merkle root commits to leaf ORDER — re-ordering the items produces a different root. Persist the array exactly as you submitted it.
The original artifacts behind each item. The leaf carries a hash; the verifier needs the underlying bytes to confirm the hash matches a specific record. Storing only the sha256_hex and not the row content makes the manifest verifiable in chain-existence but not row-by-row.
The bundle_url — fetch the .mbnt once and cache it, same as standard mode.

Database row shape (reference)

column	type	notes
`proof_id`	text PK	from response
`txid`	text	from response
`root`	text	Merkle root
`leaf_count`	int	count of items
`folder_slug`	text	namespace
`session_id`	text	optional grouping

Plus a separate items table:

column	type	notes
`proof_id`	text FK	links to manifest
`position`	int	0-indexed submission order
`label`	text	as submitted
`sha256_hex`	text	as submitted
`artifact_path`	text	where the original bytes for this item live

5. What verification needs later

Verification splits into two cases.

Case A — verify the manifest as a whole

Three things:

The .mbnt bundle — carries manifest-items-v1 canonical doc with the root.
The items list — fetched from your store, in submission order.
A BSV node — any public one.

The verifier rebuilds the Merkle tree from the items, compares the root to the canonical doc's subject.root, and confirms the canonical doc's hash matches the on-chain OP_RETURN payload.

The manifest-mode canonical subject is exactly this shape (plus an optional category):

"subject": {
  "kind": "manifest",
  "scheme": "manifest-items-v1",
  "algo": "sha256",
  "leaf_count": 3,
  "root": "<64-hex Merkle root>"
}

Note where the root lives: subject.root — not subject.proofs.chunk_merkle.root. The proofs.chunk_merkle path belongs to a different shape — a chunked file anchor (a single file anchored with a proof_set); readers pattern-matching from the file-anchor docs hit exactly that mix-up and conclude a valid manifest proof mismatches. The per-item leaves ride in the bundle's proofs.json, not in the canonical doc.

Case B — verify one item without disclosing the rest

Build an inclusion proof for the target item:

Index i of the item in the submission order.
The item itself: {label, sha256_hex}.
The sibling hashes along the path from leaf i to the root (O(log N) hashes). These can be computed by anyone with the full items list — usually the manifest holder.

A verifier with just the item + the inclusion proof + the root (read from the bundle or from proof_url) re-walks the path, confirms the root matches the on-chain anchor, and confirms the item hash is what the inclusion proof claims. The other items stay opaque.

For tabular data specifically, the merkle-row-v1 scheme documents the canonical row-encoding rule

the standard inclusion-proof shape. See /spec-merkle-row.

Cross-link: What to hash — Manifest rows / tables covers how to compute the per-item sha256_hex for common item shapes (CSV rows, JSON records, eval result rows).

Byte-exact canonicalization — the #1 verification gotcha

A proof commits to the exact bytes you hashed, not to the logical JSON object. Re-serializing the same object differently — json.dumps(..., indent=2), a different key order, an extra trailing newline — changes the sha256, and the verifier reports a sha256 mismatch even though "the data" looks identical. This bites in two places:

Per item (this guide). Compute each sha256_hex over canonical bytes and store those bytes (or enough to regenerate them), not a pretty-printed copy:

``python canonical = json.dumps(row, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8") sha256_hex = hashlib.sha256(canonical).hexdigest() ``

Whole-manifest provenance. When you anchor a satsignal.provenance.v1 manifest via POST /api/v1/provenance/anchor, the endpoint re-canonicalizes it with Satsignal Canonical JSON v1 (sorted keys, separators=(",",":"), NFC, ensure_ascii=False) and commits those bytes — not whatever spacing you sent. To re-verify later, hand the verifier the canonical bytes: either re-canonicalize with the snippet above, or use the copy the notary already embeds in the .mbnt bundle. A manifest you saved with indent=2 will not verify against its own proof — that mismatch is the proof working, not breaking.

The exact SCJ-v1 rule — with reference implementations and pinned test vectors — is specified in spec-provenance §canonicalization.

6. Copy-paste example

Anchor a 1000-row eval result set

import hashlib, json, os, urllib.request

API = "https://app.satsignal.cloud"
KEY = os.environ["SATSIGNAL_API_KEY"]
FOLDER = "eval-runs-prod"

# Build the items list. Each item's sha256_hex is the canonical
# bytes of the row.
items = []
for i, row in enumerate(eval_results):
    row_bytes = json.dumps(row, sort_keys=True,
                           separators=(",", ":")).encode("utf-8")
    items.append({
        "label": f"item-{i:04d}",
        "sha256_hex": hashlib.sha256(row_bytes).hexdigest(),
    })

body = json.dumps({
    "folder_slug": FOLDER,
    "items": items,
    "category": "evidence_bundle",
    "label": "math-eval verdicts 2026-05-26",
    "session_id": "run-2026-05-26-001",
}).encode("utf-8")

req = urllib.request.Request(
    f"{API}/api/v1/anchors",
    data=body, method="POST",
    headers={"Authorization": f"Bearer {KEY}",
             "Content-Type": "application/json",
             "Idempotency-Key": "eval-run-2026-05-26-001"},
)
with urllib.request.urlopen(req) as resp:
    out = json.load(resp)

print(f"Manifest anchored: {out['proof_id']}")
print(f"  txid:       {out['txid']}")
print(f"  leaf_count: {out['leaf_count']}")
print(f"  root:       {out['root']}")

Anchor a multi-file release

export SATSIGNAL_API_KEY=sk_...
export FOLDER=release-gates

# Hash each file, build the items list.
ITEMS_JSON=$(jq -n '[]')
for f in dist/*; do
  HASH=$(sha256sum "$f" | awk '{print $1}')
  LABEL=$(basename "$f")
  ITEMS_JSON=$(jq --arg l "$LABEL" --arg h "$HASH" \
    '. + [{label: $l, sha256_hex: $h}]' <<< "$ITEMS_JSON")
done

# Anchor the manifest.
curl -X POST https://app.satsignal.cloud/api/v1/anchors \
  -H "Authorization: Bearer $SATSIGNAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg folder "$FOLDER" --argjson items "$ITEMS_JSON" '{
    folder_slug: $folder,
    items: $items,
    category: "evidence_bundle",
    label: "release v1.2.3"
  }')"

Reveal one row to a counterparty

After anchoring, you (the manifest holder) build an inclusion proof for one item:

import hashlib, json

def jcs_bytes(obj):
    return json.dumps(obj, sort_keys=True,
                      separators=(",", ":")).encode("utf-8")

def leaf_hash(item):
    return hashlib.sha256(jcs_bytes(item)).digest()

def build_tree(leaves):
    """Returns the level-by-level list, with the root last."""
    levels = [leaves]
    cur = leaves
    while len(cur) > 1:
        if len(cur) % 2 == 1:
            cur = cur + [cur[-1]]
        cur = [hashlib.sha256(cur[i] + cur[i+1]).digest()
               for i in range(0, len(cur), 2)]
        levels.append(cur)
    return levels

def inclusion_path(levels, index):
    path = []
    for level in levels[:-1]:
        sibling = index ^ 1
        if sibling >= len(level):
            sibling = index  # last-node duplication
        path.append(level[sibling])
        index //= 2
    return path

items = [...]  # your stored items list, in submission order
leaves = [leaf_hash(it) for it in items]
levels = build_tree(leaves)
root = levels[-1][0]

target_index = 7
reveal = {
    "item": items[target_index],
    "index": target_index,
    "leaf_count": len(items),
    "inclusion_path": [h.hex() for h in inclusion_path(levels, target_index)],
    "root": root.hex(),
}

A counterparty with just the reveal object + the on-chain root (via proof_url or the .mbnt bundle) walks the path, recomputes the root, confirms it matches. The other 9,999 items stay opaque.

7. Production notes

MAX_LEAVES = 10,000

Hard cap at the API layer. Above that, the request 400s. For larger batches, split into sub-manifests + a top-level manifest:

sub-1 (5000 items) → root_1 → anchor → manifest-proof-1
sub-2 (5000 items) → root_2 → anchor → manifest-proof-2
top (items=[{label: "sub-1", sha256_hex: root_1},
            {label: "sub-2", sha256_hex: root_2}]) → anchor

The top-level manifest binds the sub-manifests; a verifier can prove one row from sub-1 with: (a) inclusion proof in sub-1's manifest, (b) inclusion proof in the top manifest, (c) chain confirmation of the top.

`force_new` for dedup override

Re-submitting the same items[] list produces the same Merkle root, which hits the default-dedup gate and returns the original anchor's proof_id without burning quota. This is usually what you want.

If you want a fresh anchor on a re-submission (e.g. debugging, or two logically-distinct manifests that happen to share an items list), set force_new: true in the body. The notary anchors a fresh transaction.

{
  "folder_slug": "...",
  "items": [...],
  "category": "evidence_bundle",
  "force_new": true
}

Item order matters

The Merkle root commits to the items in submission order. Sorting the items at submission time gives you a deterministic root that can be reproduced from the (sorted) items list later; submitting in arbitrary order gives a root that depends on the original ordering, which you must preserve.

Pick one strategy and stick to it. The most robust pattern is to sort by label before submission; then any verifier can reproduce the manifest deterministically from just the items content.

Recommended: keep an items index file alongside the manifest. The anchor response does not echo the items[] array back. The downloaded .mbnt's proofs.json does carry the ordered leaves (label + sha256_hex + derived leaf_hash), but the server-side bundle copy is only kept until the proof is deleted — if it is deleted and you hold no local copy, the submission order is gone. The blessed shape is a plain JSON file written at submission time and stored next to your cached .mbnt, under the same retention policy as the artifacts:

{
  "proof_id": "abc123def456...",
  "txid": "5e9a...c4f1",
  "root": "<64 hex Merkle root from the response>",
  "leaf_count": 2,
  "items": [
    {"index": 0, "label": "rows/0001.json", "sha256_hex": "63d5c3e6..."},
    {"index": 1, "label": "rows/0002.json", "sha256_hex": "3cad58f5..."}
  ]
}

items is the exact array you submitted, in submission order, with an explicit index so a partial copy is detectable. This is a client-side convention, not an API surface — any verifier can take the file, recompute the leaves per the leaf-construction rule above, and confirm the recomputed root equals both the root recorded here and the one in the bundle's canonical doc.

Idempotency

Idempotency-Key works the same as standard mode. The body-hash check covers the full items[] list — a retry with one item modified is a different body and returns 409 idempotency_key_reuse_body_mismatch on the same key.

Sealed manifests

A sealed manifest is documented in Sealed mode under the merkle-row-sealed-v1 scheme. The wire shape is similar but with HMAC algos and a salt_b64 field; the canonical doc binds {leaf_count, root, scheme} and the per-leaf material rides in proofs.json off-chain.

Rate limits & quota

One manifest anchor = one anchor against the monthly quota, regardless of how many items it carries. This is the primary quota advantage of manifest-backed proofs: a 5,000-row eval run consumes one anchor slot, not 5,000.

Rate-limit behavior is unchanged from standard mode — the plan quota window is the only key-level throttle; there is no separate hourly burst limit (see Files §7).

Bundle size

The .mbnt bundle carries proofs.json with the full merkle_leaves[] array — 64 hex chars per leaf, ~64 bytes per entry. A 10,000-item manifest produces a proofs.json around ~650 KB. The bundle_url download is bearer-auth gated; size is rarely an issue but be aware on slow networks.

For the full pre-flight checklist (key rotation, broadcast failure recovery, support flow), see Production checklist.

8. Errors you might see

code	name	meaning	what to do
`400`	`manifest_too_many_items`	> 10,000 items	split into sub-manifests + top-level manifest
`400`	`manifest_empty`	empty `items[]`	items must have ≥ 1 entry
`400`	`manifest_label_too_long`	a label > 256 chars	truncate or restructure
`400`	`manifest_sha256_invalid`	a `sha256_hex` isn't 64 lowercase hex	regenerate hashes; use lowercase
`400`	`manifest_disallows_sha256`	sent `sha256_hex` at top level alongside `items`	drop the top-level `sha256_hex`
`400`	`manifest_disallows_file_size`	sent `file_size` at top level alongside `items`	drop the top-level `file_size`
`404`	`folder_not_found`	folder doesn't exist	create the folder
`409`	`idempotency_key_reuse_body_mismatch`	key reuse with a different body	use a fresh key
`429`	`quota_exceeded`	monthly anchor count exhausted	email `hello@satsignal.cloud` for a cap lift, or wait for the monthly reset

9. Legacy field aliases (if your code sends them)

Same as all other paths: responses emit the canonical proof / folder names only; requests written against the legacy spellings keep working forever.

Full canonical/legacy mapping across endpoints, fields, scopes, CLI flags, and error codes: Compatibility map.

10. Where this fits

For the full Merkle-construction rule + the canonical-doc shape for manifest-items-v1, see bundle-v1 §3.4 + §4.3.
For row-level schemes (merkle-row-v1 and the sealed variant merkle-row-sealed-v1), see /spec-merkle-row.
For sealed manifests (low-entropy rows that need HMAC leaves instead of plain SHA-256), see Sealed mode.
For agent-session manifests (the final binding artifact at session end), see Agents.
If you have only one artifact, use Files — manifest mode is overkill for N=1.
For "what counts as the canonical bytes of an item" for various row shapes, see What to hash — Manifest rows / tables.

Questions about this specification? Email hello@satsignal.cloud.