Manifest-backed proofs — Merkle-batch up to 10,000 items in one anchor

When you have many small items to anchor — every CSV row, every JSON record, every eval verdict, every line of a daily ledger — anchoring each one separately is one transaction per item. Manifest-backed proofs bind up to 10,000 items via a Merkle root, then anchors that root in one transaction. Each item can still be proven individually via a Merkle inclusion path; the chain footprint is one OP_RETURN regardless of N.

Companion docs: API reference · OpenAPI spec · What to hash · Bundle spec — manifest-items-v1 · Production checklist · Compatibility map

What "manifest mode" means — and doesn't. The anchor mode field reads back as standard, sealed, or manifest, but those are not three parallel privacy tiers you pick between. "Manifest" only describes the shape of one anchor: it commits a Merkle root over an items[] batch instead of a single file's hash. You don't set it — the notary auto-selects it from the presence of items[] (sending mode: "manifest" is unnecessary, and sending it alongside sha256_hex 400s). Two clarifications this guide assumes: - It's the plain-SHA-256 batch path. If your rows are low-entropy and need HMAC leaves, you want a sealed manifest — the same batching idea, but a different wire shape sent via mode: "sealed" with a proof_set (see Sealed mode), not the items[] path below. "Sealed" and "manifest" are orthogonal axes (single-vs-batch × plain-vs-sealed), not mutually exclusive modes. - The leaf rule here is manifest-items-v1, which is not the csv-row-v1 rule the selective-disclosure / redaction tooling binds to. Don't reach for items[] to mint a disclosure carrier — that path goes through the disclosure builder / sealed proof_set, not here.

1. The 60-second framing

You send a list of {label, sha256_hex} items. The notary builds a Merkle tree over them (sha256({label, value}) per leaf, plain SHA-256 inner nodes, last-node duplication at odd levels), anchors a transaction that commits to that root — on chain as the canonical record's 20-byte doc_hash — and returns one anchor — proof_id, txid, leaf_count, root. Inclusion proofs for individual leaves are re-derivable from the items list later; the chain pins the root (via that 20-byte doc_hash), your store carries the leaves.

Concrete: an eval pipeline scoring 5,000 model outputs anchors all 5,000 scores in one transaction. A reviewer who later wants to verify one specific score gets the row + the inclusion path (O(log N) sibling hashes) and confirms it against the on-chain root, without seeing or needing the other 4,999 rows.

2. Use this when

Don't use this when:

3. What you send

POST /api/v1/anchors with items (the presence of items selects manifest mode; you MUST NOT also send sha256_hex or file_size).

{
  "folder_slug": "agent-runs-prod",
  "items": [
    {"label": "Q1", "sha256_hex": "10343a87...aa921669"},
    {"label": "Q2", "sha256_hex": "f68246b5...4b56894c"}
  ],
  "category": "evidence_bundle",
  "label": "math-eval verdicts 2026-05-07",
  "session_id": "run-2026-05-09-001"
}
fieldtyperequiredmeaning
folder_slugstringyesfolder slug.
itemsarrayyes1..10,000 items. Each item is {label: string, sha256_hex: 64-hex}.
categorystringyesusually evidence_bundle. Other enum values allowed.
labelstringnofree-text tag for the manifest itself (display only).
session_idstringnogrouping key for GET /api/v1/anchors?session_id=....
force_newboolnobypass the same-root dedup gate.

MAX_LEAVES = 10000. Above that, the request 400s with manifest_too_many_items.

Item shape

{"label": "row-7", "sha256_hex": "abc123..."}
fieldtypeconstraints
labelstring1..256 chars. Preserved exactly — bytes matter (used in the leaf-construction rule).
sha256_hexstring64 lowercase hex chars. Hash of whatever item you're committing — a row, a file, a JSON envelope, etc.

The label is part of the leaf hash. Labels can leak metadata — "alice-bid", "bob-bid" in a sealed-bid auction leaks bidder names even if the bid amounts stay sealed. Use neutral labels (row-7, item-12) where label privacy matters; see the threat-model note in /spec-merkle-row.

Leaf-construction rule

Each leaf is sha256(canonical_bytes(label, sha256_hex)) — the notary applies SCJ-v1 canonicalization (sorted keys, compact, NFC — deliberately NOT RFC 8785/JCS; see /spec-provenance §3) to {label, sha256_hex}, produces UTF-8 bytes, sha256s. Inner nodes are plain SHA-256. Last node duplicates at odd levels to round to a complete binary tree.

For verification later, a verifier with the items list reproduces the leaves and the tree identically — SCJ-v1 is deterministic.

Response

{
  "proof_id": "abc123def456...",
  "txid": "5e9a...c4f1",
  "mode": "manifest",
  "category": "evidence_bundle",
  "folder_slug": "agent-runs-prod",
  "proof_url": "https://app.satsignal.cloud/w/.../r/abc123def456",
  "bundle_url": "https://app.satsignal.cloud/bundle/abc123def456.mbnt",
  "leaf_count": 2,
  "root": "<64 hex Merkle root>"
}

leaf_count and root are manifest-mode-specific. The .mbnt bundle's canonical doc carries subject.kind == "manifest" and subject.scheme == "manifest-items-v1"; verifiers dispatch on the canonical-doc subject, not on the manifest layer (see bundle-v1 §3.4).

4. What you store

Database row shape (reference)

columntypenotes
proof_idtext PKfrom response
txidtextfrom response
roottextMerkle root
leaf_countintcount of items
folder_slugtextnamespace
session_idtextoptional grouping

Plus a separate items table:

columntypenotes
proof_idtext FKlinks to manifest
positionint0-indexed submission order
labeltextas submitted
sha256_hextextas submitted
artifact_pathtextwhere the original bytes for this item live

5. What verification needs later

Verification splits into two cases.

Case A — verify the manifest as a whole

Three things:

  1. The .mbnt bundle — carries manifest-items-v1 canonical doc with the root.
  2. The items list — fetched from your store, in submission order.
  3. A BSV node — any public one.

The verifier rebuilds the Merkle tree from the items, compares the root to the canonical doc's subject.root, and confirms the canonical doc's hash matches the on-chain OP_RETURN payload.

The manifest-mode canonical subject is exactly this shape (plus an optional category):

"subject": {
  "kind": "manifest",
  "scheme": "manifest-items-v1",
  "algo": "sha256",
  "leaf_count": 3,
  "root": "<64-hex Merkle root>"
}

Note where the root lives: subject.root — not subject.proofs.chunk_merkle.root. The proofs.chunk_merkle path belongs to a different shape — a chunked file anchor (a single file anchored with a proof_set); readers pattern-matching from the file-anchor docs hit exactly that mix-up and conclude a valid manifest proof mismatches. The per-item leaves ride in the bundle's proofs.json, not in the canonical doc.

Case B — verify one item without disclosing the rest

Build an inclusion proof for the target item:

  1. Index i of the item in the submission order.
  2. The item itself: {label, sha256_hex}.
  3. The sibling hashes along the path from leaf i to the root (O(log N) hashes). These can be computed by anyone with the full items list — usually the manifest holder.

A verifier with just the item + the inclusion proof + the root (read from the bundle or from proof_url) re-walks the path, confirms the root matches the on-chain anchor, and confirms the item hash is what the inclusion proof claims. The other items stay opaque.

For tabular data specifically, the merkle-row-v1 scheme documents the canonical row-encoding rule

Cross-link: What to hash — Manifest rows / tables covers how to compute the per-item sha256_hex for common item shapes (CSV rows, JSON records, eval result rows).

Byte-exact canonicalization — the #1 verification gotcha

A proof commits to the exact bytes you hashed, not to the logical JSON object. Re-serializing the same object differently — json.dumps(..., indent=2), a different key order, an extra trailing newline — changes the sha256, and the verifier reports a sha256 mismatch even though "the data" looks identical. This bites in two places:

``python canonical = json.dumps(row, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8") sha256_hex = hashlib.sha256(canonical).hexdigest() ``

The exact SCJ-v1 rule — with reference implementations and pinned test vectors — is specified in spec-provenance §canonicalization.

6. Copy-paste example

Anchor a 1000-row eval result set

import hashlib, json, os, urllib.request

API = "https://app.satsignal.cloud"
KEY = os.environ["SATSIGNAL_API_KEY"]
FOLDER = "eval-runs-prod"

# Build the items list. Each item's sha256_hex is the canonical
# bytes of the row.
items = []
for i, row in enumerate(eval_results):
    row_bytes = json.dumps(row, sort_keys=True,
                           separators=(",", ":")).encode("utf-8")
    items.append({
        "label": f"item-{i:04d}",
        "sha256_hex": hashlib.sha256(row_bytes).hexdigest(),
    })

body = json.dumps({
    "folder_slug": FOLDER,
    "items": items,
    "category": "evidence_bundle",
    "label": "math-eval verdicts 2026-05-26",
    "session_id": "run-2026-05-26-001",
}).encode("utf-8")

req = urllib.request.Request(
    f"{API}/api/v1/anchors",
    data=body, method="POST",
    headers={"Authorization": f"Bearer {KEY}",
             "Content-Type": "application/json",
             "Idempotency-Key": "eval-run-2026-05-26-001"},
)
with urllib.request.urlopen(req) as resp:
    out = json.load(resp)

print(f"Manifest anchored: {out['proof_id']}")
print(f"  txid:       {out['txid']}")
print(f"  leaf_count: {out['leaf_count']}")
print(f"  root:       {out['root']}")

Anchor a multi-file release

export SATSIGNAL_API_KEY=sk_...
export FOLDER=release-gates

# Hash each file, build the items list.
ITEMS_JSON=$(jq -n '[]')
for f in dist/*; do
  HASH=$(sha256sum "$f" | awk '{print $1}')
  LABEL=$(basename "$f")
  ITEMS_JSON=$(jq --arg l "$LABEL" --arg h "$HASH" \
    '. + [{label: $l, sha256_hex: $h}]' <<< "$ITEMS_JSON")
done

# Anchor the manifest.
curl -X POST https://app.satsignal.cloud/api/v1/anchors \
  -H "Authorization: Bearer $SATSIGNAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg folder "$FOLDER" --argjson items "$ITEMS_JSON" '{
    folder_slug: $folder,
    items: $items,
    category: "evidence_bundle",
    label: "release v1.2.3"
  }')"

Reveal one row to a counterparty

After anchoring, you (the manifest holder) build an inclusion proof for one item:

import hashlib, json

def jcs_bytes(obj):
    return json.dumps(obj, sort_keys=True,
                      separators=(",", ":")).encode("utf-8")

def leaf_hash(item):
    return hashlib.sha256(jcs_bytes(item)).digest()

def build_tree(leaves):
    """Returns the level-by-level list, with the root last."""
    levels = [leaves]
    cur = leaves
    while len(cur) > 1:
        if len(cur) % 2 == 1:
            cur = cur + [cur[-1]]
        cur = [hashlib.sha256(cur[i] + cur[i+1]).digest()
               for i in range(0, len(cur), 2)]
        levels.append(cur)
    return levels

def inclusion_path(levels, index):
    path = []
    for level in levels[:-1]:
        sibling = index ^ 1
        if sibling >= len(level):
            sibling = index  # last-node duplication
        path.append(level[sibling])
        index //= 2
    return path

items = [...]  # your stored items list, in submission order
leaves = [leaf_hash(it) for it in items]
levels = build_tree(leaves)
root = levels[-1][0]

target_index = 7
reveal = {
    "item": items[target_index],
    "index": target_index,
    "leaf_count": len(items),
    "inclusion_path": [h.hex() for h in inclusion_path(levels, target_index)],
    "root": root.hex(),
}

A counterparty with just the reveal object + the on-chain root (via proof_url or the .mbnt bundle) walks the path, recomputes the root, confirms it matches. The other 9,999 items stay opaque.

7. Production notes

MAX_LEAVES = 10,000

Hard cap at the API layer. Above that, the request 400s. For larger batches, split into sub-manifests + a top-level manifest:

sub-1 (5000 items) → root_1 → anchor → manifest-proof-1
sub-2 (5000 items) → root_2 → anchor → manifest-proof-2
top (items=[{label: "sub-1", sha256_hex: root_1},
            {label: "sub-2", sha256_hex: root_2}]) → anchor

The top-level manifest binds the sub-manifests; a verifier can prove one row from sub-1 with: (a) inclusion proof in sub-1's manifest, (b) inclusion proof in the top manifest, (c) chain confirmation of the top.

force_new for dedup override

Re-submitting the same items[] list produces the same Merkle root, which hits the default-dedup gate and returns the original anchor's proof_id without burning quota. This is usually what you want.

If you want a fresh anchor on a re-submission (e.g. debugging, or two logically-distinct manifests that happen to share an items list), set force_new: true in the body. The notary anchors a fresh transaction.

{
  "folder_slug": "...",
  "items": [...],
  "category": "evidence_bundle",
  "force_new": true
}

Item order matters

The Merkle root commits to the items in submission order. Sorting the items at submission time gives you a deterministic root that can be reproduced from the (sorted) items list later; submitting in arbitrary order gives a root that depends on the original ordering, which you must preserve.

Pick one strategy and stick to it. The most robust pattern is to sort by label before submission; then any verifier can reproduce the manifest deterministically from just the items content.

Recommended: keep an items index file alongside the manifest. The anchor response does not echo the items[] array back. The downloaded .mbnt's proofs.json does carry the ordered leaves (label + sha256_hex + derived leaf_hash), but the server-side bundle copy is only kept until the proof is deleted — if it is deleted and you hold no local copy, the submission order is gone. The blessed shape is a plain JSON file written at submission time and stored next to your cached .mbnt, under the same retention policy as the artifacts:

{
  "proof_id": "abc123def456...",
  "txid": "5e9a...c4f1",
  "root": "<64 hex Merkle root from the response>",
  "leaf_count": 2,
  "items": [
    {"index": 0, "label": "rows/0001.json", "sha256_hex": "63d5c3e6..."},
    {"index": 1, "label": "rows/0002.json", "sha256_hex": "3cad58f5..."}
  ]
}

items is the exact array you submitted, in submission order, with an explicit index so a partial copy is detectable. This is a client-side convention, not an API surface — any verifier can take the file, recompute the leaves per the leaf-construction rule above, and confirm the recomputed root equals both the root recorded here and the one in the bundle's canonical doc.

Idempotency

Idempotency-Key works the same as standard mode. The body-hash check covers the full items[] list — a retry with one item modified is a different body and returns 409 idempotency_key_reuse_body_mismatch on the same key.

Sealed manifests

A sealed manifest is documented in Sealed mode under the merkle-row-sealed-v1 scheme. The wire shape is similar but with HMAC algos and a salt_b64 field; the canonical doc binds {leaf_count, root, scheme} and the per-leaf material rides in proofs.json off-chain.

Rate limits & quota

One manifest anchor = one anchor against the monthly quota, regardless of how many items it carries. This is the primary quota advantage of manifest-backed proofs: a 5,000-row eval run consumes one anchor slot, not 5,000.

Rate-limit behavior is unchanged from standard mode — the plan quota window is the only key-level throttle; there is no separate hourly burst limit (see Files §7).

Bundle size

The .mbnt bundle carries proofs.json with the full merkle_leaves[] array — 64 hex chars per leaf, ~64 bytes per entry. A 10,000-item manifest produces a proofs.json around ~650 KB. The bundle_url download is bearer-auth gated; size is rarely an issue but be aware on slow networks.

For the full pre-flight checklist (key rotation, broadcast failure recovery, support flow), see Production checklist.

8. Errors you might see

codenamemeaningwhat to do
400manifest_too_many_items> 10,000 itemssplit into sub-manifests + top-level manifest
400manifest_emptyempty items[]items must have ≥ 1 entry
400manifest_label_too_longa label > 256 charstruncate or restructure
400manifest_sha256_invalida sha256_hex isn't 64 lowercase hexregenerate hashes; use lowercase
400manifest_disallows_sha256sent sha256_hex at top level alongside itemsdrop the top-level sha256_hex
400manifest_disallows_file_sizesent file_size at top level alongside itemsdrop the top-level file_size
404folder_not_foundfolder doesn't existcreate the folder
409idempotency_key_reuse_body_mismatchkey reuse with a different bodyuse a fresh key
429quota_exceededmonthly anchor count exhaustedemail hello@satsignal.cloud for a cap lift, or wait for the monthly reset

9. Legacy field aliases (if your code sends them)

Same as all other paths: responses emit the canonical proof / folder names only; requests written against the legacy spellings keep working forever.

Full canonical/legacy mapping across endpoints, fields, scopes, CLI flags, and error codes: Compatibility map.

10. Where this fits

Questions about this specification? Email hello@satsignal.cloud.