Manifest-backed proofs — Merkle-batch up to 10,000 items in one anchor
When you have many small items to anchor — every CSV row, every JSON record, every eval verdict, every line of a daily ledger — anchoring each one separately is one transaction per item. Manifest-backed proofs bind up to 10,000 items via a Merkle root, then anchors that root in one transaction. Each item can still be proven individually via a Merkle inclusion path; the chain footprint is one OP_RETURN regardless of N.
Companion docs: API reference · OpenAPI spec · What to hash · Bundle spec — manifest-items-v1 · Production checklist · Compatibility map
What "manifest mode" means — and doesn't. The anchor
modefield reads back asstandard,sealed, ormanifest, but those are not three parallel privacy tiers you pick between. "Manifest" only describes the shape of one anchor: it commits a Merkle root over anitems[]batch instead of a single file's hash. You don't set it — the notary auto-selects it from the presence ofitems[](sendingmode: "manifest"is unnecessary, and sending it alongsidesha256_hex400s). Two clarifications this guide assumes: - It's the plain-SHA-256 batch path. If your rows are low-entropy and need HMAC leaves, you want a sealed manifest — the same batching idea, but a different wire shape sent viamode: "sealed"with aproof_set(see Sealed mode), not theitems[]path below. "Sealed" and "manifest" are orthogonal axes (single-vs-batch × plain-vs-sealed), not mutually exclusive modes. - The leaf rule here ismanifest-items-v1, which is not thecsv-row-v1rule the selective-disclosure / redaction tooling binds to. Don't reach foritems[]to mint a disclosure carrier — that path goes through the disclosure builder / sealedproof_set, not here.
1. The 60-second framing
You send a list of {label, sha256_hex} items. The notary builds a Merkle tree over them (sha256({label, value}) per leaf, plain SHA-256 inner nodes, last-node duplication at odd levels), anchors a transaction that commits to that root — on chain as the canonical record's 20-byte doc_hash — and returns one anchor — proof_id, txid, leaf_count, root. Inclusion proofs for individual leaves are re-derivable from the items list later; the chain pins the root (via that 20-byte doc_hash), your store carries the leaves.
Concrete: an eval pipeline scoring 5,000 model outputs anchors all 5,000 scores in one transaction. A reviewer who later wants to verify one specific score gets the row + the inclusion path (O(log N) sibling hashes) and confirms it against the on-chain root, without seeing or needing the other 4,999 rows.
2. Use this when
- You have many small items and a per-item anchor is wasteful (1 txn × N items vs 1 txn × 1 manifest).
- The items share a logical group — a batch of eval results, a day's ledger entries, the records in a single export, a multi-file release.
- Selective disclosure matters: you want to reveal one item to a counterparty without disclosing the others.
- The items are structured rows (CSV, JSONL, table records) and
merkle-row-v1(or its sealed variant) is the right scheme. - You're building an agent-session evidence bundle (the manifest is the binding artifact at session end — see Agents).
- You're shipping a release that bundles N artifacts (binaries, configs, docs) and want one anchor per release rather than N.
Don't use this when:
- You have only one artifact — use Files.
- N > 10,000 — split into multiple sub-manifests + a top-level manifest of those (a "manifest of manifests").
- The items are low-entropy and a plain SHA-256 leaf could be brute-forced — use the sealed variant
merkle-row-sealed-v1(covered in Sealed mode).
3. What you send
POST /api/v1/anchors with items (the presence of items selects manifest mode; you MUST NOT also send sha256_hex or file_size).
{
"folder_slug": "agent-runs-prod",
"items": [
{"label": "Q1", "sha256_hex": "10343a87...aa921669"},
{"label": "Q2", "sha256_hex": "f68246b5...4b56894c"}
],
"category": "evidence_bundle",
"label": "math-eval verdicts 2026-05-07",
"session_id": "run-2026-05-09-001"
}
| field | type | required | meaning |
|---|---|---|---|
folder_slug | string | yes | folder slug. |
items | array | yes | 1..10,000 items. Each item is {label: string, sha256_hex: 64-hex}. |
category | string | yes | usually evidence_bundle. Other enum values allowed. |
label | string | no | free-text tag for the manifest itself (display only). |
session_id | string | no | grouping key for GET /api/v1/anchors?session_id=.... |
force_new | bool | no | bypass the same-root dedup gate. |
MAX_LEAVES = 10000. Above that, the request 400s with manifest_too_many_items.
Item shape
{"label": "row-7", "sha256_hex": "abc123..."}
| field | type | constraints |
|---|---|---|
label | string | 1..256 chars. Preserved exactly — bytes matter (used in the leaf-construction rule). |
sha256_hex | string | 64 lowercase hex chars. Hash of whatever item you're committing — a row, a file, a JSON envelope, etc. |
The label is part of the leaf hash. Labels can leak metadata — "alice-bid", "bob-bid" in a sealed-bid auction leaks bidder names even if the bid amounts stay sealed. Use neutral labels (row-7, item-12) where label privacy matters; see the threat-model note in /spec-merkle-row.
Leaf-construction rule
Each leaf is sha256(canonical_bytes(label, sha256_hex)) — the notary applies SCJ-v1 canonicalization (sorted keys, compact, NFC — deliberately NOT RFC 8785/JCS; see /spec-provenance §3) to {label, sha256_hex}, produces UTF-8 bytes, sha256s. Inner nodes are plain SHA-256. Last node duplicates at odd levels to round to a complete binary tree.
For verification later, a verifier with the items list reproduces the leaves and the tree identically — SCJ-v1 is deterministic.
Response
{
"proof_id": "abc123def456...",
"txid": "5e9a...c4f1",
"mode": "manifest",
"category": "evidence_bundle",
"folder_slug": "agent-runs-prod",
"proof_url": "https://app.satsignal.cloud/w/.../r/abc123def456",
"bundle_url": "https://app.satsignal.cloud/bundle/abc123def456.mbnt",
"leaf_count": 2,
"root": "<64 hex Merkle root>"
}
leaf_count and root are manifest-mode-specific. The .mbnt bundle's canonical doc carries subject.kind == "manifest" and subject.scheme == "manifest-items-v1"; verifiers dispatch on the canonical-doc subject, not on the manifest layer (see bundle-v1 §3.4).
4. What you store
proof_id+txid— the standard handles.root+leaf_count— useful for quick sanity checks at verify time without re-fetching the bundle.- The full
items[]list, in submission order. This is the load-bearing one. The Merkle root commits to leaf ORDER — re-ordering the items produces a different root. Persist the array exactly as you submitted it. - The original artifacts behind each item. The leaf carries a hash; the verifier needs the underlying bytes to confirm the hash matches a specific record. Storing only the
sha256_hexand not the row content makes the manifest verifiable in chain-existence but not row-by-row. - The
bundle_url— fetch the.mbntonce and cache it, same as standard mode.
Database row shape (reference)
| column | type | notes |
|---|---|---|
proof_id | text PK | from response |
txid | text | from response |
root | text | Merkle root |
leaf_count | int | count of items |
folder_slug | text | namespace |
session_id | text | optional grouping |
Plus a separate items table:
| column | type | notes |
|---|---|---|
proof_id | text FK | links to manifest |
position | int | 0-indexed submission order |
label | text | as submitted |
sha256_hex | text | as submitted |
artifact_path | text | where the original bytes for this item live |
5. What verification needs later
Verification splits into two cases.
Case A — verify the manifest as a whole
Three things:
- The
.mbntbundle — carriesmanifest-items-v1canonical doc with the root. - The items list — fetched from your store, in submission order.
- A BSV node — any public one.
The verifier rebuilds the Merkle tree from the items, compares the root to the canonical doc's subject.root, and confirms the canonical doc's hash matches the on-chain OP_RETURN payload.
The manifest-mode canonical subject is exactly this shape (plus an optional category):
"subject": {
"kind": "manifest",
"scheme": "manifest-items-v1",
"algo": "sha256",
"leaf_count": 3,
"root": "<64-hex Merkle root>"
}
Note where the root lives: subject.root — not subject.proofs.chunk_merkle.root. The proofs.chunk_merkle path belongs to a different shape — a chunked file anchor (a single file anchored with a proof_set); readers pattern-matching from the file-anchor docs hit exactly that mix-up and conclude a valid manifest proof mismatches. The per-item leaves ride in the bundle's proofs.json, not in the canonical doc.
Case B — verify one item without disclosing the rest
Build an inclusion proof for the target item:
- Index
iof the item in the submission order. - The item itself:
{label, sha256_hex}. - The sibling hashes along the path from leaf
ito the root (O(log N) hashes). These can be computed by anyone with the full items list — usually the manifest holder.
A verifier with just the item + the inclusion proof + the root (read from the bundle or from proof_url) re-walks the path, confirms the root matches the on-chain anchor, and confirms the item hash is what the inclusion proof claims. The other items stay opaque.
For tabular data specifically, the merkle-row-v1 scheme documents the canonical row-encoding rule
- the standard inclusion-proof shape. See /spec-merkle-row.
Cross-link: What to hash — Manifest rows / tables covers how to compute the per-item sha256_hex for common item shapes (CSV rows, JSON records, eval result rows).
Byte-exact canonicalization — the #1 verification gotcha
A proof commits to the exact bytes you hashed, not to the logical JSON object. Re-serializing the same object differently — json.dumps(..., indent=2), a different key order, an extra trailing newline — changes the sha256, and the verifier reports a sha256 mismatch even though "the data" looks identical. This bites in two places:
- Per item (this guide). Compute each
sha256_hexover canonical bytes and store those bytes (or enough to regenerate them), not a pretty-printed copy:
``python canonical = json.dumps(row, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8") sha256_hex = hashlib.sha256(canonical).hexdigest() ``
- Whole-manifest provenance. When you anchor a
satsignal.provenance.v1manifest viaPOST /api/v1/provenance/anchor, the endpoint re-canonicalizes it with Satsignal Canonical JSON v1 (sorted keys,separators=(",",":"), NFC,ensure_ascii=False) and commits those bytes — not whatever spacing you sent. To re-verify later, hand the verifier the canonical bytes: either re-canonicalize with the snippet above, or use the copy the notary already embeds in the.mbntbundle. A manifest you saved withindent=2will not verify against its own proof — that mismatch is the proof working, not breaking.
The exact SCJ-v1 rule — with reference implementations and pinned test vectors — is specified in spec-provenance §canonicalization.
6. Copy-paste example
Anchor a 1000-row eval result set
import hashlib, json, os, urllib.request
API = "https://app.satsignal.cloud"
KEY = os.environ["SATSIGNAL_API_KEY"]
FOLDER = "eval-runs-prod"
# Build the items list. Each item's sha256_hex is the canonical
# bytes of the row.
items = []
for i, row in enumerate(eval_results):
row_bytes = json.dumps(row, sort_keys=True,
separators=(",", ":")).encode("utf-8")
items.append({
"label": f"item-{i:04d}",
"sha256_hex": hashlib.sha256(row_bytes).hexdigest(),
})
body = json.dumps({
"folder_slug": FOLDER,
"items": items,
"category": "evidence_bundle",
"label": "math-eval verdicts 2026-05-26",
"session_id": "run-2026-05-26-001",
}).encode("utf-8")
req = urllib.request.Request(
f"{API}/api/v1/anchors",
data=body, method="POST",
headers={"Authorization": f"Bearer {KEY}",
"Content-Type": "application/json",
"Idempotency-Key": "eval-run-2026-05-26-001"},
)
with urllib.request.urlopen(req) as resp:
out = json.load(resp)
print(f"Manifest anchored: {out['proof_id']}")
print(f" txid: {out['txid']}")
print(f" leaf_count: {out['leaf_count']}")
print(f" root: {out['root']}")
Anchor a multi-file release
export SATSIGNAL_API_KEY=sk_...
export FOLDER=release-gates
# Hash each file, build the items list.
ITEMS_JSON=$(jq -n '[]')
for f in dist/*; do
HASH=$(sha256sum "$f" | awk '{print $1}')
LABEL=$(basename "$f")
ITEMS_JSON=$(jq --arg l "$LABEL" --arg h "$HASH" \
'. + [{label: $l, sha256_hex: $h}]' <<< "$ITEMS_JSON")
done
# Anchor the manifest.
curl -X POST https://app.satsignal.cloud/api/v1/anchors \
-H "Authorization: Bearer $SATSIGNAL_API_KEY" \
-H "Content-Type: application/json" \
-d "$(jq -n --arg folder "$FOLDER" --argjson items "$ITEMS_JSON" '{
folder_slug: $folder,
items: $items,
category: "evidence_bundle",
label: "release v1.2.3"
}')"
Reveal one row to a counterparty
After anchoring, you (the manifest holder) build an inclusion proof for one item:
import hashlib, json
def jcs_bytes(obj):
return json.dumps(obj, sort_keys=True,
separators=(",", ":")).encode("utf-8")
def leaf_hash(item):
return hashlib.sha256(jcs_bytes(item)).digest()
def build_tree(leaves):
"""Returns the level-by-level list, with the root last."""
levels = [leaves]
cur = leaves
while len(cur) > 1:
if len(cur) % 2 == 1:
cur = cur + [cur[-1]]
cur = [hashlib.sha256(cur[i] + cur[i+1]).digest()
for i in range(0, len(cur), 2)]
levels.append(cur)
return levels
def inclusion_path(levels, index):
path = []
for level in levels[:-1]:
sibling = index ^ 1
if sibling >= len(level):
sibling = index # last-node duplication
path.append(level[sibling])
index //= 2
return path
items = [...] # your stored items list, in submission order
leaves = [leaf_hash(it) for it in items]
levels = build_tree(leaves)
root = levels[-1][0]
target_index = 7
reveal = {
"item": items[target_index],
"index": target_index,
"leaf_count": len(items),
"inclusion_path": [h.hex() for h in inclusion_path(levels, target_index)],
"root": root.hex(),
}
A counterparty with just the reveal object + the on-chain root (via proof_url or the .mbnt bundle) walks the path, recomputes the root, confirms it matches. The other 9,999 items stay opaque.
7. Production notes
MAX_LEAVES = 10,000
Hard cap at the API layer. Above that, the request 400s. For larger batches, split into sub-manifests + a top-level manifest:
sub-1 (5000 items) → root_1 → anchor → manifest-proof-1
sub-2 (5000 items) → root_2 → anchor → manifest-proof-2
top (items=[{label: "sub-1", sha256_hex: root_1},
{label: "sub-2", sha256_hex: root_2}]) → anchor
The top-level manifest binds the sub-manifests; a verifier can prove one row from sub-1 with: (a) inclusion proof in sub-1's manifest, (b) inclusion proof in the top manifest, (c) chain confirmation of the top.
force_new for dedup override
Re-submitting the same items[] list produces the same Merkle root, which hits the default-dedup gate and returns the original anchor's proof_id without burning quota. This is usually what you want.
If you want a fresh anchor on a re-submission (e.g. debugging, or two logically-distinct manifests that happen to share an items list), set force_new: true in the body. The notary anchors a fresh transaction.
{
"folder_slug": "...",
"items": [...],
"category": "evidence_bundle",
"force_new": true
}
Item order matters
The Merkle root commits to the items in submission order. Sorting the items at submission time gives you a deterministic root that can be reproduced from the (sorted) items list later; submitting in arbitrary order gives a root that depends on the original ordering, which you must preserve.
Pick one strategy and stick to it. The most robust pattern is to sort by label before submission; then any verifier can reproduce the manifest deterministically from just the items content.
Recommended: keep an items index file alongside the manifest. The anchor response does not echo the items[] array back. The downloaded .mbnt's proofs.json does carry the ordered leaves (label + sha256_hex + derived leaf_hash), but the server-side bundle copy is only kept until the proof is deleted — if it is deleted and you hold no local copy, the submission order is gone. The blessed shape is a plain JSON file written at submission time and stored next to your cached .mbnt, under the same retention policy as the artifacts:
{
"proof_id": "abc123def456...",
"txid": "5e9a...c4f1",
"root": "<64 hex Merkle root from the response>",
"leaf_count": 2,
"items": [
{"index": 0, "label": "rows/0001.json", "sha256_hex": "63d5c3e6..."},
{"index": 1, "label": "rows/0002.json", "sha256_hex": "3cad58f5..."}
]
}
items is the exact array you submitted, in submission order, with an explicit index so a partial copy is detectable. This is a client-side convention, not an API surface — any verifier can take the file, recompute the leaves per the leaf-construction rule above, and confirm the recomputed root equals both the root recorded here and the one in the bundle's canonical doc.
Idempotency
Idempotency-Key works the same as standard mode. The body-hash check covers the full items[] list — a retry with one item modified is a different body and returns 409 idempotency_key_reuse_body_mismatch on the same key.
Sealed manifests
A sealed manifest is documented in Sealed mode under the merkle-row-sealed-v1 scheme. The wire shape is similar but with HMAC algos and a salt_b64 field; the canonical doc binds {leaf_count, root, scheme} and the per-leaf material rides in proofs.json off-chain.
Rate limits & quota
One manifest anchor = one anchor against the monthly quota, regardless of how many items it carries. This is the primary quota advantage of manifest-backed proofs: a 5,000-row eval run consumes one anchor slot, not 5,000.
Rate-limit behavior is unchanged from standard mode — the plan quota window is the only key-level throttle; there is no separate hourly burst limit (see Files §7).
Bundle size
The .mbnt bundle carries proofs.json with the full merkle_leaves[] array — 64 hex chars per leaf, ~64 bytes per entry. A 10,000-item manifest produces a proofs.json around ~650 KB. The bundle_url download is bearer-auth gated; size is rarely an issue but be aware on slow networks.
For the full pre-flight checklist (key rotation, broadcast failure recovery, support flow), see Production checklist.
8. Errors you might see
| code | name | meaning | what to do |
|---|---|---|---|
400 | manifest_too_many_items | > 10,000 items | split into sub-manifests + top-level manifest |
400 | manifest_empty | empty items[] | items must have ≥ 1 entry |
400 | manifest_label_too_long | a label > 256 chars | truncate or restructure |
400 | manifest_sha256_invalid | a sha256_hex isn't 64 lowercase hex | regenerate hashes; use lowercase |
400 | manifest_disallows_sha256 | sent sha256_hex at top level alongside items | drop the top-level sha256_hex |
400 | manifest_disallows_file_size | sent file_size at top level alongside items | drop the top-level file_size |
404 | folder_not_found | folder doesn't exist | create the folder |
409 | idempotency_key_reuse_body_mismatch | key reuse with a different body | use a fresh key |
429 | quota_exceeded | monthly anchor count exhausted | email hello@satsignal.cloud for a cap lift, or wait for the monthly reset |
9. Legacy field aliases (if your code sends them)
Same as all other paths: responses emit the canonical proof / folder names only; requests written against the legacy spellings keep working forever.
Full canonical/legacy mapping across endpoints, fields, scopes, CLI flags, and error codes: Compatibility map.
10. Where this fits
- For the full Merkle-construction rule + the canonical-doc shape for
manifest-items-v1, see bundle-v1 §3.4 + §4.3. - For row-level schemes (
merkle-row-v1and the sealed variantmerkle-row-sealed-v1), see /spec-merkle-row. - For sealed manifests (low-entropy rows that need HMAC leaves instead of plain SHA-256), see Sealed mode.
- For agent-session manifests (the final binding artifact at session end), see Agents.
- If you have only one artifact, use Files — manifest mode is overkill for N=1.
- For "what counts as the canonical bytes of an item" for various row shapes, see What to hash — Manifest rows / tables.
Questions about this specification? Email hello@satsignal.cloud.