MBNT on-chain wire format
Naming. Canonical vocabulary is proof / folder on every surface; where this spec shows a legacy spelling (receipt, etc.) it documents a frozen on-disk/on-chain format — same record. Full alias map: the compatibility map.
In plain words: this is the lowest level — the exact ~28-byte blob Satsignal writes into one Bitcoin SV transaction output for every anchor. You only need this page if you are writing a verifier that reads the raw chain directly (no Satsignal API) and wants to walk from a txid to the committed document hash by hand. If you just want to understand or check a proof, start at the bundle spec, which uses this format but is the right entry point. §11 below — what an anchor publicly commits to — defers to the canonical what an anchor proves / does not prove.
This page is the wire-level contract — the byte-exact spec an implementer follows to read or write the on-chain payload. For the user-facing overview of what Satsignal does and how proofs work, start at satsignal.cloud/docs.html.
The MBNT payload is the bytes that appear inside the OP_FALSE OP_RETURN output of a Satsignal anchor transaction. Everything else about a Satsignal proof (canonical doc, manifest, miner acceptance, merkle proofs) is off-chain — but the chain commitment lives here, and a non-Satsignal verifier reading a transaction needs to parse this payload to walk from txid to a 20-byte document hash.
This page is the standalone wire-format spec. It supersedes the description embedded in parseMbnt() in verifier.html. The reference implementations are:
- The public wire-level contract is this page; the browser verifier ships a JavaScript parser for the same payload
- JavaScript parse:
parseMbnt()in/verify's page source
Both produce/accept byte-identical payloads.
1. Where the payload lives
A Satsignal anchor is a BSV transaction that carries exactly one OP_FALSE OP_RETURN output with one MBNT payload (alongside an ordinary P2PKH change output). As built and serialized on chain, the script is
OP_FALSE OP_RETURN <push N> <payload bytes>
i.e. the raw scriptPubKey begins 00 6a. <push N> is either a single-byte push opcode (0x01..0x4b) or OP_PUSHDATA1 (0x4c) followed by a 1-byte length, depending on total payload size. Total payload is between 28 and 220 bytes (the upper bound is BSV's data-carrier relay norm — Satsignal does not emit larger payloads).
Explorer representations differ — accept both script shapes. The leading OP_FALSE (0x00) is always present in the raw transaction bytes (and in Bitails' parsed output scripts), but WhatsOnChain's JSON API (/v1/bsv/main/tx/hash/{txid}) strips it from vout[].scriptPubKey.hex — the same output reads 6a 22 4d 42 4e 54 … there. A parser MUST tolerate both shapes: strip one optional leading 0x00, then require 0x6a (OP_RETURN) followed by a single push whose first 4 bytes are MBNT. Checking for a literal 00 6a prefix breaks against WhatsOnChain's JSON view; checking for a bare 6a prefix breaks against raw transaction hex. (A raw-tx walker always sees the 00; the optional-strip rule is what makes the same parser correct against explorer-parsed script hex too.)
To extract the payload from a raw tx hex: walk inputs/outputs; for each output's scriptPubKey, strip an optional leading 0x00, check for 0x6a and a single push; the pushed bytes whose first 4 bytes are MBNT are the payload.
Worked example — mainnet anchor 05aac3a419328aee45404a4a11034b76bbc043c0b891d59faca94b7f35b0e218, output 1, raw scriptPubKey (37 bytes):
00 6a 22 OP_FALSE, OP_RETURN, push 0x22 (34 payload bytes)
4d 42 4e 54 magic — ASCII "MBNT"
01 01 version 0x01, subtype 0x01 (generic)
00 06 tlv_len = 6
01 e6 29 9c 3b 1d 69 7a 84 d6 b4 92 a0 30 6e 14 36 8a 98 59
doc_hash — sha256(canonical_doc)[:20]
05 04 d5 b0 b0 c6
TLV: tag 0x05 (issuer_id), len 4, value d5b0b0c6
WhatsOnChain's JSON serves the identical script minus the leading 00 (36 bytes, 6a224d424e54…).
The browser verifier at /verify implements this parse in about 25 lines of JavaScript. Other-language ports tend to be the same shape.
2. Byte layout
offset size field encoding
------ ---- ----------- ------------------------------------
0 4 magic ASCII "MBNT" — 0x4D 0x42 0x4E 0x54
4 1 version currently 0x01 (sole shipped value)
5 1 subtype see §4 (subtype registry)
6 2 tlv_len uint16, big-endian, bytes of TLV section
8 20 doc_hash first 20 bytes of sha256(canonical_doc)
28 N tlvs tag-length-value entries, see §6
N = tlv_len, 0..192
------ ----
total 28+N payload 28..220 bytes
A minimal MBNT payload (no TLVs) is exactly 28 bytes:
4D 42 4E 54 01 01 00 00 <20 bytes of doc_hash>
└ "MBNT" ┘ v st tlv_len └─── doc_hash ─────┘
A payload with two TLVs (currency=USD, timestamp_unix=epoch s) is 36 bytes:
4D 42 4E 54 01 01 00 0E <20 bytes>
01 03 55 53 44 // tag=01 (currency), len=3, "USD"
06 08 00 00 00 00 68 31 7C 80 // tag=06 (timestamp_unix), len=8, BE u64
3. Magic and version
- Magic is the literal ASCII bytes
MBNT(0x4D 0x42 0x4E 0x54). Verifiers MUST check the magic before reading any other field — the transaction may carry an OP_RETURN belonging to a different protocol. - Version is currently
0x01. There is no version0x00. Verifiers MUST refuse unknown versions; future versions will be additive in shape but never silently re-interpret v1 fields. - Forward compatibility. A v2 introduction will keep the magic and bump the version byte. A v1 verifier that hits a v2 payload refuses to parse — that is the correct behavior. No "best effort" partial-parse path.
4. Subtype registry
The subtype byte selects the schema applied to the off-chain canonical document the doc_hash commits to. Codes are 1 byte, never reused:
| code | name | meaning |
|---|---|---|
0x01 | generic | unspecified document; the canonical doc itself names its schema in a subject.kind field |
0x02 | wire | a financial-wire receipt schema |
0x03 | doc_sign | a document-signing receipt schema |
0x04 | event | an event-log receipt schema |
generic (0x01) is what every Satsignal anchor produced by the public API ships today. The other three codes are reserved for private deployments running their own canonical-doc schemas.
A verifier that doesn't recognize a subtype byte SHOULD still report the on-chain commitment (doc_hash, txid, version) but MUST NOT claim to have validated the canonical document — it doesn't know which schema applies.
5. doc_hash
doc_hash is exactly 20 bytes, computed as
doc_hash = sha256(canonical_doc_bytes)[:20]
canonical_doc_bytes is the Satsignal Canonical JSON v1 (SCJ-v1) UTF-8 encoding of the canonical-doc JSON object. The canonicalization algorithm — NFC-normalize all strings, sort all object keys by Unicode code point, emit minimal JSON (separators=(",",":")), no whitespace, no NaN/Infinity, no floats — is the same one used for the provenance manifest (see /spec-provenance §3), the manifest-items-v1 leaf preimage, and in merkle_row.py §1.1 and the commit_reveal.py helper.
Integers MUST stay in the JS-safe range. Every integer in a canonical doc (
attachments[].bytes,subject.amount_minor_units, proofsize,leaf_count) MUST satisfy|n| ≤ 2^53 − 1(9007199254740991). This is a normative canonicalization constraint, not just a sanity bound: a JS verifier re-parses every integer viaJSON.parse → Number, which silently rounds values outside ±(2^53−1), so an out-of-range integer canonicalizes to different bytes in JS than in Python's arbitrary-precision ints — the JS verifier then recomputes a differentdoc_hashand verification fails. The publishedcanonical.schema.jsoncarries"maximum": 9007199254740991on every integer field to enforce this. Integers MUST also be encoded as JSON integer literals (no decimal point) — SCJ-v1 forbids floats. This is a lexical rule: JSON Schema's value-basedintegertype cannot reject an integral-valued float like1.0(the JSON data model unifies it with1), so the SCJ-v1 rule and the reference validatorschema.pyare authoritative here and reject a1.0encoding the JSON Schema would accept.
SCJ-v1 is NOT RFC 8785 (JCS). A verifier MUST NOT substitute an RFC 8785 / JCS library for the canonical doc: SCJ-v1 sorts keys by code point (not RFC 8785's UTF-16 code-unit order — they differ for supplementary-plane keys) and NFC-normalizes strings (RFC 8785 does not). The product uses three distinct, non-interchangeable JSON canonicalizers — the same object can hash to three different canonical forms depending on which scheme anchors it. Do not assume any two are equivalent: | Rule | Used by | NFC? | Key sort | Floats | Implementation | |------|---------|:----:|----------|:------:|----------------| | SCJ-v1 | provenance manifest, MBNT canonical doc,
manifest-items-v1leaves, merkle-row, audit-packet | yes | Unicode code point | forbidden |notary/canonical.py(Py) +verifier/canon.mjs(JS) | |json-jcs-v1| JSON-filecontent_canonical+json-keypath-v1leaves | yes | UTF-16 code unit | finite only |jcsCanonicalize(JS only) | | RFC 8785 (JCS) |satsignal.json.field.v1deep-field disclosure only | no | UTF-16 code unit | yes | realjcslibrary (disclosure/jcs.py) | Only the third row is pure RFC 8785.json-jcs-v1is RFC 8785 plus an NFC step (so a stock RFC 8785 library diverges on non-NFC input); SCJ-v1 differs from both on key-sort order and floats. See/spec-disclosure-json-keypath§2 and/spec-disclosure-json-field§2.4.
The canonical-doc shape itself depends on the proof category and mode. For each shape, the spec is:
- File proof (categories:
output,evidence_bundle) with multi-proof v2 (byte_exact/content_canonical/chunk_merkle): the canonical doc follows the generic-v2 schema. - Manifest mode (Phase 8b) —
subject.kind = "manifest"withscheme: "manifest-items-v1",root,leaf_count. - Sealed mode: see
/spec(SPEC_v2_sealed.md). - Selective row-reveal (
merkle-row-v1,merkle-row-sealed-v1): see/spec-merkle-row.
5.1 Merkle conventions (normative)
The product runs three non-interchangeable Merkle tree shapes. They differ at exactly two points — the odd-node rule and the single-leaf rule — so a verifier author who reuses the wrong builder for a given scheme false-rejects valid anchors. All three are SHA-256 binary trees over already-hashed 32-byte leaves; the pair-combine is always sha256(left || right) over the raw 64-byte buffer.
| Scheme | Schemes / trees it governs | Odd node | Single leaf (N=1) | Role |
|---|---|---|---|---|
| chunk_merkle | manifest-items-v1; the chunk_merkle schemes (text-line-v1, json-keypath-v1, csv-row-v1, csv-column-v1, pdf-page-v1, image-tile-v1, zip-file-v1, satsignal.csv.row.v1; + sealed-only json-ast-v1, text-tree-v1) | duplicate-last: the unpaired last node self-pairs, parent = sha256(leaf‖leaf) | root = leaf (bare, no extra hashing) | build root + build/verify proof path (odd node ⇒ a self-sibling {side:"R", hash: own} entry) |
| satsignal.disclosure.v1 | the selective-disclosure tree (/spec-disclosure) | promote-unchanged: the unpaired last node moves up untouched | root = leaf (bare; proof_path == []) | build root + walk-only verify (odd node ⇒ no proof entry) |
| merkle-session-v1 | session_commitment (sealed closing-handoff, /spec §4.5) | duplicate-last | root = sha256(leaf‖leaf) — NOT the bare leaf | build root only (recorded, not independently verified — /spec §4.5) |
The single-leaf rule is the trap: chunk_merkle and satsignal.disclosure.v1 return the bare leaf, but merkle-session-v1 returns sha256(leaf‖leaf) so a 1-leaf session root can never be mistaken for a raw leaf hash. The odd-node rule is the other trap: chunk_merkle / merkle-session-v1 duplicate the last node, while satsignal.disclosure.v1 promotes it unchanged — these produce different roots for any odd-width level. Frozen per-scheme vectors covering both edges live in tests/vectors/merkle-single-leaf-v1/ (cross-scheme) and tests/vectors/sealed/session-commitment-v1/ (session); tests/test_wg5b_merkle_conformance.py pins them to the reference builders.
In every case the recipe is the same: re-canonicalize the same JSON, sha256 it, take the first 20 bytes, compare to the on-chain doc_hash.
6. TLV section
The 0–192 bytes after the header carry zero or more tag-length-value entries:
[tag (1 B)] [length (1 B)] [value (length B)]
- tag is a 1-byte code from §7. Tags are sorted ascending in serialized form so that two callers with the same logical TLV set produce byte-identical on-chain bytes regardless of dict order.
- length is 1 byte (0–255). MBNT does not use 2-byte lengths.
- value is
lengthbytes, encoded per the tag's spec in §7.
Verifiers MUST:
- reject duplicate tags (parse error — payload is ambiguous)
- reject TLVs that overrun
tlv_len - accept unknown tag values (record them as
(tag, value)pairs; do not derive any meaning) - treat the TLV section as optional metadata only — anything load-bearing for the proof's claim lives in the canonical doc, not on chain
The total TLV section size is capped at 192 bytes so that 28 + tlv_len ≤ 220 (BSV relay norm).
7. TLV tag registry
Public API anchors today emit one TLV by default — issuer_id (tag 0x05) — committing the 4-byte hash of the operator's DID. See §11 for what that means as a public chain-property and how to opt out. The other tags below are reserved for private deployments that want additional indexable on-chain metadata:
| tag | name | length | value encoding |
|---|---|---|---|
0x01 | currency | 3 | ASCII (e.g. b"USD") |
0x02 | amount_bucket | 1 | floor(log10(amount_in_minor_units)) (0..9) |
0x03 | reference_hash | 8 | sha256(reference_id_utf8)[:8] |
0x04 | counterparty_hash | 16 | sha256(counterparty_id_utf8)[:16] |
0x05 | issuer_id | 4 | sha256(issuer utf-8)[:4] |
0x06 | timestamp_unix | 8 | unsigned 64-bit big-endian seconds since epoch |
0x07 | subdoc_hash | 20 | secondary 20-byte document commitment |
Values are public on chain — never put PII or secrets in TLVs. Use the off-chain canonical doc and let doc_hash commit to them.
New tags are appended to this registry as private deployments need them; existing codes are never repurposed.
8. End-to-end verification recipe
A non-helper-language third party who has only (txid, canonical_doc_bytes) can verify the binding in any language with stdlib JSON + SHA-256:
- Fetch the raw transaction by txid from any public BSV node (e.g. WhatsOnChain or Bitails).
- Walk the outputs; find the first
OP_FALSE OP_RETURN <push>whose pushed bytes start with the magicMBNT. - Parse the payload per §2: confirm magic, confirm version
0x01, read subtype, readtlv_len, slicedoc_hash = payload[8:28]. - Re-canonicalize the supplied canonical doc (NFC, sort keys, no whitespace, no floats) and compute
expected = sha256(...)[:20]. - If
doc_hash == expected(constant-time compare, byte-equal), the document was committed in this transaction and the chain timestamp is its anchor time. If they differ, the document does not match this anchor — full stop.
That is the entire chain-side check. No Satsignal API is involved. For category-specific claims (the document's payload, manifest leaves, sealed reveals) the auditor then follows the spec for that category to walk the canonical doc.
For a worked example walking a real hosted-tier txid byte by byte, see /whats-on-chain.
9. Operational notes
- Single MBNT per tx. Satsignal anchors emit exactly one MBNT output per transaction. A verifier MAY accept multiple — but the current production wallet does not produce them.
- Total payload ≤ 220 bytes. The TLV cap of 192 bytes plus the 28-byte header keeps Satsignal under BSV's data-carrier relay norm. Larger payloads would not reach miners reliably.
- No on-chain compression. Bytes are exactly as specified above — no varint length, no zlib, no encoding stage.
- No on-chain secrets. Only the 20-byte
doc_hashand any TLVs named in §7. Sealed-mode salts and revealed payloads are off-chain.
10. Confirmation depth for verifiers
The chain commitment is observable from the moment a transaction is broadcast, but a transaction is not final until it is mined into a block and that block is buried under further work. Verifiers SHOULD display confirmation count prominently next to the txid and apply a minimum-depth gate appropriate to the use case:
| use case | recommended minimum |
|---|---|
| audit-trail / evidence dispositioning | ≥ 1 |
| sealed-bid auction reveal-after / fairness | ≥ 1 |
| settlement / counterparty-risk gating | ≥ 6 |
0 confirmations (mempool only) | display "unconfirmed" — never claim "anchored" |
A 0-confirmation proof is a valid commitment to the broadcast network but not yet to chain history; a verifier that treats it as anchored is conflating two distinct properties. Verifiers should display the confirmation count next to the txid and apply a minimum-depth policy appropriate to the use case; confirmation timing depends on current chain conditions.
This is a verifier-side rule, not a protocol-byte change — the MBNT payload itself carries no confirmation field. Verifiers fetch confirmation count from the explorer they use to retrieve the raw tx (e.g. WhatsOnChain /v1/bsv/main/tx/hash/{txid} returns a confirmations field).
11. What an anchor publicly commits to
Beyond doc_hash, a Satsignal anchor publicly commits the operator's DID fingerprint via the issuer_id TLV (§7, tag 0x05):
issuer_id = sha256(doc["issuer"].encode("utf-8"))[:4] # 4 bytes
For Satsignal's hosted tier this resolves to a constant — d5b0b0c6 == sha256("did:web:satsignal.cloud")[:4] — so every hosted-tier anchor is publicly chain-tagged with these same 4 bytes, whether it is a Standard or a Sealed proof: sealing hides the link between your file and the anchor, but it does not hide the operator fingerprint, which sits in a separate TLV.
Consequence. A chain observer who knows the operator's DID can enumerate every anchor that operator has issued. A multi-tenant deployment that gives each customer their own DID makes each customer's anchors enumerable by anyone watching the chain, even though the underlying documents stay sealed by the 20-byte hash.
For most use cases this is the intended property — discoverability of "Satsignal-issued anchors" by ecosystem indexers is half the reason a public protocol prefix exists. It is not appropriate for sealed-bid auctions among non-trivial counterparties or whistleblower-class users where "the bid was placed via Satsignal customer X" leaks before the reveal.
Opt-out. Emission is controlled at the anchoring pipeline, not per API request: the pipeline's publish configuration carries an issuer flag (default on); a deployment that runs the pipeline directly passes publish={"issuer": False} to suppress the TLV, producing a strict 28-byte payload with no operator fingerprint on chain. The hosted API exposes no per-request opt-out today — every anchor minted through POST /api/v1/anchors carries the hosted tier's constant issuer_id, so for hosted-tier users the property is effectively always-on. Operators running their own deployment where unlinkability matters should flip the default and treat opt-in as the exceptional path.
Authenticity caveat. The issuer_id is a discoverability handle, not an authenticity guarantee. Because a Satsignal anchor's authenticity is "this key signed it," anyone holding the signing key can mint an anchor carrying the legitimate issuer_id. A relying party MUST therefore verify the bound doc_hash — re-hash the document and confirm it matches the anchor's 20-byte commitment — rather than trusting the issuer_id tag as proof that Satsignal vouched for the content. See security.html §04 ("Keys and operator authority") for the operator-key blast radius and the key-rotation / incident posture.
This section describes only what is chain-visible. What an anchor actually proves — tamper-evidence and an upper time bound, not authorship and not prior existence — is stated canonically in the bundle spec.
12. Versioning
MBNT v1 is stable. A future MBNT v2 would change the version byte to 0x02, may extend the header, and may introduce new subtype codes. This document specifies v1 only.
Verifiers MUST refuse unknown versions and unknown subtypes (with the note in §4 about reporting the on-chain commitment for unknown subtypes). Silently accepting a future version would defeat the spec-as-contract.
Questions about this specification? Email hello@satsignal.cloud.