json-keypath-v1 — selective-disclosure profile for JSON top-level-key leaves (native anchor rule)
Status: active. This profile is the disclosure-side write-down of the per-top-level-key
chunk_merklerule every JSON anchor already commits — it adds no new on-chain behavior. It repeats, for JSON, exactly whatcsv-row-v1.mddoes for CSV andtext-line-v1.mddoes for text.subject_profileliteral:json-keypath-v1(hyphenated — the literal JSON anchors stamp intosubject.proofs.chunk_merkle.scheme). This is a NATIVE profile: the leaf is the baresha256(utf8(entry))(standard) orHMAC(per-leaf HKDF salt, utf8(entry))(sealed), whereentryis the canonicalkey:valuestring of §3 — never the salted framed RFC-6901 preimage of the deprecatedsatsignal.json.field.v1.
1. Why this exists
A selective disclosure proves a revealed unit into the exact merkle leaf the anchor already committed on chain. JSON anchors chunk a .json file by top-level key under chunk_merkle.scheme = "json-keypath-v1". To redact a JSON object you already anchored — revealing some top-level keys, withholding others — the disclosure leaf rule MUST equal the anchor's json-keypath-v1 leaf rule byte-for-byte. This profile pins that rule.
Granularity is TOP-LEVEL KEY (locked). Deep-field redaction (a JSON Pointer / RFC-6901 path into a nested value) has no on-chain deep-field leaves to prove into and is a planned FUTURE anchor scheme, not this profile. The deprecated satsignal.json.field.v1 profile cannot bind to a live JSON anchor (salted/framed leaf + RFC-6901 pointer granularity); see §9.
2. Inputs and canonicalization (json-jcs-v1)
The leaf-set is computed from the original file bytes via the SAME canon the anchor applies (JCS / RFC-8785-ish — JSON.parse then re-serialize each value canonically, byte-identical in the standard and sealed anchor branches):
- Parse the file bytes as JSON (
JSON.parse). The decoder is the anchor'sJSON.parse(file.text()). - Objects-only gate. The top-level value MUST be a JSON object. A top-level array or scalar (string / number / boolean / null) is still hashed into a
content_canonical(the JCS of the whole value) but produces NOchunk_merkle— it has no top-level keys to chunk and is therefore NOT redactable under this profile. Ajson-keypath-v1disclosure source MUST be an object. - Per-value canonicalization (
json-jcs-v1). Each top-level value is canonicalized as RFC 8785 (JCS) plus a UTF-8 NFC pre-normalization step: NFC-normalize all strings and keys, then object keys sorted by UTF-16 code unit (RFC 8785 §3.2.3),separators=(",",":")(no whitespace), JCS-shortest-form numbers (finite floats permitted;NaN/Infinityrejected).
> Not pure RFC 8785, and not SCJ-v1. The NFC step is an addition > to RFC 8785 (which does not normalize), so a verifier using a stock > RFC 8785 library MUST NFC-normalize the input first or it will diverge > on non-NFC strings. This is also distinct from SCJ-v1 (the > provenance / MBNT / manifest-items rule — see /spec-mbnt), > which sorts keys by Unicode code point and forbids floats. > json-jcs-v1 sorts by UTF-16 code unit and permits finite floats; > the two key orders differ for supplementary-plane ("astral") keys.
The content-canonical hash is sha256 of the full canonical JCS string under scheme json-jcs-v1; it is not part of the per-leaf rule but is the anchor's content_canonical. For the §4 worked-example object it is 44aa973edc1e37bb08daa176fe96db64fe8e827323d408a2ba12c65ab2cc182c.
A file the anchor accepted recomputes to the same leaves here; a true mismatch surfaces as the distinct recompute-mismatch failure (§7), never a silent reject.
3. Leaf extraction — sorted top-level keys, key:jcs(value) entry, NO header
After parse + the objects-only gate, segment into leaves over the top-level keys:
keys = Object.keys(parsed).sort() // UTF-16 code-unit sort (JS String default)
entry_i = JSON.stringify(keys[i].normalize("NFC")) + ":" + jcs(parsed[keys[i]])
- Sorted-key order. Leaves are the top-level keys sorted by UTF-16 code unit (
Object.keys().sort(), RFC 8785 §3.2.3 — not code point; the two differ only for supplementary-plane keys). The leaf indexiis the key's position in the sorted-key list, zero-indexed — it is NOT the source-document key order. - Entry rule. The leaf entry for sorted key
kis the NFC-quoted key (JSON.stringify(k.normalize("NFC"))— the key with its JSON string quotes and escaping), then a literal:, thenjcs(value)(the JCS canonicalization of that key's value). No whitespace; this is exactly one canonical"key":valuepair. - No header concept. Unlike
csv-row-v1(which excludes row 0),json-keypath-v1keeps every top-level key: leaf 0 is the first sorted key. leaf_id="k"+ 6-digit zero-padded leaf index (e.g.k000000). Display / ordering handle only — NOT part of any hash preimage. (The"k"prefix differs fromcsv-row-v1's"r"andtext-line-v1's"l"for readability only; it is not load-bearing.)
A non-object top-level value (array / scalar) is not a valid json-keypath-v1 disclosure source (no top-level keys, no chunk_merkle); the tool fails closed (objects-only gate, §2).
4. Leaf hash — bare sha256 of the canonical entry (standard mode)
For a STANDARD JSON anchor (chunk_merkle.algo == "sha256"):
leaf_hash_i = sha256( utf8( entry_i ) )
Bare — no profile literal, no leaf_id, no salt, no 0x00 separators. A standard json-keypath-v1 revealed[i] carries {leaf_id, profile: "json-keypath-v1", value: <canonical entry string>, leaf_hash, proof_path} and no salt_b64 (§5). The value is the canonical entry string "key":jcs(value) itself (whitespace-free, NFC).
The verifier's value→bytes rule is utf8(value); it recomputes sha256(utf8(value)) and compares to the published leaf_hash, then walks proof_path to linked_anchor.root.
Worked example (NOT placeholders — computed against the anchor rule)
Source file bytes = the compact JSON.stringify of the object {"name":"AcmeCorp","ssn":"123-45-6789","balance":1000,"public_id":42}.
Sorted top-level keys → balance, name, public_id, ssn → 4 leaves:
| leaf_id | value (canonical entry) | sha256(utf8(value)) |
|---|---|---|
| k000000 | "balance":1000 | 01eae28b15e02f53498cc62386411dc9b8e20bd9913e9b467388745a6c7e62ee |
| k000001 | "name":"AcmeCorp" | f041ba92cd88620b074d511008b24a22f72e8804f15532891c3f4a3b09cef36c |
| k000002 | "public_id":42 | b62d2aa2e3776e429e11caf5c22c240aafe66758728800b9de3ec06cbec0d462 |
| k000003 | "ssn":"123-45-6789" | 046b8db3c2040b01ccf3ff8b8789f580a9817202ec499751481825f86b3a1e6b |
Standard root = 3c06af94a0af735cd19cc77349b7a962464cda703be16c66bbad2395913dbec8 (4 leaves → even tree at every level; no odd-last self-sibling arises in this example — the duplicate-last-on-odd rule still applies and is pinned by the csv-row-v1 / text-line-v1 odd-last vectors that share this merkle, §6).
These are frozen in tests/vectors/disclosure-v1/json_keypath_v1_native/N1.fixture.json.
5. Salts — standard mode is UNSALTED (privacy posture is first-class)
Standard json-keypath-v1 leaves are unsalted bare sha256(utf8(entry)). The honest characterization is stronger than "an incidental proof_path sibling leaks":
- The standard
.mbntpublishes EVERY leaf hash, including redacted keys.proofs.jsoncarriesmerkle_leaves= the complete ordered list of every top-level-key leaf hash, redacted keys included. A holder of the standard bundle has the exactsha256(utf8("key":jcs(value)))of each withheld key and can guess-and-confirm it entirely offline — not only when a withheld key happens to sit on a revealed key'sproof_path. - Zero per-leaf entropy ⇒ identical withheld entries have identical leaf hashes. With no salt, no
leaf_id, and no profile tag in the preimage (§4), two redacted keys with the same canonical"key":valueentry produce the same leaf hash (cross-equality leak). - A withheld key's recovery cost equals THAT ENTRY'S OWN entropy. A guess at the
"key":valueentry is confirmed in onesha256against the published leaf hash, so entries with low-entropy or small-space values — a boolean, enum, status flag, small number, date, or known-format identifier (phone, SSN, currency amount) — are trivially recoverable; only genuinely high-entropy free-form values are protected. Mask render mode makes this worse for the attacker's job (§7): mask prints each withheld key as"key":"[REDACTED]", disclosing the exact key name and its sorted position, so for a masked withheld key only the value remains to be guessed (and the standard leaf is the baresha256of the full"key":valueentry, which the attacker already knows the key part of).
This is the anchor-time-chosen tradeoff, not a defect — the discloser accepted it by anchoring in standard mode. Do NOT use standard mode to withhold low-entropy or small-space sensitive values (e.g. an ssn, balance, or status field); route that data to sealed mode (§5b), where redacted keys are unguessable and equal withheld entries do not collide. No keyless scheme can protect a redacted key that is itself the guessable secret; that is the cost of the no-keyfile requirement and the reason sealed exists. Choose sealed before anchoring if any withheld key could be low-entropy.
A standard revealed[i] MUST NOT carry salt_b64. The structural schema treats salt_b64 as optional for json-keypath-v1; the verifier ignores any stray salt_b64 under the bare-sha256 standard rule.
5b. Sealed mode — HMAC leaf under a per-leaf HKDF salt (algo: "merkle-hmac-sha256")
For a SEALED JSON anchor (chunk_merkle.algo == "merkle-hmac-sha256", chunk_merkle.salt_version == "salt_v1"), the leaf is keyed:
salt_i = HKDF-SHA256(ikm = master_salt,
salt = "satsignal-sealed-v1/per-leaf",
info = "chunk/" || u32_be(i), L = 32)
leaf_hash_i = HMAC-SHA256(key = salt_i, msg = utf8(entry_i))
This is the same per-leaf HKDF/HMAC derivation the sealed CSV and text anchors use — the anchor's sealed merkle assembly is generic across file types. Only the leaf hash differs from standard; canonicalization (§2), segmentation (§3), and the merkle (§6) are identical.
A sealed revealed[i] carries salt_b64 = base64(salt_i) — the PER-LEAF salt for that revealed key. salt_b64 is REQUIRED for a sealed leaf; a sealed carrier with a revealed leaf missing salt_b64 fails closed with sealed_leaf_missing_salt.
5b.1 What a sealed disclosure carries — per-leaf salt, NEVER the master
The redact tool reads the 32-byte master salt from the SOURCE .mbnt manifest.json (salt_b64, base64url) and derives the per-leaf salts. The disclosure output carries ONLY the per-leaf salts of the revealed keys. THE MASTER-SALT-STRIP RULE (forever): a disclosure .mbnt MUST NOT contain the master salt in any encoding, and MUST NOT carry a redacted key's per-leaf salt. Shipping the master salt re-derives every per-leaf salt and unseals every redacted key. The tool enforces this structurally (it never ships the source manifest.json) and with a P0 runtime guard (redact-core.mjs:_assertMasterSaltStripped, scheme/mode-independent). Revealing the per-leaf HKDF salts of revealed keys leaks nothing about the master salt or other keys (HKDF-Expand is a PRF).
5b.2 Privacy posture
A sealed redacted key is unguessable: its leaf is an HMAC under a per-leaf salt the verifier cannot derive without the master salt, which the disclosure never carries. Standard = disclosed-keys-only guarantee with brute-forceable redacted keys; sealed = redacted keys stay private. The choice is made at anchor time.
5b.3 Worked example (NOT placeholders)
Same 4-key object as §4; master salt = 0x00 0x01 … 0x1f (the bearer secret, NEVER shipped). Sealed leaves (sorted-key order):
| leaf_id | value | HMAC(salt_i, utf8(value)) |
|---|---|---|
| k000000 | "balance":1000 | 7c18ca5de8b2f87545da58ac37c646deb5e76bbb249fc819ffe37a68c46de449 |
| k000001 | "name":"AcmeCorp" | 484ade177adba084462f1f0bc614f36dfcca64e323ab5a0013c158fcf2bdf3e0 |
| k000002 | "public_id":42 | 90e79917674b2e934228bd17d2a8930cd77ee236bf536a18bdc67f7fca4493f6 |
| k000003 | "ssn":"123-45-6789" | 47e77e627543dbd1a86e08af03de23490108c51306831b991d912e6ec1b093fa |
Sealed root = 8d8a2a6d5b0490b19501452a729a967e1306df3164d2e8fc129c9afd52cb463c. Frozen in tests/vectors/disclosure-v1/json_keypath_v1_native_sealed/S1.fixture.json.
6. Merkle behavior — DUPLICATE-LAST on odd
The tree is duplicate-last-on-odd, identical to csv-row-v1 / text-line-v1 and to the anchor (merkleRootFromHexLeaves / merkleRootFromLeafBytes): at each level an odd last node pairs with itself (SHA-256(node || node)). The verifier only walks proof_path — it never rebuilds the root — so the duplicate-last tree verifies with no merkle-walk change. The redact tool emits duplicate-last-correct paths (a self-sibling entry for an odd-promoted node).
Worked example (the §4 four-leaf tree)
Leaves A=k000000, B=k000001, C=k000002, D=k000003 (the §4 hashes). With 4 leaves the tree is even at every level, so no odd-last self-sibling arises here:
- Level 0 → 1: pair
A,B→L1[0] = SHA-256(A || B) = 8141f6bcbe27f7a2d5ca2c493555ecd871b6324dec5fc7e69442eb8324f4cab8; pairC,D→L1[1] = SHA-256(C || D) = df77ab3a324d8519188a22288190c054b87386b97d16f617ddba9d15026d143f. - ROOT =
SHA-256(L1[0] || L1[1]) = 3c06af94…dbec8(§4).
Proof paths (frozen in N1, revealing k000001 + k000002):
- k000001 (B) —
[{L, A}, {R, L1[1]}](A=k000000is the level-0 left sibling;L1[1]at level 1, on the right). - k000002 (C) —
[{R, D}, {L, L1[0]}](D=k000003is the level-0 right sibling;L1[0]at level 1, on the left).
The odd-last self-sibling case (SHA-256(node || node) with a two-entry self-sibling path) does not arise for this even 4-leaf example; it is the same shared duplicate-last rule, pinned by the odd-last vectors of csv-row-v1 (csv_row_v1_native/N1, Carol = 2-entry path) and text-line-v1 (text_line_v1_native/N1, l000002 = 2-entry path). A conforming verifier MUST walk a self-sibling entry; it MUST NOT reject it or assume promote-unchanged.
7. Original anchor binding + the TWO render modes
A disclosure binds to the existing anchor via the §4 chain of disclosure-v1.md: the carrier canonical.json (carried VERBATIM) hashes to the on-chain document_hash; its subject.proofs.chunk_merkle.root equals linked_anchor.root; its scheme equals linked_anchor.subject_profile == "json-keypath-v1"; and its algo selects the leaf rule (sha256 standard / merkle-hmac-sha256 sealed). The redact tool recomputes the leaves from the original file, hard-fails if they do not match the committed merkle_leaves + root (wrong file / wrong bundle / edited file), then builds proof paths for the revealed keys. No re-anchor; no new scheme.
The proof binds the revealed keys to the on-chain root via the proof_path walk. The redacted copy is presentation-only: its bytes feed ONLY presentation.view_sha256 — they are NOT part of the leaf preimage and NOT independently re-attested. presentation.format == "json", .json extension, and presentation.view_sha256 == sha256(redacted-copy bytes).
JSON ships TWO owner-chosen render modes (default drop). This is the divergence from text-line-v1 (which has one render mode); both JSON modes attest the same revealed leaves — they differ only in the presentation bytes:
- drop (default):
structure_disclosure: "positions_hidden",redaction_marker: "(key omitted)". The copy is the canonical JCS object of only the revealed keys (sorted). Withheld keys do not appear — their existence/positions are hidden. - mask:
structure_disclosure: "positions_preserved",redaction_marker: "[REDACTED]". The copy is all keys (sorted), with each withheld key rendered as"key":"[REDACTED]". The"[REDACTED]"placeholder is presentation-only and is NOT attested (it is not a leaf value); the mask copy discloses the set of keys and their positions, but the withheld values stay withheld.
Worked example — reveal name + public_id (k000001, k000002)
Both modes attest the same two leaves (the §4 / §5b proof paths); they differ only in the redacted-copy bytes and therefore in view_sha256:
- drop copy =
{"name":"AcmeCorp","public_id":42}→presentation.view_sha256 = 4d07c17305b62ca015bf4772d7b5cb2ae0073db7c2d95daabdfa76eccf4a444b. - mask copy =
{"balance":"[REDACTED]","name":"AcmeCorp","public_id":42,"ssn":"[REDACTED]"}→presentation.view_sha256 = 3cb1ff995587ee10c5dab8aa84d69b2412acd054f05c591fe52576e930a126f6.
(The N1 / S1 frozen fixtures pin the drop view_sha256 4d07c173…; the mask view_sha256 is the same for standard and sealed since masking is presentation-only.)
8. Fixtures (test vectors)
[FOREVER-CONTRACT] — disclosure-v1.md §11 forbids a profile without vectors. Frozen, oracle-computed + tool-cross-checked:
- Standard:
tests/vectors/disclosure-v1/json_keypath_v1_native/N1— the §4 happy path (reveal k000001name+ k000002public_id, redact k000000balance+ k000003ssn; nosalt_b64; drop-mode copy).N2_linked_anchor_profile_mismatch— carrier schemejson-keypath-v2, root equal →linked_anchor_profile_mismatch.N3_non_jcs_value_mistake— a non-conformant discloser emitted a NON-JCS entry for k000001 ("name": "AcmeCorp"with a space after the colon) instead of the canonical whitespace-free"name":"AcmeCorp"; the bare-sha256of the non-JCS value no longer equals the committedleaf_hash→leaf_hash_mismatch(pins §3/§4 the entry is the canonical JCSkey:value, whitespace-free).negatives/overlays —N1_leaf_hash_mismatch,N1_merkle_path_mismatch,N1_linked_anchor_root_mismatch,N1_linked_anchor_canonical_hash_mismatch.
- Sealed:
tests/vectors/disclosure-v1/json_keypath_v1_native_sealed/S1— the §5b happy path (per-leafsalt_b64; reveal k000001 + k000002).negatives/—S1_leaf_hash_mismatch,S1_wrong_salt,S1_merkle_path_mismatch,S1_linked_anchor_root_mismatch,S1_linked_anchor_canonical_hash_mismatch,S2_missing_salt(sealed_leaf_missing_salt),S3_wrong_salt_version(unsupported_linked_algo),S4_linked_anchor_profile_mismatch.
9. Out of scope / deprecation pointers
- Deep-field / RFC-6901 granularity is NOT this profile. The deprecated salted
satsignal.json.field.v1(json-field-v1.md) addresses one leaf per nested field by JSON Pointer with a salted, framed preimage and cannot bind to a livejson-keypath-v1anchor (deep RFC-6901 pointer + salted/framed leaf vs. this profile's top-level-key + bare/sealed leaf). It stays inert (allowlist literals are never removed; its frozen corpus is a regression guard only). Thisjson-keypath-v1profile is DISTINCT from that deprecatedjson-field-v1one. A future deep-field ANCHOR scheme + a de-salted native profile is a separate effort. - Top-level array / scalar JSON has no top-level keys and no
chunk_merkle(objects-only gate, §2); it gets acontent_canonicalhash but is not redactable under this profile.
11. Profile registry pointer
Registered in disclosure-v1.md §11. json-keypath-v1 is the native top-level-key rule that JSON anchors actually emit; a disclosure binds to the chunk_merkle the anchor already committed (scheme == "json-keypath-v1"), revealing a subset of its per-top-level-key leaves — no re-anchor, no new scheme. Leaf rule: §§2–4 standard, §5b sealed; merkle §6; binding + render modes §7; vectors §8.
Questions about this specification? Email hello@satsignal.cloud.