Skip to content

Watermarking

Watermarking

Layer 2 embeds an imperceptible payload in the asset’s signal data. The watermark survives common transformations that would strip or corrupt an embedded C2PA manifest (re-encoding, cropping, screenshots, format conversion). When the manifest is gone, the watermark ID points back to the manifest record in the soft-binding index.

Engines

ModalityEngineLicencePayload capacityNotes
ImageAdobe TrustMarkMIT48 bitsArbitrary resolution; C2PA soft-binding compatible
AudioMeta AudioSealMIT16 bits/segmentLocalised — detects watermarked seconds, not entire file
VideoMeta VideoSealMIT16 bits/frame-groupTemporal-propagation; survives re-encode at reasonable bitrates
TextGoogle SynthID-TextApache 2.0Green/red token logitsGeneration-time hook only; cannot watermark existing text

Engine selection is set by the recipe field watermark.engine. Mixed-modality assets (e.g. video with audio) embed both VideoSeal and AudioSeal when the recipe requests it.

Payload format

Each watermark encodes a 16-byte ULID (watermark_id). The ULID maps to a manifest record in Postgres.

The raw payload includes a 4-byte truncated HMAC-SHA-256 over (tenant_id, watermark_id). This prevents cross-tenant payload spoofing: a watermark ID decoded from an asset cannot be claimed by a different tenant.

[ watermark_id: 16 bytes ][ hmac_truncated: 4 bytes ]

Quality targets

ModalityMetricTargetMeasurement
ImagePSNR≥ 42 dBvs. original pre-watermark
AudioPESQ≥ 3.0 (speech)MOS-LQO on speech corpus
VideoVMAF≥ 90vs. source frame sequence

Quality is measured automatically in CI for each model update. A regression below threshold fails the build.

Robustness table

The following transformations are covered in the Public Beta robustness benchmark (make bench.image, make bench.audio, make bench.video).

Image (TrustMark)

AttackDetection rate
JPEG re-encode (quality ≥ 70)≥ 99%
PNG round-trip100%
Resize (0.5× to 2×)≥ 97%
Crop (≥ 50% area preserved)≥ 92%
Screenshot (desktop, 1× DPI)≥ 95%
WebP conversion≥ 98%
Brightness/contrast ±20%≥ 96%
JPEG quality 50–70≥ 90%

Audio (AudioSeal)

AttackDetection rate
MP3 re-encode (≥ 128 kbps)≥ 99%
AAC re-encode≥ 97%
Resample (44.1→22 kHz)≥ 95%
Noise addition (SNR ≥ 20 dB)≥ 96%
Speed change ±10%≥ 90%

Video (VideoSeal)

AttackDetection rate
H.264 re-encode (CRF ≤ 28)≥ 98%
H.265 re-encode (CRF ≤ 28)≥ 97%
Resolution change (720p→480p)≥ 94%
Screen-record at 1×≥ 93%

What watermarking does not cover

  • High-effort adversarial removal attacks (e.g. adversarial perturbation specifically targeting TrustMark)
  • Extreme re-encoding at very low bitrate (JPEG quality < 50, MP3 < 64 kbps)
  • Image attacks that reduce PSNR below 30 dB (the watermark may persist but image quality is unusable)
  • Audio pitch-shifting > ±20%
  • SynthID-Text: any post-generation transformation (paraphrase, translation, abbreviation)

Robustness benchmarks are gated in CI. Results are published in docs/WATERMARK-POLICY.md.

SynthID-Text: generation-time only

SynthID-Text works by modifying token logit distributions at generation time. It must be integrated as a hook inside the LLM inference pipeline — it cannot watermark existing text. The recipe text-genai-disclosure-v1 documents the required hook interface.

If your use case requires watermarking text that already exists (e.g. extracted OCR), use a C2PA manifest sidecar instead.

Watermark as a backup signal

The watermark is one layer of the provenance stack — not a standalone guarantee. If the watermark is present and the manifest is also present, the verification result is verified_manifest_and_watermark_match (highest confidence). If only the watermark is present, the result is watermark_only (lower confidence). If neither is present, the result is no_provenance.

See Verification States for the full signal-to-state mapping.