Home / Guides

Why a Black Box Isn’t Real PDF Redaction

A PDF document with a black redaction bar viewed through a magnifying glass, suggesting hidden text may remain.
Visual masking can leave underlying PDF text intact.

LAST UPDATED: 2026-01-08

You’re sharing a PDF — a contract, a payslip, a medical record, a scanned passport. You hide the private details with black boxes, export the file, and assume the information is no longer there.

Often, that assumption is wrong.

In many PDFs, those black rectangles are only visual overlays. The underlying text may still exist inside the file — searchable, selectable, and recoverable depending on how the document was created and how it is opened. To the eye it looks censored, but to the software it may not be.

In this guide we cover masking vs redaction, famous failures, why PDFs are tricky, a safe workflow, and a 2-minute verification test.

Masking vs redaction: the difference that matters

Masking (what most people accidentally do):

  • Adds shapes/annotations over content
  • Changes what you see
  • Often does not remove what the file contains

Redaction (what you actually want):

  • Permanently removes the selected content from the document
  • Produces a sanitized output you can share with confidence

Think of it like this: masking changes what you see; redaction changes what the file contains.

The Redaction Hall of Fame: three real-world failures

Each of these cases shows the same problem: documents that looked redacted, but still contained the original text inside the file.

  1. The Manafort filing (2019): black bars, real text. Reporting described how the hidden content in a court filing was still present in the file and could be extracted, exposing details investigators said were relevant to the case.
    Takeaway: If your redaction is just a visual cover, the original content may still be there.
  2. Facebook / Six4Three (2018): a “technical glitch” with a long tail. WIRED reported that court documents were redacted improperly, and a technical glitch exposed underlying text when processed in another tool.
    Takeaway: You don’t control how other people’s tools interpret your PDF. If the content remains in the file, assume it can surface elsewhere.
  3. The New York Times Snowden PDF slip (2014): journalism under pressure, redaction under-tested. OpenNews noted that an NSA document was improperly redacted, revealing the name of an NSA agent.
    Takeaway: Even experienced teams can miss this when moving fast. A good workflow is simple, repeatable, and verified.

A PDF is more than a single page

It can contain:

  • text layers,
  • images,
  • annotations and comments,
  • form fields,
  • embedded files,
  • document properties (metadata).

So when you “cover” something, you may only be covering one layer, leaving the original untouched. That’s the core problem.

A simple, reliable redaction workflow

  1. Redact (remove the content)
    Use a tool that performs true redaction and exports a new file. You mark the sensitive areas and apply the redaction so the underlying data is deleted, not just covered.
    Objective: the information is removed from the document, not merely hidden.
  2. Remove hidden information
    After redacting, clear the material people often overlook: author names, creation software, document properties, comments, annotations, and form fields. Metadata removal ensures nothing unintended remains embedded in the file.
  3. Flatten (finalize the file)
    Flattening combines the document’s layers into a single, fixed layout. It does not replace redaction, but it helps prevent casual editing after sensitive content has already been removed.
  4. Verify
    Always check the exported file. Search for removed text, try selecting or copying near redacted areas, and review document properties. This step catches nearly all failures before the file is shared.

The 2-minute “before you send” confidence test

Do this on the exported file every time:

  • Search test: Search the PDF for the sensitive words you tried to remove (names, numbers, addresses). If they appear anywhere, stop.
  • Select test: Try selecting around the redacted area. If selection behaves like there’s still text “under” the black box, stop.
  • Copy/paste sanity check: Confirm that nothing sensitive can be copied from the redacted region using normal selection behavior.
  • Metadata check: Open document properties. If you see names, internal filenames, or software trails you don’t want to share, remove them.
  • Fresh viewer check: Open the PDF in a different viewer (phone preview counts). If anything looks layered or editable, stop.

Two minutes. Huge payoff.

FAQ

Is drawing a black box over a PDF safe?

Not reliably. In many PDFs, a black box is only a visual overlay — the underlying text can remain inside the file.

Is “flatten PDF” the same as redaction?

No. Flattening changes how the PDF is packaged. Redaction removes the content itself.

What’s the safest order?

Redact → remove hidden info/metadata → flatten → verify.

How can I quickly check if a redaction actually worked?

Run a simple verification pass on the exported file: search for the removed terms, try selecting/copying near the redacted area, review document properties (metadata), and open it in a different viewer. If anything sensitive shows up, the redaction didn’t hold.