discardInvisibleText

Discard invisible text.
[read/write property] discardInvisibleText([out, retval] VARIANT_BOOL *discard) discardInvisibleText([in] VARIANT_BOOL discard)
PDF files can contain "invisible" text. It's typically used when a PDF file is OCRed: the original image is displayed, with invisible text overlaid so that search and copy/paste work.

If this property is set to true, XpdfText will discard any invisible text. This isn't generally useful – most of the time you'll want the OCRed text. But in cases where a non-image PDF file is OCRed, it can end up with duplicated text, and this option will discard the OCR text.

There are two different ways to represent invisible text in PDF:

  1. by setting the text "render mode" to 3 (invisible)
  2. by setting the alpha (transparency) to 0
discardInvisibleText will discard text drawn with either of those techniques.

It's also possible to hide text in other ways, e.g., by placing it behind the image. discardInvisibleText does not discard that text.

VB6:
pdf.discardInvisibleText = True
clipText