pdfSetDiscardInvisibleText

Discard invisible text.
void pdfSetDiscardInvisibleText(int discard)
PDF files can contain "invisible" text. It's typically used when a PDF file is OCRed: the original image is displayed, with invisible text overlaid so that search and copy/paste work.

If this function is called with a non-zero argument, XpdfText will discard any invisible text. This isn't generally useful – most of the time you'll want the OCRed text. But in cases where a non-image PDF file is OCRed, it can end up with duplicated text, and this option will discard the OCR text.

There are two different ways to represent invisible text in PDF:

  1. by setting the text "render mode" to 3 (invisible)
  2. by setting the alpha (transparency) to 0
pdfSetDiscardInvisibleText will discard text drawn with either of those techniques.

It's also possible to hide text in other ways, e.g., by placing it behind the image. pdfSetDiscardInvisibleText does not discard that text.

This setting is global - it applies to all PDF handles.

C:
pdfSetDiscardInvisibleText(1);