getNumInvisibleChars

Get the number of invisible chars on the most recent page.
getNumInvisibleChars([out, retval] int *n)
This function returns the number of invisible characters on the most recently converted page or region, i.e., the last page from the last call to convertToTextFile, convertToTextString, extractTextFromRect, extractTextFromRect2, buildWordList, or buildWordListFromRect2.

This function, along with getNumVisibleChars and getNumRemovedDupChars, are useful for detecting problematic scanned pages. In "electronic" (non-scanned) PDF files, all of the text will be visible, and there will be zero invisible characters. In most cases, removed duplicate characters occur in "fake boldface" text, and the number of removed duplicates is small. Invisible characters are used in scanned PDF files, where invisible OCR text is overlaid on top of the scanned image. If an electronic PDF file is OCRed, it can end up with both visible and invisible characters.

VB:
nVis = pdf.getNumVisibleChars() nInvis = pdf.getNumInvisibleChars() nDup = pdf.getNumRemovedDupChars()
getNumVisibleChars
getNumRemovedDupChars