Functions/Properties
Setup & configuration
componentVersion
: Retrieve the component version number.setConfig
: Process a configuration command.
Opening & closing PDF files
loadFile
: Load a PDF file from disk.loadFileWithPassword
: Load a PDF file from disk, with a password.loadStream
: Load a PDF file from an OLE IStream object.loadStreamWithPassword
: Load a PDF file from an OLE IStream object, with a password.closeFile
: Close the currently open PDF file.
Extracting text
convertToTextFile
: Convert pages to text and write to a file.convertToTextString
: Convert pages to text and return a string.extractTextFromRect
: Extract text from a rectangular region.extractTextFromRect2
: Extract text from a rectangular region.
Word lists
buildWordList
: Construct a word list.buildWordListFromRect2
: Construct a word list for a rectangular region.getPrimaryDirection
: Get the primary writing direction of the word list.getNumWords
: Get the number of words on the word list.getWord
: Get a word handle.getWordText
: Get the text of a word.getWordLength
: Get the Unicode length of a word.getWordFontName
: Get the name of the font used by a word.getWordColor
: Get the color of a word.getWordBox
: Get the bounding box of a word.getWordBox2
: Get the bounding box of a word.getWordCharBox
: Get the bounding box of a character in a word.getWordCharBox2
: Get the bounding box of a character in a word.getWordSpaceAfter
: Check for a space after a word.getWordFontSize
: Get the font size used by a word.getWordFontIsFixedWidth
: Get the "fixed width" font flag for a word.getWordFontIsSerif
: Get the "serif" font flag for a word.getWordFontIsSymbolic
: Get the "symbolic" font flag for a word.getWordFontIsItalic
: Get the "italic" font flag for a word.getWordFontIsBold
: Get the "bold" font flag for a word.getWordRotation
: Get the rotation angle of a word.getWordCharPos
: Get the character position of a word.getWordCharLen
: Get the character length of a word.getWordDirection
: Get the writing direction of a word.
Text statistics
getNumVisibleChars
: Get the number of visible chars on the most recent page.getNumInvisibleChars
: Get the number of invisible chars on the most recent page.getNumRemovedDupChars
: Get the number of removed duplicate chars on the most recent page.
Annotations
buildAnnotList
: Construct an annotation list.getNumAnnots
: Get the number of annotations on the annotation list.getAnnot
: Get an annotation handle.getAnnotType
: Get the type of an annotation.getAnnotRect
: Get the bounding box of an annotation.getAnnotContent
: Get the content of an annotation.
Form fields and XFA data
getFormType
: Get the type of form in the PDF file.getNumFormFields
: Get the number of form fields.sortFormFields
: Sort the form fields in row-major order.getFormField
: Get a form field handle.getFormFieldType
: Get the type of a form field.getFormFieldName
: Get the name of a form field.getFormFieldBBox
: Get a form field's bounding box.getFormFieldMaxLength
: Get the maximum length of a form field's value.getFormFieldValue
: Get the value of a form field.extractXFAData
: Extract XFA form data.
Setting parameters
textEncoding
: Set the encoding to use for text output.readingOrderMode
: Set text extraction mode to "reading order".physicalLayoutMode
: Set text extraction mode to "physical layout".simpleLayoutMode
: Set text extraction mode to "simple layout".simple2LayoutMode
: Set text extraction mode to "simple2 layout".tableLayoutMode
: Set text extraction mode to "table layout".linePrinterMode
: Set text extraction mode to "line printer".rawMode
: Set text extraction mode to "raw".fixedPitch
: Set the text pitch.fixedLineSpacing
: Set the text line spacing.clipText
: Separate clipped text from unclipped text.discardDiagonalText
: Discard diagonal text.discardClippedText
: Discard clipped text.discardInvisibleText
: Discard invisible text.pageBreaks
: Enable/disable page break characters between pages.keepTinyChars
: Keep tiny characters.mapNumericCharNames
: Map numeric character names to Unicode.
PDF file information
numPages
: Get the number of pages.getPageWidth
: Get the width of the specified page.getPageHeight
: Get the height of the specified page.getPageBoxXMax
: Get the maximum x coordinate of the specified page box.getPageBoxXMin
: Get the minimum x coordinate of the specified page box.getPageBoxYMax
: Get the maximum y coordinate of the specified page box.getPageBoxYMin
: Get the minimum y coordinate of the specified page box.getPageSize
: Get the size of the specified page.getPageRotation
: Get the default rotation for the specified page.getPageUserUnit
: Get the UserUnit scaling factor for the specified page.getFormType2
: Get the type of form in the PDF file.okToExtractText
: Check to see if the PDF file allows text extraction.okToPrint
: Check to see if the PDF file allows printing.okToChange
: Check to see if the PDF file allows changing.okToAddNotes
: Check to see if the PDF file allows adding notes.isTagged
: Check to see if the PDF file is tagged.fileIsDamaged
: Check to see if the PDF file is damaged.usesJavaScript
: Returns true if the PDF document uses JavaScript.
Text-type Info Entries
getInfoString
: Get the content of a document info field.title
: Get the document title.subject
: Get the document subject.keywords
: Get the document keywords.author
: Get the document author.creator
: Get the document creator.producer
: Get the document producer.
Date-type Info Entries
getInfoDate
: Parse a document info field as a date.getCreationDate
: Get file creation date.getModificationDate
: Get file modification date.
Scanning the Info Entries
numInfoFields
: Get the number of document info fields available in the PDF file.getInfoFieldName
: Get the name of a specified info field.
Layers
getNumLayers
: Get the number of layers.getLayer
: Get a layer handle.getLayerName
: Get the name of a layer.getLayerVisibility
: Get the visibility state of a layer.setLayerVisibility
: Set the visibility state of a layer.getLayerViewState
: Get the suggested state of a layer for viewing mode.getLayerPrintState
: Get the suggested state of a layer for printing mode.getLayerOrderRoot
: Get the root of the layer display order tree.getLayerOrderIsName
: Check to see if a layer display order node is a name.getLayerOrderName
: Get the name of a layer display order node.getLayerOrderLayer
: Get the layer associated with a layer display order node.getLayerOrderNumChildren
: Get the number of children attached to a layer display order node.getLayerOrderChild
: Get a child of a layer display order node.
Embedded files
getNumEmbeddedFiles
: Get the number of embedded files.getEmbeddedFileName
: Get the name of an embedded file.saveEmbeddedFile
: Save an embedded file.