pdfExtractTextFromPage

Extract text from a specified rectangle.
char *pdfExtractTextFromPage(PDFViewerHandle viewer, int page, double x0, double y0, double x1, double y1, int *length)
This function returns the text inside a specified rectangle on a specified page. The rectangle coordinates are in the PDF coordinate space (see pdfConvertWindowToPDFCoords2).

The text will be in the encoding set with pdfSetTextEncoding.

The string will be null-terminated, but note that it may contain 0x00 bytes (depending on the current text encoding). See pdfSetTextEncoding.

The caller is responsible for calling pdfFreeMemory on the returned string.

Returns NULL if no file is open or if text extraction is not allowed.

C:
int page; double x0, y0, x1, y1; char *text; int length; if (pdfGetCurrentSelection2(viewer, &page, &x0, &y0, &x1, &y1)) { text = pdfExtractTextFromPage(viewer, page, x0, y0, x1, y1, &length); if (text) { /* do something with the text ... */ pdfFreeMemory(text); } else { /* not allowed to extract text ... */ } } else { /* no current selection ... */ }
pdfSetTextEncoding
pdfSetDiscardDiagonalText
pdfOkToExtractText
pdfConvertWindowToPDFCoords2