pdfExtractTextFromPage
Extract text from a specified rectangle.
char *pdfExtractTextFromPage(PDFViewerHandle viewer, int page,
double x0, double y0, double x1, double y1, int *length)
This function returns the text inside a specified rectangle on a
specified page. The rectangle coordinates are in the PDF coordinate
space (see
pdfConvertWindowToPDFCoords2
).
The text will be in the encoding set with pdfSetTextEncoding
.
The string will be null-terminated, but note that it may contain 0x00
bytes (depending on the current text encoding). See
pdfSetTextEncoding
.
The caller is responsible for calling pdfFreeMemory
on the
returned string.
Returns NULL if no file is open or if text extraction is not allowed.
C:
int page;
double x0, y0, x1, y1;
char *text;
int length;
if (pdfGetCurrentSelection2(viewer, &page, &x0, &y0, &x1, &y1)) {
text = pdfExtractTextFromPage(viewer, page, x0, y0, x1, y1, &length);
if (text) {
/* do something with the text ... */
pdfFreeMemory(text);
} else {
/* not allowed to extract text ... */
}
} else {
/* no current selection ... */
}