Images and Vector Graphics

Overview

After the column elements, there will be image, imagemask, fill, and stroke elements describing the non-text content on the page.

These elements appear in the order in which the objects were drawn in the PDF content stream (unlike the text, which is reordered into columns, paragraphs, etc.).

Images

Each placed image is represented by an image element:
<image imagedata="image-tag" llx="lower-left-x" lly="lower-left-y" urx="upper-right-x" ury="upper-right-y"/>
The bounding box (llx, lly, urx, ury) is identical in format to the word bounding box. Units are controlled by the -unit option.

The image tag in the imagedata attribute refers to an imagedata element in the resources element. Multiple images (on the same or different pages) can have the same image tag, i.e., they can share the same image data. The same image data can be drawn in different positions and/or with a different size.

Image masks

Image masks are monochrome (1-bit) bitmaps that are filled with a particular color:
<imagemask imagedata="image-tag" llx="lower-left-x" lly="lower-left-y" urx="upper-right-x" ury="upper-right-y" <color type="rgb" r="red" g="green" b="blue"/> </imagemask>
The bounding box and image tag are similar to image elements. There is an additional color child element (see the description of color for word elements).

Image mask data is a 1-bit image, with no inherent color. The same image mask data can be drawn in multiple places, at different sizes, and with different colors.

Fills

Fill operations are represented by fill elements:
<fill rule="fill-rule"> <color type="rgb" r="red" g="green" b="blue"/> <path>path</path> <fill>
The fill rule will be either "eo" for the even-odd rule, or "nzwn" for the non-zero winding number rule.

The color element is the same as used with words and image masks.

Paths are described below.

Strokes

Stroke operations (poly-lines) are represented by stroke elements:
<stroke width="line-width" cap="line-cap-style" join="line-join-style"> <dash phase="dash-phase"> <dashelem value="dash-element"/> ... </dash> <color type="rgb" r="red" g="green" b="blue"/> <path>path</path> <fill>
The line width is given in the units specified by the -unit option.

The line cap style is one of:

The line join style is one of:

The dash element describes the line dash pattern. There will be zero or more dashelem children, each with a value describing the length of a dash or gap. See the PDF (or PostScript) reference manual for a more detailed explanation. A plain, undashed line will have an empty dash pattern:

<dash phase="0"> </dash>
The dash elements are in the units specified by the -unit option.

Paths

Paths are used in both fill elements. A path consists of one or more subpaths. Each subpath starts with a moveto, which is followed by a sequence of lineto and/or curveto operations.
<path> <subpath closed="closed"> <moveto x="x" y="y"/> <lineto x="x" y="y"/> <curveto x1="x1" y1="y1" x2="x2" y2="y2" x3="x3" y3="y3"/> ... </subpath> ... </path>
Each subpath is marked as closed (closed="true") or open (closed="false"). (For fill operations, all subpaths will be implicitly closed, so the closed attribute can be ignored.)

All coordinates (x, y, x1, y1, x2, y2, x3, y3) are in the units specified by the -unit option.