Fonts
Each PDF font object is converted to afont
element:
word
elements to refer to the font.
The font name is the name as given in the PDF font object. It may
include a font subset tag (e.g., "AAAAAA+Times-Roman
").
The font type is one of:
- "Type 1"
- "Type 1C"
- "Type 1C (OT)"
- "Type 3"
- "TrueType"
- "TrueType (OT)"
- "CID Type 0"
- "CID Type 0C"
- "CID Type 0C (OT)"
- "CID TrueType"
- "CID TrueType (OT)"
For embedded fonts, there will be an associated font file in the
output directory. Its name is given by the file
attribute. The font file will be in its native format, i.e.,
PDFdeconstruct does not do any font type conversion.
For non-embedded fonts, there will be no file
attribute.
The problematicForUnicode
attribute value is
either "yes" or "no". A "yes" value indicates that the font is likely
to be problematic when converting text to Unicode. Note that this is
a heuristic; it's impossible to automatically detect problematic fonts
with 100% certainty.
The size
elements list all of the sizes at which
this font was used, i.e., all unique fontSize
values from word
elements referencing this font.