Package org.apache.pdfbox.text
Class LegacyPDFStreamEngine
- java.lang.Object
-
- org.apache.pdfbox.contentstream.PDFStreamEngine
-
- org.apache.pdfbox.text.LegacyPDFStreamEngine
-
- Direct Known Subclasses:
PDFMarkedContentExtractor
,PDFTextStripper
class LegacyPDFStreamEngine extends PDFStreamEngine
LEGACY text calculations which are known to be incorrect but are depended on by PDFTextStripper. This class exists only so that we don't break the code of users who have their own subclasses of PDFTextStripper. It replaces the mostly empty implementation of showGlyph() in PDFStreamEngine with a heuristic implementation which is backwards compatible. DO NOT USE THIS CODE UNLESS YOU ARE WORKING WITH PDFTextStripper. THIS CODE IS DELIBERATELY INCORRECT, USE PDFStreamEngine INSTEAD.
-
-
Field Summary
Fields Modifier and Type Field Description private GlyphList
glyphList
private static org.apache.commons.logging.Log
LOG
private int
pageRotation
private PDRectangle
pageSize
private Matrix
translateMatrix
-
Constructor Summary
Constructors Constructor Description LegacyPDFStreamEngine()
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
processPage(PDPage page)
This will initialize and process the contents of the stream.protected void
processTextPosition(TextPosition text)
A method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.protected void
showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement)
Called when a glyph is to be processed.-
Methods inherited from class org.apache.pdfbox.contentstream.PDFStreamEngine
addOperator, applyTextAdjustment, beginMarkedContentSequence, beginText, decreaseLevel, endMarkedContentSequence, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getLevel, getResources, getTextLineMatrix, getTextMatrix, increaseLevel, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, registerOperatorProcessor, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showFontGlyph, showForm, showGlyph, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, showType3Glyph, transformedPoint, transformWidth, unsupportedOperator
-
-
-
-
Field Detail
-
LOG
private static final org.apache.commons.logging.Log LOG
-
pageRotation
private int pageRotation
-
pageSize
private PDRectangle pageSize
-
translateMatrix
private Matrix translateMatrix
-
glyphList
private final GlyphList glyphList
-
-
Method Detail
-
processPage
public void processPage(PDPage page) throws java.io.IOException
This will initialize and process the contents of the stream.- Overrides:
processPage
in classPDFStreamEngine
- Parameters:
page
- the page to process- Throws:
java.io.IOException
- if there is an error accessing the stream.
-
showGlyph
protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, Vector displacement) throws java.io.IOException
Called when a glyph is to be processed. The heuristic calculations here were originally written by Ben Litchfield for PDFStreamEngine.- Overrides:
showGlyph
in classPDFStreamEngine
- Parameters:
textRenderingMatrix
- the current text rendering matrix, Trmfont
- the current fontcode
- internal PDF character code for the glyphdisplacement
- the displacement (i.e. advance) of the glyph in text space- Throws:
java.io.IOException
- if the glyph cannot be processed
-
processTextPosition
protected void processTextPosition(TextPosition text)
A method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.- Parameters:
text
- The text to be processed.
-
-