Package org.apache.pdfbox.pdfparser
The pdfparser package contains classes to parse PDF documents and objects within the document.
-
Interface Summary Interface Description PDFXRef SequentialSource A SequentialSource provides access to sequential data for parsing. -
Class Summary Class Description BaseParser This class is used to contain parsing logic that will be used by both the PDFParser and the COSStreamParser.COSParser PDF-Parser which first reads startxref and xref tables in order to know valid objects and parse only these objects.EndstreamOutputStream This class is only for the readUntilEndStream method, to prevent a final CR LF or LF (but not a final CR!) from being written to the output, unless the beginning of the stream is assumed to be ASCII.FDFParser InputStreamSource A SequentialSource backed by an InputStream.PDFObjectStreamParser This will parse a PDF 1.5 object stream and extract all of the objects from the stream.PDFParser PDFStreamParser This will parse a PDF byte stream and extract operands and such.PDFXRefStream PDFXRefStream.FreeReference A class representing a free reference.PDFXRefStream.NormalReference A class representing a normal reference.PDFXRefStream.ObjectStreamReference A class representing an object stream reference.PDFXrefStreamParser This will parse a PDF 1.5 (or better) Xref stream and extract the xref information from the stream.PDFXrefStreamParser.ObjectNumbers RandomAccessSource A SequentialSource backed by a RandomAccessRead.XrefTrailerResolver This class will collect all XRef/trailer objects and creates correct xref/trailer information after all objects are read using startxref and 'Prev' information (unused XRef/trailer objects are discarded).XrefTrailerResolver.XrefTrailerObj A class which represents a xref/trailer object. -
Enum Summary Enum Description XrefTrailerResolver.XRefType The XRefType of a trailer.