Class Splitter

java.lang.Object
org.apache.pdfbox.multipdf.Splitter

public class Splitter extends Object
Split a document into several other documents.
  • Field Details

    • LOG

      private static final org.apache.commons.logging.Log LOG
    • sourceDocument

      private PDDocument sourceDocument
    • currentDestinationDocument

      private PDDocument currentDestinationDocument
    • splitLength

      private int splitLength
    • startPage

      private int startPage
    • endPage

      private int endPage
    • destinationDocuments

      private List<PDDocument> destinationDocuments
    • currentPageNumber

      private int currentPageNumber
    • memoryUsageSetting

      private MemoryUsageSetting memoryUsageSetting
  • Constructor Details

    • Splitter

      public Splitter()
  • Method Details

    • getMemoryUsageSetting

      public MemoryUsageSetting getMemoryUsageSetting()
      Returns:
      the current memory setting.
    • setMemoryUsageSetting

      public void setMemoryUsageSetting(MemoryUsageSetting memoryUsageSetting)
      Set the memory setting.
      Parameters:
      memoryUsageSetting - The memory setting.
    • split

      public List<PDDocument> split(PDDocument document) throws IOException
      This will take a document and split into several other documents.
      Parameters:
      document - The document to split.
      Returns:
      A list of all the split documents. These should all be saved before closing any documents, including the source document. Any further operations should be made after reloading them, to avoid problems due to resource sharing. For the same reason, they should not be saved with encryption.
      Throws:
      IOException - If there is an IOError
    • setSplitAtPage

      public void setSplitAtPage(int split)
      This will tell the splitting algorithm where to split the pages. The default is 1, so every page will become a new document. If it was two then each document would contain 2 pages. If the source document had 5 pages it would split into 3 new documents, 2 documents containing 2 pages and 1 document containing one page.
      Parameters:
      split - The number of pages each split document should contain.
      Throws:
      IllegalArgumentException - if the page is smaller than one.
    • setStartPage

      public void setStartPage(int start)
      This will set the start page.
      Parameters:
      start - the 1-based start page
      Throws:
      IllegalArgumentException - if the start page is smaller than one.
    • setEndPage

      public void setEndPage(int end)
      This will set the end page.
      Parameters:
      end - the 1-based end page
      Throws:
      IllegalArgumentException - if the end page is smaller than one.
    • processPages

      private void processPages() throws IOException
      Interface method to handle the start of the page processing.
      Throws:
      IOException - If an IO error occurs.
    • createNewDocumentIfNecessary

      private void createNewDocumentIfNecessary() throws IOException
      Helper method for creating new documents at the appropriate pages.
      Throws:
      IOException - If there is an error creating the new document.
    • splitAtPage

      protected boolean splitAtPage(int pageNumber)
      Check if it is necessary to create a new document. By default a split occurs at every page. If you wanted to split based on some complex logic then you could override this method. For example. protected void splitAtPage() { // will split at pages with prime numbers only return isPrime(pageNumber); }
      Parameters:
      pageNumber - the 0-based page number to be checked as splitting page
      Returns:
      true If a new document should be created.
    • createNewDocument

      protected PDDocument createNewDocument() throws IOException
      Create a new document to write the split contents to.
      Returns:
      the newly created PDDocument.
      Throws:
      IOException - If there is an problem creating the new document.
    • processPage

      protected void processPage(PDPage page) throws IOException
      Interface to start processing a new page.
      Parameters:
      page - The page that is about to get processed.
      Throws:
      IOException - If there is an error creating the new document.
    • processAnnotations

      private void processAnnotations(PDPage imported) throws IOException
      Throws:
      IOException
    • getSourceDocument

      protected final PDDocument getSourceDocument()
      The source PDF document.
      Returns:
      the pdf to be split
    • getDestinationDocument

      protected final PDDocument getDestinationDocument()
      The source PDF document.
      Returns:
      current destination pdf