PDFoutline.py Help

Maintaining a PDF's Bookmarks and Meta Information

This is a Python script for maintaining a PDF document's bookmark tree, also known as the "table of contents (TOC)".

It is based on PyMuPDF version 1.9.1 (or above) and wxPython version 3.0.2 (or above).

Input

It can be invoked via CLI and be given a PDF file name as parameter. If invoked without parameter, the standard file open dialog of the system will be presented to locate a PDF file.

Encrypted files are supported for input: A password entry dialog will ask for the password to access the file.

Output

Once you have finalized the new TOC version, you can save the result in the current file (the PDF incremental update technique is used in this case) or under a new filename. If the input has been decrypted or if it has an inconsistent PDF file structure, you must save your work under a new filename. Encryption is not supported as an output option.

At any time, you may interrupt your work: Press the Check Data button before quitting. The current state of your work will be saved under <PDF-filename>.json in the PDF's directory. The PDF itself will remain unchanged. On resuming work, this parameter file will be offered to restore the previous state. If you deny, it will be deleted. It will also be deleted, if you press SAVE. The parameter file will be updated every 60 seconds while the program is awaiting input ("auto-save feature").

Main Dialog

The program's dialog consists of the following main parts:

The right part is used to display the current PDF page. If bookmarks point to this page, their target location is indicated as a horizontal red line. You can navigate in the file via buttons, the mouse wheel or by entering a specific page number.

The upper left part consists of a grid where the TOC is displayed in tabular format.

The lower left part displays file and document meta information.

Maintaining File Information

You can change some metadata fields (author, title, subject and keywords), others are automatically set:

Maintaining the TOC

Upon program start, existing bookmark (outline) entries are extracted using method getToC(simple = False). This information is displayed in the TOC grid in tabular format. It contains the four columns Level, Title, Page and Height. Any of this information can be changed any time without changing the underlying PDF. If the PDF currently does not contain bookmarks, a dummy entry will be displayed in the grid.

Maintaining TOC Rows

There is no separate new row function. If a PDF contains no outline at all, a dummy bookmark is displayed in the grid to serve as a template for new entries.

Maintaining Grid Cells

You can overtype any cell at any time. Pressing ENTER will move to the next cell to the right (wrapping around to next row). On double-clicking any cell that is not currently selected, the corresponding PDF page will be displayed. The following explains any special cell behaviors:

Validating Input

Error checking your input will only take place when you press Check Data. The following will be checked:

Finishing Work

Changing TOC information will disable the SAVE button. Press Check Data to validate your input and to enable the save button. As mentioned above, Check Data / QUIT will save your input but will not change the PDF.

Exception Handling

If an exception occurs while saving the PDF, traceback information will be displayed and also saved under filename <PDF-filename>.txt in the PDF's directory.