Skip to content

PDFextract is a convenient-to-use CLI wrapper for pdftk which enables the user to easily extract multiple page ranges from a PDF file.

License

Notifications You must be signed in to change notification settings

mguenther/pdfextract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDFextract

PDFextract is a convenient CLI-wrapper for pdftk which enables the user to easily extract multiple pages (consecutively or discontinuous) from a given source PDF file. PDFextract saves the extracted artifacts as individual target PDF files (one for each page range) or combines them into a single target PDF.

Dependencies

pdftk must be installed on your system, otherwise PDFextract will fail to execute.

Installation

Clone the repository, then install the script by executing

$ python setup.py install

Examples

  1. Extract pages from a single, continuous page range (pages 3 to 5) from source.pdf and save the output to target.pdf.

    pdfextract source.pdf target.pdf 3-5
    
  2. Extract pages from discontinuous page ranges (pages 3 to 5 and 7 to 12) from source.pdf and save the output to target.pdf. This will automatically yield several target PDFs, each suffixed with the respective page range.

    pdfextract source.pdf target.pdf 3-5,7-12
    
  3. Extract pages from discontinuous page ranges (pages 3 to 5 and 7 to 12) from source.pdf and save the output to a single target.pdf.

    pdfextract source.pdf target.pdf 3-5,7-12 --join
    

License

This script is released under the MIT license.

About

PDFextract is a convenient-to-use CLI wrapper for pdftk which enables the user to easily extract multiple page ranges from a PDF file.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages