Using PDFtk

Exploring PDFtk utility

When you are working with PDF documents, it is often required to merge them together, rotate some pages or select some of them, etc. This functionality is helpful when you care about the nature and want to avoid printing some pages. It is quite often when some authorities ask you to print a PDF document, sign just one page and send them back a scanned copy. In this situation, the PDFtk utility can be very useful. In this article, I describe some commands I use from time to time.

Table of Contents

Installation

Before exploring the commands, I should say some words about PDFtk installation. Personally, I use an Ubuntu flavour operating system, so I will describe the installation steps for this OS. Basically, you can install the binary of PDFtk either using snap or using external package repository.

Installation Using Snap

In order to install PDFtk using snap, open a terminal and execute the following simple command:

$ sudo snap install pdftk
Currently (Ubuntu 18.04), snap cannot work with the user home directories if they are not under / or /home (please, check this discussion thread), therefore, you’ll not be able to work with documents on NFSes, other drives, etc. Therefore, I recommend installing PDFtk using the second approach.

Installation from External Package Repository

I use PDFtk installed from external package repository. Of course, if you do not trust the maintainer you should not add the repository to your operating system trusted ones and install software from it. However, I prefer this method over the building the program from sources (this is the third method, but I rarely use this approach):

$ sudo add-apt-repository ppa:malteworld/ppa # adding external repository
$ sudo apt update                            # updating package index
$ sudo apt install pdftk                     # installing PDFtk

How to Use PDFtk

After the installation check if the software has been installed correctly. Open a terminal, run the command pdftk and you should see the output similar to the one on Figure 1.

Output of the command
Output of the command

There are a lot of everyday operations that can be facilitated by PDFtk. With this utility it is possible (taken from the help):

  • Merge PDF Documents or Collate PDF Page Scans
  • Split PDF Pages into a New Document
  • Rotate PDF Documents or Pages
  • Decrypt Input as Necessary (Password Required)
  • Encrypt Output as Desired
  • Fill PDF Forms with X/FDF Data and/or Flatten Forms
  • Generate FDF Data Stencils from PDF Forms
  • Apply a Background Watermark or a Foreground Stamp
  • Report PDF Metrics, Bookmarks and Metadata
  • Add/Update PDF Bookmarks or Metadata
  • Attach Files to PDF Pages or the PDF Document
  • Unpack PDF Attachments
  • Burst a PDF Document into Single Pages
  • Uncompress and Re-Compress Page Streams
  • Repair Corrupted PDF (Where Possible)

You can also read the complete help by executing the following command:

$ pdftk --help
In what follows, I assume your working directory to be the one where the original and result documents are located.

Splitting a Document

Sometimes you need to split the document into several ones. For instance, sometimes it is required to put even and odd pages into separate documents. In order to do this, execute the following two commands:

# selects pages 1, 3 and puts them into odd.pdf
$ pdftk lorem.pdf cat 1-endodd output odd.pdf
# selects pages 2, 4 and puts them into even.pdf
$ pdftk lorem.pdf cat 1-endeven output even.pdf 

Here, you specify the range of the pages (from the first till the end of the document 1-end) and add the qualifier (odd or even) that defines what pages to select.

Using similar command you can also remove particular pages from a document:

$ pdftk lorem.pdf cat 1-2 4 output dropped.pdf

This command will take the lorem.pdf document as an input, remove page 3 (actually, it will take pages 1-2 and 4 and {cat}enate them), and put the result into the new document dropped.pdf. It is also possible to remove several pages from the document. In this case, you just need to specify what page ranges to remain in the result document.

Merging Documents

In order to merge two documents the following command should be used:

$ pdftk lorem1.pdf lorem2.pdf cat output merged.pdf

This command will append document lorem2.pdf to the end of lorem1.pdf and put the results into the merged.pdf document. Sometimes instead of appending it is required to merge document shuffling the pages, e.g., the first page from the first document, the first from the second document, then the second page from the first document, the second from the second and so on. In this case, the operation shuffle should be used:

$ pdftk lorem1.pdf lorem2.pdf shuffle output shuffled.pdf

Using these commands it is possible also to concatenate only some pages into the result document. For instance, the following command will take page 2 from the first document and page 3 from the second document:

$ pdftk A=lorem1.pdf B=lorem2.pdf cat A2 B3 output merged.pdf

Notice, I added the handles for the input documents (the letters A and B staying before the sign =) and later used them to specify what pages to take from which document. If you are working with a single document, the usage of the handles is not compulsory, however, they become very handy if you need to work with several documents.

Changing Document Pages Order

It is obvious that using the cat operation it is possible also to change the order of document pages by numbering one page after another. However, more efficient way is to use specific keywords. For instance, to revert the order of the pages in the document run the following command (in this case, a keyword end is used to point to the last page of the document):

$ pdftk lorem.pdf cat end-1 output reversed.pdf

Rotating Pages

Using PDFtk, it is possible to rotate pages in the document. In order to achieve this goal, the qualifiers north (0), east (90), south (180), west (270), left (-90), right (+90), and down (+180) are used (they rotate pages by the number of degrees specified in the parenthesis). For instance, the following command rotate all pages in the document by 90 degrees:

$ pdftk lorem.pdf cat 1-endeast output rotated.pdf

Encrypting Document

Sometimes, it is required to set a password on a PDF document, so it will show a password prompt in order to be opened. It is a very useful feature if you want to protect your document from unwanted eyes. In order to encrypt the document, you can use the following command:

$ pdftk lorem.pdf output lorem_protected.pdf user_pw "myuserpassword"

It is also possible to set “owner” password, which protects different permissions (e.g., copying the content or printing the document) on the document, however, they are useful only if the software opening this document respects these permissions. Otherwise, despite the set permissions a user will be able to do anything with the document. Therefore, this option does not seem to me valuable.

Currently, PDFtk provides a possibility to encrypt a document using RC4 algorithm using 128 (default) or 40 bit scheme. Currently, this encryption algorithm is not considered as secure so as powerful agencies may crack the password. If you are required to protect a very sensitive documents, it is better to use AES 256bit algorithm. Unfortunately, PDFtk currently does not support this algorithm. Instead, you can use qpdf to protect your document:

$ qpdf --encrypt "myuserpassword" " " 256 -- lorem.pdf lorem_protected.pdf

In this command, the “myuserpassword” represents user password, while in the second quotes you specify owner password (which is set to a space). If you check the properties of this kind of document using pdfinfo you should get something like this:

$ pdfinfo -upw "myuserpassword" lorem_protected.pdf 
Title:          ProjectPlanLast
Creator:        Pages
Producer:       macOS Version 10.14.5 (Build 18F132) Quartz PDFContext
CreationDate:   Mon Jun 17 18:13:17 2019 +03
ModDate:        Mon Jun 17 18:13:17 2019 +03
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          3
Encrypted:      yes (print:yes copy:yes change:yes addNotes:yes algorithm:AES-256)
Page size:      595.28 x 841.89 pts (A4)
Page rot:       0
File size:      45754 bytes
Optimized:      no
PDF version:    1.7

You can mention that the algorithm AES-256 is used in this case to encrypt the document.

Conclusion

Of coarse, in this article I have mentioned not all operations that can be done using PDFtk. However, these are the ones I use more often. In the future, I will extend the articles with new ones, however, meanwhile you can explore the utility on your own.

Related