I often have PDF files without page numbers. For example, when I print the exam questions. Nowadays I prepare my exams in Rmarkdown and compile them to HTML, which is the same format that my students will use. But when I print them on Google Chrome they do not have page numbers, or worse: they have (a wrong) date and show the filename in my computer. I used to change the date on my computer and upload the file to a secret folder on my blog, but this is too much trouble for such a small issue. Now I’m just printing without page numbers.
I was resigned to this situation, until my wife asked me to put page numbers into some of her PDF documents. Then I had to find a way to do it. Here is how I solved it.
How to add page numbers to a PDF
Adobe has a paid solution, and is not “command line friendly”. I found a good answer at Command Line FU “Add page numbers to a PDF”. Their suggestion is
enscript -L1 -b'||Page $% of $=' -o- < \
<(for i in $(seq "$(pdftk "$1" dump_data | grep "Num" | cut -d":" -f2)"); \
do echo; done) | ps2pdf - | pdftk "$1" multistamp - output "${1%.pdf}-header.pdf"
I had forgotten about enscript
, a program that I used
ten years ago to print my scripts on the PostScript printer. I tested
the first part by doing
enscript -L1 -b'||Page $% of $=' -o- < <(for i in $(seq 1 429); do echo; done) | \
ps2pdf - test.pdf
and it worked as expected (like a charm). Then upgraded
pdftk
, since the version I had installed never worked. The
official webpage had only a version for MacOS X 10.6, probably a 32 bit
one. But there is a hidden version for 10.11 at https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk_server-2.02-mac_osx-10.11-setup.pkg.
I tested it with the option proposed to find the number of pages:
pdftk 'original.pdf' dump_data |grep Num
It worked, but the grep
pattern has to be more specific.
Using "Num"
yields too many answers. Instead I just opened
the original.pdf
file and took note of the number of pages.
The command I used to test was:
enscript -L1 -b'||Page $% of $=' -o- < \
<(for i in $(seq 1 429); do echo; done) | \
ps2pdf - | \
pdftk 'original.pdf' multistamp - output numbered.pdf
and the result was perfect… except for the location of the numbers.
Extra tricks learned
In bash we can use $(seq 1 429)
instead of the
regular backquote `seq 1 429`
. It is easier to read, and
sometimes easier to understand. In both cases the shell executes the
command inside parenthesis/backquotes and its standard output became
command line arguments for the outer command.
We can also use
<(for i in $(seq 1 429); do echo; done)
to inject the
standard output of a command to the input of another. In this case we
can easily use a pipe, like this:
for i in $(seq 1 429); do echo; done | \
enscript -F Times-Roman10 --fancy-header=footer -L1 -b'||' --footer '||$%' -o - | \
ps2pdf - | pdftk 'original.pdf' multistamp - output numb.pdf
Both shell expansions execute a command and handle its standard
output. They differ in how they deliver this output to the next command.
The $()
syntax delivers the output as arguments, the
<()
syntax delivers in the standard input. Both are
alternatives to classical syntax.
Summary
Next exam will have page numbers in the printed copy.