• Find Text
• Merge
• Compare
• Save Web Page as PDF
• Change/View PDF Info
• Convert Scanned PDF to Searchable PDF (OCR)
• Reduce Size of PDF File
• Editors
Find text (for example, "Important") in pdf-files recursively and show page number:
$ pdfgrep -n -r --include "*.pdf" "Important"
Merge three pdf files together:
$ pdftk file1.pdf file2.pdf file3.pdf cat output newfile.pdf
Combine selected pages of two pdf files into a new document:
$ pdftk A=file1.pdf B=file2.pdf cat A1-3 B1-5 A4 output newfile.pdf
diffpdf
compares two PDF files textually or visually.
$ diffpdf file1.pdf file2.pdf
wkhtmltopdf
can save web page to PDF file preserving formatting and hyperlinks.
Some features (headers, margins and etc.) require a patched Qt.
Most Linux distributions provide wkhtmltopdf
without those features
[FAQ].
The binary versions for major Linux distributions, Windows and macOS with all features are provided by developers.
An example of using wkhtmltopdf
:
$ wkhtmltopdf -s A3 -L 25mm -R 25mm --default-header --header-font-size 10 --header-spacing 5 http://example.com/ example.pdf
Firefox Bug #454059 - Creating PDF of web page: hyperlinks are lost. Opened on 2008-09-07.
exiftool
can list and edit meta information. To view the tags:
$ exiftool file.pdf
To change the title of file.pdf
:
$ exiftool -overwrite_original -Title="Title of PDF Document" file.pdf
The writable tags are Author
, Creator
, Keywords
("keyword1;keyword2"), Producer
, Subject
and Title
[PDF Tags - exiftool.org].
$ convert -density 300 scanned-pdf.pdf converted-png.png
If pdf document has 2 pages, then 2 png files will be created: converted-png-0.png
and converted-png-1.png
.
OCR png files by tesseract
setting proper language ("-l deu
" in case of German text)
$ tesseract converted-png-0.png new-pdf-page1 -l deu pdf
$ tesseract converted-png-1.png new-pdf-page2 -l deu pdf
Merge pages
pdftk new-pdf-page1.pdf new-pdf-page2.pdf cat output new-pdf.pdf
The size of pdf file can be reduced
$ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dColorImageDownsampleType=/Bicubic -dColorImageResolution=300 -dPDFSETTINGS=/screen -sOutputFile=fileout.pdf filein.pdf
-dPDFSETTINGS
options [Milan Kupcevic]:
/screen
: screen-view-only quality, 72 dpi images;
/ebook
: low quality, 150 dpi images;
/printer
: high quality, 300 dpi images;
/prepress
: high quality, color preserving, 300 dpi images.
«Master PDF Editor is a proprietary application to edit PDF documents on Linux, Windows and macOS. It can create, edit (insert text or images), annotate, view, encrypt, and sign PDF documents. With version 5, Master PDF Editor has removed some features from its free to use version, like editing or adding text, inserting images, and more - when using such tools, the application adds a big watermark to the PDF document unless users buy the full version.» [linuxuprising.com]