- pdfgrep is a tool to search text in PDF files, pdfgrep supports POSIX as well as perl compatible regular expressions (PCRE). Gitlab repo
- Apache PDFBox comes with a series of command-line utilities.
Some usage examples:
$ java -jar pdfbox-app-2.y.z.jar PDFMerger <Source PDF files (2 ..n)> <Target PDF file>
$ java -jar pdfbox-app-2.y.z.jar PDFSplit [OPTIONS] <PDF file>
$ java -jar pdfbox-app-2.y.z.jar ExtractImages [OPTIONS] <inputfile>
$ java -jar pdfbox-app-2.y.z.jar ExtractText [OPTIONS] <inputfile> [Text file]