Описание тега pdftotext
pdftotext is a command-line utility for converting PDF files to plain text files—i.e. extracting raw text from PDF-encapsulated files.
pdftotext
is freely available and included by default with many Linux distributions, and is also available for Windows as part of the Xpdf Windows port. Poppler, which is derived from Xpdf, also includes an implementation of pdftotext
and included as part of the poppler-utils package on most major Linux distributions.
However, there are also others CLI-based PDF text extraction tools with a similar or equal name. While they (for the most part) work in the same way, they may give different results. So, only us this tag for CLI-based pdftotext
tools and variants and make sure to point out your specific version and environment.
Do not use this tag if you use a different extraction tool, i.e. a GUI-based PDF to text converter, an online PDF to Text converter, or another (commercial) tool.