PDFiD
Scans PDF files for certain PDF keywords to identify potentially malicious documents. Helps detect features like JavaScript, encryption, or auto-actions without full parsing.
Description
PDFiD is a tool designed to scan PDF files for specific keywords associated with suspicious or malicious content. It identifies documents containing JavaScript, OpenActions, AcroForms, or other features that could be exploited for malware delivery. The tool handles name obfuscation, making it effective against evasion attempts.
Use cases include malware analysis, document forensics, and preemptive security checks on PDFs from untrusted sources. It provides counts of PDF objects like obj, stream, xref, and flags suspicious elements like /Encrypt, /JS, /JBIG2Decode, or /Launch.
Unlike full PDF parsers, PDFiD performs lightweight keyword scanning, allowing quick triage of files. It supports single files, directories, zip archives, URLs, and batch lists via @file.
How It Works
PDFiD scans files for PDF-specific keywords and structures without parsing the full document. It counts occurrences of elements like %PDF header, obj/endobj, stream/endstream, xref, trailer, and dictionary keys such as /Page, /Encrypt, /JS, /JavaScript, /OpenAction, /AcroForm, /JBIG2Decode, /RichMedia, /Launch, /EmbeddedFile, and /Colors > 2^24. The tool reports these counts to flag potentially malicious PDFs. It handles obfuscation by searching for variations and can force scans on files lacking proper headers.
Installation
sudo apt install pdfidFlags
Examples
pdfid /usr/share/doc/texmf/fonts/lm/lm-info.pdfpdfid -hpdfid file1.pdf file2.pdfpdfid *.pdfpdfid @filelist.txtpdfid /path/to/directorypdfid example.zip