Forensicspdfmalwareanalysisjavascriptstaticscan

PDFiD

Scans PDF files for certain PDF keywords to identify potentially malicious documents. Helps detect features like JavaScript, encryption, or auto-actions without full parsing.

Description

PDFiD is a tool designed to scan PDF files for specific keywords associated with suspicious or malicious content. It identifies documents containing JavaScript, OpenActions, AcroForms, or other features that could be exploited for malware delivery. The tool handles name obfuscation, making it effective against evasion attempts.

Use cases include malware analysis, document forensics, and preemptive security checks on PDFs from untrusted sources. It provides counts of PDF objects like obj, stream, xref, and flags suspicious elements like /Encrypt, /JS, /JBIG2Decode, or /Launch.

Unlike full PDF parsers, PDFiD performs lightweight keyword scanning, allowing quick triage of files. It supports single files, directories, zip archives, URLs, and batch lists via @file.

How It Works

PDFiD scans files for PDF-specific keywords and structures without parsing the full document. It counts occurrences of elements like %PDF header, obj/endobj, stream/endstream, xref, trailer, and dictionary keys such as /Page, /Encrypt, /JS, /JavaScript, /OpenAction, /AcroForm, /JBIG2Decode, /RichMedia, /Launch, /EmbeddedFile, and /Colors > 2^24. The tool reports these counts to flag potentially malicious PDFs. It handles obfuscation by searching for variations and can force scans on files lacking proper headers.

Installation

bash
sudo apt install pdfid

Flags

--versionshow program's version number and exit
-h, --helpshow this help message and exit
-s, --scanscan the given directory
-a, --alldisplay all the names
-e, --extradisplay extra data, like dates
-f, --forceforce the scan of the file, even without proper %PDF header
-d, --disarmdisable JavaScript and auto launch

Examples

Scans a sample PDF file and outputs counts of PDF elements like obj (526), stream (151), /Page (26), and confirms no suspicious features like /Encrypt or /JS
pdfid /usr/share/doc/texmf/fonts/lm/lm-info.pdf
Displays the full help message with usage and all available options
pdfid -h
Scans multiple PDF files provided as arguments
pdfid file1.pdf file2.pdf
Scans all PDF files matching the wildcard pattern
pdfid *.pdf
Runs PDFiD on each file listed in the specified text file
pdfid @filelist.txt
Uses --scan option to recursively scan all PDFs in a directory
pdfid /path/to/directory
Scans PDF files contained within a zip archive
pdfid example.zip
Updated 2026-04-16kali.org ↗