Information Gatheringosintcrawlerwebreconnaissanceurlsemailsfileskeys

Photon

Photon is an incredibly fast crawler designed for open source intelligence (OSINT). It extracts URLs, intel like emails and social media, files, secret keys, JavaScript files, and more while crawling.

Description

Photon is a fast and flexible crawler tailored for open source intelligence gathering. It systematically crawls websites to extract valuable data such as in-scope and out-of-scope URLs, URLs with parameters, intel including emails, social media accounts, and Amazon buckets, various file types like PDFs and PNGs, secret keys such as auth/API keys and hashes, JavaScript files along with their endpoints, strings matching custom regex patterns, and subdomains with DNS-related data.

Use cases include reconnaissance during penetration testing, OSINT investigations, and security assessments where comprehensive web crawling is needed to map out a target's digital footprint. The tool organizes extracted information for easy analysis or exports it as JSON or CSV for further processing.

It supports customizable crawling levels, threading for speed, delays to respect rate limits, and various output options, making it versatile for both quick scans and deep dives into web applications.

How It Works

Photon operates as a multi-threaded web crawler starting from a root URL, recursively following links up to specified levels. It uses HTTP requests with optional cookies, custom user agents, and headers to mimic browser behavior, respecting timeouts and delays. During crawling, it parses responses to identify and extract targeted data types using pattern matching, regex, and heuristics for URLs, parameters, intel, files, keys, JS endpoints, and DNS/subdomains. Data is organized into directories or exported in structured formats like JSON.

Installation

bash
sudo apt install photon

Flags

-h, --helpshow this help message and exit
-u, --url ROOTroot url
-c, --cookie COOKcookie
-r, --regex REGEXregex pattern
-e, --export {csv,json}export format
-o, --output OUTPUToutput directory
-l, --level LEVELlevels to crawl
-t, --threads THREADSnumber of threads
-d, --delay DELAYdelay between requests
-vverbose output
-s, --seeds SEEDS [SEEDS ...]seed URLs
--stdout STDstdout output
--user-agent USER_AGENTcustom user agent
--exclude EXCLUDEexclude paths
--timeout TIMEOUTrequest timeout
--cloneclone resources
--headersextract headers
--dnsDNS enumeration
--keysextract keys
--only-urlsonly extract URLs
--waybackuse Wayback Machine

Examples

Display the help message and usage information for Photon.
photon -h
Crawl the root URL example.com and extract data like URLs, intel, files, and keys.
photon -u https://example.com
Crawl example.com up to 2 levels deep.
photon -u https://example.com -l 2
Crawl example.com and save organized output to the specified directory.
photon -u https://example.com -o output_dir
Crawl example.com and export extracted data as JSON.
photon -u https://example.com -e json
Crawl example.com and extract strings matching the custom regex pattern.
photon -u https://example.com -r "api_key:[a-zA-Z0-9]{32}"
Crawl example.com with 50 threads and 0.1 second delay between requests.
photon -u https://example.com -t 50 -d 0.1
Crawl example.com with DNS enumeration, key extraction, and Wayback Machine integration.
photon -u https://example.com --dns --keys --wayback
Updated 2026-04-16kali.org ↗