Web Application Analysiswordlistspiderpasswordcrackingcustomrubymetadataemails

CeWL

CeWL is a custom word list generator that spiders a given URL to a specified depth and extracts words for password cracking. It can also generate email addresses from mailto links and extract usernames from file metadata via FAB.

Description

CeWL (Custom Word List generator) is a ruby app which spiders a given URL, up to a specified depth, and returns a list of words which can then be used for password crackers such as John the Ripper. Optionally, CeWL can follow external links. CeWL can also create a list of email addresses found in mailto links. These email addresses can be used as usernames in brute force actions.

Another tool provided by CeWL project is FAB (Files Already Bagged). FAB extracts the content of the author/creator fields, from metadata of some files, to create lists of possible usernames. These usernames can be used in association with the password list generated by CeWL. FAB uses the same metadata extraction techniques as CeWL and currently processes Office pre 2007, Office 2007 and PDF formats.

CeWL is useful in security tests and forensics investigations. CeWL is pronounced 'cool'.

How It Works

CeWL spiders the target URL using Ruby libraries like ruby-spider and ruby-nokogiri, crawling to a specified depth and extracting words longer than the minimum length. It processes HTML content to build word lists and can optionally follow offsite links, handle authentication (digest or basic), proxy support, and custom headers. FAB component extracts metadata (author/creator fields) from Office and PDF files using ruby-mini-exiftool and ruby-zip.

Installation

bash
sudo apt install cewl

Flags

-h, --helpShow help.
-k, --keepKeep the downloaded file.
-d <x>, --depth <x>Depth to spider to, default 2.
-m, --min_word_lengthMinimum word length, default 3.
-x, --max_word_lengthMaximum word length, default unset.
-o, --offsiteLet the spider visit other sites.
--excludeA file containing a list of paths to exclude.
--auth_typeDigest or basic.
--auth_userAuthentication username.
--auth_passAuthentication password.
--proxy_hostProxy host.
--proxy_portProxy port, default 8080.
--proxy_usernameUsername for proxy, if required.
--proxy_passwordPassword for proxy, if required.
--header, -HIn format name:value - can pass multiple.

Examples

Scan to a depth of 2 (-d 2) and use a minimum word length of 5 (-m 5), save the words to a file (-w docswords.txt), targeting the given URL.
cewl -d 2 -m 5 -w docswords.txt https://example.com
Show help for CeWL.
cewl -h
Spider to depth 2 (default min word length 3) on example.com.
cewl -d 2 https://example.com
Generate wordlist with minimum word length 5 from example.com (default depth 2).
cewl -m 5 https://example.com
Allow spider to visit offsite links from example.com.
cewl -o https://example.com
Use basic authentication to spider example.com.
cewl --auth_type basic --auth_user user --auth_pass pass https://example.com
Show help for FAB metadata extractor.
fab-cewl -h
Extract metadata from file or list of files.
fab-cewl filename/list
Updated 2026-04-16kali.org ↗