Web Application Analysisrobots.txtdisallowbingwebaudit

Parsero

Parsero audits Robots.txt files by parsing Disallow entries and checking their HTTP status codes. It reveals potentially sensitive directories or files that search engines are instructed not to index.

Description

Parsero is a Python script that reads the Robots.txt file from a web server and examines the Disallow entries. These entries specify directories or files that should not be indexed by search engines like Google, Bing, or Yahoo. For instance, 'Disallow: /portal/login' prevents crawlers from indexing content at www.example.com/portal/login, helping administrators protect sensitive information from being shared publicly.

The tool is useful for security audits, identifying exposed paths that might contain private data, admin panels, or other restricted areas. It can analyze a single URL or a list of domains, and optionally focus on Bing-indexed Disallows to see what content is actually accessible despite the directives.

By simulating requests to these disallowed paths, Parsero reports HTTP status codes such as 200 OK, 404 Not Found, or redirects, providing insight into server configurations and potential misconfigurations.

How It Works

Parsero fetches the Robots.txt file from the target URL, parses the Disallow directives, and sends HTTP requests to each listed path. It reports the response status codes (e.g., 200 OK, 404 Not Found, 301 Moved Permanently). With -sb, it searches Bing for indexed Disallow entries from Robots.txt. The tool uses Python libraries like BeautifulSoup (bs4) and urllib3 for parsing and HTTP requests.

Installation

bash
sudo apt install parsero

Flags

-h, --helpshow this help message and exit
-u URLType the URL which will be analyzed
-oShow only the "HTTP 200" status code
-sbSearch in Bing indexed Disallows
-f FILEScan a list of domains from a list

Examples

Search for results from www.bing.com using Bing indexed Disallows
parsero -u www.bing.com -sb
Analyze Robots.txt Disallows for www.bing.com without Bing search
parsero -u www.bing.com
Analyze www.bing.com and show only HTTP 200 status codes
parsero -u www.bing.com -o
Scan a list of domains from a file
parsero -f domains.txt
Search Bing indexed Disallows for example.com and show only 200 OK responses
parsero -u example.com -sb -o
Display the help message and usage options
parsero -h
Updated 2026-04-16kali.org ↗