OPERATIONAL SECURITY & OSINT TACTICAL GUIDE

Google Dorking on .gov.uk

Google Dorking (advanced search operators) is a powerful method for mapping attack surfaces, discovering forgotten subdomains, and finding leaking documents containing sensitive metadata across the UK government's digital estate.

Technique 01 // Subdomain Discovery

Mapping the Perimeter

By default, most users interact with www.gov.uk. However, hundreds of undocumented local, regional, and staging subdomains exist. By using the exclusion operator, we can filter out main portals and find hidden infrastructure.

This helps identify legacy platforms, internal dashboards that have accidentally been exposed to search engine spiders, and localized services.

ALL SUBDOMAINS EXCLUDING MAIN PORTAL
site:*.gov.uk -site:www.gov.uk
FIND STAGING / DEV / TEST SITES
site:*.gov.uk inurl:dev OR inurl:test OR inurl:staging
Google search showing subdomain discovery search operators
Figure 1: Finding unlinked subdomains under the gov.uk parent space.
Technique 02 // Document & Metadata Harvesting

Extracting Hidden Metadata

Government organizations publish thousands of office documents annually. Older formats (like .doc, .xls, .ppt) contain EXIF metadata including:

  • Author name & organization structure
  • Local filepath templates (which reveal internal active directory usernames)
  • Software suites & OS build versions
  • Internal network printer paths
SEARCH FOR MICROSOFT OFFICE FILES
site:gov.uk filetype:doc OR filetype:xls OR filetype:ppt
EXPOSED CONFIGURATION AND DATABASE DUMPS
site:gov.uk filetype:sql OR filetype:env OR filetype:ini
Google search showing filetype search operators
Figure 2: Identifying indexed legacy .doc files under gov.uk domains.
Technique 03 // Sequential ID Discovery & IDOR Mapping

Brute-forcing Sequential Assets (e.g., Cafcass Case Study)

A common design pattern on CMS setups is hosting media attachments inside directories structured with auto-incrementing numerical IDs. A classic example is Cafcass (Children and Family Court Advisory and Support Service).

When file pathways follow structures like: https://www.cafcass.gov.uk/media/1042/ an analyst or attacker can easily automate a crawler to cycle from 1 to 10000. This uncovers hidden PDFs, drafts, or media attachments that were uploaded to the backend but never linked on public navigation pages.

Google Dorks can be used to query these media directories directly. If the search engine has crawled them, we can view their listing even without direct links.

CRAWL INDEXED CAFCASS MEDIA AND ATTACHMENTS
site:cafcass.gov.uk/media/
Cafcass sequential media directory search results
Figure 3: Cafcass media files stored sequentially.

Directory Listing Indexes

When web servers do not disable directory indexing, requesting a folder path returns an HTML index of files. Using dorks like intitle:"index of" or searching for directories that contain files helps threat researchers discover directory structures.

Directory listing exposed inside a search engine
Figure 4: Index of directories indicating raw uploaded files.
INDEX OF SENSITIVE ARCHIVE FORMATS
site:gov.uk intitle:"index of" "backup" OR "zip"
WP-CONTENT AND PLUGINS DIRECTORY LISTINGS
site:gov.uk intitle:"index of" "wp-content/uploads"