Ethical Hacking #OSINT#reconnaissance#ethical hacking

What Is OSINT? Open Source Intelligence for Ethical Hackers

Master OSINT techniques for reconnaissance: Google dorking, Shodan, Maltego, social media investigation, and public record searches.

10 min read

OSINT (Open Source Intelligence) is the art of gathering intelligence from publicly available sources. For ethical hackers, OSINT is often the first and most critical phase of a security assessment. Unlike network scanning, OSINT requires no direct access to a system — it’s pure detective work using public data.

Why OSINT Matters

Before any penetration test begins, professionals spend 30-50% of their time on reconnaissance. OSINT provides:

  • Organization structure and employee lists (LinkedIn, company websites)
  • Infrastructure details (IP ranges, domain registrations, server locations)
  • Security misconfigurations (publicly exposed files, hardcoded credentials)
  • Social engineering intelligence (email formats, employee names, job titles)
  • Technology stack (frameworks, CMS versions, third-party services)

The more you know before touching the network, the more surgical and effective your testing becomes.

Core OSINT Techniques

Google is a reconnaissance goldmine if you know the right syntax. Google dorking uses special operators to find specific information.

Common operators:

OperatorPurposeExample
site:Search within a domainsite:company.com filetype:pdf
intitle:Search page titlesintitle:"admin" site:company.com
inurl:Search URLsinurl:admin login
filetype:Find specific file typesfiletype:xlsx credentials
cache:View Google’s cached versioncache:company.com
-Exclude resultssite:company.com -blog

Practical examples:

# Find exposed backup files
site:company.com filetype:sql OR filetype:bak

# Discover admin panels
site:company.com intitle:"admin" inurl:login

# Find exposed AWS S3 buckets
site:s3.amazonaws.com "company"

# Locate exposed API keys
inurl:github "company" "api_key" OR "secret"

Caution: Accessing unauthorized pages found through dorking is illegal. Viewing cached content is generally acceptable for reconnaissance.

2. Shodan (The “Google for Devices”)

Shodan scans the internet for exposed services and devices. Unlike Google, Shodan indexes service banners, headers, and metadata.

Setting up Shodan:

Visit https://www.shodan.io and create a free account. The CLI is optional but powerful.

Free account includes:

  • Basic search capabilities
  • Limited results per query
  • API access (limited requests)

Practical Shodan searches:

# Find all Apache servers for a company
org:"Company Name" Apache

# Discover exposed CCTV cameras
port:8080 title:"IP Camera"

# Locate Elasticsearch instances
port:9200 "elasticsearch"

# Find exposed databases
port:27017 mongodb

# Discover specific company infrastructure
org:"Company Name" country:"US"

Interpretation example:

IP: 203.0.113.45
Port: 80
Service: Apache/2.4.41 (Ubuntu)
Location: San Francisco, USA
Organization: Company Name Inc.

This tells you exactly what’s running, where, and who owns it.

3. Maltego (Visual Investigation)

Maltego is a visual OSINT platform that maps relationships between data points. It transforms scattered information into connected graphs.

Maltego Desktop (free Community Edition):

What Maltego can do:

  • Link email addresses to social accounts
  • Discover DNS records and MX servers
  • Map organizational relationships
  • Visualize domain connections
  • Identify related infrastructure

Basic workflow:

  1. Add an entity (domain, email, IP, person)
  2. Select transforms (lookup functions)
  3. Visualize the resulting graph
  4. Pivot on new discoveries

For example: Start with “company.com” → discover MX records → identify email format → search LinkedIn for employees with that format → find social media accounts.

4. DNS Reconnaissance

DNS contains enormous amounts of intelligence about infrastructure.

Tools and commands:

# Query DNS records
nslookup company.com
dig company.com ANY

# Enumerate subdomains (using wordlist)
dnsrecon -d company.com -D subdomains.txt

# Zone transfer (sometimes works)
dig @ns1.company.com company.com axfr

# Reverse DNS lookup
nslookup 203.0.113.45

# Certificate-based subdomain discovery
curl -s "https://crt.sh/?q=%25.company.com&output=json" | jq

What to look for:

  • Multiple DNS providers (indicates complex infrastructure)
  • Subdomains revealing internal structure
  • MX records showing mail servers
  • TXT records with DKIM/SPF policies

5. Email and Employee Discovery

LinkedIn is a primary source for organizational intelligence.

Manual reconnaissance:

  • Search employees by company
  • Note job titles, departments, and skills
  • Identify email format (firstname.lastname@, firstinitial.lastname@, etc.)
  • Document email addresses and names

Email validation tools:

Example workflow:

  1. Identify company email format from LinkedIn
  2. Use Hunter or Clearbit to validate guessed addresses
  3. Cross-reference employees with social media (Twitter, GitHub)
  4. Document tech stack and insider knowledge shared publicly

6. Certificate and SSL Analysis

SSL certificates contain identifying information and are searchable.

# Query Certificate Transparency logs
curl -s "https://crt.sh/?q=company.com&output=json" | jq

# Examine certificate directly
openssl s_client -connect company.com:443

# Use SSL tools for broader discovery
echo company.com | python3 sslsubdomains.py

Certificates reveal:

  • All subdomains (CN and SANs)
  • Company organization name
  • Certificate history
  • Organizational structure hints

7. GitHub and Public Repositories

Developers often commit sensitive data to public repositories.

Search techniques:

# Find exposed credentials
repo:company org:company password
repo:company api_key
repo:company secret

# Locate internal tools
repo:company internal
repo:company tools

# Find configuration files
filename:config.php company
filename:.env

Tools:

  • GitRob - Scans public GitHub repos for sensitive files
  • TruffleHog - Searches for secrets in git history

Always use these responsibly. If you find credentials, report them to the organization’s security team.

OSINT Workflow Example

Target: Assess company.com

  1. DNS enumeration → Find all subdomains and IP ranges
  2. Shodan/Certificate search → Identify exposed services
  3. LinkedIn research → Understand organizational structure, tech stack
  4. GitHub search → Look for publicly exposed code and configurations
  5. Google dorking → Discover exposed files, admin panels, public data
  6. Social media → Find employee accounts, technical details
  7. Whois/Domain registry → Registrant information, name servers
  8. Compile findings → Create attack surface map

OSINT is passive reconnaissance — you’re not accessing systems, just analyzing public information. However:

  • Know your scope: Only research authorized targets
  • Document everything: Keep records of what you found and where
  • Don’t access unauthorized content: Finding a link doesn’t mean you should click it
  • Don’t impersonate: Never pretend to be someone to gather information
  • Report findings: If you find exposed credentials or data, notify the organization

Practical Lab Exercise

Safe practice:

  1. Research your own company or a public organization’s digital footprint
  2. Document what’s discoverable without direct access
  3. Identify potential security risks in publicly available information
  4. Practice using Google dorking, Shodan, and Maltego
  5. Note what information should be private

Conclusion

OSINT is reconnaissance at its purest — turning public information into actionable intelligence. It requires no special tools (though tools help), just curiosity, patience, and systematic thinking. Master OSINT and you’ll understand the organization’s attack surface before typing a single command on their network.

The best penetration test starts with the best reconnaissance.

#Google dorking #Shodan #Maltego #ethical hacking #reconnaissance #OSINT