OSINT (Open Source Intelligence) is the art of gathering intelligence from publicly available sources. For ethical hackers, OSINT is often the first and most critical phase of a security assessment. Unlike network scanning, OSINT requires no direct access to a system — it’s pure detective work using public data.
Why OSINT Matters
Before any penetration test begins, professionals spend 30-50% of their time on reconnaissance. OSINT provides:
- Organization structure and employee lists (LinkedIn, company websites)
- Infrastructure details (IP ranges, domain registrations, server locations)
- Security misconfigurations (publicly exposed files, hardcoded credentials)
- Social engineering intelligence (email formats, employee names, job titles)
- Technology stack (frameworks, CMS versions, third-party services)
The more you know before touching the network, the more surgical and effective your testing becomes.
Core OSINT Techniques
1. Google Dorking (Advanced Search)
Google is a reconnaissance goldmine if you know the right syntax. Google dorking uses special operators to find specific information.
Common operators:
| Operator | Purpose | Example |
|---|
site: | Search within a domain | site:company.com filetype:pdf |
intitle: | Search page titles | intitle:"admin" site:company.com |
inurl: | Search URLs | inurl:admin login |
filetype: | Find specific file types | filetype:xlsx credentials |
cache: | View Google’s cached version | cache:company.com |
- | Exclude results | site:company.com -blog |
Practical examples:
# Find exposed backup files
site:company.com filetype:sql OR filetype:bak
# Discover admin panels
site:company.com intitle:"admin" inurl:login
# Find exposed AWS S3 buckets
site:s3.amazonaws.com "company"
# Locate exposed API keys
inurl:github "company" "api_key" OR "secret"
Caution: Accessing unauthorized pages found through dorking is illegal. Viewing cached content is generally acceptable for reconnaissance.
2. Shodan (The “Google for Devices”)
Shodan scans the internet for exposed services and devices. Unlike Google, Shodan indexes service banners, headers, and metadata.
Setting up Shodan:
Visit https://www.shodan.io and create a free account. The CLI is optional but powerful.
Free account includes:
- Basic search capabilities
- Limited results per query
- API access (limited requests)
Practical Shodan searches:
# Find all Apache servers for a company
org:"Company Name" Apache
# Discover exposed CCTV cameras
port:8080 title:"IP Camera"
# Locate Elasticsearch instances
port:9200 "elasticsearch"
# Find exposed databases
port:27017 mongodb
# Discover specific company infrastructure
org:"Company Name" country:"US"
Interpretation example:
IP: 203.0.113.45
Port: 80
Service: Apache/2.4.41 (Ubuntu)
Location: San Francisco, USA
Organization: Company Name Inc.
This tells you exactly what’s running, where, and who owns it.
3. Maltego (Visual Investigation)
Maltego is a visual OSINT platform that maps relationships between data points. It transforms scattered information into connected graphs.
Maltego Desktop (free Community Edition):
What Maltego can do:
- Link email addresses to social accounts
- Discover DNS records and MX servers
- Map organizational relationships
- Visualize domain connections
- Identify related infrastructure
Basic workflow:
- Add an entity (domain, email, IP, person)
- Select transforms (lookup functions)
- Visualize the resulting graph
- Pivot on new discoveries
For example: Start with “company.com” → discover MX records → identify email format → search LinkedIn for employees with that format → find social media accounts.
4. DNS Reconnaissance
DNS contains enormous amounts of intelligence about infrastructure.
Tools and commands:
# Query DNS records
nslookup company.com
dig company.com ANY
# Enumerate subdomains (using wordlist)
dnsrecon -d company.com -D subdomains.txt
# Zone transfer (sometimes works)
dig @ns1.company.com company.com axfr
# Reverse DNS lookup
nslookup 203.0.113.45
# Certificate-based subdomain discovery
curl -s "https://crt.sh/?q=%25.company.com&output=json" | jq
What to look for:
- Multiple DNS providers (indicates complex infrastructure)
- Subdomains revealing internal structure
- MX records showing mail servers
- TXT records with DKIM/SPF policies
5. Email and Employee Discovery
LinkedIn is a primary source for organizational intelligence.
Manual reconnaissance:
- Search employees by company
- Note job titles, departments, and skills
- Identify email format (firstname.lastname@, firstinitial.lastname@, etc.)
- Document email addresses and names
Email validation tools:
Example workflow:
- Identify company email format from LinkedIn
- Use Hunter or Clearbit to validate guessed addresses
- Cross-reference employees with social media (Twitter, GitHub)
- Document tech stack and insider knowledge shared publicly
6. Certificate and SSL Analysis
SSL certificates contain identifying information and are searchable.
# Query Certificate Transparency logs
curl -s "https://crt.sh/?q=company.com&output=json" | jq
# Examine certificate directly
openssl s_client -connect company.com:443
# Use SSL tools for broader discovery
echo company.com | python3 sslsubdomains.py
Certificates reveal:
- All subdomains (CN and SANs)
- Company organization name
- Certificate history
- Organizational structure hints
7. GitHub and Public Repositories
Developers often commit sensitive data to public repositories.
Search techniques:
# Find exposed credentials
repo:company org:company password
repo:company api_key
repo:company secret
# Locate internal tools
repo:company internal
repo:company tools
# Find configuration files
filename:config.php company
filename:.env
Tools:
- GitRob - Scans public GitHub repos for sensitive files
- TruffleHog - Searches for secrets in git history
Always use these responsibly. If you find credentials, report them to the organization’s security team.
OSINT Workflow Example
Target: Assess company.com
- DNS enumeration → Find all subdomains and IP ranges
- Shodan/Certificate search → Identify exposed services
- LinkedIn research → Understand organizational structure, tech stack
- GitHub search → Look for publicly exposed code and configurations
- Google dorking → Discover exposed files, admin panels, public data
- Social media → Find employee accounts, technical details
- Whois/Domain registry → Registrant information, name servers
- Compile findings → Create attack surface map
Ethical and Legal Considerations
OSINT is passive reconnaissance — you’re not accessing systems, just analyzing public information. However:
- Know your scope: Only research authorized targets
- Document everything: Keep records of what you found and where
- Don’t access unauthorized content: Finding a link doesn’t mean you should click it
- Don’t impersonate: Never pretend to be someone to gather information
- Report findings: If you find exposed credentials or data, notify the organization
Practical Lab Exercise
Safe practice:
- Research your own company or a public organization’s digital footprint
- Document what’s discoverable without direct access
- Identify potential security risks in publicly available information
- Practice using Google dorking, Shodan, and Maltego
- Note what information should be private
Conclusion
OSINT is reconnaissance at its purest — turning public information into actionable intelligence. It requires no special tools (though tools help), just curiosity, patience, and systematic thinking. Master OSINT and you’ll understand the organization’s attack surface before typing a single command on their network.
The best penetration test starts with the best reconnaissance.