COM379 Introduction - University of Sunderland

Download Report

Transcript COM379 Introduction - University of Sunderland

Google Hacking
University of Sunderland
CSEM02
Harry R Erwin, PhD
Peter Dunne, PhD
Basics
•
•
•
•
•
Web Search
Newsgroups
Images
Preferences
Language Tools
Google Queries
•
•
•
•
•
•
Non-case sensitive
* in a query stands for a word
‘.’ in a query is a single character wildcard
Automatic stemming
Ten-word limit
AND (+) is assumed, OR (|) and NOT (-) must be
entered
• “” for a phrase
More Queries
• You can control the language of the pages
and the language of the reports
• You can restrict the search to specific
countries
Controlling Searches
•
•
•
•
•
•
•
•
•
•
Intitle, allintitle
Inurl, allinurl
Filetype
Allintext
Site
Link
Inanchor
Daterange
Cache
Info
•
•
•
•
•
•
•
•
•
•
Related
Phonebook
Rphonebook
Bphonebook
Author
Group
Msgid
Insubject
Stocks
Define
Controlling Searches (II)
• These operators can be used to restrict
searches.
• To restrict the search to the university:
site:sunderland.ac.uk
• Or to search for seventh moon merlot in the
uk: “seventh moon” merlot site:uk
Typical Filetypes
•
•
•
•
•
•
•
Pdf
Ps
Xls
Ppt
Doc
Rtf
Txt
Why Google
• You access Google, not the original website.
• Most crackers access any site, even Google
via a proxy server.
• Why? If you access the cached web page
and it contains images, you will get the
images from the original site.
Directory Listings
•
•
•
•
•
•
•
Search for intitle:index.of
Or intitle:index.of “parent directory”
Or intitle:index.of name size
Or intitle:index.of inurl:admin
Or intitle:index.of filename
This can then lead to a directory traversal
Look for filetype:bak, too, particularly if you want
to expose sql data generated on the fly
Commonly Available Sensitive
Information
•
•
•
•
•
•
•
HR files
Helpdesk files
Job listings
Company information
Employee names
Personal websites and blogs
E-mail and e-mail addresses
Network Mapping
• Site:domain name
• Site crawling, particularly by indicating
negative searches for known domains
• Lynx is convenient if you want lots of hits:
– lynx -dump “http://www.google.com/search?\
– q=site:name+-knownsite&num=100” >\
– test.html
• Or use a Perl script with the Google API
Link Mapping
• Explore the target site to see what it links to.
The owners of the linked sites may be
trusted and yet have weak security.
• The link operator supports this kind of
search.
• Also check the newsgroups for questions
from people at the organization.
Web-Enabled Network Devices
• The Google webspider often encounters
web-enabled devices. These allow an
administrator to query their status or
manage their configuration using a web
browser.
• You may also be able to access network
statistics this way.
Searches to Worry About
•
•
•
•
•
Site:
Intitle:index.of
Error|warning
Login|logon
Username|userid|empl
oyee.ID| “your
username is”
• Password|passcode|
“your password is”
• Admin|administrator
• -ext:html -ext:htm
-ext:shtml -ext:asp
-ext:php
• Inurl:temp|inurl:tmp|
inurl:backup|inurl:bak
• Intranet|help.desk
Protecting Yourselves
•
•
•
•
•
Solid security policy
Public web servers are Public!
Disable directory listings
Block crawlers with robots.txt
<META NAME=“ROBOTS”
CONTENT=“NOARCHIVE”>
• NOSNIPPET is similar.
More Protection
• Passwords
• Delete anything you don’t need from the
standard webserver configuration
• Keep your system patched.
• Hack yourself
• If sensitive data gets into Google, use the
URL removal tools to delete it.