Googling Welcome ! While you are waiting, please… find in your packet: Exercise 6 - Questions for the Final Exercise “What Do You Want Google.
Download
Report
Transcript Googling Welcome ! While you are waiting, please… find in your packet: Exercise 6 - Questions for the Final Exercise “What Do You Want Google.
Googling
Welcome !
While you are waiting, please…
find in your packet:
Exercise 6 - Questions for the Final Exercise
“What Do You Want Google to Tell You?”
begin writing down your questions in
three or more categories
Googling
Instructor: Joe Barker
[email protected]
An Infopeople Workshop
2005
Googling
This Workshop is Brought to You By the
Infopeople Project
Infopeople is a federally-funded grant project supported
by the California State Library. It provides a wide
variety of training to California libraries. Infopeople
workshops are offered around the state and are open
registration on a first-come, first-served basis.
For a complete list of workshops, and for other
information about the Project, go to the Infopeople Web
site at infopeople.org.
Introductions
Name
Library
Position
How do you use Google?
Workshop Overview
Google’s way of “thinking”
Taking charge of the driving
Using limits to find the hard-to-get
Finding information on a subject
Special Google databases and tools
What to do when Google doesn’t work
Go to:
bookmarks.infopeople.org
Click on extreme_googling_bk.htm
Make a bookmark of this page
Add to Favorites
Exercise 1
How does Google “think” about your
searches?
Please pause and wait for discussion when
you reach a
A Close Look at Google
Search Results
• Excerpt of page with your terms
• Matched terms in bold
• Which Google database used
• Approx. # of hits
• Terms actually searched on, as Dictionary links
• URL, size, date last crawled
• Link to Cached copy
• Pages supposedly like this one
• 2nd page from
same site
• All Google pages
from this site
Don’t believe the number of Results
They are approximate, changing, and not comprehensive
Default Matching on Search Terms
Default AND between terms
Google takes a FUZZY approach
only some of the words if a page is “important”
words may occur only in pages that link to the page
words occur somewhere on the site a page belongs to
Cached reveals the page as Google found it
may differ from the current page
Cached exists if a page is full-text indexed
About 1 billion pages in Google are not cached
Not fully searchable
no Cached if a page owner requests not to be cached
How Can You Know
Why Google Found a Page ?
Click Cache link toward end of results
top area often explains what was matched
Stemming
Google
stems “when appropriate”
automatically detects word stem or root
retrieves with various endings
kite flying gets kite kites kiting
fly flying, flyers, flyer’s, flyers’
to turn off
+kite +flying
“kite flying”
single word searches not stemmed
Words Google Does Not Search
Common or “stop” words ignored
to be or not to be
no list of “common” terms
Google tells you below search box in results
to turn off
+to +be +or not +to +be
“to be or not to be”
single word searches possible on common words
Ranking of Results
Word order matters
favoring phrases (words together)
looks for phrases with something in place of
stop words
word repetition and proximity also count
Google ranking is a great mystery
PageRank combines many factors
popularity - links to a page and their importance
“importance” - a value of 0 (low) to 10 (high)
term placement - phrases, proximity, repetition
See Cheat Sheet #1
Google Preferences
Interface language
Selected languages for pages
SafeSearch filtering
Number of results returned
“moderate” is default
20 or 30 is best
Open new browser window for search
results
Back of Cheat Sheet #1
The Google Toolbar
Search any Google databases
Search within a site
Pop-up blocker
Search history list
Set Google preferences quickly
Customizable in Options
download from
toolbar.google.com
Other browsers toolbar
download from
googlebar.mozdev.org
Googling
Exercise 2
Installing the Google Toolbar
Customizing Preferences
Taking Charge of Driving Google
OR
Getting the Most
from Google’s FUZZY Thinking
Improving Google’s
“FUZZY” Default AND
Problems with AND default:
words can occur anywhere in results pages
some pages may not contain all of your words
some may not have any of your words
Use quotation marks to require words together
may have different meanings or contexts
turns common words into unique search terms
“working mothers”
145,000
5% of
working mothers
2,680,000
“dry cells”
11,500
1% of
dry cells
1,010,000
Hyphen makes phrases and searches with and
without hyphens
bite-sized retrieves bite-sized, bite sized, bitesized
Force “FUZZY” with OR Searches
Singulars and plurals not covered by
stemming
parent OR parents
Equivalent or synonymous terms
parent OR guardian
Misspellings
libarian OR librarian
Apostrophes and their misuse
april's OR aprils OR april "fools day"
Ask Google to be “FUZZY”
Synonym search
~ immediately before a word
sometimes “thinks” of very broad, related terms
~food
~facts
~help
recipes, nutrition, cooking
information, statistics
guide, tutorial, FAQ, manual
Often: Terms appear in links pointing to a retrieved page
Take advantage of stemming
Let stemming handle variant endings:
“wild flowers” OR wildflowers hike “point reyes”
april OR may OR spring
hike, hikers,
hiking, hikes
Ask for “FUZZY” Number Ranges
Numrange search uses
. . (no spaces)
babe ruth 1921..1935
results have highlighted dates within this range
3..6 megapixels digital camera
most numbers will be associated with megapixels
DVD player $250..
can be open-ended -- any number above starting number
The Whole-Word Wildcard:
Allowing FUZZY within “ ”
Can’t remember the exact wording in a phrase?
Who wrote something like, “The stag at night drank his
fill”?
Try searching:
“the stag * * * his fill” OR “the stag * * * * his fill”
ANSWER: “The stag at eve had drunk his fill” - in most sources
--Sir Walter Scott, “Lady of the Lake”
Construct proximity searches
Or
try GAPS
www.staggernation.com/cgi-bin/gaps.cgi
"george bush"
"george * bush"
"george * * bush"
"bush george"
"bush * george"
Excluding to Control “FUZZIness”
You want: Medical info about a pancreatitis
diet
Start with: pancreatitis diet
172,000
Eliminate undesirable words in results:
pancreatitis diet -cat -dog
132,000
pancreatitis -cat -dog -"support group"
128,000
Select exclusions carefully
Ask Google to be Very “FUZZY”:
Related & Similar
Two commands for the same function
click Similar at end of result
search related:www.infopeople.org
Sometimes hard to see how related
links to and from the target page
major words in and ranking of related pages
Possible uses
comparison shopping
find more sites like a site
related:www.econsumer.gov
use to evaluate a suspect page
Googling
Exercise 3
Taking Charge of Driving
Google
Googling
Limiting to Find
the Hard-to-Get
Limiting: Words in <Title>
intitle:
finds pages concentrated on your term
hybrid cars intitle:mileage
hybrid cars mileage
with quotes:
intitle:”cuban embargo”
“cuban embargo”
7,060
296,000
581
28,000
with OR:
intitle:”global warming” OR intitle:”greenhouse effect”
Use allintitle: to require all words in title
allintitle: hybrid cars mileage
86
can combine only with site:
allintitle: hybrid cars mileage –site:com
11
Exploiting a Page’s URL
Limiting to domain (edu, gov, etc):
site:edu OR site:gov OR site:ca.us
complete list at:
http://en.wikipedia.org/wiki/List_of_Internet_TLDs
Searching within a Site
site:
site:memory.loc.gov lincoln “sheet music”
works only in top/first part of URL
omit http:// and final /
makes Google into a search engine for pages that are indexed
in Google
inurl: less specific
term may be anywhere in URLs
inurl:lincoln “sheet music”
finds “lincoln” anywhere in any URL and “sheet music”
somewhere in the pages
Limiting to Types of Documents
filetype:
OR to find more than one
form 1040 filetype:pdf - finds forms
-filetype:
exclude certain filetypes
form 1040 -filetype:pdf - finds help with forms
View as HTML link can be useful
avoids viruses a document might carry if opened
allows viewing without the software or reader
Caveats for Limit Commands
Cannot always be combined
link: similar: must stand alone
allintitle: allintext: allinanchor: allinurl: with site: only
You can mix all other limit commands, usually:
inurl:ucla intitle:admissions statistics
intitle:”thyroid disease” site:edu OR site:com
Be careful not to ask for the impossible:
site:ucla.edu -inurl:edu
site:com site:edu site:gov
Some require understanding HTML hypertext links:
inanchor:links looks for text in link tags in the HTML code:
<a href="http://www.pancreasweb.com”>Pancreatitis links</a>
<a href="www.pancreaticdisease.com/links/links.htm”>Links</a>
See Cheat Sheet #3
Advanced Web Search page
Restricted Opportunities
Useful if you want to:
Not useful if you want to:
Try limiting to pages
updated in 3 mos, 6 mos,
year
Change language of
results pages
Select from list of filetype
formats
Change content filtering
(also in Preferences)
I almost never
use it
Construct complex
searches
Use OR for more than one
limiter
OR with phrases
multiple phrases
site:
filetype:
inurl:
Use intitle: inurl:
only the allin... commands
in Advanced Search
Googling
Exercise 4
Limiting
Googling
Finding Info on a Subject
Finding Directories & Link Lists
EXAMPLE - looking for links or directories about:
“women’s history” “middle east”
Use words likely to occur in link-list or directory pages
links OR "directory of" OR guide “women’s history” “middle east”
“what’s new” OR “what’s cool” “women’s history” “middle east”
<Title> field limit to focus pages you want
intitle:links OR intitle:”directory of” OR intitle:”encyclopedia of”
“women’s history” “middle east”
intitle:”women’s history” intitle:directory “middle east”
Are there agencies or organizations with links on this topic?
inanchor:links society OR association
"middle east" "women's studies"
Be creative. Substitute database for “directory” to find searchable databases
Google’s Directory
1.5+ million pages (compare with 8+ billion in web search)
DMOZ Open Directory
Google “importance” ranking within directory
EXAMPLE:
women's history middle east OR eastern
Click on useful subject categories for more:
Science > Social Sciences > Area Studies > Middle Eastern Studies
Society > People > Women > Women's Studies > By Topic
Society > Issues > Human Rights and Liberties > Regional > Middle
East
Search Google for Weblogs
Current commentary, opinions, misc. musings
Google indexes “important” blogs frequently
more than most web pages
Thorough search impossible
blog OR weblog OR “web log” your subject words
inurl:blog OR inurl:weblog your subject words
If you know the software a blog is using:
“powered by blogger” your subject words
site:blogspot.com your subject words
“powered by geeklog” your subject words
Try searching the Google Directory
Search Google Groups for Info
Usenet news groups back to 1981
archive of UNevaluated public thoughts, advice &
opinions
some not found elsewhere
select threads with more than one article for context
Search differences:
search for a group by name
search within a group
+ required for common words even in “ “
“hair loss” OR "loss +of hair" OR balding
group:alt.support.thyroid
use Advanced Search to limit by group or date posted
Create new mailing lists with registration
Google as Encyclopedic Glossary
Use the command define:[no space]
Google finds and ranks Web pages with definitions
define:internet
define:due diligence
Or build searches for pages with definitions:
internet “what is”
“what is the internet”
“internet stands +for”
internet ~beginners
internet ~FAQ
Also many common facts available:
population of japan
currency in algeria
birthplace of hitler
Exercise 5
Finding Info on a Subject
Brainstorming
How would you approach Google
7.
isto
the
ofof
Nepal,
and
how
1.
2.
4.
IHow
Where
wantcan
can
find
I find
Icurrency
find
websites
some
debates,
good
directing
from
collections
a me
wide
to of
range
good
links
places
ofand
3. What
blogs
about
California
and
the
5.
6.
birthplace
size
of
California?
of
Teddy
to
solve
each
the
following
much
of in
itproblems?
could
US
buy
asblogs
of a near-death
Roosevelt?
information
for
bird
watching
on
about
migraine
in$100
Northern
what
headaches?
constitutes
California.
useperspectives,
of
blogs
libraries,
particularly
to keep in
January 15,I'm
2004?
experience?
interested
proofs that
what
people
touch with other librarians
andinlibraries
in the
state
can be using
believed.
andreport
how they’re
blogs?
Googling
Special Google Databases
and Tools
Shortcuts and Services
Shortcuts:
dictionaries and other definitions
phonebooks - white and yellow
movie showtimes
stocks with recent news
maps, weather
converters, math problem calculators, physical constants
number searches
UPS, FedEx, USPS, VIN, UPC codes, area codes,
airplane reg. #, patents, more
http://www.googleguide.com/shortcuts.html
Translate
click [Translate this page] or URL or enter text at
www.google.com/language_tools
Page Info - better to enter a URL @ alexa.com
Many search engines offer useful shortcuts & similar tools:
See Search Cheat Sheet #4 & Supplement
“Hacking” Google URLs
Structure of a Google search result URL
Your search is for:
“web searching” tutorial
http://www.google.com/search? Google URL ? indicates query
num=20&
Number of results per page
hl=en&
Interface language
lr=&
Search language blank (ALL)
safe=off&
SafeSearch off
q=%22web+searching%22+tutorial
Query search terms
%22 means quote mark
+ joins terms
Will vary according to your Preferences setting
You
can modify results by changing values
A “Hack” for Country Searches
Type the search: egypt history 1950..1970
http://www.google.com/search?num=20&hl=en&lr=&safe=off&
q=egypt+history+1950..1975 &restrict=countryEG
Append in Address/URL box (no spaces):
&restrict=countryEG
General format - capitalized country code:
&restrict=countryXX
Complete country codes list:
http://en.wikipedia.org/wiki/List_of_Internet_TLDs
More countries and pages than in Language Tools
search page
www.google.com/language_tools
Google’s Other Proprietary Databases
Besides Web, Directory, and Groups
Images
News
Use Advanced Search forms
4,500 news sources
Useful, specific limit settings
30 days
international versions - other news slants
Froogle for shopping
1.3+ billion
SafeSearch filter only works in English language
shopping sites from Google - a subset
+ merchant uploads of catalogs not on the web
no fees, no pay for position
Catalogs (Google Labs still)
scanned mail-order catalogs (not web), text searchable
to navigate within a catalog, click an image and use the
special catalogs navigation bar
Local Information
local.google.com
“businesses & services” from Google web database +
several yellow pages
topic box
address/location box
restrict to 1, 5, 15, 45 miles away
geographic proximity, maps
EXAMPLE:
vegetarian restaurants
100 Larkin St, San Francisco, CA
maps.google.com
draggable images, satellite view
local (yellow pages), driving directions
earth.google.com
requires download, 200 MB memory
exotic toy or useful tool?
Google Labs
More upcoming Google services (beta)
Print.google.com – search only in Print
database
Sets - create and explore sequences of things
Suggest - browse possible search terms
video.google.com – some TV programs
My search history – registration and privacy
considerations
project to make full text books available online
Scholar.google.com – special page to search
from
scholarly articles (mostly) on the web
abstracts if full text not available
integrated with OCLC for library holdings
integrated with some college campuses
See Cheat Sheet #5
Exercise 6
Where would you look?
1.
Choose ONE or TWO questions to answer
2.
Write down what you did & learned
3.
It’s O.K. to talk, ask questions, and help
each other as needed
Googling
When Google Doesn’t Work
Other Effective Search Engines
Yahoo Search (3+ billion)
no 10-word limit
accepts ( ) around Boolean OR
(“global warming” OR “greenhouse effect”)
(site:edu OR site:gov OR site:uk)
pay-for-position sites not identified
Teoma (1+ billion)
popularity within subjects
sometimes finds link collections as Resources
Bookmarklets for Searching
Java
Script applications that reside in
your Bookmarks or Favorites (Favlets)
Search engine tools:
run
a search in another search engine
@Teoma @Yahoo!
search
highlighted text in a search engine
Information
and more about them at
searchengineshowdown.com/bmlets
Recommended Directories
By library people
LII.ORG
Academic Info
Infomine
Complement to searching
when search engines do not seem to
work
when you know or have a hunch there
is a site about your question
Thinking in Sync with Search Engines
Search engine balancing act:
Do we agree with Google’s “importance”?
tyrannical or democratic?
favors established more than new websites
favors trendy, high-speed, consumer, vroom & zoom
Are Google’s secretiveness & fuzziness trustable?
Have search engines changed us?
Do we accept “good enough” quicker?
Have we given up “thorough” and “certain”?
Will semantic & linguistic analysis help?
Or bring in a new age of “whatever” thinking
Googling
Exercise 7
Make your own Cheat Sheet
Write down up to seven things you want to
remember to do or practice
Circle the ONE you like most
Googling
Workshop Evaluation
infopeople.org/WS/eval