What's new in search tools?

Download Report

Transcript What's new in search tools?

Top Tips for Expert
Searching
Inforum 2005, Prague
Karen Blakeman, RBA Information Services
[email protected]
rba.co.uk
21 May 2005
Karen Blakeman www.rba.co.uk
1
What will be covered?
■
■
■
■
■
■
21 May 2005
What Google has been doing over the last
year
Alternatives to Google
RSS and Blogs
Desktop Search
Storing, managing and sharing resources
Top Tips to reduce overload
Karen Blakeman www.rba.co.uk
2
Google
New services
■
Google Mail
Google Print
Google Scholar
Google Suggests
Google Local and Maps (only US, Canada and UK)
My Search History
Web Accelerator
Personalized News
Google Desktop
–
–
–
–
–
–
–
–
–
■
Look in labs.google.com
21 May 2005
Karen Blakeman www.rba.co.uk
3
Google
Increased database to over 8 billion pages
■
–
–
more rubbish to sift through
need to use advanced search features to get
more useful results
New search features
■
–
–
increased number of search terms to 32
numeric range search
toblerone 1..5 kg
DVD player $100..200
2000..2005
–
21 May 2005
synonym search (~ before a word)
Karen Blakeman www.rba.co.uk
4
Google: use advanced search
■
■
■
■
■
■
■
■
■
21 May 2005
language
file format (filetype:)
date modified
domain (site:)
Similar pages (related:)
pages that link to a known page (link:)
Also define: for definitions
Use * to stand in for a word in a phrase
Use + sign before a word to stop automatic
stemming
Karen Blakeman www.rba.co.uk
5
21 May 2005
Karen Blakeman www.rba.co.uk
6
"Google sinker"
Works in most search engines (not MSN)
Repeat the most important word in your
search several times e.g.
■
■
–
–
–
beer "market share" france germany czech
beer "market share" france germany czech
czech
beer "market share" france germany czech
czech czech
all give different results!
21 May 2005
Karen Blakeman www.rba.co.uk
7
Google Scholar
scholar.google.com
■
–
–
–
–
–
–
21 May 2005
"search specifically for scholarly literature,
including peer-reviewed papers, theses, books,
preprints, abstracts and technical reports from all
broad areas of research"
no source list
includes citations and books
limited "advanced search" and author search
unpredictable
articles ranked by relevance only
many articles are priced
Karen Blakeman www.rba.co.uk
8
Google print
Books supplied to Google by publishers
Google digitizes them
Search on:
■
■
■
–
–
■
■
■
■
21 May 2005
books about......
e.g. books about hubbert
3 books listed near the top of the results list
Search within the book
Limit on number of pages that can be viewed
Information about the book and links to book
stores
Karen Blakeman www.rba.co.uk
9
21 May 2005
Karen Blakeman www.rba.co.uk
10
21 May 2005
Karen Blakeman www.rba.co.uk
11
21 May 2005
Karen Blakeman www.rba.co.uk
12
Google Suggests
■
■
21 May 2005
labs.google.com and click on Google Suggests
Start typing in your search and Google
suggests additions to your search together with
the number of results
Karen Blakeman www.rba.co.uk
13
21 May 2005
Karen Blakeman www.rba.co.uk
14
Variations on Google
Lots of search tools based on Google
bananaslug.com
■
■
–
–
–
takes your search, adds a random term and
searches Google
pulls up results that would usually be hidden far
down your results list
can select a random term from a category
animals, random number, Shakespeare
random number works well when looking for statistics
or market data
21 May 2005
Karen Blakeman www.rba.co.uk
15
Why try alternatives?
Different coverage
Different way of sorting results
Different search features
Different types of resource
Compare some of the major search engines
using Thumbshots Ranking
■
■
■
■
■
–
–
ranking.thumbshots.com
shows overlap in first hundred results
varies depending on the search
21 May 2005
Karen Blakeman www.rba.co.uk
16
21 May 2005
Karen Blakeman www.rba.co.uk
17
21 May 2005
Karen Blakeman www.rba.co.uk
18
21 May 2005
Karen Blakeman www.rba.co.uk
19
21 May 2005
Karen Blakeman www.rba.co.uk
20
Dogpile Missing Pieces
■
■
missingpieces.dogpile.com/WhitePaper.pdf
missingpieces.dogpile.com/missingpiecestool.aspx
– compares Google, Yahoo and Ask Jeeves
– graphic shows how many results are in only 1, 2 or in
all 3 search engines for a particular search and
includes sponsored links from the top of the pages.
21 May 2005
Karen Blakeman www.rba.co.uk
21
21 May 2005
Karen Blakeman www.rba.co.uk
22
Yahoo!
search.yahoo.com
Launched at the beginning of 2004
? billion pages
Search features very similar to Google
Key features
■
■
■
■
■
–
–
–
–
–
21 May 2005
searches first 500K of a page (Google only 100K)
full Boolean search
link and linkdomain command better than Google
RSS/XML filetype search
News alerts available as RSS feed
Karen Blakeman www.rba.co.uk
23
Yahoo link and linkdomain commands
Google
■
–
link:www.zefix.ch (68 results)
Yahoo
■
–
link:http://www.zefix.ch/ (1100 results)
finds pages that link to this individual page
–
–
21 May 2005
linkdomain:www.zefix.ch (1290 results)
can also exclude your starting point using -site:
Karen Blakeman www.rba.co.uk
24
MSN
■
■
■
■
■
■
■
21 May 2005
www.msn.com
Launched in autumn 2004
5 billion pages
"Search Builder" = advanced search
options
No filetype search option in advanced
search
Results tend to be "consumer" oriented
News Alerts available as RSS
Karen Blakeman www.rba.co.uk
25
Exalead
www.exalead.com
1 billion pages
Full Boolean search
NEAR command - within 16 words of each
other
Supports wild cards
■
■
■
■
■
–
21 May 2005
middle and end of the word
Karen Blakeman www.rba.co.uk
26
Exalead (2)
Advanced Search
■
–
–
–
phonetic search
approximate spelling
automatic stemming
Pattern matching
■
–
good for solving (cheating at?) crossword
puzzles
start pattern with a forward slash, represent each
missing letter with a full stop and finish pattern with a
forward slash e.g. /.h.s.c..n/
use a full stop followed by an asterisk to represent
one or more letters e.g. /psych.*ist/
21 May 2005
Karen Blakeman www.rba.co.uk
27
Exalead (3)
Results
■
–
can be sorted by date (newest to oldest or
oldest to newest)
select option on Advanced Search screen
–
–
–
21 May 2005
results display thumbnail of page next to each
entry
related terms displayed
can be viewed by file format e.g. PDF, DOC
Karen Blakeman www.rba.co.uk
28
21 May 2005
Karen Blakeman www.rba.co.uk
29
Kartoo
■
■
■
■
■
21 May 2005
kartoo.com
Meta-search tool
Graphical representation of results
Extracts related terms from documents
Different layouts available including
straightforward text listing
Karen Blakeman www.rba.co.uk
30
21 May 2005
Karen Blakeman www.rba.co.uk
31
Unique search features
Google
■
–
numeric range search, synonym search, define
command, Google Suggests
Yahoo
■
–
RSS/XML format
Exalead
■
–
phonetic search, approximate spelling, pattern
matching, wildcards, NEAR command, related
terms
Kartoo
■
–
21 May 2005
graphical representation of results, related terms
Karen Blakeman www.rba.co.uk
32
Which search tool
Synonyms and related terms
■
–
Google, Exalead, Kartoo
Wild cards, variations on words
■
–
Exalead wild card, phonetic search,
approximate spelling, pattern matching
Proximity search
■
–
Exalead
Numeric range search
■
–
21 May 2005
Google
Karen Blakeman www.rba.co.uk
33
Quick facts and reference queries
Answers.com
■
–
–
"topic-based snapshot"
100 authoritative encyclopedias, dictionaries,
glossaries and atlases
Wikipedia
■
–
–
–
–
–
21 May 2005
www.wikipedia.org
free-content encyclopaedia that anyone can edit
editors required to compile a balanced article
including references to other sources
several language versions
good for quick reference and for links to other
related sources
Karen Blakeman www.rba.co.uk
34
Quick facts and reference queries (2)
More examples:
■
–
–
–
–
–
21 May 2005
acronymfinder.com
dictionary.com
Encarta
encyclopedia.com
brainboost.com
Karen Blakeman www.rba.co.uk
35
Evaluated listings
Annotated directories on a particular subject,
industry or type of information
Provide access to recommended resources
on a topic
Expert human assessment of resources
Examples:
■
■
■
■
–
–
–
21 May 2005
eco5.com for finance and economics
Biogate biogate.lub.lu.se for "1000 best links in the
biological sciences"
Official Statistics on the Web
www.library.auckland.ac.nz/subjects/stats/offstats/
Karen Blakeman www.rba.co.uk
36
Evaluated listings
How do you find them?
■
–
–
–
–
–
21 May 2005
BUBL Link bubl.ac.uk
Pinakes, a subject launchpad
www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html
professional or trade association
by personal recommendation
by chance
Karen Blakeman www.rba.co.uk
37
Meta search tools
Take your search and run it in several
search engines at once
For:
■
■
–
–
–
saves time and effort
combined results sometimes better than
individual search tools
some arrange results into folders e.g. Killerinfo
Against:
■
–
21 May 2005
cannot use the advanced search options of
individual search tools
Karen Blakeman www.rba.co.uk
38
Examples of meta search tools
■
■
■
■
■
■
■
21 May 2005
kartoo.com
killerinfo.com
vivissimo.com
ixquick.com
dogpile.com
turboscout.com (searches one at a time)
turbo10.com (build your own!)
Karen Blakeman www.rba.co.uk
39
Turboscout
Interface to a range of different search tools
and types of resources
■
–
–
–
–
–
–
–
■
21 May 2005
21 "standard" search tools e.g. Google, Teoma
12 image search tools
17 reference sources e.g. Wikipedia, Scirus
10 news search tools
13 product search e.g. Amazon
9 blog tools e.g. DayPop, Technorati
8 audio visual
Type in your search once and click on each
tool in turn
Karen Blakeman www.rba.co.uk
40
21 May 2005
Karen Blakeman www.rba.co.uk
41
21 May 2005
Karen Blakeman www.rba.co.uk
42
Turbo10.com
Build your own meta search tool
■
–
–
select search tools from existing list of search
engines and sites
add your own search tool or site
wizard take you through the steps
not all tools and sites can be added
dependent on cookies to keep and display your
collections so problems if you move from one PC to
another
21 May 2005
Karen Blakeman www.rba.co.uk
43
21 May 2005
Karen Blakeman www.rba.co.uk
44
RSS and blogs
What is RSS?
■
a way of delivering headlines and stories
stands for Really Simple Syndication, or Rich Site
Summary, or RDF Site Summary
more information at www.rba.co.uk/rss/rss.htm
–
–
–
Need a program to "read" the feeds
■
–
web based e.g. bloglines.com
Bloglines tutorial at tinyurl.com/ap42n
–
–
21 May 2005
desktop program e.g. FeedReader, FeedDemon
list of readers at
en.wikipedia.org/wiki/List_of_news_aggregators
Karen Blakeman www.rba.co.uk
45
"Raw" RSS feed
21 May 2005
Karen Blakeman www.rba.co.uk
46
RSS feeds in a Feedreader
21 May 2005
Karen Blakeman www.rba.co.uk
47
RSS and blogs
Many news services now offer RSS feeds
■
–
–
–
–
Yahoo News
MSN News
Newstrove.com, Moreover.com and FeedDirect
look for the RSS or XML logo
Blogs
■
–
–
–
21 May 2005
online journal or diary
can range from superficial irrelevances to
extreme erudition
try and find blogs by industry "gurus"
Karen Blakeman www.rba.co.uk
48
RSS and blogs
Searching for RSS feeds and blogs
■
–
–
–
–
–
–
–
–
–
21 May 2005
Yahoo Advanced Search, RSS/XML file format
syndic8.com
bloglines.com
blogdex.net
blogdigger.com
daypop.com
technorati.com
feedster.com
blogpulse.com
Karen Blakeman www.rba.co.uk
49
Desktop Search
Searches documents, emails, chat, IM
messages, web pages etc. stored on your
PC
■
–
search both file names and content
Can combine local search with web search
Indexes documents and folders on your PC
■
■
–
■
■
21 May 2005
do not generally index documents on other
machines on the network (some exceptions)
Useful for tracking down "lost" files but....
Not a replacement for structured, well
managed document folders
Karen Blakeman www.rba.co.uk
50
Examples
■
■
■
■
■
■
21 May 2005
Blinkx - www.blinkx.com
Ask Jeeves - sp.ask.com/docs/desktop/
Google Desktop - desktop.google.com
MSN - www.msn.com
Copernic Desktop - www.copernic.com
Yahoo desktop - desktop.yahoo.com
Karen Blakeman www.rba.co.uk
51
Desktop search - what is indexed?
Your documents, email, cached web pages etc
How?
■
■
–
varies with search tool e.g. Google first creates a cache
and then indexes the cache, others create just an index
First time indexing can take along time
Updates handled differently by different tools
■
■
–
–
scheduled e.g. every hour, once a day
continually as new or edited documents appear
(dynamic indexing)
can interfere with work
need to be able to pause or "snooze" indexing if necessary
21 May 2005
Karen Blakeman www.rba.co.uk
52
Document types supported
Varies depending on the desktop search tool
Crucial when choosing a tool
Usually at least MS Office, html, text files
■
■
■
may not support OpenOffice, Thunderbird, Eudora,
Firefox, Netscape etc.
may only index default folders e.g. My documents
–
–
Preview will vary depending on type of document
Check treatment of secure web pages
■
■
https e.g. bank statements, Intranet pages
–
■
Check treatment of password protected docs
21 May 2005
Karen Blakeman www.rba.co.uk
53
Also check...
Can the content of a particular document format be
searched or is only title searched?
Can the tool support files without extensions?
Do you really need network or Enterprise Search
rather than just local PC search?
Also check that you have at least the minimum
spec on your machine to support the desktop
search tool - some are resource hungry
Operating systems supported
■
■
■
■
■
–
MS, MAC, Linux?
21 May 2005
Karen Blakeman www.rba.co.uk
54
Three useful references
UW E-Business Institute "Benchmark Study
of Desktop Search Tools" - www.uwebi.org
■
–
free of charge
Desktop Search Handbook - an Office Watch
guide - shop.office-watch.com/dsh/
■
–
■
21 May 2005
US$ 14.95, e-book, updates are free
Desktop Detectives, Davey Winder,
Information World Review, May 2005, Issue
213, pp.19-21
Karen Blakeman www.rba.co.uk
55
UWEBI criteria
Usability
■
–
how easy to use, intuitive?
Versatility
■
–
e.g. which document formats are supported?
Accuracy
Efficiency
■
■
–
■
■
21 May 2005
memory usage, indexing time, index pause
options
Security
Enterprise readiness
Karen Blakeman www.rba.co.uk
56
Blinkx
■
"blinkx changes the way you interact with all kinds of
information by reading the content on your computer
screen and automatically linking you to related
information - Web sites, the latest news on the Web,
even documents and email on your computer."
■
"Without having to actively or explicitly fire off a
search or even choose words to search on, IQ uses
intelligent analysis of the page a user is reading or
writing to find related information, again regardless of
its source, whether the local computer, the internet or
television."
21 May 2005
Karen Blakeman www.rba.co.uk
57
Blinkx
■
■
■
■
■
21 May 2005
Limited range of file formats supported
Indexing is very slow
Memory hungry
Not intuitive
"Intelligent Analysis" loses the plot if you
have a wide range of interests
Karen Blakeman www.rba.co.uk
58
Ask Jeeves
sp.ask.com/docs/desktop/
Document support limited
Not easy to pause indexing
Does not search content of PDF, ZIP or
Excel files
No web history search
Can be unstable
■
■
■
■
■
■
–
21 May 2005
regularly crashed my machine
Karen Blakeman www.rba.co.uk
59
Google Desktop Search
desktop.google.com
File formats supported improving
Third party plugins provide support for file formats
Indexes documents as you view them, even
password protected files unless you tell it not to
Includes https files unless you tell it not to
Sends back anonymous info about your searches
unless you tell it not to
Problems with persistent cache and indexes
■
■
■
■
■
■
■
–
remove function not easy to use
21 May 2005
Karen Blakeman www.rba.co.uk
60
21 May 2005
Karen Blakeman www.rba.co.uk
61
21 May 2005
Karen Blakeman www.rba.co.uk
62
21 May 2005
Karen Blakeman www.rba.co.uk
63
21 May 2005
Karen Blakeman www.rba.co.uk
64
MSN Desktop
www.msn.com
Document formats support is limited
Spots when you are active on your PC and
suspends indexing quite quickly
■
■
■
also a "snooze" button you can use to stop indexing
immediately.
–
■
Good sort options
21 May 2005
Karen Blakeman www.rba.co.uk
65
Copernic Desktop
www.copernic.com
Supports "the usual suspects" plus Firebird,
Mozilla Netscape and Thunderbird
■
■
–
Advanced options allow you to
add file types to index
add text file types to index
■
■
■
■
21 May 2005
Searches as you type
Dynamic indexing
Has a good pause button to stop indexing
Uses Alltheweb for the Web search
Karen Blakeman www.rba.co.uk
66
Copernic Desktop (2)
■
■
■
■
21 May 2005
Individual results are displayed in a Quick
Preview pane below the search results
Results categorised into groups that
change depending on search type e.g.
email, file, music
Search terms are highlighted
Enterprise option - Coveo
Karen Blakeman www.rba.co.uk
67
21 May 2005
Karen Blakeman www.rba.co.uk
68
Yahoo Desktop
■
■
■
■
■
■
■
21 May 2005
desktop.yahoo.com
Based on X1 Desktop Search
Supports over 200 document types + media
files
Searches inside zip files
Good preview
No dynamic indexing
Enterprise search - X1
Karen Blakeman www.rba.co.uk
69
21 May 2005
Karen Blakeman www.rba.co.uk
70
Which Desktop Search?
Karen's choice
1. Yahoo
2. Copernic
3. MSN
4. Google
5. Ask Jeeves
6. Blinkx
21 May 2005
UWEBI's choice
1. Copernic
2. Yahoo
4. MSN
5. Google
6. Ask Jeeves
11. Blinkx
Karen Blakeman www.rba.co.uk
71
Alternatives
Windows Search
Agent Ransack/File Locator Pro
■
■
–
–
–
–
21 May 2005
www.mythicsoft.com
"Unlike other search products FileLocator Pro
does not consider any file too small or
insignificant to examine."
sophisticated (geekish?) search options
like Windows Search can take a long time to
search your PC
Karen Blakeman www.rba.co.uk
72
21 May 2005
Karen Blakeman www.rba.co.uk
73
Should you install desktop search?
Security issues to consider and conflicts
with document management policies
Safest one so far seems to be Yahoo
All limited in file types supported
■
■
■
–
Yahoo most comprehensive
Do not rely on desktop search to find your
documents
■
–
21 May 2005
far better to have well organised, structured
documents folders
Karen Blakeman www.rba.co.uk
74
The future for desktop search
Limited take up at present in the corporate
environment
■
–
some are specifically prohibited
Many tools need to improve drastically
■
–
■
■
21 May 2005
many will not bother
Microsoft to integrate desktop search into
the operating system
Will still need tools for non-Microsoft
platforms and non-Microsoft document
formats
Karen Blakeman www.rba.co.uk
75
Storing and organising resources
Organise bookmarks/favorites
Add frequently used sites to links tool bar
■
■
–
www.rba.co.uk/search/toolbar.htm
Copy URLs and descriptions to your own
web page or word document
Firefox users
■
■
–
Copy URL extension
copy URL, page title and any highlighted text
paste into application of your choice
21 May 2005
Karen Blakeman www.rba.co.uk
76
Storing and organising resources
Netsnippets
■
–
–
–
–
–
–
21 May 2005
netsnippets.com
"Capture. Organize. Share"
stores selected text, whole web pages, files
along with your comments
organise research and pages into folders or
"projects"
share that information with colleagues and
friends
help produce a report from your research
Karen Blakeman www.rba.co.uk
77
Storing and sharing bookmarks online
Store, comment on and organise resources
via third party web site
Can keep your bookmarks private, make them
totally public or share them with selected
individuals
Ideal for sharing project resources amongst a
group of widely dispersed co-workers
But service could vanish so make backups
Examples
■
■
■
■
■
–
21 May 2005
Furl.net, Spurl.net, del.icio.us, de.lirio.us,
Connotea.org
Karen Blakeman www.rba.co.uk
78
Contact details
Karen Blakeman
RBA Information Services
email: [email protected]
web: rba.co.uk
Tel: +44 118 947 2256
Fax: +44 20 8020 0253
21 May 2005
Karen Blakeman www.rba.co.uk
79