Transcript Slide 1

NWLIP
Friday 21st November 2008
Internet Search Techniques
Karen Blakeman
RBA Information Services
Photo :http://www.flickr.com/photos/harry_manback/17204494/
18 July 2015
Karen Blakeman www.rba.co.uk
1
This work is licensed
under a Creative Commons Attribution 3.0
2.5 License
Search engine meltdown
 AlltheWeb Livesearch - gone
– AlltheWeb itself still alive but no further development (uses
Yahoo databases)
 Ask – gone down hill since latest makeover
 Live.com
– link and linkdomain commands – now you see ‘em, now you
don’t
– Academic Live, Live Books – gone
 Yahoo
– NOT command, parentheses, Mindset - gone
 Exalead
– Approximate spelling (transformed into smellslike, spell slike!)
– ‘regular expression’ internal masking of letters - gone
 Accoona – gone
– http://www.rba.co.uk/wordpress/2008/10/05/accoona-is-no-more/
18 July 2015
Karen Blakeman www.rba.co.uk
2
What’s new?







Google
– Knol, much improved Google Finance, lots of tweaks to existing
services
– Google searchwiki now in action http://www.google.co.uk/ - sign in
with your Google account
Ask
– Yet another makeover, new layout, return to “Ask a Question
MSE360
– http://www.mse360.com
Silobreaker - http://www.silobreaker.com/
Search visualisation tools
– e.g. Cluuz ……
Lots of Web 2.0 ‘stuff’
Cuil
– Cuil not so Cool
– http://www.rba.co.uk/wordpress/2008/07/28/cuil-not-so-cool/
18 July 2015
Karen Blakeman www.rba.co.uk
3
Search techniques – a reminder
 Search engines still search for all of your terms by default
– but note that Google also looks for terms in ‘links to’
 Double quote marks around phrases
– e.g. “climate change”
 To exclude pages containing a term, precede the term with a
minus sign (-)
 Boolean search
–
–
–
–
OR, AND, NOT
must use capital letters for the operators
only OR works in Google and even that does not work well
Live.com, Exalead and MSE360 are best (Yahoo has withdrawn
NOT, and nested Boolean)
– for example chemical engineer AND (inurl:cv OR
intitle:cv) AND (oil OR petroleum)
18 July 2015
Karen Blakeman www.rba.co.uk
4
Search techniques – a reminder
 Repeat your key search terms in your strategy
– chocolate production UK france belgium
– chocolate production UK france belgium belgium belgium
• give different results
 Change the order of your terms
– chocolate production Belgium Switzerland
– production Belgium Switzerland chocolate
• different results
 See the summary and comparison chart for the major search
engines at http://www.rba.co.uk/search/compare.pdf and
http://www.rba.co.uk/search/compare.shtml
18 July 2015
Karen Blakeman www.rba.co.uk
5
File format search
 Use advanced search options to limit your search to file
types or format:
– pdf or doc for government or industry/market reports
– xls for data and statistics
– ppt or pdf for presentations
 Run in at least Google, Yahoo and Live
 Looking for experts on a topic or presentations?
– Slideshare http://www.slideshare.net/
– authorSTREAM http://www.authorstream.com/
– YouTube http://www.youtube.com/
18 July 2015
Karen Blakeman www.rba.co.uk
6
Unique Google search features
 Automatically looks for variations on your terms
– to stop it, precede your terms with plus signs
e.g. air +pollution or put your term in double quotes e.g. “Smyth”
 Synonym search
– precede your search terms with a tilde (~)
 Numeric range search
– now on advanced search page
– can be weights, distances, years, prices
– Command line syntax is
• search term(s) first value..second value unit of
measurement
– TV advertising spend forecasts 2005..2012
– toblerone 1..5 kg
18 July 2015
Karen Blakeman www.rba.co.uk
7
Unique Google search features
 Proximity
– use the asterisk (*) to stand in for one or more terms
– macular * degeneration picks up
• macular retinal degeneration
• macula disciform degeneration
• macular choroidal degeneration
• macular vitelliform degeneration
• macular pigmentary degeneration
– adding extra * changes the results
– add, remove spaces between * * to change ranking of results
• why does it do that – who knows?
– no information on maximum number of terms of separation
18 July 2015
Karen Blakeman www.rba.co.uk
8
Firefox – Customise Google Add-on




Adds numbers to Google search results (position counter)
Links to other search engines
Stream search result pages
Add links to Wayback Machine
18 July 2015
Karen Blakeman www.rba.co.uk
9
Use something other than Google
18 July 2015
Karen Blakeman www.rba.co.uk
10
Ask





http://www.ask.com/, http://www.ask.co.uk/
Suggestions for narrowing down or expanding your search
Particularly good for blogs
Big News gone
Search interface and options revamped
– new Q&A tab
18 July 2015
Karen Blakeman www.rba.co.uk
11
Exalead
 http://www.exalead.com/
 http://www.exalead.co.uk/
 Supports wild cards
– asterisk (*) at the end of a word
• pollut* finds pollute, pollutant, polluting etc.
 NEAR - finds words within 16 terms of one another
– NEAR/n finds words within n number of terms one another
• climate NEAR/3 change
 Approximate spelling, phonetic search (?)
 Regular expression (internal masking of letters)
 Feedback from users is that there is more European content,
which seems to be given priority
18 July 2015
Karen Blakeman www.rba.co.uk
12
Live Search






http://www.live.com/
Results tend to be more consumer oriented
Has the most up to date database
Possibly has the most extensive database of web pages
Good image search option
Feed command for locating RSS feeds on a specified web
site
– site:bbc.co.uk feed:bbc.co.uk
 Revamped interface but no improvement in advanced
search screen
 Link commands gone
 Axed Link commands, Books and Academic Live 
18 July 2015
Karen Blakeman www.rba.co.uk
13
Yahoo!
 http://search.yahoo.co.uk/ http://search.yahoo.com/
 Boolean AND, OR
– NOT no longer available – use the minus sign.
– parentheses do not work
 Indexes first 500 K of a document (Google 101 K)
 Square brackets round terms to pick up terms on the
page in the order specified
– [carbon emissions trading]
 Region command (inherited from Inktomi)
 region: e.g. region:europe, region:mediterranean
– others are africa, asia, centralamerica, northamerica,
southamerica, mideast, southeastasia, downunder
18 July 2015
Karen Blakeman www.rba.co.uk
14
MSE360.com
 http://www.mse360.com/
 See reviews at
– http://www.rba.co.uk/wordpress/2008/10/05/mse360-search/
– http://www.rba.co.uk/wordpress/2008/10/06/update-on-mse360/
 Full Boolean nested search options
 No advanced search screen but can use commands e.g.
filetype: , site;
 ‘Tiered’ results – Web, Wikipedia, blogs
 Customise results layout
 Tags sites that you have already visited (Firefox only at
present)
 Quick to respond to bug reports and fix problems
18 July 2015
Karen Blakeman www.rba.co.uk
15
Zuula.com
18 July 2015
Karen Blakeman www.rba.co.uk
16
Intelways.com
18 July 2015
Karen Blakeman www.rba.co.uk
17
Science/Academic Search Engines
 RefSeek – http://www.refseek.com/
 Ten Science Search Engines http://hwlibrary.wordpress.com/2008/09/22/science-searchengines/
–
–
–
–
–
–
–
–
–
Scirus – http://www.scirus.com/
Scitopia.org – http://www.scitopia.org/
Science.gov – http://www.science.gov/
ScienceResearch.com - http://www.scienceresearch.com/
Scitation - http://scitation.aip.org/
WorldWideScience.org - http://worldwidescience.org/
Science Accelerator - http://www.scienceaccelerator.gov/
TechXtra – http://www.techxtra.ac.uk
search.optics.org - http://search.optics.org/
 Google Scholar – http://scholar.google.com/
– use with care
18 July 2015
Karen Blakeman www.rba.co.uk
18
Books
 Amazon
 Google Books
– can sometimes search inside the book and looks at individual pages
– useful for older texts
– suppliers of the book
 Project Gutenburg
– electronic versions of over 25000 texts
– different editions may be available e.g. Darwin’s Origin of Species
 Book swap schemes
– Turning over an old leaf
– http://www.guardian.co.uk/environment/2008/may/01/ethicalliving.rec
ycling
– e.g. http://www.bookmooch.co.uk/
18 July 2015
Karen Blakeman www.rba.co.uk
19
News
 BBC – http://news.bbc.co.uk/
 Search engine news options e.g. Yahoo, Google
–
–
–
–
have only the last 30 days of free news
advanced search options limited and unreliable
no source list, and sources frequently change
key industry publications may not be included
 Google News Archive
http://www.google.com/archivesearch
– some sources going back 200 years
– many articles are priced (before you buy check other
sources)
 Silobreaker - http://www.silobreaker.com/
 Chipwrapper - http://www.chipwrapper.co.uk/
18 July 2015
Karen Blakeman www.rba.co.uk
20
Silobreaker http://www.silobreaker.com
 covers free resources
 news, blogs, video,
images
 market trends
 geographical location
of stories
 people
 networks
18 July 2015
Karen Blakeman www.rba.co.uk
21
Chipwrapper http://www.chipwrapper.co.uk
 Google Custom
Search engine
 Searches everything
available on 15 free
UK News Sites
 No date sort option
but typing in the year
usually works
18 July 2015
Karen Blakeman www.rba.co.uk
22
Yahoo Finance
18 July 2015
Karen Blakeman www.rba.co.uk
23
Google Finance
18 July 2015
Karen Blakeman www.rba.co.uk
24
Blog searching
 Google Blogsearch
– http://www.google.com/blogsearch
 Ask – Blogs and feeds
– http://www.ask.com/
 Exalead
– http://www.exalead.com/
– limit search to Site Type Blog
 Live Search
– http://www.live.com/ and select Feeds
 Blog and feed search engines
– Technorati.com, Blogpulse.com
18 July 2015
Karen Blakeman www.rba.co.uk
25
Blogpulse search and trends
Click on the graph
to see ‘trends’
18 July 2015
Karen Blakeman www.rba.co.uk
26
Blogpulse trends
18 July 2015
Karen Blakeman www.rba.co.uk
27
pipl
 http://www.pipl.com/
 Review at
http://www.rba.co.uk/wordpress/2007/05/05/pipl-peoplesearch-beta/
 Searches ‘hidden’ web + Google search
– blog search, Google Groups, LinkedIn, Flickr, Google
Scholar, Electoral Roll, Directories, Amazon, Hoovers,
Zoominfo etc.
– Google web search results not the same as an ordinary
Google search – they incorporate terms such as resume,
CV
18 July 2015
Karen Blakeman www.rba.co.uk
28
Zoominfo - Karen Blakeman’s verified profile
Information
‘verified’ by
Karen Blakeman
View the ‘references’
(web pages) to see the
information in context
18 July 2015
Karen Blakeman www.rba.co.uk
29
LinkedIn
18 July 2015
Karen Blakeman www.rba.co.uk
30
Facebook
18 July 2015
Karen Blakeman www.rba.co.uk
31
Cluuz
 http://www.cluuz.com/
 “Cluuz … core technology understands the
relationship between the entities, terms, or persons
searched leading to more relevant, easy to
understand search results”
 Not totally intuitive but the network visualisation is
‘cool’
 The links in the network visualisation do not always
relate to the same person or organisation but they are
usually working in a similar field or subject area
 Results change from one day to the next, one hour to
the next, but still worth a look
18 July 2015
Karen Blakeman www.rba.co.uk
32
Cluuz
18 July 2015
Karen Blakeman www.rba.co.uk
33
Create your own search engine
 Examples:
– AlacraSearch
• http://www.alacra.com/alacrasearch
– pipl
• http://www.pipl.com/
– Chipwrapper
• http://www.chipwrapper.co.uk/
 Google Custom Search Engines
– http://www.google.com/coop/cse
– can be hosted on your own site or on Google
• http://www.rba.co.uk/sources/energy.shtml
• http://www.google.com/coop/cse?cx=0143042123649627
40038:tui4ebh5r_a
18 July 2015
Karen Blakeman www.rba.co.uk
34
‘Disappearing’ pages
 Search engine cache copies
– Google, Yahoo, Live, Ask, Exalead
 Wayback machine
– http://www.archive.org/
– from 1996 to about 6 months ago
– navigate the archived site or type in the full URL of the
document if known
 Firefox users
– install the Resurrect Pages add-on
27
18 November
July 2015 2006
Karen Blakeman www.rba.co.uk
35
Wayback Machine
18 July 2015
Karen Blakeman www.rba.co.uk
36
Karen Blakeman
Tel: 0118 947 2256, +44 118 947 2256
[email protected]
http://www.rba.co.uk/
blog: http://www.rba.co.uk/wordpress/
Facebook – Karen Blakeman
Twitter: karenblakeman
http://www.slideshare.net/KarenBlakeman
18 July 2015
Photo: Dan Cupid http://commons.wikimedia.org/wiki/Image:Manchester_Piccadilly_Station_interior.jpg
37