Size of Internet/World Wide Web

Download Report

Transcript Size of Internet/World Wide Web

Basic Internet Search Techniques

Or How to really find information on the internet

Shayna Keces Reference Librarian 236-0301 ext. 441

August 2004

Agenda

Size of Internet Types of search engines Search strategies Some hints on selecting search strategies Interpretation of search results Tutorials on searching and search engines

Size of Internet/World Wide Web

July 2000 2.1 billion web pages, est. 4 billion pages by early 2001 (Some place much higher if count invisible or deep web) Size of search engine databases

Google 4.28 billion Fast (alltheweb) 2.1 billion AltaVista 1.1 billion Yahoo 2 million catalogued

Search strategies

Do not use search button use a string of keywords without specifying Boolean properties use upper case unless part of strategy use NOT or - unless absolutely sure is necessary

elimination of unanticipated pages format is non standardized

Search Strategies

Do

Consider what type of resource will best answer your question and search for that resource (eg. dictionary or certain type of web page) think of a list of keywords that will narrow or broaden your search keeping in mind that with the internet, narrowing your search is usually better Stick to small list of search engines and learn the search syntax for the search engine you’re using

Types of search engines

Keyword or robot based (builds a database) Directory based (categories indexed by people rather than computer) Annotated directory-based search engines Meta indexes (can combine searches or allow you to search a variety of engines individually) Specialized search engines

Keyword or robot based Search Engines

Large database of web pages No human involvement and no quality control Can submit website or will find some on own Searches full text to certain level, does not search deep or invisible web Google ( www.google.com

) Alta Vista ( www.altavista.com

) Fast ( www.alltheweb.com

) Wisenut ( www.wisenut.com

)

Google ( www.google.com

)

Presently largest database (ca. 4 billion) Very sophisticated placement of results particularly good for popular sites, company sites Advanced search can limit search to title of page or to URL implied AND + for stop words If you want

or

needs to be expressed in caps not case sensitive

Google (www.google.com) cont.

no stemming or truncation (except on ad hoc basis controlled by Google.

description shows keywords in context cached pages helpful for sites not working Searches some formats not found in other search engines (eg. Adobe acrobat and postscript files, Excel, Powerpoint, and Word files as well as rich text files.) Innovative in new features (eg. ability to convert measurements, eg. 4 miles in km) See www.google.ca/help/features.html

for a description of features.

AltaVista ( www.altavista.com

)

One of larger search engines (1.3 billion pages/objects or more) Particularly good for finding less popular sites Implied “and” but noted for changing Case sensitive when word is in quotations Stemming with * at end or in middle of words Has related terms which helps you focus your search

AltaVista Advanced Search

Has “build a Boolean search” facility or can create your own Can specify pages be from certain country based on country codes so will not include .com etc.

Can specify dates of last modification

Directory-based Search Engines

Indexed by individuals so subject searches will be more accurate Smaller database than Robot engines Used mainly for finding good site on general topic Yahoo ( www.yahoo.com or ca.yahoo.com

) About ( about.com ) Looksmart ( www.looksmart.com

)

Yahoo ( ca.yahoo.com

)

Most popular of directory based search engines Many different versions (international have same pages as others but local options are supplied first) Now has own web search which is competing with Google’s Can search by categories and sub-categories

Annotated directory-based search engines

Because annotated, database is even smaller than Directory-based engine Quality of web pages is better Web pages often rated Librarian’s Index to the Internet ( lii.org

) The Internet Public Library ( www.ipl.org/ )

Librarian’s Index to the Internet ( lii.org

)

Topical list of high quality websites with abstracts and qualitative analysis Can willow down by topic or use search capability Only websites which meet the standards of the editors are included Provides date site was added to index as well as date the lii entry was last updated

Meta indexes

One site searches more than one search engine Results can be separated or combined Sometimes a problem in interpreting question equally effectively for all search engines Used if not sure which search engine will give you best results and/or for obscure topics

Meta indexes examples

Dogpile ( www.dogpile.com

) Metacrawler ( www.metacrawler.com/index.html

) Surfwax ( www.surfwax.com

) Hotbot ( www.hotbot.com

)

Specialized Search Engines

Geographic based ( www.altavistacanada.com

, http://www.ottawastart.com/ Phone directories ( canada411.sympatico.ca

/, www.infospace.com/canada/index.htm

) Newsgroup searching ( groups.google.com

) News searching ( news.google.ca

) Women’s information ( wwwomen.com

) Different formats ( www.gimpsy.com/ , www.kartoo.com/ )

Specialized sites

Ottawa Public Library ( www.library.ottawa.on.ca

) Reference tools (see library reference sites, eg. lii.org

, www.ipl.org/ref ) Encyclopedias ( www.britannica.com

, Columbia encyclopedia www.bartleby.com/65/ Canadian information ( vrl.tpl.toronto.on.ca/ , Canadian information by subject www.nlc bnc.ca/caninfo/ecaninfo.htm

, Canadian encyclopedia online, www.thecanadianencyclopedia.com/

Some hints on selecting search strategies

For any page on general topic to which you need an introduction try Directory-based search engine. If do not need specific quality can use address bar search For web page of major company or organization try Google or Alta Vista For a specific web page that would not necessarily be popular try Alta Vista or Google

Some hints on selecting search strategies cont.

For health topics try a health website engine like www.medbroadcast.com

or the Canadian Health Network www.canadian-health network.ca/customtools/homee.html

, or the library’s health database, Health Source ( www.library.ottawa.on.ca/electronic/index.ht

m ), or the health links on the library’s web page ( www.library.ottawa.on.ca/english/links/Public Adults/index.htm

).

Some hints on selecting search strategies cont.

For very obscure topic topic try Google or Alta Vista or one of meta indexes For items in databases, try to find the correct host or search a special site for invisible websites (eg. www.invisible web.net/ )

Interpretation of search results

Look at results and reformat search using things like searching within results, Prisma and adding new keywords.

Analytically choose which sites to look at in result list

Anatomy of URL domain + type of name, I.e. the name or organization followed by the type of organization. Some popular suffixes are: .com for commercial sites, .edu for university sites (mainly American), .org for non-profit organizations, .gov for U.S. government sites, and .gc.ca for Canadian government sites.

Interpretation of search results cont.

Consider things like the authority of the author, the currency of the information, and the reason for creating the website (implications for bias) Do not look through pages and pages of results. If the first three pages are not promising refine the search (see the first point on interpreting the results).

Some useful tutorials for searching

See “Learning to search” section of Collection of special search engines (appears under contents on left-hand side of the page) www.leidenuniv.nl/ub/biv/specials.htm

l Web searching tips www.searchenginewatch.com/facts/index.htm

Net tutor ( gateway.lib.ohio state.edu/tutor/les5/ )

Some useful tutorials for searching cont.

In the links section of the Ottawa Public Library’s web site, ( www.library.ottawa.on.ca/english/links/ PublicAdults/index.htm

), look under the category WWW under the subcategory Internet

To find more info on search engines

Searchenginewatch ( www.searchenginewatch.com

) Searchengineshowdown ( www.searchengineshowdown.com

)

For More Help on Searching

Contact the Reference Dept. of the Main Branch of OPL by phoning 236 0302, ext. 233, or email [email protected]

Consult this web page or other specialized web presentations on the library’s web page at http://www.library.ottawa.on.ca/english/s ervices/reference/index.htm