Key Word in Context Finder: A Concordance Tool for Online

Download Report

Transcript Key Word in Context Finder: A Concordance Tool for Online

Key Word in Context Finder
A Concordance Tool for Online
Learning and Research
William H. Fletcher
United States Naval Academy
Paper presented at the annual meeting of
EuroCALL, University of Abertay Dundee, Scotland,
31 August – 2 September 2000.
Info, info everywhere…
I’ve got no time to think!
Information Overload
Increasing information access
has led to information excess
leaving us no time to process
the information we do access.
InfoAge Paradox
The more information we have,
the less we seem to be able to do with it.
InfoAge Writer’s Block
Any intellectual property yet, honey?
Is the InfoAge leading us
toward disaster?
What does the future
hold for our students?
Actually I’m hoping that what I grow
up to be hasn’t been invented yet.
What can we do to
prepare our
students for the
future?
The Purpose of an Education
To learn to distinguish
Sense from Nonsense
The details change, but the
principles remain the same.
Help turn the
Information Age
into the
Knowledge Age
• Raw Bits into Information
• Information into Knowledge
The FAIRness Principle
• Find information
• Assess its relevance, significance,
and reliability
• Integrate it into one’s knowledge
base
• Retrieve it efficiently
Isn’t the Web wonderful?
The Dream…
With so much information at their fingertips,
students will attain new levels of insight and
synthesis
The Reality…
Random Cut & Paste (with attribution?)
replaces original research and creativity
Isn’t the Web wonderful?
The Dream…
Instant access to authentic materials from
around the globe
The Reality…
World Wide Wait + World Wide Haystack =
World Wide Frustration
Isn’t the Web wonderful?
The Dream…
With so many powerful Search Engines it
should be easy to find appropriate materials
online
The Reality…
Search Engines are designed to optimize the
number, not the relevance of hits
KWiCFinder – Purpose
A research tool to…
• help formulate and focus queries
• automatically retrieve and excerpt
documents matching the search criteria
A search produces a Key Word in Context
concordance of the documents analyzed
KWiCFinder – Genesis
KWiCFinder was motivated by…
• observation of and discussions with
students and colleagues conducting online
searches
• my own frustration with the inordinate
effort required to find suitable content for
instructional materials and research
KWiCFinder – AltaVista
KWiCFinder uses the AltaVista search engine
• Largest corpus indexed
–
claims 340 MDocs
• Useful features for highly-specific searches
 World-wide coverage; Specify language
 Distinguishes case and special characters
 Best documentation of how Boolean and
other criteria are implemented
 No stopwords excluded from index
KWiCFinder Enhancements
Offers significant enhancements to AltaVista
for foreign languages:
• Intuitive approach to special char input
• Tamecards greatly increase query
specificity with limited user overhead
• point-and-click inclusion of countries, date
ranges, domains etc. for search criteria not
included in report
More KWiCFinder
Enhancements

AltaVista supports NEAR, i.e. two
search terms must occur within 10
words of each other
• KWiCFinder adds BEFORE and
AFTER, and allows the user to
specify up to how many words
apart
KWiCFinder Query Screen
KWiCFinder Inclusion Screen
KWiCFinder Exclusion Screen
KWiCFinder Option Screen
KWiCFinder – “Sic” Option
AltaVista Wildcards – Implicit
• Any plain char matches all corresponding
chars with diacritics
• Any lower-case chars matches corresponding
UPPER-CASE as well
- continuo always matches continuo,
continúo, continuó
- fahrt always matches fahrt, Fahrt, fährt
KWiCFinder offers a “sic” option to block
these built-in wildcards when desired
KWiCFinder – Wildcards
AltaVista Wildcards – Explicit
* (asterisk) matches 0-5 chars; must be
preceded by 3 chars minimum
KWiCFinder adds
% (percent) matches 0-1 chars
? matches 1 char
KWiCFinder
Additional Wildcards
KWiCFinder Wildcards – Implicit
Apostrophe ’ and hyphen - match forms
with these marks, space, or no space
- (ich) hab’s matches hab’s and habs
- on-line matches on-line, on line, online
- co-operate matches co-operate,
cooperate, coöperate
KWiCFinder Tamecards
• Wildcards return any sequence of n chars
• Tamecards match only those combinations
of chars specified by the user.
KWiCFinder Tamecards – Plain
KWiCFinder Tamecards – Indexed
KWiCFinder – Latest Innovations
XML report format provides for…
• Changing report language or format
without new search
• Annotating, classifying, and deleting
citations or documents
• Merging search reports or
extracting subsets of them
KWiCFinder – Current Status
• Download for free at
http://miniappolis.com/KWiCFinder/
• Auto-update keeps up with changes in
AV search engine and KF program
• Caveats and Limitations
 still beta software, subject to bugs and changes
 observe system requirements (Win 9x,
technical specs, IE 5+; NT/2000??)
 “emerging” documentation
The Web is My Corpus,
I Shall Not Want…
• I shall not want for material
 Numerous documents can be found online in almost any language
• I shall not want for challenges
 Formulate a query that yields optimally
useful results
 Determine which documents are reliable
and representive in form and content
KWiCFinder as a
Teaching and Learning Tool
• Micro / Form Level
 Lexical – word use, vocabulary discovery
and verification, collocates, lexical functions
 Grammatical – examples of forms
and structures in context
• Macro / Content Level
 Assess usefulness (relevance, accessibility)
of many documents quickly and efficiently
Preparing Learners To Use
KWiCFinder
• Pre-Search Organizer
 What words or phrases do you need to see in
context? Which are essential? Which are
alternatives?
 What other factors (country, date, other
terms) might help narrow your search?
• Post-Search Evaluation
 Documents chosen – Why? Contribution?
 Documents rejected – Why?
Limitations
KWiCFinder is only as good as the
Search Engine it uses. Spurious
hits result because…
 AV is not always current due to “link rot”
and changed pages
 Documents returned by a general search
on AV may not meet the more-specific
criteria submitted to KWiCFinder