THE INTERSPACE PROTOTYPE An Analysis Environment for

Download Report

Transcript THE INTERSPACE PROTOTYPE An Analysis Environment for

Concept Switching in the Interspace:
Networking Infrastructure for
Community Knowledge
Bruce Schatz
CANIS Laboratory
Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
Graduate School of Informatics, Kyoto University
[email protected], www.canis.uiuc.edu
IEEE Knowledge Media Networking KMN’02
Keynote Address, CRL, Kyoto Japan, July 11, 2002
THE THIRD WAVE OF NET EVOLUTION
CONCEPTS
OBJECTS
PACKETS
CONCEPT SPACES

from Objects to Concepts

from Syntax to Semantics

Infrastructure is Interaction with Abstraction
Internet is packet transmission across computers
Interspace is concept navigation across repositories
LEVELS OF INDEXES
Technology
Engineering
FORMAL
(manual)
Electrical
IEEE
communities
INFORMAL
groups
(automatic)
individuals
THE DISTRIBUTED WORLD



Community Repositories in the Interspace
Peer to Peer Networking Infrastructure
Every Person performs Every Role
USER
LIBRARIAN
INDEXER
PUBLISHER
AUTHOR
request
reference
classify
quality
generate
Meta Data
How to Represent the
Community Knowledge
Automatic and Interactive
Representation Techniques
for Capturing the
Fundamental Structure
Meta Maps
How to Locate the
Community Knowledge
Automatic and Interactive
Location Techniques
for Capturing the
Fundamental Landscape
CONCEPTS ACROSS THE INTERSPACE
SCALABLE SEMANTICS

Automatic indexing
Domain-Independent indexing
Statistical clustering

Compute Context of




concepts within documents
documents within repositories
CROSS-OVERS IN SEMANTIC INDEXING
COMPUTING CONCEPTS
‘92: 4,000 (molecular biology)
‘93: 40,000 (molecular biology)
‘95: 400,000 (electrical engineering)
‘96: 4,000,000 (engineering)
‘98: 40,000,000 (medicine)
1992
1993
1995
1996
1998
SIMULATING A NEW WORLD

Obtain discipline-scale collection



Partition discipline into Community Repositories



4 core terms per abstract for MeSH classification
32K nodes with core terms (classification tree)
Community is all abstracts classified by core term



MEDLINE from NLM, 10M bibliographic abstracts
human classification: Medical Subject Headings
40M abstracts containing 280M concepts
concept spaces took 2 days on NCSA Origin 2000
Simulating World of Medical Communities

10K repositories with > 1K abstracts
(1K w/ > 10K)
COMMUNITY PROCESSING
Semantic Indexing

Extracting Concepts (AI)



Canonical noun phrases
Generic statistical parser
Computing Context (IR)


Co-occurrence frequency, in collection
Useful interactively, not strict ordering
System Side Infrastructure
Classification Technologies for
Multimedia Documents





Phrases
Concepts
Types
Clusters
Structures
(multi-word nouns)
(generic phrases)
(identified concepts)
(grouped types)
(semantic universals)
INTERSPACE NAVIGATION

Semantic Indexes for Community Repositories

Navigating Abstractions within Repository


concept space & category map
Interactive browsing by Community experts
*www.canis.uiuc.edu/interspace-prototype
Interspace Remote Access Client
Navigation in MEDSPACE
For a patient with Rheumatoid Arthritis


Find a drug that reduces the pain (analgesic)
but does not cause stomach (gastrointestinal) bleeding
Choose Domain
Concept Search
Concept Navigation
Retrieve Document
Navigate Document
Retrieve Document
Category Map
Category
Navigation
Concept Navigation
User Side Infrastructure
Navigation Technologies for
Search Interfaces





Exact Match
(noun phrases)
Relationship List
(concept suggestions)
Cluster Comparison (groups to groups)
Spreading Activation (group intersections)
Artificial Landscapes (semantic distances)
SWITCHING
In the Interspace…

each Community maintains its own repository

Switching is navigating Across repositories

use your vocabulary to search
another specialty
Medicine Session
Categories and Concepts
Concept Switching
Document Retrieval
CONCEPT SWITCHING

“Concept” versus “Term”


set of “semantically” equivalent terms
Concept switching

region to region (set to set) match
Semantic region
term
Concept Space
Concept Space
ENGINEERING SESSION
Engineering Categories & Concepts
Further Concept Navigation
Searching via Concept Suggestion
Switching Across Repositories
Future Technologies

Concept Switching


Dynamic Indexing


Spreading activation, type tagging
On-the-fly collections, during session
Path Matching

Aggregating indexes, many repositories
Semantic Analysis of Multimedia

Collections of Objects containing Units

Text: community repository (topic proximity)
document abstracts containing noun phrases

Image: aerial photograph (spatial proximity)
feature regions containing texture tiles


Units -- media-dependent (statistical parsers)
Indexes -- media-independent (statistical clusters)
Media Interoperability Model

text concept space & category map (geoscience)


image concept & category maps in aerial photos


1M phrases in 500K abstracts from Georef
and Petroleum Abstracts
visual thesaurus maps for 200K regions in 800
images (6M tiles)
geographic map (where) v. semantic map (what)

spatial gazetteer as bridge
image<=>text<=>number
Text and Number Interoperability
Integrated Result:
Within the bounding
geography location, 2
documents and 88
AVHRR records related
to the integrated query
are retrieved.
Text and AVHRR Query:
Show me information about
Santa Barbara area with
mild temperature and high
vegetation density.
Image Concept Switching
Image Query:
By browsing a
texture (tile) catalog,
show me information
about residential and
farm land areas.
Result:
A set of related
images are retrieved
and shown in the
Results Frame. The
full-size image #368
is displayed with its
place names and tile
locations.
INFORMATION SPACEFLIGHT

Landscape as category map visualization


Valleys are semantic clusters
Hills are semantic distances

Traversal across multiple levels of abstraction
Category
Maps
SELF-ORGANIZING MAPS (SOMs)
INFORMATION SPACEFLIGHT
INFORMATION SPACEFLIGHT
Flying through
Cyberspace
THE NET OF THE 21st CENTURY



Beyond Objects to Concepts
Beyond Search to Analysis
Problem Solving via Cross-Correlating
Multimedia Information across the Net

Every community has its own special library
Every community does semantic indexing

The Interspace is true Cyberspace

Subject Assignment
Improved Search by
Identifying Subjects

Human Indexers classify Documents
From Subject Thesaurus and Knowledge

Interactive Support for Community Curators
(Subject Experts but Classification Amateurs)

Use Concept Spaces to Suggest Subjects
From Related Documents in the Collection

See Best Paper Nominee at ACM DL 98
Structure Assignment
Improved Search by
Identifying Structures

Human Indexers classify Clusters
From Generic Structures beyond Subjects

Universal Structures Cross-Cultural

Interactive Support for Community Curators
(Subject Experts but Classification Amateurs)

Necessary for Peer-Peer Infrastructure
When Ordinary Persons form Communities
The Structures of Everyday Life

Bodies



(physical interactions)
Rails (trains) and Roads (cars)
Communication

(groups)
Houses and Cities
Transportation


Food and Clothes
Buildings

(individuals)
(logical interactions)
Phones (talking) and Computers (retrieving)
Navigating Universal Structures






A planet for every kid’s local environment
Federating the planets into a universe
Ordering all planets from kid’s Point Of View
Flying through the Kids Universe
Finding similar kids from different POVs
Connecting historically through museums