Metadata Strategy

Download Report

Transcript Metadata Strategy

Search, Browse, and Faceted
Navigation
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
 Introduction
 Essentials of Facets / Faceted Navigation
 Facets in Government / Enterprise
–
–
Differences
Basic Design of Search / Browse / Facets
 Case Studies – Tale of Two Taxonomies
 Search / Browse / Facets – Web 2.0 & Future Trends
2
KAPS Group: General






Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 12-15
Partners – FAST/Convera, Inxight, SchemaLogic, etc.
Consulting, Strategy, Knowledge architecture audit
Taxonomies: Enterprise, Marketing, Insurance, etc.
Services:
–
–
–
–
–
Taxonomy development, consulting, customization
Technology Consulting – Search, CMS, Portals, etc.
Metadata standards and implementation
Knowledge Management: Collaboration, Expertise, e-learning
Applied Theory – Faceted taxonomies, complexity theory, natural
categories
3
History of Facets
 S. R. Ranganathan – 1960’s (Taxonomies – Aristotle)
–
Issue of Compound Subjects
– The Universe consists of PMEST
• Personality, Matter, Energy, Space, Time
 Classification Research Group- 1950’s, 1970’s
–
–
–
Facet analysis as basis for all bibliographic classifications
Based on Ranganathan, simplified
Principles:
• Division – a facet must represent only one characteristic
• Mutual Exclusivity
– More flexible, less doctrinaire
 Classification Theory to Web Implementation
–
An Idea waiting for a technology - Multiple Filters / dimensions
4
Essentials of Facets
 Facets are not categories
–
Entities or concepts belong to a category
– Entities have facets
 Facets are metadata - properties or attributes
–
–
Entities or concepts fit into one or more categories
All entities have all facets – defined by set of values
 Facets are orthogonal – mutually exclusive – dimensions
–
An event is not a person is not a document is not a place.
 Facets – variety – of units, of structure
–
–
Numerical range (price), Location – big to small
Alphabetical, Hierarchical - taxonomic
5
Essentials of Faceted Navigation
 Not a Yahoo-style Browse
–
Computer Stores under Computers and Internet
– One value per facet per entity
 Faceted Navigation
–
Facets are filters, multidimensional
– Browse within a facet, filter by multiple facets
 Facets are applied at search time – post-coordination, not precoordination [Advanced Search]
 Faceted Navigation is an active interface – dynamic combination
of search and browse
6
Faceted Navigation: Advantages
 More intuitive – easy to guess what is behind each door
• Simplicity of internal organization
• 20 questions – we know and use
 Dynamic selection of categories
• Allow multiple perspectives/ no universal set needed
• Ability to Handle Compound Subjects
 Trick Users into “using” Advanced Search
• wine where color = red, price = x-y, etc.
• Click on color red, click on price x-y, etc.
 Systematic Advantages:
–
Need fewer Elements
– 4 facets of 10 nodes = 10,000 node taxonomy
7
Faceted Navigation: Disadvantages
 Lack of Standards for Faceted Classifications
• Every project is unique customization
 Difficulty of expressing complex relationships
• Simplicity of internal organization
 Loss of Browse Context
• Difficult to grasp scope and relationships
 Essential Limit of Faceted Navigation
Limited Domain Applicability – type and size
– Cost of tagging
–
 Trade off between simplicity (power and ease of understanding)
and complexity (real world)
8
9
10
11
12
Government & Enterprise Environment
 Agency Content – different world than eCommerce
–
More Content, more kinds, more unstructured
– Not a catalog to start – less metadata and structured content
– Complexity -- not just content but variety of users and activities
 Agency – Question of Balance / strategy
–
More facets = more findability (up to a point)
– Fewer facets = lower cost to tag documents
 Facet structures are more complex than in eCommerce
–
Multiple structures, more subject like
 Need to start with major research (KA Audit)
–
Content, users, business activities, information technologies
13
Knowledge Architecture Audit:
Knowledge Map
Project
Foundation
Contextual
Interviews
Information
Interviews
App/Content User Survey
Catalog
Knowledge
Map
Meetings,
work groups
Overview
High Level:
Process
Community
Info
behaviors
of Business
processes
Technology
and content
All 4
dimensions
Meetings,
work groups
General
Outline
Broad
Context
Deep
Details
Deep
Details
Complete
Picture
New
Foundation
14
Facets, Search, Browse
Enterprise Design Issues - General
 How many Facets do you need?
–
“Can’t we start with just 1 or 2 facets and see how it works?”
 Balance of metadata overhead, findability, personalization
–
–
–
–
–
Distributed model reduces cost – enables more facets
ECM – publishing process, policy
Distributed taggers – users, user communities (2.0), KM-Library
Auto Populate – Organization, Location
Software – entity extraction, summarization, auto-categorization
 Rule of Thumb:
–
–
Small catalog of homogenous items 3-4
Enterprise content – 4-8
15
Enterprise Environment – Case Studies
 A Tale of Two Taxonomies
–
It was the best of times, it was the worst of times
 Basic Approach
Initial meetings – project planning
– High level K map – content, people, technology
– Contextual and Information Interviews
– Content Analysis
– Draft Taxonomy – validation interviews, refine
– Integration and Governance Plans
–
16
Enterprise Environment – Case One – Taxonomy, 7 facets
 Taxonomy of Subjects / Disciplines:
–
Science > Marine Science > Marine microbiology > Marine toxins
 Facets:
–
Organization > Division > Group
– Clients > Federal > EPA
– Instruments > Environmental Testing > Ocean Analysis > Vehicle
– Facilities > Division > Location > Building X
– Methods > Social > Population Study
– Materials > Compounds > Chemicals
– Content Type – Knowledge Asset > Proposals
17
Enterprise Environment – Case One – Taxonomy, 7 facets
 Project Owner – KM department – included RM, business
process
 Involvement of library - critical
 Realistic budget, flexible project plan
 Successful interviews – build on context
–
Overall information strategy – where taxonomy fits
 Good Draft taxonomy and extended refinement
Software, process, team – train library staff
– Good selection and number of facets
–
 Final plans and hand off to client
18
Enterprise Environment – Case Two – Taxonomy, 4 facets
 Taxonomy of Subjects / Disciplines:
–
Geology > Petrology
 Facets:
–
–
–
–
Organization > Division > Group
Process > Drill a Well > File Test Plan
Assets > Platforms > Platform A
Content Type > Communication > Presentations
19
Enterprise Environment – Case Two – Taxonomy, 4 facets
 Location – not KM – tied to RM and software
• Solution looking for the right problem
• No Library or Training involvement
 Value of taxonomy understood, but not the complexity and scope
–
–
Under budget, under staffed
Not enough research – and wrong people
 Not enough facets
Wrong set of facets – business not information
– Ill-defined facets – too complex internal structure
–
 Wrong kind of project management
• Special needs of a taxonomy project
20
Facets and 2.0
 “It’s MySpace meets YouTube meets Wikipedia meets Google –
on steroids.”
 “It’s ignorance meets egotism meets bad taste meets mob rule –
on steroids.” – The Cult of the Amateur – Andrew Keen
 Revolution and Evolution
–
Doesn’t anyone do evolution (Web 1.2 anyone?)
 Wikipedia – users can do it all - NOT
–
With the help of 2,000 trusted editors and software, combating the
passionate conviction and impact of money
 Wisdom of Crowds
–
Good for guessing jelly beans, not useful tags
21
Folksonomies – Good and Bad
 Advantages
–
–
–
–
Simple, Lower cost of categorization
Can respond quickly to changes, User’s own terms
Better than no tags at all (Not really)
Getting people excited about metadata!
 Disadvantages
They don’t work very well for finding
No structure, no conceptual relationships
Quality and Popularity are very different
Issues of scale – popular tags already showing a million hits
– Errors – misspellings, single words or bad compounds, single use or
idiosyncratic use
–
–
–
–
 Social mechanism – opposite of wisdom of crowds
–
–
Tyranny of the majority
Del.icio.us – Design – 1 Mil (computer design)
22
Facets and 2.0 – Evolving answers
Technology
 Integrated Evolving Solution: Technology, People, Semantics //
with Feedback with consequences
 Enterprise Content Management
Place to add metadata – of all kinds, not just keywords
Policy support – important, part of job performance
– Add tag clouds to input page
– More sophisticated displays
• Tag clouds mapped to community map
• Tag clusters, taxonomy location
–
–
 Semantic Software – Inxight, Teragram, etc.
–
Suggest terms based on text, on tag clouds
 Enterprise Search
–
Search – Browse – Facets
23
Facets and 2.0 – Evolving answers
People
 New Relationship of Center and Crowd
–
Not top down or bottom up
– More sophisticated support, more freedom, more suggestions, more
user input
– - New roles – for users (taggers, part of variety of communities –
both distributed and central)
– New roles for central – create feedback system, tweak the evolution
of the system, Develop initial candidates
 Communities of Practice – apply to tagging, ranking
–
–
–
Community Maps – formal and informal
Map tags to communities – more useful suggestions
Use tags to uncover communities
24
Facets and 2.0 – Evolving answers
Semantics
 Start and end with a formal taxonomy / Ontology
–
Findability vastly superior
– Communication with others – share tags
– Take advantage of conceptual relationships
 Tagging experience – folksonomies plus
Users can type any word – system looks it up – plurals, synonyms,
preferred terms, spelling variations
– Software suggestions – based on content of bookmark, document
and on popular user tags
• Cognitively simpler task than own value, complex hierarchy
– New terms flagged and routed to central team
–
 Feedback with consequences
–
Rank quality of tags, quality of taggers
25
Facets: Future Trends
 Facets and Facts / Ontologies
–
Types of relationships: People have friends, family, bosses and
employees, jobs
– Implications of those relationships – doctor has patients, salesman
has customers
– Facets are a foundation for precise rules and relationships
• Define important types of relationships for each facet dimension.
 Advanced Applications – Text and Data Mining, Alerts
–
–
Combining Subject Matter and Topical Facets
Map Topics and Facets
• Quality control for drilling new well in region X
– Rules – Contains any of type x entity or facet (products), plus
complex conceptual content, plus certain values within a facet
(buying activity), then send alert
26
Conclusions: Facets not Folksonomies
 Facets are an important addition to Search / Browse
 Facets require adding lots of meta data – and that is a good thing
 Facets require that you understand your users – and that is a
good thing
 Facets support the range of Government users – dynamic
personalization – multiple interests, multiple info behaviors
 An integrated search-browse-facet user interface provides simple
complexity
–
supports both quick answers to specific questions and deep research
exploration
 You want a revolution? Integrate 2.0 with meaning (3.0)
–
Dynamic dimensions – User and semantics
27
Questions?
Tom Reamy
[email protected]
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com