Taxonomy Development Workshop
Download
Report
Transcript Taxonomy Development Workshop
Enterprise Semantic Infrastructure
Workshop
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
Introduction
Semantic Infrastructure
Basic Concepts – Content, People, Business Processes, Technology
– Developing an Articulated Strategic Vision
– Benefits of an Infrastructure Approach
–
Development and Maintenance of a Semantic Infrastructure
Semantic Tools – Capabilities & Acquisition Strategy
– Development Processes & Best Practices
–
Semantic Infrastructure Applications
–
Enterprise Search
– Search Based Applications & Beyond
Discussion &Questions / Lunch
2
KAPS Group: General
Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 8-10
Partners – SAS, Smart Logic, Microsoft, Concept Searching, etc.
Consulting, Strategy, Knowledge architecture audit
Services:
– Taxonomy/Text Analytics development, consulting, customization
– Technology Consulting – Search, CMS, Portals, etc.
– Evaluation of Enterprise Search, Text Analytics
– Metadata standards and implementation
– Knowledge Management: Collaboration, Expertise, e-learning
Applied Theory – Faceted taxonomies, complexity theory, natural
categories
3
Questions?
Tom Reamy
[email protected]
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Semantic Infrastructure
Basic Concepts & Benefits
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
Agenda
Semantic Infrastructure – Basic Concepts
–
Content & Content Structure
– People – Resources, Producers, Consumers
– Semantics in Business Processes
– Technology – Information, Text Analytics, Text Mining
Semantic Infrastructure – Strategic Foundation
–
Knowledge Audit Plus
Semantic Infrastructure – Benefits of an Infrastructure Approach
–
–
Infrastructure vs. Projects
Semantics vs. Technology
Conclusion
6
Semantic Infrastructure: 4 Dimensions
Ideas – Content and Content Structure
–
–
Map of Content – Tribal language silos
Structure – articulate and integrate
People – Producers & Consumers
–
Communities, Users, Central Team
Activities – Business processes and procedures
–
Semantics, information needs and behaviors
Technology
–
–
CMS, Search, portals, text analytics
Applications – BI, CI, Semantic Web, Text Mining
7
Semantic Infrastructure: 4 Dimensions
Content and Content Structure
Map multiple types and sources of content
–
Structured and unstructured, internal and external
Beyond Metadata and Taxonomy
–
Keywords - poor performance
– Dublin Core: hard to implement
– Dublin Core: Too formal and not formal enough
Need structures that are more powerful and more flexible
–
Model of framework and smart modules
Framework
–
–
–
–
Faceted metadata
Simple taxonomies with intelligence – categorization & extraction
Ontology and Semantic Web
Best bets and user metadata
8
Knowledge Structures
List of Keywords (Folksonomies)
Controlled Vocabularies, Glossaries
Thesaurus
Browse Taxonomies (Classification)
Formal Taxonomies
Faceted Classifications
Semantic Networks / Ontologies
Categorization Taxonomies
Topic Maps
Knowledge Maps
9
A Framework of Knowledge Structures
Level 1 – keywords, glossaries, acronym lists, search logs
– Resources, inputs into upper levels
Level 2 – Thesaurus, Taxonomies
– Semantic Resource – foundation for applications, metadata
Level 3 – Facets, Ontologies, semantic networks, topic
maps, Categorization Taxonomies
– Applications
Level 4 – Knowledge maps
– Strategic Resource
10
Semantic Infrastructure: People
Communities / Tribes
–
Different languages
– Different Cultures
– Different models of knowledge
Two needs – support silos and inter-silo communication
Types of Communities
–
–
–
Formal and informal
Variety of subject matters – vaccines, research, sales
Variety of communication channels and information behaviors
Individual People – tacit knowledge / information behaviors
–
–
Consumers and Producers of information – In Depth
Map major types
11
Semantic Infrastructure Dimensions
People: Central Team
Central Team supported by software and offering services
–
–
–
–
–
–
–
Creating, acquiring, evaluating taxonomies, metadata standards,
vocabularies, categorization taxonomies
Input into technology decisions and design – content management,
portals, search
Socializing the benefits of metadata, creating a content culture
Evaluating metadata quality, facilitating author metadata
Analyzing the results of using metadata, how communities are using
Research metadata theory, user centric metadata
Facilitate knowledge capture in projects, meetings
12
Semantic Infrastructure Dimensions
People: Location of Team
KM/KA Dept. – Cross Organizational, Interdisciplinary
Balance of dedicated and virtual, partners
–
Library, Training, IT, HR, Corporate Communication
Balance of central and distributed
Industry variation
–
–
–
Pharmaceutical – dedicated department, major place in the
organization
Insurance – Small central group with partners
Beans – a librarian and part time functions
Which design – knowledge architecture audit
13
Semantic Infrastructure Dimensions
Technology Infrastructure
Enterprise platforms: from creation to retrieval to
application
–
Semantic Infrastructure as the computer network
• Applications – integrated meaning, not just data
Semantic Structure
–
Text Analytics – taxonomy, categorization, extraction
Integration Platforms – Content management, Search
–
–
Add structure to content at publication
Add structure to content at consumption
14
Infrastructure Solutions: Resources
Technology
Text Mining
–
–
Both a structure technology – taxonomy development
And an application
Search Based Applications
–
Portals, collaboration, business intelligence, CRM
– Semantics add intelligence to individual applications
– Semantics add ability to communicate between applications
Creation – content management, innovation, communities of
practice (CoPs)
–
When, who, how, and how much structure to add
– Workflow with meaning, distributed subject matter experts (SMEs)
and centralized teams
15
Infrastructure Solutions: Elements
Business Processes
Platform for variety of information behaviors & needs
–
–
Research, administration, technical support, etc.
Types of content, questions
Subject Matter Experts – Info Structure Amateurs
Web Analytics – Feedback for maintenance & refine
Enhance Basic Processes – Integrated Workflow
–
Enhance Both Efficiency and Quality
Enhance support processes – education, training
Develop new processes and capabilities
–
External Content – Text mining, smarter categorization
16
Semantic Infrastructure: The start and foundation
Knowledge Architecture Audit
Knowledge Map - Understand what you have, what you
are, what you want
–
The foundation of the foundation
Contextual interviews, content analysis, surveys, focus
groups, ethnographic studies, Text Mining
Category modeling – “Intertwingledness” -learning new
categories influenced by other, related categories
Natural level categories mapped to communities, activities
• Novice prefer higher levels
• Balance of informative and distinctiveness
Living, breathing, evolving foundation is the goal
17
Knowledge Architecture Audit:
Knowledge Map
Project
Foundation
Contextual
Interviews
Information
Interviews
App/Content User Survey
Catalog
Strategy
Document
Meetings,
work groups
Overview
High Level:
Process
Community
Info
behaviors
of Business
processes
Technology
and content
All 4
dimensions
Meetings,
work groups
General
Outline
Broad
Context
Deep
Details
Deep
Details
Complete
Picture
New
Foundation
18
Semantic Infrastructure
Enterprise Taxonomies: Wrong Approach
Very difficult to develop - $100,000’s
Even more difficult to apply
–
–
Teams of Librarians or Authors/SME’s
Cost versus Quality
Problems with maintenance
Cost rises in proportion with granularity
Difficulty of representing user perspective
Social media requires a framework – doesn’t create one
–
Tyranny of the majority, madness of crowds
19
Semantic Infrastructure
Content Structures: New Approach
Simple Subject Taxonomy structure
–
Easy to develop and maintain
Combined with categorization capabilities
–
Added power and intelligence
Combined with Faceted Metadata
–
–
Dynamic selection of simple categories
Allow multiple user perspectives
• Can’t predict all the ways people think
• Monkey, Banana, Panda
Combined with ontologies and semantic data
–
–
Multiple applications – Text mining to Search
Combine search and browse
20
Semantic Infrastructure Design:
People, Technology, Business Processes
People (Central) – tagging, evaluating tags, fine tune rules
and taxonomy
People (Users) - social tagging, suggestions
Software - Text analytics, auto-categorization, entity
extraction
Software – Search, Content Management, Portals-Intranets
–
Hybrid model – combination of automatic and human
Business Processes – integrated search with activities, text
analytics based applications , intelligent routing
21
Semantic Infrastructure Benefits
Why Semantic Infrastructure
Unstructured content = 80% or more of all content
Limited Usefullness – database of unstructured content
Need to add (infra) structure to make it useful
Information is about meaning, semantics
Search is about semantics, not technology
Can’t Google do it?
–
–
–
Link Algorithm – human act of meaning
Doesn’t work in enterprise
1,000’s of editors adding meaning
New technology makes it possible – Text Analytics
22
Semantic Infrastructure Benefits
General Time and Productivity
Time Savings – Too Big to Believe?
–
–
–
–
Lost time searching - $12M a year per 1,000
Cost of recreating lost information - $4.5M per 1,000
Cost of not finding the right information – Years?
10% improvement = $1.2M a year per 10,000
Making Metrics Human
–
–
–
–
Number of addition FTE’s at no cost (enhanced productivity)
Savings passed on to clients
Spreadsheet of extra activities (ex. Training – working smarter
Build a more integrated, smarter organization
23
Semantic Infrastructure Benefits
Return on Existing Technology
Enterprise Content Management - $100K - $2M
–
Underperforming – year after year, new initiative every 5 years
ECM as part of a Platform
–
Enhance search – improved metadata, especially keywords
A Hybrid Model of ECM and Metadata
–
–
–
Authors, editors-librarians, Text Analytics
Submit a document -> TA generates metadata, extracts
concepts, Suggests categorization (keywords) -> author OK’s
(easy task) -> librarian monitors for issues
Use results as input into analytics
24
Semantic Infrastructure Benefits
Return on Existing Technology
Enterprise Search - $100K - $2M
–
–
Cost Effective and good quality keywords / categorization
More metadata – faceted navigation
Work with ECM or dynamically generate categorization at
search results time
Rich results – summaries, categorization, facets like date,
people, organizations, etc. Tag clouds and related topics
Foundation for Search Based Applications – all need
semantics
25
Semantic Infrastructure Benefits
Infrastructure vs. Projects
Strategic foundation vs. Short Term
Integrated solution – CM and Search and Applications
–
–
Better results
Avoid duplication
Semantics
–
–
Small comparative cost
Needed to get full value from all the above
ROI – asking the wrong question
–
–
What is ROI for having an HR department?
What is ROI for organizing your company?
26
Semantic Infrastructure Benefits
Selling the Benefits
CTO, CFO, CEO
–
–
–
–
–
–
Doesn’t understand – wrong language
Semantics is extra – harder work will overcome
Not business critical
Not tangible – accounting bias
Does not believe the numbers
Believes he/she can do it
Need stories and figures that will connect
Need to understand their world – every case is different
Need to educate them – Semantics is tough and needed
27
Conclusion
Semantic Infrastructure is not just a project
–
Foundation and Platform for multiple projects
Semantic Infrastructure is not just about search
–
It is about language, cognition, and applied intelligence
Strategic Vision (articulated by K Map) is essential
–
–
Even for your under the radar vocabulary project
Paying attention to theory is practical
Benefits are enormous – believe it!
Think Big, Start Small, Scale Fast
–
Initial Project = +10%, All Other Projects = -50%
28
Questions?
Tom Reamy
[email protected]
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com