Getting Started with Business Taxonomy Design

Download Report

Transcript Getting Started with Business Taxonomy Design

Taxonomy Strategies LLC
Getting Started with Business
Taxonomy Design
Joseph A. Busch, Founder & Principal
Ron Daniel Jr., Principal
June 2, 2009
Copyright 2009 Taxonomy Strategies LLC. All rights reserved.
Workshop agenda
Time
Duration
Description
9:00-9:15
15 min
Introduction
9:15-9:30
15 min
Warm-up exercise
9:30-9:45
15 min
Taxonomy background
9:45-10:00
15 min
Taxonomy exercise
10:00-10:15
15 min
Taxonomy background continued
10:15-10:30
15 min
Coffee break
10:30-11:15
45 min
Taxonomy process
11:15-11:45
30 min
Taxonomy exercise
11:45-12:00
15 min
Q&A
Taxonomy Strategies LLC The business of organized information
2
Who we are: Joseph Busch
v
Over 25 years in the business of organized information.
 Founder, Taxonomy Strategies LLC
 Director, Solutions Architecture, Interwoven
 VP, Infoware, Metacode Technologies
– (acquired by Interwoven, November 2000)
 Program Manager, Getty Foundation
 Manager, Pricewaterhouse
v
Metadata and taxonomies community leadership.
 President, American Society for Information Science & Technology
 Director, Dublin Core Metadata Initiative
 Adviser, National Research Council Computer Science and Telecommunications
Board
 Reviewer, National Science Foundation Division of Information and Intelligent
Systems
 Founder, Networked Knowledge Organization Systems/Services
Taxonomy Strategies LLC The business of organized information
3
Who we are: Ron Daniel, Jr.
v
Over 15 years in the business of metadata & automatic
classification.
 Principal, Taxonomy Strategies
 Standards Architect, Interwoven
 Senior Information Scientist, Metacode Technologies (acquired by
Interwoven, November 2000)
 Technical Staff Member, Los Alamos National Laboratory
v
Metadata and taxonomies community leadership.
 Chair, PRISM (Publishers Requirements for Industry Standard Metadata)
working group
 Acting chair, XML Linking working group
 Member, RDF working groups
 Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2 reports.
Taxonomy Strategies LLC The business of organized information
4
Who are you?
What sectors do you work in?
Your Role
v
v
v
v
v
v
v
v
v
v
v
Content Manager
Editor
Information Architect
Usability Expert
Librarian
Records Manager
Knowledge Engineer
Ontologist
Chief Information Officer
Communications
Administration
Industrial Sector
v
Financial Services
 Banking & Insurance
v
High Tech
 Computers, Software &
Telecommunications
v
Heavy Manufacturing
 Steel, Automobiles, Aircraft, etc.
v
Government
 Federal, State or local
v
Manufacturing
 Consumer Products, etc.
v
v
Medical & Health Care
Mining & Refining
 Petrochemicals, Oil & Gas
v
Pharmaceuticals
 Drugs, Biotech
Taxonomy Strategies LLC The business of organized information
5
Pop Quiz
On a blank piece of paper:
• What question(s) did you want to have answered by
coming to today’s talks?
Flag one question to be discussed later.
You do NOT have to provide your name.
Please DO provide your job title, division, and either
company name or company type.
Taxonomy Strategies LLC The business of organized information
6
Exercise 1: How do you organize your sock drawer?
Like this?
Or, like this?
Taxonomy Strategies LLC The business of organized information
7
Getting Started with Business Taxonomy Design:
BACKGROUND
Taxonomy Strategies LLC The business of organized information
8
Simple definition of metadata and taxonomy
The Taxonomy is the lists of values to
go into the metadata fields.
Audience
Metadata
Title
Author
Department
Audience
Topic
Internal
Executives
Managers
External
Suppliers
Customers
Partners
Topics
Employee Services
Compensation
Retirement
Insurance
Further Education
Finance and Budget
Products and Services
Support Services
Infrastructure
Supplies
Metadata is data about data – in our
case it is a set of fields of library cataloglike data about published content..
Taxonomy Strategies LLC The business of organized information
9
Traditional v. business taxonomy: Side-by-side comparison
Traditional
Taxonomy
Pacific
Gopher
Snake
(Pituophis
Catenifer)
Kingdom  Animalia
Phylum  Chordata
Class
 Reptilia
Order  Squamata
Family  Colubridae
Genus  Pituophis
Species  Catenifer
v Detailed model for
Business
Taxonomy
Simple & Usable
real world.
for common tasks
v Absolute
v Granularity is
Granularity and
small groups, not
Ultimate
individual items.
Classification.
v Modern ‘faceted’
v Modern ‘cladistic’
approach uses
approach yields far
multiple small
deeper hierarchies
facets which
with very low fancombine to yield
out.
small groups.
Taxonomy Strategies LLC The business of organized information
v
10
Business taxonomy problem: How can a customer pick from
>5,000 faucets w/o quitting?
Refine search by:
v Category
v Price
v Brand
v Color/Finish
v # Handles
v Series Name
v Water Filter?
v Faucet Spray
v Handle Shape
v Soap Dispenser?
Taxonomy Strategies LLC The business of organized information
11
How business taxonomy translates into front-end interface
Metadata Field:
Size
Taxonomy Values:
4.5
5.5
6
6.5
7
8
…
Metadata Field:
Type
Taxonomy Values:
Athletic Inspired
Boots
Loafers and Slip-ons
Oxfords and More
Sandals
Metadata Field:
Color
Metadata Field:
Brand
Taxonomy Values:
Black
Blue
Brown
Green
Grey
Ivory
…
Taxonomy Strategies LLC The business of organized information
Taxonomy Values:
Antonio Maurizi
Bacco Bucci
Ben Sherman
Bruno Magli
…
12
How business taxonomy translates into front-end
interface…for YOUR BUSINESS
XYZ Corp. Intranet
Metadata Field:
Department
Departments:
HR
Finance
IT
More…
Taxonomy Values:
HR
Sales and Marketing
Communications
Shipping
…
Document Types:
Metadata Field:
Topics
Document Type
Taxonomy Values:
Forms
Policies
Procedures
Reports
News
…
Forms
Policies
Reports
News
More…
Benefits
Manufacturing
Quality
Safety
More…
Regions:
N. America
Europe
Asia
S. America
More…
Taxonomy Strategies LLC The business of organized information
Metadata Field:
Topic
Taxonomy Values:
Manufacturing
Benefits
Infrastructure
Quality
Safety
…
Metadata Field:
Locale
Taxonomy Values:
North America
Europe
Asia
South America
…
13
Exercise 2: High Level Taxonomy Identification
Metadata Field A:
___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
Metadata Field B:
___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
Your Org’s Site
Grouping A:
Lorem ipso
Factorum delos
Istab uno
Librea johe
More…
Grouping B:
Lorem ipso
Factorum delos
Istab uno
Librea johe
More…
Grouping C:
Lorem ipso
Factorum delos
Istab uno
Librea johe
More…
Grouping D:
Lorem ipso
Factorum delos
Istab uno
Librea johe
More…
Taxonomy Strategies LLC The business of organized information
Metadata Field C:
___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
Metadata Field D:
___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
14
Why use facets in a business taxonomy?
Categorize in multiple,
independent, categories.
v Allow combinations of categories
to narrow the choice of items.
v 4 independent categories of 10
nodes each have the same
discriminatory power as one
hierarchy of 10,000 nodes (104)
v

Easier to maintain
 Easier to reuse existing lists
 Can be easier to navigate, if
software supports it
 Accommodates different needs
and preferences
Taxonomy Strategies LLC The business of organized information
Main
Ingredients
•
•
•
•
•
•
•
•
•
•
Chocolate
Dairy
Fruits
Grains
Meat &
Seafood
Nuts
Olives
Pasta
Spices &
Seasonings
Vegetables
Meal Type
•
•
•
•
•
•
Breakfast
Brunch
Lunch
Supper
Dinner
Snack
•
•
•
•
•
•
•
•
•
•
•
Cuisines
Cooking
Methods
African
American
Asian
Caribbean
Continental
Eclectic/
Fusion/
International
Jewish
Latin American
Mediterranean
Middle Eastern
Vegetarian
•
•
•
•
•
•
•
•
•
•
•
•
•
Advanced
Bake
Broil
Fry
Grill
Marinade
Microwave
No Cooking
Poach
Quick
Roast
Sauté
Slow
Cooking
• Steam
• Stir-fry
42 values to maintain (10+6+11+15)
9900 combinations (10x6x11x15)
15
Justification for business taxonomy
v Easier information management
v Flexibility to respond to changing needs
v Foundation for findability and usability
v Typical ROI Scenarios:
 Greater sales on a public shopping site
 Faster and more consistent responses by call center staff
 Reduced regulatory and legal risk
 Improved knowledge worker productivity
 Improved overall staff productivity
v Don’t justify the taxonomy, justify the goal the taxonomy will
help you achieve.
Taxonomy Strategies LLC The business of organized information
16
Effectiveness of applications of a business taxonomy
v For a product catalog, e.g., HomeDepot.com
 Conversion rate increases
– 20% increase. Petersen
 Lift in average order size.
– 20% increase. Petersen
v For knowledge workers, e.g., call center support staff
 Time saved
– 36% faster than search. Chen & Dumais.
v For knowledge workers, e.g., analysts
 Increase in productivity
– 25% productivity increase from not re-creating content . Taylor.
– Estimated productivity loss exceeded $10M per year—about $500 per
employee per year. Nielsen.
Taxonomy Strategies LLC The business of organized information
17
How do taxonomies improve search?
v Input (Query) Side
 “Search” using a small set of pre-defined values instead of trying to
guess what word or words might have been used in the content.
 Have synonyms mapped together so searches for “car”, “auto”, and
“automobile” return the same things.
v Output (Results) Side
 Organize search results into groups of related items.
 Sorting and filtering
 Refining search results
Taxonomy Strategies LLC The business of organized information
18
Google search on “pcb” –
Returns > 28M items
Taxonomy could suggest
“Polychlorinated Biphenyls” vs.
“Printed Circuit Boards” or
“Pakistan Cricket Board”
Taxonomy Strategies LLC The business of organized information
19
169,169 items
Categorized results
Taxonomy Strategies LLC The business of organized information
Refine search by
clicking on categories
20
Taxonomy in action on the results side:
www.CareerBuilder.com search on IT positions
By Category By Company
Taxonomy Strategies LLC The business of organized information
By City
By State
21
Intra-site navigation through metadata and taxonomy
Main content tagged with:
ORG = Parks, Recreation & Forestry
Division/Parks Department
TOPIC=Leisure & Culture/Parks & Gardens;
Transport & Infrastructure/Construction,
Maintenance & Improvements
CONTENT TYPE = News and Announcements
Home > Dept. of Parks, Recreation & Forestry > Division of Parks > News & Announcements
Forest Park Master Plan
Breadcrumbs and Left-nav are dynamic and
based on directory in which content is
created.
Dynamically populated with query:
SELECT thumbnail,URL
WHERE Format = Video/* and Org = Parks.
Select a random result if list is long.
Org = “Division of Parks” AND Type=“Online
Forms”
Construction Update: July
2003
Aviation Field: The fields are
complete and are open to the
public. Work is still underway
on the paths. The Forest Park
Softball League is seeking
teams for fall play. Contact
Roger Berry at 289-5307.
Boathouse: Project is
complete and open. The City
of St. Louis awarded the
contact for the operator to
Leisure & Culture
Parks & Gardens
Transportation & Infrastructure
Construction, Maint, Impro
SELECT Ttitle, Description,
URL
WHERE
Org = “Parks Division”
AND Type=HomePage
Org = RELATED_ORG(Topic
= “Parks & Gardens”,
“Construction,
Maintenance &
Improvements”)
AND Type=HomePage (Get
Title and Description)
Topic = AROUND(“Parks & Gardens”, “Construction,
Maintenance & Improvements”)
Taxonomy Strategies LLC The business of organized information
22
Getting Started with Business Taxonomy Design:
TAXONOMY DEVELOPMENT
METHODOLOGY
Taxonomy Strategies LLC The business of organized information
23
Taxonomy development methods
Method
Automated
analysis
Description
Munge, blast, crunch text to analyze
corpus.
Workshopping
Guide group in activities to identify
key concepts.
Prepare best guess, then bring it to
the table to discuss.
Customize internal terminology,
industry standards, etc.
Combination of some or all of these
methods.
Strawman
Adapt Existing
Vocabularies
Hybrid
Taxonomy Strategies LLC The business of organized information
24
Key components to a successful taxonomy project
Set-up
taxonomy
team
Identify
business
case
Planning &
research
Maintain &
evolve
taxonomy
Interview
stakeholders
Migrate
content
Define use
cases
Validation
testing &
review
Taxonomy Strategies LLC The business of organized information
Build-out
taxonomy
detail
Build highlevel
taxonomy
25
Define business case: Business case examples
v Improve search and browsing to reduce the amount of time
employees spend looking for information.
v Reduce business silos, foster collaboration and content reuse,
and thereby reduce redundant work.
v Reduce the amount of time employees spend e-mailing basic
information to each other.
v Build confidence that employees are getting the most up to
date information, and increase employee loyalty by helping
them stay “up to date” on the company.
Taxonomy Strategies LLC The business of organized information
26
Research & planning
v Identify target content to be focused on.
 Provide a list of websites (and/or other target content file stores)
 Prioritize this list for the purposes of the taxonomy project.
v Gather any query logs, usage statistics and usability surveys.
v Collect any existing documentation related to audience
personas, content organization, metadata, keywords, and any
other guidelines or standards.
v Identify and gather any internal classifications (org charts,
sales regions, records retention schedule, code of conduct,
product lists, etc.); and any relevant industry standard
classifications (UNSPSC, NAICS, USPS, regulated activities, etc.)
Taxonomy Strategies LLC The business of organized information
27
Interview stakeholders
v Recruit people from business-critical functions such as
marketing, public relations, product marketing, legal, etc.
 Include people who have credibility, are early adopters, hold large
amounts of content, and are “squeaky wheels” or “fans.”
v Conduct 10-20 interviews.
v The goal is for stakeholders to be the review board during the
taxonomy development process, and beyond.
Taxonomy Strategies LLC The business of organized information
28
Define use cases: Intranet examples
v Content related to business areas or facilities
 By geographic location, by type, by specific facility, by access
restrictions, by audience, etc.
 Use Case: Create a safety policies and procedures website for facilities organized by
State.
 Use Scenario: Find all safety policies and procedures related to a facilities located in
Ohio.
v Company-wide content
 By business function, by topic, by access rights, etc.
 Use Case: Locate any content that has policies and procedures around a particular
topic.
 Use Scenario: A policy regarding smoking company-wide has changed and references
to outdated policies should be removed. Find official policies, as well as newsletters
related to the smoking policy company-wide.
Taxonomy Strategies LLC The business of organized information
29
Define use cases: .com examples
v Web content managers
 By content type, by topic, by location, etc.
 Use Case: Find and recall all public-facing pages that describe a specific safety tip.
 Use Scenario: Find and recall all public-facing pages that discuss gas safety.
v Public users seeking information
 by topic, by location, etc.
 Use Case: Provide search for dividend schedules, earnings statements and stock splits;
and the corresponding press releases for a specific time period.
 Use Scenario: An investor who recently sold stock is preparing taxes and would like to
do a concise search so that they can find historical information about their holdings.
Taxonomy Strategies LLC The business of organized information
30
Build high-level taxonomy
v Identify the types of actors
 Audiences, roles & access rights
v Identify the types of content
v Identify the types activities
 Business processes, applications & uses
v Identify the types of named entities
 Products, services, projects, organizations, locations, etc.
v Topics will be everything else.
 A business taxonomy should have no more than 6-10 broad
divisions.
Taxonomy Strategies LLC The business of organized information
31
Build high level taxonomy: Oracle.com top-level taxonomy
Person
Organization
Location
Content Type
Audience
Products
Product Line
Technology
Application
Industry Solution
 The Oracle.com taxonomy has no explicit
topics, only actors, content types, and
named entities.
Taxonomy Strategies LLC The business of organized information
“Is a” groups of
Products
32
Build high level taxonomy: SGMS top-level taxonomy
http://mysearch.internet.gov.sg/
Topics
 The SGMS (Singapore Government
Metadata Standard) Taxonomy is much
more focused on Topics.
Taxonomy Strategies LLC The business of organized information
33
Build-out taxonomy detail
v Get agreement on the broad divisions first, then build-out the
v
v
v
v
detailed taxonomy.
Use existing terminologies whenever they are available for
business functions, locations, products & services, etc.
Only build a vocabulary when no alternative authoritative
source exists.
Only create categories for which there already is content, or
likely to be content soon.
Keep the taxonomy broad and shallow.
 Roll-up more specific terms into broader categories
 A business taxonomy should have no more than 1,200
categories.
Taxonomy Strategies LLC The business of organized information
34
Build out taxonomy detail: NASA Taxonomy
http://nasataxonomy.jpl.nasa.gov/
Taxonomy Strategies LLC The business of organized information
35
Validation testing and review
Method
Process
Who
Requires
Validation
Walk-thru
Show & explain
• Taxonomist
• SME
• Team
• Rough taxonomy
• Approach
• Appropriateness to task
Walk-thru
Check conformance
to editorial rules
• Taxonomist
• Draft taxonomy
• Editorial Rules
• Consistent look and feel
Usability
Testing
Contextual analysis
(card sorting,
scenario testing,
etc.)
• Users
• Rough taxonomy
• Tasks & Answers
• Tasks are completed successfully
• Time to complete task is reduced
User
Satisfaction
Survey
• Users
• Rough Taxonomy
• UI Mockup
• Search prototype
• Reaction to taxonomy
• Reaction to new interface
• Reaction to search results
Tagging
Samples
Tag sample content
with taxonomy
• Taxonomist
• Team
• Indexers
• Sample content
• Rough taxonomy
• Content ‘fit’
• Fills out content inventory
• Training materials for people &
(or better)
algorithms
• Basis for quantitative methods
Taxonomy Strategies LLC The business of organized information
36
Migrate content
v Prioritize content to be tagged
 Identify and dispose of ROT.
v Use business rules to automate content tagging
 Tag landing pages for major sections.
 Lower-level pages inherit tags from top-level pages.
v Use workflow to enforce tagging
 Require entry of simple tagging in order to submit an item into the
content management system.
v Use templates to guide user tagging
 Pre-populate template fields whenever possible.
 Use context-sensitive pick lists.
 Call-out to taxonomy service for more complex controlled vocabularies.
v Provide tagging incentives
 Almost instantaneous feedback.
Taxonomy Strategies LLC The business of organized information
37
Maintain and evolve taxonomy
v Taxonomy building is iterative.
 A taxonomy should be improved over time and maintained.
v Designate a taxonomy editor as the single point-of-contact for
taxonomy changes.
v Log change requests and notify requestors.
v Prioritize taxonomy changes, e.g.
 Improves information access, use and reuse.
 Requires creating new data or metadata.
 Affects program operations or has a financial impact.
 Enables communication campaigns or organizational strategy.
 Positive impact on users
Taxonomy Strategies LLC The business of organized information
38
Exercise:
BUILD A HIGH LEVEL TAXONOMY
Taxonomy Strategies LLC The business of organized information
39
Taxonomy Strategies LLC The business of organized information
40
Exercise: Promo website taxonomy
What is Dunder Mifflin Promo?
The new online division that
markets promotional products.
DM Promo was designed to
reinvent the business of selling
promotional paper products.
v The DMP website will provide:
v




Product catalog browse by category,
brand, cost, popularity, feature, etc.
Product specs, series, schedule,
imprinting, & colors.
Various types of content such as
product ideas, articles, testimonials,
etc.
Account information, shipping &
returns.
Taxonomy Strategies LLC The business of organized information
DMP Products include:
Logo binders & filing supplies
Logo calendars & planners
Logo paper, cardstock & pads
Logo pens & pencils
Logo promotional products
(badges & lanyards, mugs, stress
balls, tote bags, mp3 players)
v Logo trophies & novelties (custom
money, banners & signs, origami,
party hats, paper boats)
v Logo wear (shirts, t-shirts,
sweatshirts , fleece, bathing suits,
hats, bandanas, team uniforms,
socks)
v
v
v
v
v
41
Exercise 3: Identify topics for Promo website taxonomy
1. Form groups


No more than 10 in a group.
Appoint recorder & reporter.
2. Brainstorm topics (10 min)

Write one topic on each Post-it
3. Sort Post-its into groups (5 min)
4. Present taxonomies (10 min)
5. Compare taxonomies (5 min)
Taxonomy Strategies LLC The business of organized information
42
Taxonomy Strategies LLC
Questions?
Joseph Busch, 415-377-7912,
[email protected]
Ron Daniel, 925-691-8374,
[email protected]
June 2, 2009
Copyright 2009 Taxonomy Strategies LLC. All rights reserved.