Transcript Slide 1

Taxonomy Strategies LLC

Data Governance Maturity:

When the business depends on clear description of fuzzy objects

Presented to San Francisco DAMA Sept. 10, 2008

Sept. 10, 2008

Ron Daniel, Jr.

Copyright 2008Taxonomy Strategies LLC. All rights reserved.

Bio: Ron Daniel, Jr.

 Over 15 years in the business of metadata & automatic classification  Principal, Taxonomy Strategies  Standards Architect, Interwoven  Senior Information Scientist, Metacode Technologies (acquired by Interwoven, November 2000)  Technical Staff Member, Los Alamos National Laboratory  Metadata and taxonomies community leadership.

 Chair, PRISM (Publishers Requirements for Industry Standard Metadata) working group  Acting chair, XML Linking working group  Member, RDF working groups  Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2 reports.

Taxonomy Strategies

LLC

The business of organized information 2

Recent & current projects:

http://www.taxonomystrategies.com/html/clients.htm

Government Commercial Not-for-Profit

Taxonomy Strategies

LLC

The business of organized information 3

Goals for this talk

 Provide you with background on maturity models.

 Provide the results of our surveys of Search, Metadata, & Taxonomy practices and discuss interesting findings.

 Review the practices in use at stock photo houses, and compare them to methods that may be used in typical information management projects.

 Give you the tools to do a simple self-assessment of your organization’s metadata maturity 4 Taxonomy Strategies

LLC

The business of organized information

Agenda

9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn

Taxonomy Strategies

LLC

The business of organized information 5

Metadata Definitions

T AXONOMY S TRATEGIES The business of organized information 6

Taxonomy and metadata definitions

Metadata  “Data about data”.

 Different communities have very different assumptions about they types of data being described.

 I’m from the Information Science community, not the database, statistics, or massive storage communities. Taxonomy 1.

2.

3.

The classification of organisms in an ordered system that indicates natural relationships. The science, laws, or principles of classification; systematics. Division into ordered groups, categories, or hierarchies.

7 Taxonomy Strategies

LLC

The business of organized information

Examples of taxonomy used to populate metadata fields

Metadata Title Author Department Audience Topic Metadata Values (Facets within the overall Taxonomy) Audience

Internal Executives Managers External Suppliers Customers Partners

Topics

Employee Services Compensation Retirement Insurance Further Education Finance and Budget Products and Services Support Services Infrastructure Supplies 8 Taxonomy Strategies

LLC

The business of organized information

Example faceted taxonomy

ABC Computers.com

Content Type Competency Industry Service Product Family

Award Case Study Contract & Warranty Demo Magazine News & Event Product Information Services Solution Specification Technical Note Tool Training White Paper Other Content Type Business & Finance Interpersonal Development IT Professionals Technical Training IT Professionals Training & Certification PC Productivity Personal Computing Proficiency Banking & Finance

Communica tions E-Business Education Government Healthcare Hospitality Manufacturing Petro chemocals Retail / Wholesale Technology Transportation Other Industries Assessment, Design & Implementati on Deployment Enterprise Support Client Support Managed Lifecycle Asset Recovery & Recycling Training

Desktops MP3 Players Monitors Networking Notebooks Printers Projectors Servers Services Storage Televisions Non-ABC Brands

Audience Line of Business Region Country

All Business ABC Employee Education Gaming Enthusiast Home Investor Job Seeker Media Partner Shopper First Time Experienced Advanced Supplier All Home & Home Office Gaming Government, Education & Healthcare Medium & Large Business Small Business All Asia-Pacific Canada ABC EMEA Japan Latin America & Caribbean United States 9 Taxonomy Strategies

LLC

The business of organized information

Manually tagged metadata sample

Title Attribute URL Description Content Types Audiences Organizations Missions & Projects Locations Business Functions Disciplines Time Period Values

Jupiter’s Ring System http://ringmaster.arc.nasa.gov/jupiter/ Overview of the Jupiter ring system. Many images, animations and references are included for both the scientist and the public. Web Sites; Animations; Images; Reference Sources Educators; Students Ames Research Center Voyager; Galileo; Cassini; Hubble Space Telescope Jupiter Scientific and Technical Information Planetary and Lunar Science 1979-1999 Taxonomy Strategies

LLC

The business of organized information 10

Other things sometimes called Taxonomy

Type

Synonym Ring Authority File

Classification Scheme Remarks

 Connects a series of terms together  Treats them as equivalent for search purposes e.g (Dog, Canine, Pooch, Mutt) (Cat, Feline, Kitty), …  Used to control variant names with a preferred term  Typically used for names of countries, individuals, organizations e.g. (IBM, Big Blue, International Business Machines Inc.)  A hierarchical arrangement of terms  May or may not follow strict “is-a” hierarchy rules  Usually enumerated; ie, LC or Dewey

Thesaurus

Ontology Taxonomy Strategies

LLC

 Expresses semantic relationships of: • Hierarchy (broader & narrower terms) • • Equivalence (synonyms) Associative (related terms)  May include definitions  Resembles faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules  A model of reality, allowing inferences to be made.

The business of organized information 11

Pop Quiz

On a blank piece of paper: • What question(s) did you want to have answered by coming to today’s talks?

Flag

one

question to be discussed later.

You do NOT have to provide your name.

Please DO provide your job title, division, and either company name or company type.

Taxonomy Strategies

LLC

The business of organized information 12

What do other people ask about?

 How to build a taxonomy?

 Definitions of terms.

 How to govern its use and maintenance?

 What’s the ROI?

 What are they for?

 How do we put them to use?

 How do we link them to content?

 How do they help search?

 How do I sell management on a taxonomy project?

 How do we maintain them?

and many more…

development definitions governance ROI basic taxo purpose usage tagging search selling maint 13 Taxonomy Strategies

LLC

The business of organized information

Agenda

9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn

Taxonomy Strategies

LLC

The business of organized information 14

Motivation behind the Metadata Maturity Model

T AXONOMY S TRATEGIES The business of organized information 15

Organizational benchmarking

 A common goal of organizations is to ‘benchmark’ themselves against other organizations.

 Different organizations have:  Different levels of sophistication in their planning, execution, and follow-up for CMS, Search, Portal, Metadata, and Taxonomy projects.

 Different reasons for pursuing Search, Metadata, and Taxonomy efforts  Different cultures  Benchmarks should be to similar organizations.

Taxonomy Strategies

LLC

The business of organized information 16

Is unnecessary capability harmful?

 Tool Vendors continue to provide ever-more capable tools with ever-more sophisticated features.

 But we live in a world where a significant fraction of public, commercial, web pages don’t have a tag.</p> <p> Organizations that can’t manage <title> tags stand a very poor chance of putting an entity extractor to use, which requires some ongoing management of the lists of entities to be extracted.</p> <p> Organizations that can’t create and maintain clean metadata can’t put a faceted search UI to good use.</p> <p> Unused capability is poor value-for-money.</p> <p> Organizations over-spend on tools and under-spend on staff & processes.</p> <p>17 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p18" href="#"></a> <h3><b>Towards better benchmarking…</b></h3> <p> Wanted a method to:  Generally identify good and bad practices.</p> <p> Help clients identify the things they can do, and the things that stand an excellent chance of failing.</p> <p> Predict likely sources of problems in engagements.</p> <p> We have started to develop a Metadata Maturity Model, inspired by Maturity Models from the software industry.</p> <p> To keep the model tied to reality, we are conducting surveys to determine the actual state of practice around search, metadata, taxonomy, and supporting business functions such as staffing and project management.</p> <p>18 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p19" href="#"></a> <h2><b>A Tale of Two Software Maturity Models</b></h2> <p>CMMI (Capability Maturity Model Integration) vs.</p> <p>The Joel Test 19 T AXONOMY S TRATEGIES The business of organized information</p> <a id="p20" href="#"></a> <h3><b>CMMI structure</b></h3> <p><b>Maturity Models are collections of <i>Practices</i>.</b></p> <p><b>Main differences in Maturity Models concern:</b></p> <p>•</p> <p><b>Descriptivist or Prescriptivist Purpose</b></p> <p>•</p> <p><b>Degree of Categorization of Practices</b></p> <p>•</p> <p><b>Number of Practices (~400 in CMMI)</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Source: http://chrguibert.free.fr/cmmi</b></p> <p>20</p> <a id="p21" href="#"></a> <h3><b>22 Process Areas, keyed to 5 Maturity Levels…</b></h3> <p> Process Areas contain Specific and Generic Practices, organized by Goals and Features, and arranged into Levels </p> <p><b>Process Areas cover a broad range of practices beyond simple software development</b></p> <p> CMMI Axioms:  Individual processes at higher levels are supporting processes at lower levels.</p> <p><i>AT RISK</i></p> <p>from  A Maturity Level is not achieved until </p> <p><i>ALL</i></p> <p>the Practices in that level are in operation.</p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 21</p> <a id="p22" href="#"></a> <h3><b>CMMI Positives</b></h3> <p>   Independent audits of an organization’s level of maturity are a common service  Level 3 certification frequently required in bids “…compared with an average Level 2 program, Level 3 programs have 3.6 times fewer latent defects, Level 4 programs have 14.5 times fewer latent defects, and Level 5 programs have 16.8 times fewer latent defects”.</p> <p>Michael Diaz and Jeff King – “How CMM Impacts Quality, Productivity,Rework, and the Bottom Line”  ‘If you find yourself involved in product liability litigation you're going to hear terms like "prevailing standard of care" and "what a reasonable member of your profession would have done". Considering the fact that well over a thousand companies world-wide have achieved level 3 or above, and the body of knowledge about the CMM is readily available, you might have some explaining to do if you claim ignorance’.</p> <p>Linda Zarate in a review of </p> <p><i>A Guide to the Cmm: Understanding the Capability Maturity Model for Software</i></p> <p>by Kenneth M. Dymond <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 22</p> <a id="p23" href="#"></a> <h3><b>CMMI Negatives</b></h3> <p> Complexity and Expense  Reading and understanding the materials  Putting it into action – identifying processes, mapping processes to model, gathering required data, …  Audits are expensive  CMMI does not scale down well to small shops  Has been accused of restraint of trade <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 23</p> <a id="p24" href="#"></a> <h3><b>At the other extreme, The Joel Test</b></h3> <p> Developed by Joel Spolsky as reaction to CMMI complexity  Positives - Quick, easy, and inexpensive to use.</p> <p> Negatives up well: Doesn’t scale    Not a good way to assure the quality of nuclear reactor software.</p> <p>Not suitable for scaring away liability lawyers.</p> <p>Not a longer-term improvement plan.</p> <p>1.</p> <p>2.</p> <p>3.</p> <p>4.</p> <p>5.</p> <p>6.</p> <p></p> <p><b>The Joel Test</b></p> <p>Do you use source control? Can you make a build in one step? Do you make daily builds? Do you have a bug database? Do you fix bugs before writing new code? Do you have an up-to-date schedule? 7.</p> <p>8.</p> <p>9.</p> <p>Do you have a spec? Do programmers have quiet working conditions? Do you use the best tools money can buy? 10.</p> <p>Do you have testers? 11.</p> <p>Do new candidates write code during their interview? 12.</p> <p>Do you do hallway usability testing?</p> <p>Scoring: 1 point for each ‘yes’. Scores below 10 indicate serious trouble.</p> <p>24 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p25" href="#"></a> <h3><b>What does software development “Maturity” really mean?</b></h3> <p> A low score on a maturity audit DOES NOT mean that an organization can’t develop good software  It DOES mean that whether the organization will do a good job depends on the specific mix of people assigned to the project  In other words, it sets a floor for how bad an organization is likely to do, not a ceiling on how good they can do  Probability of failure is a good thing to know before spending a lot of time and money 25 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p26" href="#"></a> <h2><b>Towards a Metadata Maturity Model</b></h2> <p>Caveats:  Maturity is not a goal, it is a characterization of an organization’s methods for achieving its core goals.</p> <p> Mature processes impose expenses which must be justified by consequent cost savings, revenue gains, or service improvements.</p> <p><i>Nevertheless</i></p> <p>, Maturity Models are useful as collections of best practices and stages in which to try to adopt them.</p> <p>T AXONOMY S TRATEGIES The business of organized information 26</p> <a id="p27" href="#"></a> <h3><b>Basis for initial maturity model</b></h3> <p> CEN study on commercial adoption of Dublin Core  Small-scale phone survey  Organizations which have world-class search and metadata externally  Not necessarily the most mature overall processes or the best </p> <p><i>internal</i></p> <p>search and metadata  Literature review  Client experiences  Structure from software maturity models <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 27</p> <a id="p28" href="#"></a> <h3><b>Initial Metadata Maturity Model (ca. May, 2005)</b></h3> <p><b>37 Practices, Categorized by Area, Level, and Importance</b></p> <p>Practice Area Maturity Level</p> <p><b>Basic Intermediate Advanced Bleeding Edge Limiting</b></p> <p>Search Capabilities Uniform Search Box Query Log Exam.</p> <p>System MD Stds.</p> <p>Index Multiple Repos.</p> <p>Best Bets Simple Grouping Organization MD Std.</p> <p>Reuse ERP Intranet Facet Navigation Improved Ranking Multipe Repos Comply Taxonomy Roadmap Highly Abstract Subject Taxos.</p> <p>Metadata and taxonomy standards Tools and tool selection Staff training and hiring Data creation and QA Requirements, then Tools Search Analyst Role CM Introduced Bakeoff Datasets Librarian Expertise ROT-Eliminatiion Budget for Bakeoffs Pre-hire Testing Hybrid Creation Model SME Catalogers Adaptive Qualification Quality Measures Unneeded Capabils.</p> <p>Tools, then Reqs.</p> <p>Project management Project Plan Std. Proj. Methodol.</p> <p>X-Functional Teams Communication Plan Multi-Year Plan Intranet ROI Model Executive support and ROI External Search ROI <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information Early Termination CEO knows Search ROI Use it or Lose It Budgets 28</p> <a id="p29" href="#"></a> <h3><b>Shortcomings of the initial model</b></h3> <p> No idea of how it corresponds to actual practice across multiple organizations  Some indications that it over-emphasized the sophisticated practices and under-emphasized beginning practices.</p> <p> The initial metadata maturity model can be regarded as a hypothesis about how an organization progresses through various practices as it matures  How to test it? Let’s ask!</p> <p> Two surveys to date  Surveys are being run in stages because of large number of practices.</p> <p> Ask about future, current, and former practices to gather information on progression 29 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p30" href="#"></a> <h3><b>Agenda</b></h3> <p><b>9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 30</p> <a id="p31" href="#"></a> <h2><b>Survey 1: Search, Metadata, & Taxonomy Practices</b></h2> <p>The data in this section comes from a survey conducted in the autumn of 2005.</p> <p>31 T AXONOMY S TRATEGIES The business of organized information</p> <a id="p32" href="#"></a> <h3><b>Participants by Organization Size</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 32</p> <a id="p33" href="#"></a> <h3><b>Participants by Job Role</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 33</p> <a id="p34" href="#"></a> <h3><b>Participants by Industry</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 34</p> <a id="p35" href="#"></a> <h3><b>Search Practices</b></h3> <p>Search Box in standard place on all web pages.</p> <p>Search engine indexes multiple repositories in addition to web sites.</p> <p>Spell Checking.</p> <p>Synonym Searching.</p> <p>Search results grouped by date, location, or other factors in addition to simple relevance score.</p> <p>Queries are logged and the logs are regularly examined Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. Advanced computation of relevance based on data in addition to the text of the document.</p> <p>A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search.</p> <p>A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal.</p> <p>Not current practice 20% (12) 25% (15) 31% (19) 41% (25) 37% (22) 31% (19) 46% (28) 43% (26) 68% (41) 57% (34) Being developed 11% (7) 21% (13) 18% (11) 23% (14) 20% (12) 25% (15) 25% (15) 16% (10) 7% (4) 15% (9) <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information In practice 62% (38) 44% (27) 38% (23) 30% (18) 37% (22) 31% (19) 21% (13) 25% (15) 10% (6) 17% (10) Former practice 2% (1) 2% (1) 0% (0) 0% (0) 0% (0) 5% (3) 0% (0) 0% (0) 0% (0) 0% (0) NA or Unknown 5% (3) 8% (5) 13% (8) 7% (4) 7% (4) 8% (5) 8% (5) 16% (10) 15% (9) 12% (7) 35</p> <a id="p36" href="#"></a> <h3><b>Metadata Practices</b></h3> <p><b>These two questions were the only ones with much correlation to organization size</b></p> <p>Not current practice Metadata standards are developed for the needs of each system with no overall attempt to unify them.</p> <p>An Organization-wide metadata standard exists and new systems consider it during development.</p> <p>The Organization-wide metadata standard is based on the Dublin Core.</p> <p>Multiple repositories comply with metadata standard.</p> <p>A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard.</p> <p>The Cataloging Policy document is revised periodically.</p> <p>A centralized metadata repository exists to aggregate and unify metadata from disparate sources.</p> <p>Metadata is manually entered into web forms.</p> <p>Metadata is generated automatically by software.</p> <p>Metadata is generated automatically, then reviewed manually for correction.</p> <p>22% (13) 37% (22) 52% (30) 52% (31) 48% (29) 48% (29) 57% (34) 15% (9) 38% (23) 48% (29) <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information Being developed 12% (7) 37% (22) 16% (9) 20% (12) 20% (12) 15% (9) 17% (10) 12% (7) 18% (11) 18% (11) In practice 37% (22) 20% (12) 10% (6) 20% (12) 21% (12) 17% (10) 20% (12) 17% (10) 17% (10) 61% (36) 27% (16) 17% (10) Former practice 0% (0) 0% (0) 0% (0) 0% (0) 0% (0) 0% (0) 3% (2) 2% (1) 2% (1) NA or Unknown 7% (4) 12% (7) 12% (7) 12% (7) 20% (12) 10% (6) 8% (5) 15% (9) 15% (9) 36</p> <a id="p37" href="#"></a> <h3><b>Taxonomy Practices</b></h3> <p>Not current practice Org Chart' Taxonomy - One based primarily on the structure of the organization.</p> <p>'Products' Taxonomy - One based primarily on the products and/or services offered by the organization.</p> <p>'Content Types' Taxonomy - One based primarily on the different types of documents.</p> <p>'Topical' Taxonomy - One based primarily on topics of interest to the site users. 'Faceted' Taxonomy - One which uses several of the approaches above.</p> <p>The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor.</p> <p>The Taxonomy follows a written 'style guide' to ensure its consistency over time.</p> <p>The Taxonomy is maintained using a taxonomy editing tool other than MS Excel.</p> <p>The Taxonomy was validated on a representative sample of content during its development.</p> <p>A Roadmap for the future evolution of the Taxonomy has been developed.</p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 36% (21) 37% (22) 28% (16) 20% (12) 32% (19) 75% (44) 47% (28) 35% (21) 28% (17) 38% (23) Being developed 10% (6) 10% (6) 21% (12) 36% (21) 29% (17) 3% (2) 22% (13) 17% (10) 22% (13) 40% (24) In practice 34% (20) 32% (19) 40% (23) 34% (20) 34% (20) 14% (8) 20% (12) 40% (24) 33% (20) 13% (8) Former practice 5% (3) 5% (3) 5% (3) 3% (2) 0% (0) 0% (0) 0% (0) 2% (1) 3% (2) 0% (0) NA or Unknown 15% (9) 15% (9) 7% (4) 7% (4) 5% (3) 8% (5) 10% (6) 7% (4) 13% (8) 8% (5) 37</p> <a id="p38" href="#"></a> <h2><b>Survey 2: Business Drivers, Processes, and Staffing</b></h2> <p>The data in this section comes from a survey conducted in the spring of 2006.</p> <p>38 T AXONOMY S TRATEGIES The business of organized information</p> <a id="p39" href="#"></a> <h3><b>Participants by Job Role</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 39</p> <a id="p40" href="#"></a> <h3><b>Participants by Tenure</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 40</p> <a id="p41" href="#"></a> <h3><b>Participants by Industry</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 41</p> <a id="p42" href="#"></a> <h3><b>Participants by Organization Size</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 42</p> <a id="p43" href="#"></a> <h3><b>Business Drivers: Search, Metadata, and Taxonomy (SMT) Applications</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 43</p> <a id="p44" href="#"></a> <h3><b>Business Drivers: Desired Benefits</b></h3> <p><b>Other desired benefits: </b></p> <p>1 2 3 4 5 6 7 8 9 10 11 Innovation Core to our business product Clients do all the above </p> <p><i>[From a consultant]</i></p> <p>Better navigation to diverse State web sites Increased knowledge sharing across the corporation Interoperability Dynamic web applications Improved user search experience Improve R&D Higher value to members </p> <p><i>[From a non-profit membership org.]</i></p> <p>For organization to have better understanding of their content <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 44</p> <a id="p45" href="#"></a> <h3><b>ROI: Cost Estimation</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 45</p> <a id="p46" href="#"></a> <h3><b>Processes</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Use of search logs is improving Surprisingly sophisticated Basic data quality and communications need improvement Many solo operators</b></p> <p>46</p> <a id="p47" href="#"></a> <h3><b>Team Structures & Staffing</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 47</p> <a id="p48" href="#"></a> <h3><b>Salary Survey</b></h3> <p>Experience Geography Co. Size Education Industry Role Time at current job 0.6 Nice to see it really counts.</p> <p>0.5 California and the Northeast have highest salaries.</p> <p>0.5 Not very reliable, big changes from one datapoint 0.4 Many taxonomists have MLS or above.</p> <p>0.4 Surprisingly, retail has high salaries for taxonomists.</p> <p>0.04 Taxonomists paid about like Information Architects -0.07</p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 48</p> <a id="p49" href="#"></a> <h3><b>Notes from Participants</b></h3> <p> There is the constant struggle with individual [magazine] titles to hire trained librarians or data specialists instead of trying to save money by hiring an editor who can build articles AND create and assign metadata. This is a governance issue we have been struggling with since we have no monetary stake in the individual publications. We make recommendations, but have no higher level authority to require titles to hire trained staff for metadata.</p> <p> Reporting metrics have become a new area of confusion as we move to portalized pages consisting of objects in portlets, each with their own metadata.</p> <p> Key organizational issue is that the "problems" that stem from lack of systematic metadata/taxonomy creation are not "owned" by anyone, and consequently have no budget for their solution. 49 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p50" href="#"></a> <h2><b>Interim Conclusions</b></h2> <p>T AXONOMY S TRATEGIES The business of organized information 50</p> <a id="p51" href="#"></a> <h3><b>Observations (1)</b></h3> <p> Practices which a single person or a small group can carry out are more commonly used  Not surprising  Very different than ERP/BPR, indicates that information management is not being sold to the “C-level” staff.</p> <p> People need to question how inclusive their “Organizational Metadata Standards” and “Taxonomy Roadmaps ” actually are.</p> <p> We have found Taxonomy Roadmaps to be an advanced practice, due to a dependence on knowing upcoming IT development schedule 51 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p52" href="#"></a> <h3><b>Observations (2)</b></h3> <p> Many of the basics are being skipped  More organizations doing “Spell Checking” than “Query Log Analysis”.</p> <p> 69% have a taxonomy change plan, but only 41% have a plan for revisiting data if the taxonomy changes.</p> <p> 64% have a communications plan, but only 56% have a website.</p> <p> This seems to be linked to the previous observation – things that are easy for an individual get done before things that need an organizational effort, despite their level of ‘sophistication’.</p> <p>52 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p53" href="#"></a> <h3><b>Interim Metadata Maturity Model (ca. May, 2006)</b></h3> <p>Practice Area</p> <p><b>Basic Intermediate Advanced Limiting</b></p> <p>Search Capabilities Metadata and taxonomy standards  Uniform Search Box  Query Log Exam.</p> <p> System MD Stds.</p> <p> Organization MD Std.</p> <p> Requirements, then Tools  Index Multiple Repos.</p> <p> Best Bets  Multipe Repos Comply w/ MD Std.</p> <p> Reuse ERP Taxos  Taxo Maint. Doc  Bakeoff Datasets  Facet Navigation UI  Taxonomy Roadmap   Highly Abstract Subject Taxos (e.g. “Moods”) Metadata Maint. Doc  Budget for Bakeoffs  Tools, then Reqs.</p> <p>Tools and tool selection Staff training and hiring  Librarian or IA Expertise  Search Analyst Role  Cross-Functional Taxonomy Creation  Cross-functional taxonomy maint.</p> <p> SME Catalogers  Pre-hire Testing Data creation and QA Project management  CM Introduced  Project Plan  X-Functional Teams Executive support and ROI  External Search ROI  SMT in separate silos  ROT-Eliminatiion  Semi-auto tagging  Std. Proj. Methodol.</p> <p> Multi-Year Plan  Communication Plan  SMT Business Manager, instead of IT Manager  Intranet ROI Model  Quality Measures  Early Termination  CEO knows Search ROI  Use it or Lose It Budgets 53 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p54" href="#"></a> <h3><b>Search and Metadata Maturity Quick Quiz</b></h3> <p><b>Basic</b></p> <p>1) Is there a process in place to examine query logs?</p> <p>2) 3)</p> <p><b>Intermediate</b></p> <p>4) Does the search engine index more than 4 repositories around the organization?</p> <p>5) 6) 7) 8) Is there a process for adding directories and content to the repository, or do people just do what they want?</p> <p>Is there an organization-wide metadata standard, such as an extension of the Dublin Core, for use by search tools, multiple repositories, etc.?</p> <p>Does the search engine integrate with the taxonomy to improve searches and organize results?</p> <p>Are there hiring and training practices especially for metadata and taxonomy positions?</p> <p>Is there an ongoing data cleansing procedure to look for ROT (Redundant, Obsolete, Trivial content)?</p> <p>Are tools only acquired after requirements have been analyzed, or are major purchases sometimes made to use up year-end money?</p> <p><b>Advanced</b></p> <p>9) Are there established qualitative and quantitative measures of metadata quality?</p> <p>10) Can the CEO explain the ROI for search and metadata?</p> <p>54 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p55" href="#"></a> <h3><b>Agenda</b></h3> <p><b>9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 55</p> <a id="p56" href="#"></a> <h3><b>Agenda</b></h3> <p><b>9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 56</p> <a id="p57" href="#"></a> <h3><b>Stock Photo Business</b></h3> <p> Advertising, Editorial Content, Corporate Communications, and many other types of content rely on images to convey information and moods.</p> <p> When time and/or budget does not allow a commissioned shoot, stock photo houses can supply images.</p> <p> Fundamental problem for users: How to search for an image that conveys what you want?</p> <p> Fundamental problem for houses: How to describe images so that users can find them?</p> <p>57 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p58" href="#"></a> <h3><b>How would you search for this image?</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 58</p> <a id="p59" href="#"></a> <h3><b>Tagging by emotions</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 59</p> <a id="p60" href="#"></a> <h3><b>“silence”</b></h3> <p><b>Image Rights Criteria Objective criteria</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 60</p> <a id="p61" href="#"></a> <h3><b>Clarification: Finger on Lips</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 61</p> <a id="p62" href="#"></a> <h3><b>Scrolling through results…</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>This is more of the mood I’m looking for…</b></p> <p>62</p> <a id="p63" href="#"></a> <h3><b>More like this</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 63</p> <a id="p64" href="#"></a> <h3><b>Facets at gettyimages.com</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 64</p> <a id="p65" href="#"></a> <h3><b>Key Questions</b></h3> <p> Getty Images (and Corbis) have put a lot of effort into their websites for image purchase * .</p> <p> Internal staff at such organizations tell me that their intranets are nowhere near as easy to use.</p> <p> ROI is the reason why.</p> <p> Recall that retail had high salaries for taxonomists, because the ROI for a better shopping site is so clear.</p> <p> The front-ends are dependent on data. How is that data governed? How does that differ from how their intranets are governed?</p> <p>* Licensing, not purchasing, to be pedantic.</p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 65</p> <a id="p66" href="#"></a> <h3><b>Agenda</b></h3> <p><b>9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 66</p> <a id="p67" href="#"></a> <h3><b>Pop Quiz</b></h3> <p></p> <p><b>What is the #1 underused source of quantitative information on how to improve your metadata and taxonomy?</b></p> <h1><b>Query Logs & Click Trails</b></h1> <p>67 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p68" href="#"></a> <h3><b>Who are the users & what are they looking for?</b></h3> <p> Only 30-40% of organizations regularly examine their logs.</p> <p> Sophisticated software available, but don’t wait.</p> <p> 80% of value comes from basic reports 68 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p69" href="#"></a> <h3><b>Query log & click trail examination— Click trail packages</b></h3> <p>      iWebTrack NetTracker OptimalIQ SiteCatalyst</p> <p><b>Visitorville </b></p> <p>WebTrends  <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 69</p> <a id="p70" href="#"></a> <h3><b>Query log & click trail examination– Query log</b></h3> <p>      </p> <p><b>UltraSeek Reporting</b></p> <p>Top queries Queries with no results Queries with no click-through Most requested documents Query trend analysis Complete server usage summary <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 70</p> <a id="p71" href="#"></a> <h2><b>Examining the Stock Photo Agencies in Light of the Metadata Maturity Model</b></h2> <p>T AXONOMY S TRATEGIES The business of organized information 71</p> <a id="p72" href="#"></a> <h3><b>Maturity Model Recap</b></h3> <p>Practice Area Search Capabilities Metadata and taxonomy standards Tools and tool selection Staff training and hiring </p> <p><b>Basic</b></p> <p> Uniform Search Box  Query Log Exam.</p> <p> System MD Stds.</p> <p> Organization MD Std.</p> <p> Requirements, then Tools  Librarian or IA Expertise  Search Analyst Role Data creation and QA Project management  CM Introduced  Project Plan  X-Functional Teams Executive support and ROI  External Search ROI  SMT in separate silos <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Intermediate</b></p> <p> Index Multiple Repos.</p> <p> Best Bets  Multiple Repos itories Comply w/ MD Std.</p> <p> Reuse ERP Taxos  Taxo Maint. Doc  Bakeoff Datasets</p> <p><b>Advanced</b></p> <p> Facet Navigation UI  Taxonomy Roadmap   Highly Abstract Subject Taxos (e.g. “Moods”) Metadata Maint. Doc  Budget for Bakeoffs</p> <p><b>Limiting</b></p> <p> Tools, then Reqs.</p> <p> Cross-Functional Taxonomy Creation  Cross-functional taxonomy maint.</p> <p> SME Catalogers  Pre-hire Testing  Quality Measures  ROT-Eliminatiion  Semi-auto tagging  Std. Proj. Methodol.</p> <p> Multi-Year Plan  Communication Plan  SMT Business Manager, instead of IT Manager  Intranet ROI Model   Early Termination CEO knows Search ROI  Use it or Lose It Budgets 72</p> <a id="p73" href="#"></a> <h3><b>Search capabilities</b></h3> <p>Practice Area Search Capabilities</p> <p><b>Basic</b></p> <p> Uniform Search Box  Query Log Exam.</p> <p><b>Intermediate</b></p> <p> Index Multiple Repos.</p> <p> Best Bets</p> <p><b>Advanced</b></p> <p> Facet Navigation UI • • • • •</p> <p><b>Uniform Search box: </b></p> <p>Both provide this.</p> <p><b>Query Log Exam: </b></p> <p>Both gathered logs but had only semi-formal review processes at time of interviews.</p> <p><b>Index multiple repositories: </b></p> <p>Both license picture ‘collections’ from disparate sources but bring them together for search and purchase.</p> <p><b>Best Bets: </b></p> <p>N/A in creative space.</p> <p><b>Facet Navigation UI: </b></p> <p>Used on gettyimages.com, but not on corbis.com.</p> <p><b>Limiting</b></p> <p>73 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p74" href="#"></a> <h3><b>Data standards</b></h3> <p>Practice Area Metadata and taxonomy standards</p> <p><b>Basic</b></p> <p> System MD Stds.</p> <p> Organization MD Std.</p> <p><b>Intermediate Advanced</b></p> <p> Multiple Repos .Comply w/ MD Std.</p> <p> Reuse ERP Taxos  Taxo Maint. Doc  Taxonomy Roadmap   Highly Abstract Subject Taxos (e.g. “Moods”) Metadata Maint. Doc</p> <p><b>Limiting</b></p> <p>• • • • • • • •</p> <p><b>System MD Stds: </b></p> <p>Both have moved beyond that level.</p> <p><b>Organization MD Standard: </b></p> <p>Both define core metadata standards with extensions for specific collections.</p> <p><b>Multiple repositories comply w/ MD standard: </b></p> <p>Collections are tagged to a common core at both vendors, plus extension elements in different collections.</p> <p><b>Reuse ERP taxonomies: </b></p> <p>N/A</p> <p><b>Taxonomy Maint. Doc: Taxonomy Roadmap: </b></p> <p>Corbis had plan for facets to be added, but not keyed to other systems.</p> <p><b>Highly abstract vocabularies: </b></p> <p>Getty shows emotion tagging in action with their moodstream offering.</p> <p><b>Metadata maint. doc: </b></p> <p>TBD <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 74</p> <a id="p75" href="#"></a> <h3><b>Image Collections</b></h3> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 75</p> <a id="p76" href="#"></a> <h3><b>Editorial rules standard</b></h3> <p>              </p> <p><b>…</b></p> <p>Abbreviations Ampersands Capitalization General…, More…, Other… Languages & character sets Length limits Multiple parents Plural vs. singular form Scope notes Serial comma Sources of terms Spaces Synonyms & acronyms Term order (Alphabetic or …) Term label order (Direct vs. inverted)</p> <p><b>Rule Name</b></p> <p>Abbreviations Ampersands Capitalization General…, More…, Other… …</p> <p><b>Editorial Rule</b></p> <p>Abbreviations, other than colloquial terms and acronyms, shall not be used in term labels.</p> <p>Example: NOT: Public Information Public Info. The ampersand [&] character shall be used instead of the word ‘and’. Example: NOT: Licensing & Compliance Licensing and Compliance Title case capitalization shall be used. Example: Customer Service NOT: CUSTOMER SERVICE NOT: NOT: Customer service customer service The term labels “General…”, “More…”, and “Other…” shall be used for categories which contain content items that are not further classifiable. Example: “Other Property” “Other Services” “General Information” “General Audience” … 76 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p77" href="#"></a> <h3><b>Tools and Tool Selection</b></h3> <p>Practice Area Tools and tool selection</p> <p><b>Basic</b></p> <p> Requirements, then Tools</p> <p><b>Intermediate</b></p> <p> Bakeoff Datasets</p> <p><b>Advanced</b></p> <p> Budget for Bakeoffs</p> <p><b>Limiting</b></p> <p> Tools, then Reqs.</p> <p>• • • •</p> <p><b>Requirements, then Tools: </b></p> <p>Both are well into iterative additions of functionality based on feature requests.</p> <p><b>Bakeoff Datasets: </b></p> <p>Periodically they look at cataloging tools from outside vendors but none really automate image tagging to a notable degree. </p> <p><b>Budget for Bakeoffs: </b></p> <p>N/A.</p> <p><b>Tools, then Requirements: </b></p> <p>Neither susceptible given the amount of custom code.</p> <p>77 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p78" href="#"></a> <h3><b>Normal taxonomy editor functionality requirements</b></h3> <p><b>Standard and Custom Fields Standard and Custom Relations Data Typing and Restrictions Consistency Enforcement Flexible Reporting Flexible Importing?</b></p> <p><b>Term Editing UNICODE Multiple Vocabulary Support Inter-Vocabulary Relations Unique IDs ISO Codes not sufficient Workflow Voting Change Request Mgmt.</b></p> <p><b>Stylistic rules enforcement Programmability</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Hierarchy Browser</b></p> <p>78</p> <a id="p79" href="#"></a> <h3><b>Staff hiring and training</b></h3> <p>Practice Area Staff training and hiring </p> <p><b>Basic</b></p> <p> Librarian or IA Expertise  Search Analyst Role</p> <p><b>Intermediate Advanced</b></p> <p> Cross-Functional Taxonomy Creation  Cross-functional taxonomy maint.</p> <p> SME Catalogers  Pre-hire Testing</p> <p><b>Limiting</b></p> <p>• • • • • •</p> <p><b>Librarian or IA expertise: </b></p> <p>Both seek this in their cataloging and taxonomy hires, but seek additional things as well.</p> <p><b>Search Analyst:</b></p> <p>Was goal for Getty at time of interview. Interviewee thought that would take Getty from a “7” to an ”8” in terms of search sophistication.</p> <p><b>Cross-functional taxonomy creation: </b></p> <p>Not at time of interviews.</p> <p><b>Cross-Functional taxonomy maint: </b></p> <p>Not at time of interviews.</p> <p><b>SME Catalogers: </b></p> <p>Yes, esp. Getty Images. Corbis had an art history emphasis, Getty looked for people with variety of backgrounds, esp. science, and photographers.</p> <p><b>Pre-hire testing: </b></p> <p>Getty did some of this with interns. 79 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p80" href="#"></a> <h3><b>Data creation and QA</b></h3> <p>Practice Area Data creation and QA</p> <p><b>Basic</b></p> <p> CM Introduced</p> <p><b>Intermediate</b></p> <p> ROT-Eliminatiion  Semi-auto tagging</p> <p><b>Advanced</b></p> <p> Quality Measures</p> <p><b>Limiting</b></p> <p>• • • • •</p> <p><b>CM Introduced: </b></p> <p>Both use strong database systems for cataloging.</p> <p><b>ROT-Elimination:</b></p> <p>Image collections rarely removed unless licensing problems occur. Both have error detection and error correction processes. </p> <p><b>Semi-auto tagging: </b></p> <p>Both evaluate this technology periodically but neither has found it usable on images.</p> <p><b>Cross-Functional taxonomy maint: </b></p> <p>Not at time of interviews.</p> <p><b>Quality measures: </b></p> <p>Both have quality control processes but neither mentioned analytic models.. 80 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p81" href="#"></a> <h3><b>Taxonomy testing methods</b></h3> <p><b>Method</b></p> <p>Walk-thru Walk-thru Usability Testing User Satisfaction Tagging Samples</p> <p><b>Process</b></p> <p>Show & explain Check conformance to editorial rules Tag sample content with taxonomy</p> <p><b>Who</b></p> <p>   Taxonomist SME Team  Taxonomist Contextual analysis (card sorting, scenario testing, etc.)  Users Survey  Users    Taxonomist Team Indexers <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Requires</b></p> <p> Rough taxonomy</p> <p><b>Validation</b></p> <p>  Approach Appropriateness to task  Draft taxonomy  Editorial Rules  Rough taxonomy  Tasks & Answers  Rough Taxonomy   UI Mockup Search prototype  Sample content  Rough taxonomy (or better)  Consistent look and feel   Tasks are completed successfully Time to complete task is reduced  Reaction to taxonomy  Reaction to new interface  Reaction to search results  Content ‘fit’  Fills out content inventory  Training materials for people & algorithms </p> <p><b>Basis for quantitative methods</b></p> <p>81</p> <a id="p82" href="#"></a> <h3><b>Simple method: Closed Card Sort</b></h3> <p> Tests how people think about content, good for exposing ambiguity.</p> <p> Example from alpha test of a grocery site:  15 Testers put each of 71 best-selling product types into one of 10 pre-defined categories  Categories where fewer than 14 of 15 testers put product into same category were flagged</p> <p><b>% of Testers</b></p> <p>15/15 14/15 13/15 12/15 11/15 <11/15</p> <p><b>Cumulative % of Products</b></p> <p>54% 70% 77% 83% 85% 100%</p> <p><b>With Poly Hierarchy</b></p> <p>69% 83% 93% 100% 100% 100%</p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>“Cocoa Drinks – Powder” is best categorized in both “Beverages” and “Grocery”.</b></p> <p><b>How to improve? Allow products in multiple categories. (Results are for minimum size = 4 votes)</b></p> <a id="p83" href="#"></a> <h3><b>User interface survey— Which search UI is ‘better’?</b></h3> <p>  Criteria  User satisfaction  Success completing tasks  Confidence in results  Fewer dead ends Methodology  Design tasks from specific to general  Time performance  Calculate success rates   Survey subjective criteria Pay attention to survey hygiene:    Participant selection Counterbalancing T-scores <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information Source: Yee, Swearingen, Li, & Hearst 83</p> <a id="p84" href="#"></a> <h3><b>User interface survey — Results (1)</b></h3> <p><b>Which Interface would you rather use for these tasks?</b></p> <p>Find images of roses Find all works from a certain period Find pictures by 2 artists in the same media</p> <p><b>… Google-like Baseline</b></p> <p>15 2 1</p> <p><b>Overall assessment:</b></p> <p>More useful for your usual tasks Easiest to use Most flexible More likely to result in dead-ends Helped you learn more Overall preference</p> <p><b>… Google-like Baseline</b></p> <p>4 8 6 28 1 2</p> <p><b>Faceted Category</b></p> <p>16 30 29</p> <p><b>Faceted Category</b></p> <p>28 23 24 3 31 29 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information Source: Yee, Swearingen, Li, & Hearst 84</p> <a id="p85" href="#"></a> <h3><b>User interface survey — Results (2)</b></h3> <p>9 8 3 2 1 0 7 6 5 4</p> <p><b>6.0</b></p> <p><b>7.6</b></p> <p><b>6.7</b></p> <p><b>7.2</b></p> <p><b>4.7</b></p> <p><b>6.3</b></p> <p><b>4.6</b></p> <p><b>3.5</b></p> <p><b>5.8</b></p> <p><b>7.7</b></p> <p><b>5.5</b></p> <p><b>7.4</b></p> <p><b>6.0</b></p> <p><b>7.8</b></p> <p><b>4.0</b></p> <p><b>4.8</b></p> <p>Ea sy to U se Si m pl e Fl exi bl e Google-like Baseline Faceted Category Te di ou s In te re st in Ea g sy to B ro w se <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information En jo ya bl e O ve rw he lm in g Source: Yee, Swearingen, Li, & Hearst 85</p> <a id="p86" href="#"></a> <h3><b>Document distribution— How evenly does it divide the content?</b></h3> <p>   Documents do not distribute uniformly across categories Zipf (1/x) distribution is expected behavior 80/20 rule in action (actually 70/20 rule)</p> <p><b>Measured v Expected Distribution of Top 10 Content Types in Library of Congress Database</b></p> <p>350,000 300,000 250,000 200,000 150,000 100,000 50,000 0 C on gr es se s B io gr ap hy P er io di ca ls M ap s Fi ct io n E xh ib iti on Ju s ve ni le li te ra tu re B ib lio gr ap hy</p> <p><b>Top 10 Content Types</b></p> <p>S ta tis tic s Leading candidate for splitting Leading candidates for merging <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 86</p> <a id="p87" href="#"></a> <h3><b>Document distribution— How evenly does it divide the content?</b></h3> <p> </p> <p><b>Methodology: </b></p> <p>115 randomly selected URLs from corporate intranet search index were manually categorized. Inaccessible files and ‘junk’ were removed. </p> <p><b>Results: </b></p> <p>Slightly more uniform than Zipf distribution. Above the curve is better than expected.</p> <p><b>Measured v Expected Intranet Content Type Distribution </b></p> <p>25 20 15 10 5 0 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Content Type</b></p> <p>87</p> <a id="p88" href="#"></a> <h3><b>Document distribution— How does taxonomy “shape” match that of content?</b></h3> <p>  Background: Hierarchical taxonomies allow comparison of “fit” between content and taxonomy areas    Methodology: 25,380 resources tagged with taxonomy of 179 terms. (Avg. of 2 terms per resource) Counts of terms and documents summed within taxonomy hierarchy    Results: Roughly Zipf distributed (top 20 terms: 79%; top 30 terms: 87%) Mismatches between term% and document% flagged</p> <p><b>Term Group</b></p> <p>Administrators Community Groups Counselors Federal Funds Recipients and Applicants Librarians News Media Other Parents and Families Policymakers Researchers School Support Staff Student Financial Aid Providers</p> <p><b>% Terms</b></p> <p>7.8</p> <p>2.8</p> <p>3.4</p> <p>9.5</p> <p>2.8</p> <p>0.6</p> <p>7.3</p> <p>2.8</p> <p>4.5</p> <p>2.2</p> <p>2.2</p> <p>1.7</p> <p><b>% Docs</b></p> <p>15.8</p> <p>1.8</p> <p>1.4</p> <p>34.4</p> <p>1.1</p> <p>3.1</p> <p>2.0</p> <p>6.0</p> <p>11.5</p> <p>3.6</p> <p>0.2</p> <p>0.7</p> <p>Students Teachers 27.4</p> <p>25.1</p> <p>7.0</p> <p>11.4</p> <p>Source: Courtesy Keith Stubbs, US. Dept. of Ed.</p> <p>88 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p89" href="#"></a> <h3><b>Project Management</b></h3> <p>Practice Area Project management</p> <p><b>Basic</b></p> <p> Project Plan  X-Functional Teams</p> <p><b>Intermediate</b></p> <p> Std. Proj. Methodol.</p> <p> Multi-Year Plan  Communication Plan  SMT Business Manager, instead of IT Manager</p> <p><b>Advanced</b></p> <p> Early Termination</p> <p><b>Limiting</b></p> <p>• • • • • • •</p> <p><b>Project Plan:</b></p> <p>Both companies are in a mode where maintaining the cataloging, terminology, and search tools is ongoing enhancement. Neither company discussed project management.</p> <p><b>X-Functional Teams: </b></p> <p>Very little corss-functional involvement was discussed. Some input from sales and cataloging for taxonomy revisions.</p> <p><b>Std. Project Methodology: </b></p> <p>Not at time of interviews.</p> <p><b>Multi-year plan: </b></p> <p>Not at time of interviews.</p> <p><b>Communication Plan: </b></p> <p>Not discussed.</p> <p><b>SMT Business Manager:</b></p> <p>Not discussed.</p> <p><b>Early Termination:</b></p> <p>Not discussed.</p> <p>89 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p90" href="#"></a> <h3><b>Key Governance Aspects</b></h3> <p> Roles and Responsibilities –  Managers  Reviewers  Policies –  For naming  Required Fields  Procedures –  For reviewing and approving metadata placement  For acting on poor metadata application <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 90</p> <a id="p91" href="#"></a> <h3><b>Recommended Measure and Improve Mindset</b></h3> <p>     Measure - Determine current situation and what is wrong.</p> <p>• Too many documents in a category? Too many categories? People complaining about not finding material that is on the site? People asking for materials not on the site? Common searches without results?</p> <p>Decide – Decide how to change things to fix the problem.</p> <p>• Change navigation list? Add new categories? Add synonyms to search? Create new content?</p> <p>Confirm – Before rolling out changes, test them to make sure they will improve the problem.</p> <p>• Usability tests, Card sorts, Internal functionality tests, … Implement – Roll out the changes.</p> <p>Repeat – Monitor people’s behavior on the site as well as responding to reported problems.</p> <p>• Query log examination, Clicktrail examination, Google search result position, Stakeholder feedback, User surveys, Site analytics, etc.</p> <p>91 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p92" href="#"></a> <h3><b>Taxonomy team: Generic roles</b></h3> <p><b>Stakeholder Committee Content Owners</b></p> <p> Keeps team on track with larger business objectives.</p> <p> Reality check on process change suggestions.</p> <p>  Balances cost/benefit issues to decide appropriate levels of effort.</p> <p>Obtains needed resources if those on committee can’t accomplish a particular task.</p> <p><b>Business Lead Technical Specialist</b></p> <p> Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc.</p> <p> Helps obtain data from various systems.</p> <p>  Committee’s liaison to content creators.</p> <p>Estimates costs of proposed changes in terms of editorial process changes, additional or reduced workload, etc.</p> <p><b>Content Specialist Taxonomy Specialist</b></p> <p> Suggests potential taxonomy changes based on analysis of query logs, indexer feedback.</p> <p> Makes edits to taxonomy, installs into system with aid of IT specialist.</p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 92</p> <a id="p93" href="#"></a> <h3><b>Taxonomy governance environment</b></h3> <p><b>Change Requests & Responses Published Facets Consuming Applications</b></p> <p>1: External vocabularies change on their own schedule, with some advance notice.</p> <p>2: Team decides when to update facets within Taxonomy Web CMS Archives ISO 3166-1 Other External ERP Vocabulary Management System CVs</p> <p><b>Custodians Notifications</b></p> <p>3: Team adds value via mappings, translations, synonyms, training materials, etc.</p> <p>Other Controlled Items Intranet Search ERMS</p> <p><b>’’</b></p> <p>Other Internal CV (Controlled Vocabulary) – The list of values for one facet in the Taxonomy.</p> <p><b>Taxonomy Governance Environment</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 4: Updated versions of facets published to consuming applications Intranet Nav.</p> <p>DAM … …</p> <p><b>’’</b></p> <p>93</p> <a id="p94" href="#"></a> <h3><b>Taxonomy maintenance processes</b></h3> <p>• Different organizations will have different change processes.</p> <p>• Organization 1: A custodian is responsible for the content, but checks facts with department heads before making changes.</p> <p>• Organization 2: Marketing reps ask for a change, taxonomy editor makes demo, web representative approves it.</p> <p>• Organization 3: Analysts suggest changes, editors approve, copyeditors verify consistency.</p> <p>94 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p95" href="#"></a> <h3><b>Sample taxonomy maintenance workflow Taxonomy Tool</b></h3> <p><b>Yes Problem?</b></p> <p><b>No Suggest new name/categor y Analyst Review new name Copy edit new name Problem?</b></p> <p><b>No Yes Editor Copywriter Add to enterprise Taxonomy Taxonom y Sys Admin</b></p> <p>95 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p96" href="#"></a> <h3><b>Where taxonomy change suggestions come from</b></h3> <p><b>Application Logic Query log analysis notes ‘missing’ concepts Recommendations by Editor 1. Small taxonomy changes (labels, synonyms) 2. Large taxonomy changes (retagging, application changes) 3.</b></p> <p><b>New “best bets” content.</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <p><b>Taxonomy Team Team Considerations 1. Business goals.</b></p> <p><b>2. Changes in user experience.</b></p> <p><b>3. Retagging cost.</b></p> <p><b>Requests from other parts of NASA parts of the organization </b></p> <p>96</p> <a id="p97" href="#"></a> <h3><b>Executive Support</b></h3> <p>Practice Area</p> <p><b>Basic</b></p> <p>Executive support and ROI  External Search ROI  SMT in separate silos</p> <p><b>Intermediate</b></p> <p> Intranet ROI Model</p> <p><b>Advanced</b></p> <p> CEO knows Search ROI</p> <p><b>Limiting</b></p> <p> Use it or Lose It Budgets • • • • •</p> <p><b>External Search ROI: </b></p> <p>Both Corbis and Getty Images have very clear and compelling ROI stories for external search.</p> <p><b>SMT in separate silos: </b></p> <p>Both Corbis and Getty images havemoved beyond this practice.</p> <p><b>Intranet ROI model: </b></p> <p>Not at time of interviews.</p> <p><b>CEO knows search ROI: </b></p> <p>Yes, both Corbis and Getty Images have CEOs who know the ROI story for external search, but there was not ROI analysis for the intranet at the time of the interviews.</p> <p><b>Use it or lose it budgets: </b></p> <p>Neither Corbis or Getty Images discussed budget details.</p> <p>97 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p98" href="#"></a> <h3><b>Agenda</b></h3> <p><b>9:15 9:30 9:45 10:15 10:30 10:40 11:40 11:45 12:00 Metadata Definitions Maturity Models Metadata Maturity Model (ca. 2006) Break Stock Photo Business Data Governance Practices in Stock Photo Agencies Summary Questions Adjourn</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 98</p> <a id="p99" href="#"></a> <h3><b>Recommended Reading</b></h3> <p></p> <p><b>CMMI: http://chrguibert.free.fr/cmmi (Official site is http://www.sei.cmu.edu/cmmi/ , but that is not the most comprehensible.)</b></p> <p></p> <p><b>Joel Test http://www.joelonsoftware.com/articles/fog0000000043.html</b></p> <p></p> <p><b>EIA Roadmap http://www.louisrosenfeld.com/presentations/031013-KMintranets.ppt</b></p> <p></p> <p><b>Enterprise Search Report http://www.cmswatch.com/EntSearch/</b></p> <p><b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information 99</p> <a id="p100" href="#"></a> <h3><b>Fun Questions</b></h3> <p><b>The animals are divided into: (a) belonging to the emperor, (b) embalmed, (c) tame, (d) sucking pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification, (i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from along way off look like flies.</b></p> <p><b>Jorge Luis Borges, " THE ANALYTICAL LANGUAGE OF JOHN WILKINS" Works in 3 volumes (in Russian). St. Petersburg, "Polaris", 1994. V. 2: 87.</b></p> <p><b>This was created to be as bad a classification as possible. What makes it so bad?</b></p> <p>100 <b>Taxonomy </b><i>Strategies</i></p> <p><i>LLC</i></p> <p>The business of organized information</p> <a id="p101" href="#"></a> <p>Taxonomy Strategies LLC</p> <p><b>Contact Info</b></p> <p>Ron Daniel, Jr.</p> <p>925-368-8371 rdaniel@taxonomystrategies.com</p> <p>Sept. 10, 2008 Copyright 2008Taxonomy Strategies LLC. All rights reserved.</p> </div> </section> </div> </div> </div> </main> <footer> <div class="container mt-3"> <div class="row justify-content-between"> <div class="col"> <a href="/"> <img src="/theme/studyslide/static/logo-slideum.png" /> </a> </div> </div> <div class="row mt-3"> <ul class="col-sm-6 list-unstyled"> <li> <h6 class="mb-3">Company</h6> <li> <i class="fa fa-location-arrow"></i> Nicosia Constantinou Palaiologou 16, Palouriotissa, 1040 <li> <i class="fa fa-phone"></i> +357 64-733-402 <li> <i class="fa fa-envelope"></i> info@slideum.com </ul> <ul class="col-6 col-sm-3 list-unstyled"> <li> <h6 class="mb-3">Links</h6> <li> <a href="/about">About</a> <li> <a href="/contacts">Contact</a> <li> <a href="/faq">Help / FAQ</a> </ul> <ul class="col-6 col-sm-3 list-unstyled"> <li> <h6 class="mb-3">Legal</h6> <li> <a href="/terms">Terms of Service</a> <li> <a href="/privacy">Privacy policy</a> <li> <a href="/page.html?code=public.usefull.cookie">Cookie policy</a> <li> <a href="/page.html?code=public.usefull.disclaimer">Disclaimer</a> </ul> </div> <hr> <p>slideum.com © 2024, Inc. All rights reserved.</p> </div> </footer> <div class="modal directory" id="directory-modal"> <div class="modal-dialog"> <div class="modal-content"> <div class="modal-header"> <h5 class="modal-title">Directory</h5> <button class="close" type="button" data-dismiss="modal">×</button> </div> <div class="modal-body"></div> </div> </div> </div> <script src="/theme/common/static/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/theme/common/static/jquery_extra/dist/jquery-extra.js"></script> <script src="/theme/common/static/popper.js@1.16.1/dist/umd/popper.min.js"></script> <script src="/theme/common/static/bootstrap@4.6.0/dist/js/bootstrap.min.js"></script> <script> var __path_directory = [ ] !function __draw_directory(data, root, uuid) { var ul = $('<ul>', uuid && { id: 'category' + uuid, class: !__path_directory.includes(uuid) ? 'collapse' : null }); for (var item in data) { var li = $('<li>').appendTo(ul); if (item = data[item], item.children) { li.append('<a href=#category' + item.id + ' data-toggle=collapse>') __draw_directory(item.children, li, item.id); } else { li.append('<a href=' + item.url + '>'); } var a = $('> a', li).addClass('item').text(item.name) .append($('<a class="link fa fa-external-link" href=' + item.url + '>')); if (item.id === +__path_directory.slice(-1)) { a.addClass('active'); } /* if (item.id !== __path_directory[0]) { a.addClass('collapsed'); } */ } root.append(ul); } ([{"id":1,"name":"Food and cooking","url":"/catalog/Food+and+cooking","children":null},{"id":2,"name":"Education","url":"/catalog/Education","children":null},{"id":3,"name":"Healthcare","url":"/catalog/Healthcare","children":null},{"id":4,"name":"Real estate","url":"/catalog/Real+estate","children":null},{"id":5,"name":"Religion ","url":"/catalog/Religion+","children":null},{"id":6,"name":"Science and nature","url":"/catalog/Science+and+nature","children":null},{"id":7,"name":"Internet","url":"/catalog/Internet","children":null},{"id":8,"name":"Sport","url":"/catalog/Sport","children":null},{"id":9,"name":"Technical documentation","url":"/catalog/Technical+documentation","children":null},{"id":10,"name":"Travel","url":"/catalog/Travel","children":null},{"id":11,"name":"Art and Design","url":"/catalog/Art+and+Design","children":null},{"id":12,"name":"Automotive","url":"/catalog/Automotive","children":null},{"id":13,"name":"Business","url":"/catalog/Business","children":null},{"id":14,"name":"Government","url":"/catalog/Government","children":null}], $('#directory-aside')); var __root_directory = $('#directory-aside > ul'); $('#directory-aside') .on('show.bs.collapse', function() { //console.log('show.collapse') }) .on('hide.bs.collapse', function() { //console.log('hide.collapse') }); $('#directory-modal') .on('show.bs.modal', function() { $('[class$="body"]', this).prepend(__root_directory); }) .on('hide.bs.modal', function() { $('#directory-aside').prepend(__root_directory); }); $('.directory-mobile').on('click', function(e) { e.preventDefault(); }); $('.directory .link').on('click', function(e) { e.stopPropagation(); }); </script> <script> function scrollToViewport() { $('html, body').stop().animate( { scrollTop: $('.navbar').outerHeight() }, 1000); } setTimeout(scrollToViewport, 1000); $(window).on('orientationchange', scrollToViewport); $('[data-toggle="tooltip"]').tooltip(); </script> <script async src="//s7.addthis.com/js/300/addthis_widget.js#pubid=#sp('addthis_pub_id')"></script> <!-- Yandex.Metrika counter --> <script type="text/javascript"> (function (d, w, c) { (w[c] = w[c] || []).push(function() { try { w.yaCounter28397281 = new Ya.Metrika({ id:28397281 }); } catch(e) { } }); var n = d.getElementsByTagName("script")[0], s = d.createElement("script"), f = function () { n.parentNode.insertBefore(s, n); }; s.type = "text/javascript"; s.async = true; s.src = (d.location.protocol == "https:" ? "https:" : "http:") + "//mc.yandex.ru/metrika/watch.js"; if (w.opera == "[object Opera]") { d.addEventListener("DOMContentLoaded", f, false); } else { f(); } })(document, window, "yandex_metrika_callbacks"); </script> <noscript><div><img src="//mc.yandex.ru/watch/28397281" style="position:absolute; left:-9999px;" alt="" /></div></noscript> <!-- /Yandex.Metrika counter --> <link rel="stylesheet" type="text/css" href="//cdnjs.cloudflare.com/ajax/libs/cookieconsent2/3.1.0/cookieconsent.min.css" /> <style> @media screen and (max-width: 768px) { .cc-revoke { display: none; }} </style> <script src="//cdnjs.cloudflare.com/ajax/libs/cookieconsent2/3.1.0/cookieconsent.min.js"></script> <script> window.addEventListener("load", function() { window.cookieconsent.initialise( { content: { href: "https://slideum.com/dmca" }, location: true, palette: { button: { background: "#fff", text: "#237afc" }, popup: { background: "#007bff" }, }, position: "bottom-right", revokable: true, theme: "classic", type: "opt-in" })}); </script> </body> </html>