GlobalWisdom Software

Download Report

Transcript GlobalWisdom Software

GlobalWisdom Software
BravoTM Reviewer for
Online Editors
Abhijit Patil
Company Overview
• GlobalWisdom, Inc. sells software that dramatically
increases the ROI (Return Of Investment) of
enterprise information management systems by
capturing, organizing and sharing individual and
collective knowledge.
• Software enhances content and knowledge
management systems by leveraging existing
workflows and continuously adapting information
structure to foster critical thinking
• Software is not smart; people are
• Software that captures the distributed expertise of
people working with content.
BravoTM Reviewer for Online Editors
• Workflow-based solution for the online publishing
market.
• Effectively leverages the expertise of editors in
charge of adding metadata to content.
• Online publishers depend on high quality metadata to
make sure their readers find the information they
need.
• Bravo makes the process of adding metadata more
efficient and effective, so publishers can maximize
their return on investment when producing value
added content.
The Value of Bravo™,
Software that Learns
• Producing high-quality metadata is expensive.
• The best metadata comes from
– Editorial oversight by employees who understand
the content
– The taxonomy (i.e. subject matter hierarchy) being
applied to the content
– The customers whose business needs are being
served.
• Auto-classification
– Reduces cost
– Without human oversight and control, create
metadata that makes more sense to an algorithm
than a human being.
The Value of Bravo™,
Software that Learns
(Cont.)
• The Bravo combines
– The efficiency of auto-classification
– The expertise of in-house editorial staff
• Allowing editors to train the auto-classification
software within their ordinary workflow. This improves
the performance of the auto-classification algorithms,
so that business can achieve a more cost-effective
balance.
BravoTM System
How ?
• Auto-classifier recommends topics as metadata tags,
which reviewers can quickly approve or change.
• This allows publishers to off-load the rote work to the
software
• Gives reviewers tools to fine-tune the taxonomy while
reviewing content.
• Can add negative or positive examples to the training
set with a click of the mouse
• Easily create new topics or subtopics.
• These tools allow to produce better quality metadata
BravoTM Modules
• The Bravo Module
• Concept Based Search
– The difference between search and retrieval
• Content Classifier
• Content Indexer
• K42 Topic Map Engine
The Bravo Module
(Features)
Profiling
• Based on user/system interaction & direct input
– Gathers feedback from how people already work with
information
• Identification of subject area experts
• Recommendation of relevant topics or documents
– Global Wisdom's search module is concept based - results
are returned by the ideas expressed in a document, not
simply by keywords
• Information alerts on preset topics, or relevant to the
work at hand
• Automatic, accurate content routing
The Bravo Module
(Features)
Search
• Search by selecting example, from a sentence to a
full document
• Merging two or more examples
• Free text
System
• Unlimited hierarchy depth
• Augments performance of existing content
management systems
• Integrates seamlessly with legacy data storage
systems
Concept-based Search
Module™
• Search Module is concept-based, not keywordbased.
• Can find highly relevant documents in which query
terms do not appear.
• Users can phrase queries in the ways that make
sense to them, rather than adapting to the limitations
of the search engine.
• Based on patented Latent Semantic Indexing (LSI),
which has demonstrated a 30% increase in accuracy
(Telcordia, 1999) over keyword techniques.
• Leverages user feedback to become fully adaptive.
Concept-based Search
Module™ (Cont.)
Sample Applications
• Corporate intranet - ensure that employees across
the organization can find the information they need.
Maximize reuse of information, minimize duplication
of effort.
• Workgroup collaboration - improve support for
research, analysis, project management, and more.
• Publishers - ensure that customers can locate the
valuable content they need.
• Corporate extranet - improve accessibility to timely,
accurate information for suppliers, end users and
partners
Concept-based Search
Module™ (Cont.)
Features:
• Search by example.
– Select a piece of text, from a sentence to an entire
document, to serve as a search query.
• Search by merging examples.
– Combine two examples to find documents that are relevant
to each.
• 100% automatic.
– Does not require thesaurus or controlled vocabulary (works
with multiple languages, without the need for translation).
• Concept-based relevancy ranking.
• Highly effective with Optical Character Recognition (OCR).
• Retains high level of accuracy with small amount of text, e.g.,
photo captions.
Latent Semantic Indexing
• Uses singular-value decomposition.
• We take a large matrix of term-document association data and
construct a "semantic" space wherein terms and documents that
are closely associated are placed near one another.
• Singular-value decomposition allows the arrangement of the
space to reflect the major associative patterns in the data, and
ignore the smaller, less important influences.
• As a result, terms that did not actually appear in a document
may still end up close to the document, if that is consistent with
the major patterns of association in the data.
• Position in the space then serves as the new kind of semantic
indexing, and retrieval proceeds by using the terms in a query to
identify a point in the space, and documents in its neighborhood
are returned to the user.
D-GPS™ Content
Classification Module
• Classifier uses a model-based approach
• Current algorithms simply analyze the text in a
document. But different people will interpret the
meaning of that information in their own way.
• Patent-pending algorithm, D-GPS, improves
accuracy and relevance by simultaneously analyzing
the text and the hierarchy, or hierarchies, that reflect
the understanding your business brings to
information.
• Scalability – Handles large document sets with ease
• Integrated with existing hierarchy or with custom one
GlobalWisdom Indexer
• Essential component of the bravo™ engine, which
leverages user feedback to deliver fully adaptive
enterprise workflow solutions for publishers, content
delivery, knowledge management and enterprise
portals.
• Features:
– Fast, reliable and accurate indexing of content using Java
and SQL DBMS.
– 100% Java.
– Parallel threads using multiple DBMS connections.
– All common formats, including text, HTML, XML, etc. New
formats available on request.
– Optional Crawler API, with support for focused crawl
K42 Topic Map Engine
• Using the Topic Map standard, K42 captures the relationships
and associations that connect your data in within a "knowledge
layer.“
• Allows a greater level of meaning to be represented within a
content or knowledge management system, improving the
performance of critical information applications.
– For example, Topic Maps can represent that pending FDA
regulations have influenced a biotech company to launch a new
marketing campaign, and that both have contributed to a rise in
stock prices.
• The knowledge layer is a neutral format that works on top of any
proprietary system, easily bridging multiple formats and sources.
• When integrated with the bravo™ engine, K42 leverages user
feedback to become fully adaptive.
References
• Project Home Page
– http://www.globalwisdom.org/products/index.htm
– http://www.globalwisdom.org/products/modules.htm
• Indexing by Latent Semantic Analysis
– http://lsi.argreenhouse.com/lsi/papers/JASIS90.pdf
– http://lsi.argreenhouse.com/lsi/papers/execsum.html