Dataware Products Direction Presented to BRS North American Users Group Meeting

Download Report

Transcript Dataware Products Direction Presented to BRS North American Users Group Meeting

Dataware Products Direction
Presented to BRS North American Users Group Meeting
August 27th, 1999
Dave Schubmehl
Dataware Product Suite
 BRS/NetAnswer
– Web and Client/Server based Information Retrieval for secure, richly
structured text content.
 InQuery
– Probabilistic Information Retrieval, Text Mining, and Filtering
 Publisher
– Electronic publishing to CDs and the Web.
 Knowledge Management Suite
– A foundation for enterprise knowledge management applications.
 Query Server
– Meta-search engine for combining search results from multiple Webbased search services.
Product Development Mission
 “Develop Dataware’s best-of-breed products
and technology into a suite of interoperating,
distributed components that will serve as the
foundation for high performance, highly scalable
information management, retrieval, and
publishing applications.”
Requirements

Interoperability and extensibility using common standards-based framework
(XML, etc.)

Support for extremely large, distributed collections

Flexible application development and administration framework

Cross-platform portability (Unix and NT)

Multi-lingual/Cross-lingual support

Advanced IR and text mining features
– Concept extraction / Document summarization
– Relationship analysis
– Auto-categorization and filtering
– Clustering
Product Vision (12 - 24 Months)
Applications (“Solutions”)
Dataware
Text
Mgr
Application
Components
Search
Server
Filtering
Server
Source
Text
Miner
3rd Party
Profiler
Categorization
Server
Index
Parser
Core
Cartridges Server
Components
Agent
Publisher
Server
Collaboration Messaging
Server
Server
Multimedia
Mgr
Event
Tracker
HTML/XML
Emitter
Query
Processor
Directory
Server
RDBMS
Web
Server
Query
Broker
Frameworks
Administration, Distribution, Security, Internationalization, Interoperability,
Exception Handling, Logging, Application Development
Getting There
Present
Publisher
3.0
BRS
6.3
Q4, 1999
Publisher
3.1
BRS
7.0
Q2, 2000
Q4, 2000
Publisher
4.0
Publisher
BRS
7.1
Doc
Manager
KMS
2.0
KMS
2.1
KMS
2.5
Text
Miner
InQuery
5.1
Query Server
2.0
InQuery
5.2
Query Server
2.1
InQuery
5.3
Query Server
2.2
Search
Server
Profiler
Agent
Server
BRS/Search 6.3 Improvements
 Patches
– currently at 90+
 New Features
– sort limit increased to 16 million documents
– HP-UX 11 32 & 64 bit port changes
– piece qualification by range
 Performance
– faster truncation and numeric range searching
– improved sort performance
– faster database loading
– faster NEAR searches
BRS/Search 6.3 Fixes
 inverted file growth when INVERTALIGN=8
 NT file locking
 pattern matching fixes
 open more than 256 files on Solaris
 paragraph qualification fixes
Development Plans
 BRS/Search 7.0
– scheduled for Q4 1999
– incorporate 6.3 patch levels
– additional features
– incorporate NetAnswer II into release schedule
 BRS/Search 7.1
– planning on mid-year 2000
BRS/Search V7.0 Features
 Database
– auto loading / build error record file
– dynamic database concatenation
– improved error checking
 Search Engine / Thesaurus
– multi-level explosion of thesaurus terms
– case sensitive searching
– improving ranking speed with large sets of query
terms
 Easier Installation & Administration
– GUI-based administration tool
– database creation wizards
– simpler / faster wizard based installation
NetAnswer II 7.0 Development Plans
 New release
– based on BRS/Search 7.0
– scheduled for Q4 1999
– new features include:
- soundex support
- faster ranking
- multiple term selection for index, thesaurus
– possible features include:
- multi-server support
- online update via web
- NSAPI / ISAPI support
BRS/Search & NetAnswer II
Platforms
 Sun Solaris 2.7
 HP-UX 11
 Compaq (Digital) Unix
 Windows NT 4
 IBM RS/6000 AIX
 Red Hat Linux?
BRS/Search 7.1 (Q2/Q3, 2000)
 Emphasis on componentization, with work in:
– Source Interface Management (“Cartridges”)
– Interoperability with KMS & InQuery components
– Admininstrative framework
– Distributed framework
– Large database performance
InQuery
Characteristics of Solutions
 Web-based probabilistic search (a la Infoseek,
Altavista, Lycos…)
 Intranet or domain specific Internet search
 Casual search (Web search sites, broad user
base, novice searchers)
 No stringent security requirements
 “Discovery” versus “Find” application.
 Access to Web, File System, ODBC, and basic
Lotus Notes sources
 English language only
InQuery Strengths
 Accurate probabilistic searching (relevance
ranking)
 Distributed architecture gives good scalability
(multi-host, cross-host)
 “Best passage” summarization
 Concept mining (find related people,
companies, and concepts)
 Good Web crawling (including Domino content)
 Simple installation and Web-based
administration
 Rich query language
KMS
Characteristics of Solutions
 Cross-repository content search and
organization (“enterprise KM”)
 Hierarchical content navigation
 Document management (“contributions”)
 Relatively stringent security
 Windows NT
 Expert identification
 Notification agents
KMS Strengths
 Integrated KM solution
 Expert Identification “tacit knowledge discovery”
 Integrated “lightweight” document management
 Sophisticated Lotus Notes Cartridge
 Security integrated with industry standards
(LDAP)
 Built-in agent technology
 Complete enterprise KM solution
Knowledge Query Server
Characteristics of Solutions
 Complete results across web search services
 Server based meta-searching
 Need to search both external and internal
sources of information unifying results
Query Server Strengths
 Server based configuration for many users
 Multi-threaded searching for high performance
 Document results clustering
 Many options regarding uniform relevance
ranking of results from different search services
 Simple configuration wizard for new search
services
 International language support including double
byte - currently works in Japanese!
Questions?