Dataware Products Direction Presented to BRS North American Users Group Meeting
Download
Report
Transcript Dataware Products Direction Presented to BRS North American Users Group Meeting
Dataware Products Direction
Presented to BRS North American Users Group Meeting
August 27th, 1999
Dave Schubmehl
Dataware Product Suite
BRS/NetAnswer
– Web and Client/Server based Information Retrieval for secure, richly
structured text content.
InQuery
– Probabilistic Information Retrieval, Text Mining, and Filtering
Publisher
– Electronic publishing to CDs and the Web.
Knowledge Management Suite
– A foundation for enterprise knowledge management applications.
Query Server
– Meta-search engine for combining search results from multiple Webbased search services.
Product Development Mission
“Develop Dataware’s best-of-breed products
and technology into a suite of interoperating,
distributed components that will serve as the
foundation for high performance, highly scalable
information management, retrieval, and
publishing applications.”
Requirements
Interoperability and extensibility using common standards-based framework
(XML, etc.)
Support for extremely large, distributed collections
Flexible application development and administration framework
Cross-platform portability (Unix and NT)
Multi-lingual/Cross-lingual support
Advanced IR and text mining features
– Concept extraction / Document summarization
– Relationship analysis
– Auto-categorization and filtering
– Clustering
Product Vision (12 - 24 Months)
Applications (“Solutions”)
Dataware
Text
Mgr
Application
Components
Search
Server
Filtering
Server
Source
Text
Miner
3rd Party
Profiler
Categorization
Server
Index
Parser
Core
Cartridges Server
Components
Agent
Publisher
Server
Collaboration Messaging
Server
Server
Multimedia
Mgr
Event
Tracker
HTML/XML
Emitter
Query
Processor
Directory
Server
RDBMS
Web
Server
Query
Broker
Frameworks
Administration, Distribution, Security, Internationalization, Interoperability,
Exception Handling, Logging, Application Development
Getting There
Present
Publisher
3.0
BRS
6.3
Q4, 1999
Publisher
3.1
BRS
7.0
Q2, 2000
Q4, 2000
Publisher
4.0
Publisher
BRS
7.1
Doc
Manager
KMS
2.0
KMS
2.1
KMS
2.5
Text
Miner
InQuery
5.1
Query Server
2.0
InQuery
5.2
Query Server
2.1
InQuery
5.3
Query Server
2.2
Search
Server
Profiler
Agent
Server
BRS/Search 6.3 Improvements
Patches
– currently at 90+
New Features
– sort limit increased to 16 million documents
– HP-UX 11 32 & 64 bit port changes
– piece qualification by range
Performance
– faster truncation and numeric range searching
– improved sort performance
– faster database loading
– faster NEAR searches
BRS/Search 6.3 Fixes
inverted file growth when INVERTALIGN=8
NT file locking
pattern matching fixes
open more than 256 files on Solaris
paragraph qualification fixes
Development Plans
BRS/Search 7.0
– scheduled for Q4 1999
– incorporate 6.3 patch levels
– additional features
– incorporate NetAnswer II into release schedule
BRS/Search 7.1
– planning on mid-year 2000
BRS/Search V7.0 Features
Database
– auto loading / build error record file
– dynamic database concatenation
– improved error checking
Search Engine / Thesaurus
– multi-level explosion of thesaurus terms
– case sensitive searching
– improving ranking speed with large sets of query
terms
Easier Installation & Administration
– GUI-based administration tool
– database creation wizards
– simpler / faster wizard based installation
NetAnswer II 7.0 Development Plans
New release
– based on BRS/Search 7.0
– scheduled for Q4 1999
– new features include:
- soundex support
- faster ranking
- multiple term selection for index, thesaurus
– possible features include:
- multi-server support
- online update via web
- NSAPI / ISAPI support
BRS/Search & NetAnswer II
Platforms
Sun Solaris 2.7
HP-UX 11
Compaq (Digital) Unix
Windows NT 4
IBM RS/6000 AIX
Red Hat Linux?
BRS/Search 7.1 (Q2/Q3, 2000)
Emphasis on componentization, with work in:
– Source Interface Management (“Cartridges”)
– Interoperability with KMS & InQuery components
– Admininstrative framework
– Distributed framework
– Large database performance
InQuery
Characteristics of Solutions
Web-based probabilistic search (a la Infoseek,
Altavista, Lycos…)
Intranet or domain specific Internet search
Casual search (Web search sites, broad user
base, novice searchers)
No stringent security requirements
“Discovery” versus “Find” application.
Access to Web, File System, ODBC, and basic
Lotus Notes sources
English language only
InQuery Strengths
Accurate probabilistic searching (relevance
ranking)
Distributed architecture gives good scalability
(multi-host, cross-host)
“Best passage” summarization
Concept mining (find related people,
companies, and concepts)
Good Web crawling (including Domino content)
Simple installation and Web-based
administration
Rich query language
KMS
Characteristics of Solutions
Cross-repository content search and
organization (“enterprise KM”)
Hierarchical content navigation
Document management (“contributions”)
Relatively stringent security
Windows NT
Expert identification
Notification agents
KMS Strengths
Integrated KM solution
Expert Identification “tacit knowledge discovery”
Integrated “lightweight” document management
Sophisticated Lotus Notes Cartridge
Security integrated with industry standards
(LDAP)
Built-in agent technology
Complete enterprise KM solution
Knowledge Query Server
Characteristics of Solutions
Complete results across web search services
Server based meta-searching
Need to search both external and internal
sources of information unifying results
Query Server Strengths
Server based configuration for many users
Multi-threaded searching for high performance
Document results clustering
Many options regarding uniform relevance
ranking of results from different search services
Simple configuration wizard for new search
services
International language support including double
byte - currently works in Japanese!
Questions?