OpenCyc - | University of Nebraska Omaha

Download Report

Transcript OpenCyc - | University of Nebraska Omaha

OpenCyc
BY BENJAMIN SUSMAN
Introductions
Commonsense knowledge:
Outline
Background information
OpenCyc details
Cyc Ontology
CycL – Cyc’s language
NLP in Cyc
Setting up OpenCyc
Demo
What is Cyc?
1.
Rich knowledge modeling language
2.
Large corpus of formally modeled knowledge with a wide breadth
3.
Efficient inference engine
Written in LISP and CycL (Cyc language)
Designed as a way to provide common sense or world knowledge and provide methods for
reasoning around this knowledge
~ 900 person-years of effort
Cycorp headquartered in Austin, Texas; offices in Ljublijana, Slovenia
Timeline
1984: Cyc Project founded by Dr. Douglas Lenat as a lead project in the Microelectronics and
Computer Technology Corporation (MCC).
1994: Cycorp was founded to further develop, commercialize, and apply the Cyc technology.
1996: Cycorp gets its first substantial government contract
2002: OpenCyc Version 1 Released: only included 6,000 concepts and 60,000 facts
2006: ResearchCyc Version 1 Released: adds additional knowledge base, English parsing and
generation tools, and interface for knowledge editing and querying
2012: OpenCyc Version 4 Released: Includes ~239,000 terms; ~2,093,000 triples
OpenCyc details
Available under Apache version 2 license (free software license)
OpenCyc includes a subset of functionality from Cyc
One way for accessing OpenCyc KB (knowledge base) is through OpenCyc KB Browser.
OpenCyc criticized for restrictiveness:
1.
Missing instance-level knowledge
2.
Lacking any reasoning capability
3.
Being formally inconsistent
The 3rd is most difficult to deal with since it has been hand-crafted
Other problems addressed by ResearchCyc or EnterpriseCyc
CycL – The basics
Language that makes up most of Cyc
Some similarities to LISP
#$Collection = Class, something that has instances
#$Individual
◦ non-set/class (incl. relations, strings, numbers)
#$isa = instance of
#$genls = subclass of (“is-a”)
Operators are built-in code-supported #$Predicates:
◦ #$and #$or #$not #$implies #$arity #$thereExists
◦ #$assertedSentence #$knownSentence #$arg1Isa
◦ #$interArg[Isa/Genl/Reln/Format]N-M #$resultIsa
CycL – The basics
Definitional Assertion of isa:
(isa isa BinaryPredicate)
(arg1Isa isa Thing)
(arg2Isa isa Collection)
Example
(isa Benjamin-Susman Individual)
CycL Visualized
Cyc Ontology
Separated into 3 parts: upper, middle, and lower
Upper level
◦ Contains the most broad abstract concepts, universal truths
◦ Smallest but most widely referenced area of Cyc ontology
Middle level
◦ Not universal, but widely used abstraction layer
◦ Examples: geospatial relationships, broad knowledge of human interactions, or everyday items and
events
Lower level
◦ Domain specific knowledge
◦ Example: information about chemistry, biology, etc.
Cyc Ontology – Upper Level
(comment Event “An important specialization of Situation, and thus also of IntangibleIndividual and
TemporallyExistingThing. Each instance of Event is a dynamic situation in which the state of the world
changes; each instance is something one would say ‘happens’. Events are intangible because they are
changes per se, not tangible objects that effect and undergo changes. Notable specializations of Event
include Event-Localized, PhysicalEvent, Action, and GeneralizedTransfer. Events should not be
confused with TimeIntervals.”)
(isa Event TemporalStuffType)
(isa Event Collection)
(quotedIsa Event PublicConstant-CommentOK)
(quotedIsa Event VocabularyConstrainingAbstraction)
(genls Event Situation)
(disjointWith Event PositiveDimensionalThing)
Cyc Ontology – Upper Level (cont.)
(genls InstantaneousEvent Event)
(genls HelicopterLanding Event)
inferred knowledge
(genls (BecomingFn Intoxicated) Event)
(relationExistsAll victim Event Victim-UnfortunatePerson)
For every instance of the collection Victim-UnfortunatePerson, there exists an Event in which that
person was the victim—i.e., an event for which these statements hold:
(victim ?SOMEVICTIM ?SOMEEVENT)
(isa ?SOMEVICTIM Victim-UnfortunatePerson)
(isa ?SOMEVICTIM Event))
Cyc Ontology – Middle Level
(comment SocialGathering “A specialization of SocialOccurrence. Each instance of SocialGathering is an
intentional social gathering of people who have the same or similar purposes in attending, and in which there is
communication between the participants. Specializations include BabyShower, Carnival, and Rally. Note that a
group of people waiting to board an elevator is not typically a SocialGathering, even though they share a common
purpose, since they are not expected to talk to each other.”)
(disjointWith SocialGathering SingleDoerAction)
(disjointWith SocialGathering ConflictEvent)
(disjointWith SocialGathering IntrinsicStateChangeEvent)
(keStrongSuggestionPreds SocialGathering dateOfEvent)
Although it is not semantically required, it is likely that getting a dateOfEvent assertion for any given instance of
SocialGathering would be appropriate or desirable.
(requiredActorSlots SocialGathering attendees)
In every social occasion something must play the role of attendees.
Cyc Ontology – Lower Level
(comment ChemicalReaction “A collection of events; a subcollection of
PhysicalTransformationEvent. Each instance of ChemicalReaction is an event in which two or
more substances undergo a chemical change, i.e., some portions of the substances involved are
transformed into different ChemicalSubstanceTypes. The transformations are brought about by
purely chemical (including biochemical) means which affect chemical bonds between atoms in
the molecules of stuff. Examples of ChemicalReaction: instances of CombustionProcess;
instances of Photosynthesis-Generic.”)
(keGenlsStrongSuggestionPreds-RelationAllExists ChemicalReaction catalyst)
(genls ChemicalReaction PhysicalTransformationEvent)
(genls CombustionReaction ChemicalReaction)
(genls ExothermicReaction ChemicalReaction)
(genls ChemicalBonding ChemicalReaction)
Cyc Ontology – Lower Level (cont.)
(arg1Genl availableReactantTypeInReactionType ChemicalReaction)
The first argument to an assertion made using the predicate
availableReactantTypeInReactionType must generalize to some chemical reaction.
(comment CombustionReaction “A specialization of both ChemicalReaction and
CombustionProcess. Each instance of CombustionReaction is a rapid chemical reaction that
produces a flame. Many combustion reactions involve oxygen from the air as a reactant.”)
(genls CombustionReaction CombustionProcess)
(genls CombustionReaction ChemicalReaction)
(outputsCreated-TypeType CombustionReaction Flame)
Events which are CombustionReactions have members of the collection Flame as outputs.
CycL Example
(isa BurningOfPapalBull SocialGathering)
Because this is an instance of SocialGathering, it is an known to be an instance of Event and to have
attendees.
(eventOccursAt BurningOfPapalBull CityOfWittenburgGermany)
(dateOfEvent BurningOfPapalBull
(DayFn 10 (MonthFn December (YearFn 1520))))
(attendee BurningOfPapalBull MartinLuther-ReligiousFigure)
Martin Luther is already represented in the KB, along with basic biographical information such as birth
and death date, country of residence, and native language.
(relationInstanceExistsMin BurningOfPapalBull attendees UniversityStudent 40)
At least forty university students attended the event. RelationInstanceExistsMin is a rule macro
predicate.
CycL Example (cont.)
(isa BurningOfPapalBull-Document CombustionProcess)
(properSubEvent BurningOfPapalBull-Document BurningOfPapalBull)
(relationInstanceExists inputsDestroyed BurningOfPapalBull-Document
(CopyOfConceptualWorkFn PapalBull-ExcommunicationOfLutherCW)
The thing destroyed is a member of the functionally defined collection “all copies of the conceptual work
PapalBullExcommunicationOfLuther”. The distinction between the conceptual artifact and the specific copy being burned prevents Cyc
from concluding that the conceptual work has been utterly destroyed (in the same way that burning a copy of Moby Dick does not destroy
the work Moby Dick generally).
(thereExists ?EVT (and (performedBy ?EVT MartinLuther-ReligiousFigure) (causes ?EVT BurningOfPapalBull-Document)))
The actual burning of a document was caused by some additional sub-event EVT (such as holding a match to it or throwing it into a fire).
Unlike the distinction between the social gathering and the actual burning of the document, both of which are reified events, this event is
defined implicitly, because there is nothing else to say about it.
NLP in Cyc – The lexicon
◦
◦
◦
◦
Backbone of NL system
Links between English words and Cyc constants are stored
Every word contained as a Cyc constant
Constant designated with “#$”
NLP in Cyc – Syntactic Parser
Makes multiple parse trees for ambiguous (outputs all parse trees allowed by rules)
PARSE TREE 1
PARSE TREE 2
{:SENTENCE
{:SENTENCE
{:NP
{:NP
{:DETP {#$Determiner [the]}}
{:DETP {#$Determiner [the]}}
{:N-BAR {#$SimpleNoun [man]}}}
{:N-BAR {#$SimpleNoun [man]}}}
{:VP
{:VP
{#$Verb [saw]}
{:NP {:DETP {#$Determiner [the]}}
{:N-BAR {#$SimpleNoun [light]}}
{:PP {#$Preposition [with]}
{:NP {:DETP {#$Determiner [the]}}
{:N-BAR {#$SimpleNoun [telescope]}}}}}}}}
{#$Verb [saw]}
{:NP {:DETP {#$Determiner [the]}}
{:N-BAR
{:N-BAR {#$SimpleNoun [light]}}
{:PP {#$Preposition [with]}
{:NP {:DETP {#$Determiner [the]}}
{:N-BAR {#$SimpleNoun [telescope]}}}}}}}}
NLP in Cyc – Semantic Interpreter
Cyc-NL's semantic component transforms syntactic parses into CycL formulas
Semantic structures are built up piece-by-piece and combined into larger structures.
For each syntactic rule, there is a corresponding semantic procedure which applies.
Verb driven:
◦ Example: "Mary believes that the blue hat is pretty"
◦ (#$believes :SUBJECT :CLAUSE).
NLP in Cyc – NL Generation
Cyc's natural language generation capabilities include (according to Cyc):
◦
◦
◦
◦
◦
Multi-lingual support
Alternative paraphrasing based on desired verbosity, register, etc.
Selective inclusion/exclusion of text based on expectations of the user's information needs
Extensible lexifications for individual concepts
Extensible generation templates for predicates and sentences
Can create output without requiring user to understand CycL
Setting up OpenCyc
System requirements
◦ 3G RAM
◦ 64 bit system (No 32 bit support)
◦ 1G disk space
Setup
◦ Download and unzip folder
◦ Available for Windows and Linux based systems
◦ Windows: opencyc-4.0\scripts\run-cyc.bat
◦ Linux: opencyc-4.0/scripts/run-cyc.sh
Setting up OpenCyc
Access
◦
◦
◦
◦
Use web browser: http://localhost:3602/cgi-bin/cyccgi/cg?cb-start
Start as Guest account (documentation bad)
Login as CycAdministrator (Note typo in the prompt when OpenCyc starts)
Make new account, allow features should now be available
Demo
Let’s run it!
Research Cyc Anyone?
http://www.cyc.com/downloads/ResearchCyc_License_Rev2.1.pdf
References
Primary source of info: http://www.cyc.com/platform/opencyc
Cynthia Matuszek, John Cabral, Michael Witbrock, John DeOliveira. An Introduction to the
Syntax and Content of Cyc:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.1357&rep=rep1&type=pdf
CycL Visualization courtesy of: http://www.webstructor.net/worlds/cycl.html
Used for timeline and finding release dates: http://en.wikipedia.org/wiki/Cyc
Gave perspective on Cycorp’s funding, long term goals, and organizational setup:
http://www.technologyreview.com/news/403803/cycorp-the-cost-of-common-sense/
Copy of ResearchCyc’s License agreement:
http://www.cyc.com/downloads/ResearchCyc_License_Rev2.1.pdf
NLP in Cyc: http://www.cyc.com/cyc/nl