Transcript poster

How NAGA* uncoils:
Searching with Relations and Entities
{kasneci, suchanek, ramanath, weikum}@mpii.mpg.de
Gjergji Kasneci
Fabian Suchanek
Maya Ramanath
Gerhard Weikum
The NAGA Data Model
Motivation
Based on binary relationships, named entities and concepts are
arranged unredundantly and consistently into a knowledge
graph.
What kind of research was Max
Planck involved in?
Should I search with Google?
Which keywords should I
use?
Maybe: Research, Scientist,
Max Planck
• For each extracted fact we maintain the URLs of the pages
in which it was found (evidence page). If the found fact is not
contained in the knowledge graph then it is included,
otherwise only the URL of its evidence page is maintained.
• For each fact we attach a confidence value to the
corresponding edge ef in the knowledge graph:
•The user should be able to define
relations, entities, and concepts in the query:
Research
is a
Scientist
Max Planck
confidence (e f )   correct ( f , p f ) *authority ( p f )
pf
•There is a need for a new framework
The NAGA Answer Model
The Framework
The answer to a query is a subgraph of the knowledge graph
that matches the query:
A
type
Q
1795
year
The NAGA Query Model
.*
A query is a directed graph G(V,E)
• V: set of (possibly labeled) nodes
• E: set of (possibly labeled) edges
Paris
locatedIn
Scientist
type
New York
Max Planck
bornIn
Einstein
( means | type)
year
locatedIn
0.83
New York
establishedIn
locatedIn
Paris
New York
0.72
If there are multiple subgraphs matching the query then the best
subgraph S is determined based on:
CONF ( S )  confidence ( f )
Kiel
f S
The formula above is based on the assumption that the facts
were extracted independently from one-another.
Ulm
RELATEDNESS QUERIES (RQ)
• A set of discovery queries connected to each other
by edges labeled with regular expressions
over relationships
Max Planck
0.99
• Overall Confidence:
DISCOVERY QUERIES (DQ)
• A connected directed graph the nodes and labels of
which may be unlabeled.
invented
1795
Paris
EVIDENCE QUERIES (EQ)
• A connected directed graph the nodes and labels of
which are labeled.
type
.*
1795
Paris
type
locatedIn
year
New York
Obviously, It holds: EQ  DQ  RQ
* NAGA is a very large mythological snake.
Here it is used as a symbol for the large size and
the diversity of the unstructured Web information.
The NAGA System & Preliminary Results
Naga is implemented
in Java and the facts
of the knowledge
graph are stored in
an Oracle data base.
NAGA vs. Google: