FIBO Semantics Initiative David Newman Strategic Planning Manager, Vice President Enterprise Architecture, Wells Fargo Bank January 2012

Download Report

Transcript FIBO Semantics Initiative David Newman Strategic Planning Manager, Vice President Enterprise Architecture, Wells Fargo Bank January 2012

FIBO Semantics Initiative
David Newman
Strategic Planning Manager, Vice President
Enterprise Architecture, Wells Fargo Bank
January 2012
"We can't solve problems by using the
same kind of thinking we used when
we created them." —Albert Einstein
1/30/2012
2
Agenda
1) Mission of joint EDM Council/Object Management Group Semantics OTC
Derivatives Proof of Concept
2) Business and Regulatory Drivers
3) Briefing on Semantics as an Enabling Technology for Expressing and
Operationalizing Financial Data Standards
4) OTC Derivatives POC Demonstration
5) Discussion and Next Steps
1/30/2012
3
Industry Team Collaborating on
Semantics OTC Derivatives POC
Name
Organization
Role
David Newman
Wells Fargo
Lead
Mike Bennett
EDM Council
Core Team
Elisa Kendall
Thematix
Core Team
Jim Rhyne
Thematix
Core Team
Mike Atkin
EDM Council
Stakeholder
Anthony Coates
Londata
Subject Matter Expert
David Gertler
Super Derivatives
Subject Matter Expert
Marc Gratacos
ISDA
Subject Matter Expert
Andrew Jacobs
UBS
Subject Matter Expert
Dave McComb
Semantic Arts
Subject Matter Expert
Pete Rivett
Adaptive
Subject Matter Expert
Martin Sexton
London Market Systems
Subject Matter Expert
Harsh Sharma
Citi
Subject Matter Expert
Kevin Tyson
JP Morgan Chase
Subject Matter Expert
Marcelle von Wendland
Fincore
Subject Matter Expert
1/30/2012
4
Key Regulatory Requirements
Influencing Semantics OTC POC
1) Define Uniform and Expressive Financial Data Standards
Ability to enable standardized terminology and uniform meaning of financial data for
interoperability across messaging protocols and data sources for data rollups and aggregations
2) Classify Financial Instruments into Asset Classes*
Ability to classify financial instruments into asset classes and taxonomies based upon the
characteristics and attributes of the instrument itself, rather than relying on descriptive codes
3) Electronically Express Contractual Provisions**
Ability to encode concepts in machine readable form that describe key provisions specified in
contracts in order to identify levels of risk and exposures
4) Link Disparate Information for Risk Analysis *
Ability to link disparate information based upon explicit or implied relationships for risk analysis
and reporting, e.g. legal entity ownership hierarchies for counter-party risk assessment
5) Meet Regulatory Requirements, Control IT Costs, Incrementally Deploy
Ability to define data standards, store and access data, flexibly refactor data schemas and change
assumptions without risk of incurring high IT costs and delays, evolve incrementally
*Swap Data Recordkeeping and Reporting Requirements, CFTC, Dec 8, 2010
*Report on OTC Derivatives Data Reporting and Aggregation Requirements, the International Organization of Securities Commissioners (IOSCO), August 2011
**Joint Study on the Feasibility of Mandating Algorithmic Descriptions for Derivatives, SEC/CFTC, April 2011
1/30/2012
5
Semantics OTC Derivatives POC Mission
Mission Statement:
Demonstrate to the financial industry and the regulatory community
how:
 utilizing semantic technology and the Financial Industry Business
Ontology (FIBO) can be a prudent strategic investment to realize:
data standardization
data integration
data linkage
data classification
 using currently available data sources and messaging protocols
1/30/2012
6
Data challenges for entity and instrument
identification, classification and relationships
How can we
evolve from a
state of data
disorder to data
order?
Jackson Pollock “Convergence”
Current State of Financial Data
 Limited data standards
 Data rationalization problems
 Data incongruity and fragmentation
 Opaque data silos limits integration
 Cryptic codes, programs, brittle
data schemas and fixed taxonomies
Target State of Financial Data
 Pervasive data standards
 Data precision, clarity, consistency
 Data alignment and linkage
 Data integration despite silos
 Flexible and intelligent data schemas
and dynamic classifications
How can we supplement our existing investments in data
management to resolve these challenges and achieve these goals?
7
1/30/2012
Semantic Web Technology Can Help Organizations
Mature their Data Management Capabilities
• The true value of an information management system is
ultimately based upon the intelligence and expressive
power of it’s data schema or model
• Semantic web technology provides highly advanced data
schemas (ontologies) and tools that can help organizations
better define, link, integrate and classify their data
1/30/2012
8
Financial Industry Business Ontology (FIBO)
FIBO
Industry Standards
FpML
MISMO
FIX
ISO
OMG
XBRL
Securities
Business
Entities
Derivatives
Loans
Corporate
Built
in
Actions
 Industry initiative to extend financial industry data standards using semantic
web principles for heightened data expressivity, consistency, linkage and rollups
Semantics is synergistic, complementary and additive to existing data standards
and technology investments in data management!
1/30/2012
9
What is Semantic Technology?
A data management technology for the 21st century that provides:
a layer of intelligence over disparate data structures that is used
 to precisely express the meanings, concepts, and relationships
implied by the data
 in ways that both humans and machines can understand
 in order to maximize data organization, integration and
classification
1/30/2012
Semantic Web Stack
10
What are Semantic Data Schemas
(Ontologies)?
• Schemas based on a formal symbolic logic (Description Logics) that
• specifies a set of mathematically verifiable and repeatable logical
patterns that are understood by machines
• and can be used to represent complex relations between entities
• in order to automatically describe real world concepts that are
meaningful to humans
Semantic Schema (ontology)
Understands
1/30/2012
Understands
11
Semantic Technology Basics
• Describes concepts in terms of:
– Classes (Entities, Unary
predicates)
– Relationships (Properties, Binary
predicates)
– Individuals (instances)
• Makes inferencing possible
– A “Reasoner” infers new data
relationships and classifications
after applying semantically
defined rules and logical patterns
to instances of data
1/30/2012
Aligns linguistically with how
we think and speak!
Subject
Predicate
Object
<<Class>>
<<Property>>
<<Class>>
Person
type
David
workFor Company
subPropertyOf
type
isEmployedBy Wells Fargo
inverse
employs
12
Semantic Intelligence Utilizes Underlying
Machine Based Logical Patterns
Example: Transitive Relations
Underlying
Machine
Based
Logical
Pattern
(Axiom)
expresses
Inference: A causes D
A
A causes B
B
B causes C
C
C causes D
D
Inference: Humans cause Forest Fires
Human
Concept
Use Cases : Ancestry, Dependency, Impact, Link Analysis
1/30/2012
13
Some Examples of Semantic Axioms that Allow
Machines to Represent Human Concepts
A class (or property) is a
Subsumption sub-set of another class
(or property)
Mother subClassOf Parent
Marge type Mother -> Marge type Parent
Functional
Properties
Property can have only
one unique value
Person hasBirthMother Mother
Symmetric
Properties
Property relation holds
true in both directions
of the relationship
Person marriedTo Person
Person hasAncestor Person
Transitive
Properties
If A has a relation with
B, and B has a relation
with C, then A also has
a relation with C
Properties can be linked
together to form a chain
of meaningful
relationships
Person hasParent Person: Person hasSister FemalePerson
-> Person hasAunt FemalePerson
Bart hasParent Marge : Marge hasSister Selma
-> Bart hasAunt Selma
Describes new class by
associating multiple
classes, properties and
values together
NuclearFamily equivalentClass = hasFather exactly 1 Father and
hasMother exactly 1 Mother and hasChild some Child
Property
Chains
Restriction
Classes
1/30/2012
Lisa hasBirthMother Marge
Homer marriedTo Marge -> Marge marriedTo Homer
Bart hasAncestor Homer and Homer hasAncestor Abraham -> Bart
hasAncestor Abraham
Simpsons type NuclearFamily -> hasFather Homer and hasMother
Marge and hasChild (Bart, Lisa, Maggie)
14
Ontology Spectrum*
strong semantics
Modal Logic
First Order Logic
Logical Theory
Is Disjoint Subclass of
Description Logic
with transitivity
DAML+OIL, OWL
property
UML
Conceptual Model
RDF/S
XTM
Extended ER
Thesaurus
ER
Has Narrower Meaning Than
Structural Interoperability
DB Schemas, XML Schema
Taxonomy
Relational
Model, XML
weak semantics
1/30/2012
Is Subclass of Semantic Interoperability
Is Sub-Classification of
Syntactic Interoperability
*courtesy of Dr. Leo Obrst, The Mitre Corporation
15
Semantics Offers Differentiating Value
Compared to Conventional Technologies
Swap
LegalEntity
Interest
Rate Swap
Contract
SubClassOf
Basis
Swap
Contract
SubClassOf
SwapAssoc
Party
Vanilla
Interest Rate
Swap
Contract
Type
Swapstream
SwapParty
Equivalent
Class
Swap_Leg some
Fixed_Interest_Rate and
Swap_Leg some
Variable_Interest_Rate
hasSwapStream
Swap_100001234
SwapStream_2…
hasSwapStream
Type
SwapStream_1…
Swap_Leg
XML
Relational
Semantics
• Lingua franca of web service messaging
payloads following W3C standards
• Used to tag data elements with standard
labels that conform to a predefined schema
• Forms structured data hierarchies
• Document hierarchy can be queried
• Dominant database implementation
• Highly mature software and tools
• Data is physically organized within
tables and accessed by matching
related columns in different tables
that fulfill various conditions
• Emerging form of knowledge representation
offers highly intelligent form of data organization
• Conceptually describes the meaning of data
and its relationships in a way that both people
and computers can understand
• Supports classification, reasoning and agility
• While XML tags associate data to labels,
the meaning of the labels is not inherently
understood by the computer requiring
custom program logic to process each label
• Knowledge within application logic
• Hard-wired and brittle schema/data
• Design, construction, access, mgt
are labor, time, resource intensive
• Limited, but growing, set of software, tools
• Can supplement XML and relational database
• Can begin with knowledge representation and
evolve towards operational implementations
16
1/30/2012
Semantics Supplements Existing Data
Standards: Descriptively and Operationally
Operational
Message
XMLXML
Message
Provides data
mapping, linkage
and classification
Relational Data Base
Relational
Data Base
Swap
LegalEntity
Swap
LegalEntity
SwapAssoc
SwapAssoc
Party
Party
Swapstream
SwapParty
Swapstream
SwapParty
Rationalizes
Descriptive
Precisely describes
data elements for
better human
understanding
Describes
Describes
Ontology
Interest
Rate Swap
Contract
SubClassOf
Basis
Swap
Contract
SubClassOf
Equivalent Swap_Leg some
Vanilla
Class
Interest Rate
Fixed_Interest_Rate and
Swap_Leg some
Swap
Variable_Interest_Rate
Contract
Type
hasSwapStream
Swap_100001234
hasSwapStream
Type
SwapStream_1
1/30/2012
Note: Run with Animation
SwapStream_2…
Swap_Leg
Provides data
integration and
advanced queries
across disparate
data sources
Operational
17
Semantic Technology: How is it beneficial?
Challenges:
Conventional Technology
Business Rules in Code
New Physical Table for New Entity
Application Software
Data Schema
Physical Database
Access
New Data Entity
Define
Update
 -> High costs, longer TTM
Semantic Technology
Some Business Rules
Migrated
to Ontology
Application
Software
 Knowledge encapsulated in
opaque software
 Data organization tightly coupled
with schema
 Multiple complex tables and data
relationships
 Awareness of physical organization
of data required
 Schemas enforce limited data
integrity
Improvements:
Physical Format Unchanged after New
Data Entity Added
Ontology / Semantic Schema
Physical Database
Access
Schema
Update
Inferred
Data
New Data Entity
Define
Some Business
Rules Added to
Ontology
 Standard vocabulary and
knowledge representation
 Data organization decoupled from
schema
 Inferencing creates new
knowledge
 Consistent rules based on standard
data elements ensured across
domain
 All data is Web addressable
 -> Lower costs, faster TTM
1/30/2012
18
Comparative Analysis
XML
Relational Semantics
Describes Concepts, Taxonomies, Rich Data Relationships
Concepts Understandable to Both Humans and Machines
Multiple Classifications and Categorizations of Data
Logical consistency and constraint checking
Reasoning and Inference Capabilities
Ability to change schema/model with low impact/cost
Potential to Deliver Faster TTM and Lower TCO
Operational Scalability, Efficiency and Optimization
Industry Adoption and Prevalence of Skilled Resources
Maturity of Tools and Software
Current Ease of Mastery of Technology and Skills
Low
1/30/2012
Medium
High
19
Potential Benefits of using Semantic
Technology
Evolve Global Data Standards, Enable Data Integration and Classification
• Provides model and infrastructure to define the meaning of information in order to represent the
semantics of data standards; as well as integrate, link and classify incongruent data
Reduce Complexity
• Reduces reliance on arcane legacy data structures and cryptic codes by using more meaningful,
natural language friendly constructs
Reduce Costs (People and Technology)
• As understanding of data increases, costly data reconciliation efforts by analysts can be reduced
• Improved data federation and reduced data management costs can potentially be realized
Improve Agility
• As regulatory/industry views and assumptions change, semantics allows data schemas to rapidly
reflect change without incurring massive data and application program restructuring efforts
Increase Functionality using Reasoning and Inferencing Capabilities
• Using logically consistent rules and semantic definitions, programs called reasoners can infer data
to be classified into special business defined categories and relationships
1/30/2012
20
Business and Operational Ontologies
Requirement #1: Define Uniform and Expressive Financial Data Standards
Business Ontology
(AKA “conceptual
model”)
Defines Transaction types
Defines contract types
Defines leg roles
Defines contract terms
Model from Sparx
Systems
Enterprise Architect
Includes only those terms
which have corresponding
instance data
provides source for
Agreement
is a
has party
Operational Ontology
(Semantic Web)
1/30/2012
swaps
IR Stream
IR Swap
has party
swaps
IR Stream
Narrowed for
Operational use
21
Anatomy of a Semantic Data Standard
Requirement #1: Define Uniform and Expressive Financial Data Standards
ODM Model
RDF
Type
Semantic
Metadata
Model
OWL
versionInfo
SKOS
SKOS
altLabel
altLabel
DC
source
RDFS
seeAlso
SKOS
definition
RDF,RDFS, OWL: W3C Semantic languages
DC: Dublin Core Metadata Elements
SKOS: Simple Knowledge Organization
System
Semantic Metadata
Community Access to Standards
rdf:type
skos:altLabel
skos:definition
 Multiple access options over the web via the authoritative
standards body
 Hyperlink to semantic web standard from documents
 Community participation and interaction
 Query access via formal semantics repository including links
and synonymous terms for knowledge
 Improved governance
 Provenance and evolution recorded
 Model files for download in multiple tools
LegalEntityIdentifier
LEI
A legal entity identifier (LEI) is a unique ID
associated with a single corporate entity
dc:source
SIFMA (Securities Industry and Financial
Markets Association) overview discussion of
Legal Entity Identifier (http://www.sifma.org)
owl:versionInfo Version 1.0.0
rdfs:seeAlso
Office of Financial Research; Statement on
Legal Entity Identification for Financial
Contracts
1/30/2012
22
Semantics can operationally classify
undifferentiated Swaps and show relationships
Requirement #2: Classify Financial Instruments into Asset Classes
Classes are inferred
using rules that query
the content of the data
Data is linked together
via relationships called
properties
Vanilla_IR_Swap
has_Swap_Legs some
Variable_Interest_Terms
and has_Swap_Legs some
Fixed_Interest_Terms
1/30/2012
* Gruff 3.0 courtesy of Franz, Inc.
23
Semantic Representation of Contractual
Provisions for Risk Classification
Requirement #3: Electronically Express Contractual Provisions
Transaction
Repository, et.al.
 ISDA Master Agreement
 Schedules
 Credit Support Annex
 Schedules
FIBO
Ontology
Operational
Ontology
Counterparties
Define Axioms
Identify Key
Contractual
Events
Classify
Contract
Type
Identify Key
Contractual
Actions
Credit
Rating
Agency
Downgrade
Counterparty
Credit
OTC Derivative Confirm
Capture
Semantics of
Contract
Provisions
Events
Reduce
Value of
Collateral
Market Reference
Data
Default Events
Termination Events
Increase Collateral
Infer
Counterparty
Exposures
Classify
Counterparties into
Risk Categories for
Analytics
Risk Analyst
Transfer Payments
*Report on OTC Derivatives Data Reporting and Aggregation Requirements, the
International Organization of Securities Commissioners (IOSCO), August 2011
Note: OTC POC Phase 2 in process
1/30/2012
**Joint Study on the Feasibility of Mandating Algorithmic Descriptions for Derivatives,
SEC/CFTC, April 2011
24
Semantics offers Advanced Query Capabilities
Requirement #4: Link Disparate Information for Risk Analysis
Transaction Repository Z
Legal
Entity
Legal
Entity
Legal?legal
Name
Entity
?legal
?legal
Name
Name
?parent
Transaction Repository Y
?entity
Transaction Repository X
?entity
?entity
?parent
?parent
?swap
Risk Analyst
?swap
?swap
type
notional
swap Amount
AtRisk
notional
Notional
Amount
Amount
AtRisk
Note: TBD in future phase of POC
1/30/2012
Basis
Swap
?amount
?amount
type Vanilla Interest
Rate Swap
Query all Transaction Repositories to
report on the sum total of aggregate
exposure for all counterparties and
their parents involved in all swaps
associated with an interest rate swap
taxonomy
?amount
subClassOf
subClassOf
Interest
Rate Swap
 Data is queried using graph pattern matching techniques vs. relational joins
 Queries can process inferred data and highly complex and abstract data structures
 Queries can federate across semantic endpoints (using SPARQL 1.1)
 Data can be aggregated and summarized (using SPARQL 1.1)
25
Semantics Offers Federation via Linked Data
Requirement #4: Link Disparate Information for Risk Analysis
 Semantically defined data that is Web addressable and “inter-linked”
 Transcends organizational boundaries and provides universal access to data wherever it resides
internally within the network (and externally via “Linked Open Data”)
 Obtains data directly from its source (transparent to location, platform, schema, format)
 Can support access, queries and rollups across Swap Data Repositories
Aggregated
Linked Data
Query
Linked OTC Data Cloud
Swap Data Repository
Database
Swap Data Repository
Database
Legal Entity
Data Provider
Note: TBD in future phase of POC
1/30/2012
Semantic Enterprise
Information
Integration (EII)
Platform
Ontologies
Risk Analyst
Linked Open Data
Cloud (External)
26
Semantic Usage Patterns can be Deployed
Incrementally and in Tandem with Existing Technology
Requirement #5: Meet Regulatory Requirements, Control IT Costs, Deploy Incrementally
Conceptual Ontology
Business
Semantics
Conceptual
Models
Operational Ontology
Schema
Relational
Data
Schema
Inferred
Data
XML
XML
RDBMS
Relational
Rules Engine
Unstructured
Unstructured
Inferencing and Classification of Source Data
• Heterogeneous source data ingested, validated for
inconsistencies, and transformed by Semantic Reasoner into
domain ontology to fulfill mapping rules
• Source data inferred by Reasoner, using formal axioms or rules,
into abstract classifications, new data relationships/linkages
• Semantic rules engine can be optionally accessed
• Query time reasoning can be optionally utilized
1/30/2012
Relational
Unstructured
Reference Ontologies
• Primarily for Human consumption
• Conceptual, design-time, and non-operational
• Community engagement and update process when warranted
• Standard terminology, concepts and descriptions for reference,
knowledge, data reconciliation, rationalization and governance
• Integrated ontologies, Upper ontologies for broader meaning
Semantic
Application
Semantic
Application
Application
Business Semantics Conceptual Models
Operational Ontology
XML
XML
• Primarily for Machine consumption
• Ontologies narrowed for operational usage
• Supplements and operates in tandem with conventional technology
• Runtime access to knowledge, reference data, metadata
• Canonical domain models for mappings and interoperability
• Semantic graph pattern matching queries and automated reasoning
Operational Ontology
TBox
TBox
Semantic
Application
RDBMS
Relational
Schema
Inferred
Inferred
Inferred
XML
XML
Rules Engine
ABox
ABox
Unstructured
Unstructured
Data
Data Federation and Linked Open Data
• Data semantically linked, integrated and accessed both internally
and externally using RDF linked URIs which are Web addressable
• Federated query of semantic and non-semantic data stores using
canonical semantic domain model for data interoperability and
inferencing
27
OTC POC Semantic Building Blocks and
Methodology
OTC POC Operational Ontologies
4) Build operational
ontology for Legal Entities
1) Build conceptual
ontology for Swaps
in FIBO
Legal Entity
Upper
Ontologies
FIBO
Model
Knowledge
2) Build operational
ontology for Swaps
from FIBO
9) Perform queries to fulfill
regulatory use cases and reports
SPARQL Queries
1/30/2012
Legal
Entity
FIBO
Swap
Bridge
FIBO
Swap
5) Build bridging ontologies
that tie together individual
ontologies
FIBO-FpML
Swap Bridge
8) Invoke Reasoner to
a. associate data in FpML
Swap Ontology to FIBO
Swap Ontology
b. classify Swap Contracts
into taxonomy levels
according to their
attributes
Reasoning
LE Instances
Legal
Entity
FpML
Swap
Bridge
FpML
Swap
7) Ingest Legal Entity data
into Legal Entity Ontology
3) Build operational
ontology for Swaps
from FpML
6) Ingest FpML Swap data
into FpML Swap ontology
FpML Instances
28
OTC Derivatives Semantic
POC Demonstration
• Swap Ontology
• Classification and Reasoning
• Semantic Query
1/30/2012
29
Semantic Building Blocks for Financial Data
Standards and Risk Management
Reasoning and
Inferencing
Data Federation
Graph Pattern Matching
Inferred Conclusions and Data Linkage
Data Classification and Categorization
Data Mapping and Integration
Financial
Industry
Business
Ontology
Holistic Data Linkages and Bridges
Runtime
Knowledge
Data Traceability
Data Consistency
Mortgages
Securities
Derivatives
...
Semantic Descriptions of Financial Data,
Concepts, Relationships and Rules
Financial
Data
Standards
Expressivity
Data Taxonomies
Asset and
Risk
Categories
Trust
Conceptual
Ontologies
Descriptive
Human Facing
Data and
Knowledge
Representation
(FIBO)
Systemic
Risk
Analysis
Transparency
Operational
Machine Facing
Implementation
Ontologies
Advanced
Queries
Data Rollups and Aggregation
Higher Level Concepts (Upper Ontologies)
Semantic Foundations for Financial Data Management
1/30/2012
30
Semantic Technology:
Making the Investment
in Semantic Technology
Adoption can be an
Evolutionary
Process that may
Lead to Strategic
Value
•
Is still early in its lifecycle; tools are relatively immature and
language standards are still evolving, vendors are small
•
Does require a learning curve to understand how the “semantic
reasoner” thinks in order to best utilize the technology; which can
take time and investment to develop
•
Will not necessarily replace current object oriented and relational
database technology in the foreseeable future; but can be used to
better enable and enhance conventional technology
•
Positions users that are adopters of its knowledge representation
and reasoning capabilities to achieve valuable benefits not easily
achievable using conventional technologies by themselves
By embracing semantic technology and FIBO as a basis
for enhancing financial industry data standards we are
making a strategic investment to improve our data
management capabilities by using the tools of the 21st
century
1/30/2012
31
31
Invitation to Financial Regulators, Market
Authorities and the Financial Industry
Financial regulators to support and participate in a formal collaboration with financial
industry participants and standards organizations such as ISDA, ISO, XBRL, FIX, MISMO,
OMG, etc. to refine and implement FIBO as the standard financial instrument and entity
ontology for regulatory reporting, business processing and risk analysis
Financial regulators to act as catalysts in forming a public/private partnership to create best
practice reference architectures for operational semantic implementations.
Continued extension of the semantic proof-of-concept work to support the analytical
requirements of regulators, market authorities and financial institutions
OTC Derivatives (Contractual Provisions, Credit Default Swaps)
Asset Backed Securities (Mortgage Backed Securities, Collateralized Debt Obligations)
FIBO
1/30/2012
32