FIBO Semantics Initiative David Newman Strategic Planning Manager, Vice President Enterprise Architecture, Wells Fargo Bank January 2012
Download ReportTranscript FIBO Semantics Initiative David Newman Strategic Planning Manager, Vice President Enterprise Architecture, Wells Fargo Bank January 2012
FIBO Semantics Initiative David Newman Strategic Planning Manager, Vice President Enterprise Architecture, Wells Fargo Bank January 2012 "We can't solve problems by using the same kind of thinking we used when we created them." —Albert Einstein 1/30/2012 2 Agenda 1) Mission of joint EDM Council/Object Management Group Semantics OTC Derivatives Proof of Concept 2) Business and Regulatory Drivers 3) Briefing on Semantics as an Enabling Technology for Expressing and Operationalizing Financial Data Standards 4) OTC Derivatives POC Demonstration 5) Discussion and Next Steps 1/30/2012 3 Industry Team Collaborating on Semantics OTC Derivatives POC Name Organization Role David Newman Wells Fargo Lead Mike Bennett EDM Council Core Team Elisa Kendall Thematix Core Team Jim Rhyne Thematix Core Team Mike Atkin EDM Council Stakeholder Anthony Coates Londata Subject Matter Expert David Gertler Super Derivatives Subject Matter Expert Marc Gratacos ISDA Subject Matter Expert Andrew Jacobs UBS Subject Matter Expert Dave McComb Semantic Arts Subject Matter Expert Pete Rivett Adaptive Subject Matter Expert Martin Sexton London Market Systems Subject Matter Expert Harsh Sharma Citi Subject Matter Expert Kevin Tyson JP Morgan Chase Subject Matter Expert Marcelle von Wendland Fincore Subject Matter Expert 1/30/2012 4 Key Regulatory Requirements Influencing Semantics OTC POC 1) Define Uniform and Expressive Financial Data Standards Ability to enable standardized terminology and uniform meaning of financial data for interoperability across messaging protocols and data sources for data rollups and aggregations 2) Classify Financial Instruments into Asset Classes* Ability to classify financial instruments into asset classes and taxonomies based upon the characteristics and attributes of the instrument itself, rather than relying on descriptive codes 3) Electronically Express Contractual Provisions** Ability to encode concepts in machine readable form that describe key provisions specified in contracts in order to identify levels of risk and exposures 4) Link Disparate Information for Risk Analysis * Ability to link disparate information based upon explicit or implied relationships for risk analysis and reporting, e.g. legal entity ownership hierarchies for counter-party risk assessment 5) Meet Regulatory Requirements, Control IT Costs, Incrementally Deploy Ability to define data standards, store and access data, flexibly refactor data schemas and change assumptions without risk of incurring high IT costs and delays, evolve incrementally *Swap Data Recordkeeping and Reporting Requirements, CFTC, Dec 8, 2010 *Report on OTC Derivatives Data Reporting and Aggregation Requirements, the International Organization of Securities Commissioners (IOSCO), August 2011 **Joint Study on the Feasibility of Mandating Algorithmic Descriptions for Derivatives, SEC/CFTC, April 2011 1/30/2012 5 Semantics OTC Derivatives POC Mission Mission Statement: Demonstrate to the financial industry and the regulatory community how: utilizing semantic technology and the Financial Industry Business Ontology (FIBO) can be a prudent strategic investment to realize: data standardization data integration data linkage data classification using currently available data sources and messaging protocols 1/30/2012 6 Data challenges for entity and instrument identification, classification and relationships How can we evolve from a state of data disorder to data order? Jackson Pollock “Convergence” Current State of Financial Data Limited data standards Data rationalization problems Data incongruity and fragmentation Opaque data silos limits integration Cryptic codes, programs, brittle data schemas and fixed taxonomies Target State of Financial Data Pervasive data standards Data precision, clarity, consistency Data alignment and linkage Data integration despite silos Flexible and intelligent data schemas and dynamic classifications How can we supplement our existing investments in data management to resolve these challenges and achieve these goals? 7 1/30/2012 Semantic Web Technology Can Help Organizations Mature their Data Management Capabilities • The true value of an information management system is ultimately based upon the intelligence and expressive power of it’s data schema or model • Semantic web technology provides highly advanced data schemas (ontologies) and tools that can help organizations better define, link, integrate and classify their data 1/30/2012 8 Financial Industry Business Ontology (FIBO) FIBO Industry Standards FpML MISMO FIX ISO OMG XBRL Securities Business Entities Derivatives Loans Corporate Built in Actions Industry initiative to extend financial industry data standards using semantic web principles for heightened data expressivity, consistency, linkage and rollups Semantics is synergistic, complementary and additive to existing data standards and technology investments in data management! 1/30/2012 9 What is Semantic Technology? A data management technology for the 21st century that provides: a layer of intelligence over disparate data structures that is used to precisely express the meanings, concepts, and relationships implied by the data in ways that both humans and machines can understand in order to maximize data organization, integration and classification 1/30/2012 Semantic Web Stack 10 What are Semantic Data Schemas (Ontologies)? • Schemas based on a formal symbolic logic (Description Logics) that • specifies a set of mathematically verifiable and repeatable logical patterns that are understood by machines • and can be used to represent complex relations between entities • in order to automatically describe real world concepts that are meaningful to humans Semantic Schema (ontology) Understands 1/30/2012 Understands 11 Semantic Technology Basics • Describes concepts in terms of: – Classes (Entities, Unary predicates) – Relationships (Properties, Binary predicates) – Individuals (instances) • Makes inferencing possible – A “Reasoner” infers new data relationships and classifications after applying semantically defined rules and logical patterns to instances of data 1/30/2012 Aligns linguistically with how we think and speak! Subject Predicate Object <<Class>> <<Property>> <<Class>> Person type David workFor Company subPropertyOf type isEmployedBy Wells Fargo inverse employs 12 Semantic Intelligence Utilizes Underlying Machine Based Logical Patterns Example: Transitive Relations Underlying Machine Based Logical Pattern (Axiom) expresses Inference: A causes D A A causes B B B causes C C C causes D D Inference: Humans cause Forest Fires Human Concept Use Cases : Ancestry, Dependency, Impact, Link Analysis 1/30/2012 13 Some Examples of Semantic Axioms that Allow Machines to Represent Human Concepts A class (or property) is a Subsumption sub-set of another class (or property) Mother subClassOf Parent Marge type Mother -> Marge type Parent Functional Properties Property can have only one unique value Person hasBirthMother Mother Symmetric Properties Property relation holds true in both directions of the relationship Person marriedTo Person Person hasAncestor Person Transitive Properties If A has a relation with B, and B has a relation with C, then A also has a relation with C Properties can be linked together to form a chain of meaningful relationships Person hasParent Person: Person hasSister FemalePerson -> Person hasAunt FemalePerson Bart hasParent Marge : Marge hasSister Selma -> Bart hasAunt Selma Describes new class by associating multiple classes, properties and values together NuclearFamily equivalentClass = hasFather exactly 1 Father and hasMother exactly 1 Mother and hasChild some Child Property Chains Restriction Classes 1/30/2012 Lisa hasBirthMother Marge Homer marriedTo Marge -> Marge marriedTo Homer Bart hasAncestor Homer and Homer hasAncestor Abraham -> Bart hasAncestor Abraham Simpsons type NuclearFamily -> hasFather Homer and hasMother Marge and hasChild (Bart, Lisa, Maggie) 14 Ontology Spectrum* strong semantics Modal Logic First Order Logic Logical Theory Is Disjoint Subclass of Description Logic with transitivity DAML+OIL, OWL property UML Conceptual Model RDF/S XTM Extended ER Thesaurus ER Has Narrower Meaning Than Structural Interoperability DB Schemas, XML Schema Taxonomy Relational Model, XML weak semantics 1/30/2012 Is Subclass of Semantic Interoperability Is Sub-Classification of Syntactic Interoperability *courtesy of Dr. Leo Obrst, The Mitre Corporation 15 Semantics Offers Differentiating Value Compared to Conventional Technologies Swap LegalEntity Interest Rate Swap Contract SubClassOf Basis Swap Contract SubClassOf SwapAssoc Party Vanilla Interest Rate Swap Contract Type Swapstream SwapParty Equivalent Class Swap_Leg some Fixed_Interest_Rate and Swap_Leg some Variable_Interest_Rate hasSwapStream Swap_100001234 SwapStream_2… hasSwapStream Type SwapStream_1… Swap_Leg XML Relational Semantics • Lingua franca of web service messaging payloads following W3C standards • Used to tag data elements with standard labels that conform to a predefined schema • Forms structured data hierarchies • Document hierarchy can be queried • Dominant database implementation • Highly mature software and tools • Data is physically organized within tables and accessed by matching related columns in different tables that fulfill various conditions • Emerging form of knowledge representation offers highly intelligent form of data organization • Conceptually describes the meaning of data and its relationships in a way that both people and computers can understand • Supports classification, reasoning and agility • While XML tags associate data to labels, the meaning of the labels is not inherently understood by the computer requiring custom program logic to process each label • Knowledge within application logic • Hard-wired and brittle schema/data • Design, construction, access, mgt are labor, time, resource intensive • Limited, but growing, set of software, tools • Can supplement XML and relational database • Can begin with knowledge representation and evolve towards operational implementations 16 1/30/2012 Semantics Supplements Existing Data Standards: Descriptively and Operationally Operational Message XMLXML Message Provides data mapping, linkage and classification Relational Data Base Relational Data Base Swap LegalEntity Swap LegalEntity SwapAssoc SwapAssoc Party Party Swapstream SwapParty Swapstream SwapParty Rationalizes Descriptive Precisely describes data elements for better human understanding Describes Describes Ontology Interest Rate Swap Contract SubClassOf Basis Swap Contract SubClassOf Equivalent Swap_Leg some Vanilla Class Interest Rate Fixed_Interest_Rate and Swap_Leg some Swap Variable_Interest_Rate Contract Type hasSwapStream Swap_100001234 hasSwapStream Type SwapStream_1 1/30/2012 Note: Run with Animation SwapStream_2… Swap_Leg Provides data integration and advanced queries across disparate data sources Operational 17 Semantic Technology: How is it beneficial? Challenges: Conventional Technology Business Rules in Code New Physical Table for New Entity Application Software Data Schema Physical Database Access New Data Entity Define Update -> High costs, longer TTM Semantic Technology Some Business Rules Migrated to Ontology Application Software Knowledge encapsulated in opaque software Data organization tightly coupled with schema Multiple complex tables and data relationships Awareness of physical organization of data required Schemas enforce limited data integrity Improvements: Physical Format Unchanged after New Data Entity Added Ontology / Semantic Schema Physical Database Access Schema Update Inferred Data New Data Entity Define Some Business Rules Added to Ontology Standard vocabulary and knowledge representation Data organization decoupled from schema Inferencing creates new knowledge Consistent rules based on standard data elements ensured across domain All data is Web addressable -> Lower costs, faster TTM 1/30/2012 18 Comparative Analysis XML Relational Semantics Describes Concepts, Taxonomies, Rich Data Relationships Concepts Understandable to Both Humans and Machines Multiple Classifications and Categorizations of Data Logical consistency and constraint checking Reasoning and Inference Capabilities Ability to change schema/model with low impact/cost Potential to Deliver Faster TTM and Lower TCO Operational Scalability, Efficiency and Optimization Industry Adoption and Prevalence of Skilled Resources Maturity of Tools and Software Current Ease of Mastery of Technology and Skills Low 1/30/2012 Medium High 19 Potential Benefits of using Semantic Technology Evolve Global Data Standards, Enable Data Integration and Classification • Provides model and infrastructure to define the meaning of information in order to represent the semantics of data standards; as well as integrate, link and classify incongruent data Reduce Complexity • Reduces reliance on arcane legacy data structures and cryptic codes by using more meaningful, natural language friendly constructs Reduce Costs (People and Technology) • As understanding of data increases, costly data reconciliation efforts by analysts can be reduced • Improved data federation and reduced data management costs can potentially be realized Improve Agility • As regulatory/industry views and assumptions change, semantics allows data schemas to rapidly reflect change without incurring massive data and application program restructuring efforts Increase Functionality using Reasoning and Inferencing Capabilities • Using logically consistent rules and semantic definitions, programs called reasoners can infer data to be classified into special business defined categories and relationships 1/30/2012 20 Business and Operational Ontologies Requirement #1: Define Uniform and Expressive Financial Data Standards Business Ontology (AKA “conceptual model”) Defines Transaction types Defines contract types Defines leg roles Defines contract terms Model from Sparx Systems Enterprise Architect Includes only those terms which have corresponding instance data provides source for Agreement is a has party Operational Ontology (Semantic Web) 1/30/2012 swaps IR Stream IR Swap has party swaps IR Stream Narrowed for Operational use 21 Anatomy of a Semantic Data Standard Requirement #1: Define Uniform and Expressive Financial Data Standards ODM Model RDF Type Semantic Metadata Model OWL versionInfo SKOS SKOS altLabel altLabel DC source RDFS seeAlso SKOS definition RDF,RDFS, OWL: W3C Semantic languages DC: Dublin Core Metadata Elements SKOS: Simple Knowledge Organization System Semantic Metadata Community Access to Standards rdf:type skos:altLabel skos:definition Multiple access options over the web via the authoritative standards body Hyperlink to semantic web standard from documents Community participation and interaction Query access via formal semantics repository including links and synonymous terms for knowledge Improved governance Provenance and evolution recorded Model files for download in multiple tools LegalEntityIdentifier LEI A legal entity identifier (LEI) is a unique ID associated with a single corporate entity dc:source SIFMA (Securities Industry and Financial Markets Association) overview discussion of Legal Entity Identifier (http://www.sifma.org) owl:versionInfo Version 1.0.0 rdfs:seeAlso Office of Financial Research; Statement on Legal Entity Identification for Financial Contracts 1/30/2012 22 Semantics can operationally classify undifferentiated Swaps and show relationships Requirement #2: Classify Financial Instruments into Asset Classes Classes are inferred using rules that query the content of the data Data is linked together via relationships called properties Vanilla_IR_Swap has_Swap_Legs some Variable_Interest_Terms and has_Swap_Legs some Fixed_Interest_Terms 1/30/2012 * Gruff 3.0 courtesy of Franz, Inc. 23 Semantic Representation of Contractual Provisions for Risk Classification Requirement #3: Electronically Express Contractual Provisions Transaction Repository, et.al. ISDA Master Agreement Schedules Credit Support Annex Schedules FIBO Ontology Operational Ontology Counterparties Define Axioms Identify Key Contractual Events Classify Contract Type Identify Key Contractual Actions Credit Rating Agency Downgrade Counterparty Credit OTC Derivative Confirm Capture Semantics of Contract Provisions Events Reduce Value of Collateral Market Reference Data Default Events Termination Events Increase Collateral Infer Counterparty Exposures Classify Counterparties into Risk Categories for Analytics Risk Analyst Transfer Payments *Report on OTC Derivatives Data Reporting and Aggregation Requirements, the International Organization of Securities Commissioners (IOSCO), August 2011 Note: OTC POC Phase 2 in process 1/30/2012 **Joint Study on the Feasibility of Mandating Algorithmic Descriptions for Derivatives, SEC/CFTC, April 2011 24 Semantics offers Advanced Query Capabilities Requirement #4: Link Disparate Information for Risk Analysis Transaction Repository Z Legal Entity Legal Entity Legal?legal Name Entity ?legal ?legal Name Name ?parent Transaction Repository Y ?entity Transaction Repository X ?entity ?entity ?parent ?parent ?swap Risk Analyst ?swap ?swap type notional swap Amount AtRisk notional Notional Amount Amount AtRisk Note: TBD in future phase of POC 1/30/2012 Basis Swap ?amount ?amount type Vanilla Interest Rate Swap Query all Transaction Repositories to report on the sum total of aggregate exposure for all counterparties and their parents involved in all swaps associated with an interest rate swap taxonomy ?amount subClassOf subClassOf Interest Rate Swap Data is queried using graph pattern matching techniques vs. relational joins Queries can process inferred data and highly complex and abstract data structures Queries can federate across semantic endpoints (using SPARQL 1.1) Data can be aggregated and summarized (using SPARQL 1.1) 25 Semantics Offers Federation via Linked Data Requirement #4: Link Disparate Information for Risk Analysis Semantically defined data that is Web addressable and “inter-linked” Transcends organizational boundaries and provides universal access to data wherever it resides internally within the network (and externally via “Linked Open Data”) Obtains data directly from its source (transparent to location, platform, schema, format) Can support access, queries and rollups across Swap Data Repositories Aggregated Linked Data Query Linked OTC Data Cloud Swap Data Repository Database Swap Data Repository Database Legal Entity Data Provider Note: TBD in future phase of POC 1/30/2012 Semantic Enterprise Information Integration (EII) Platform Ontologies Risk Analyst Linked Open Data Cloud (External) 26 Semantic Usage Patterns can be Deployed Incrementally and in Tandem with Existing Technology Requirement #5: Meet Regulatory Requirements, Control IT Costs, Deploy Incrementally Conceptual Ontology Business Semantics Conceptual Models Operational Ontology Schema Relational Data Schema Inferred Data XML XML RDBMS Relational Rules Engine Unstructured Unstructured Inferencing and Classification of Source Data • Heterogeneous source data ingested, validated for inconsistencies, and transformed by Semantic Reasoner into domain ontology to fulfill mapping rules • Source data inferred by Reasoner, using formal axioms or rules, into abstract classifications, new data relationships/linkages • Semantic rules engine can be optionally accessed • Query time reasoning can be optionally utilized 1/30/2012 Relational Unstructured Reference Ontologies • Primarily for Human consumption • Conceptual, design-time, and non-operational • Community engagement and update process when warranted • Standard terminology, concepts and descriptions for reference, knowledge, data reconciliation, rationalization and governance • Integrated ontologies, Upper ontologies for broader meaning Semantic Application Semantic Application Application Business Semantics Conceptual Models Operational Ontology XML XML • Primarily for Machine consumption • Ontologies narrowed for operational usage • Supplements and operates in tandem with conventional technology • Runtime access to knowledge, reference data, metadata • Canonical domain models for mappings and interoperability • Semantic graph pattern matching queries and automated reasoning Operational Ontology TBox TBox Semantic Application RDBMS Relational Schema Inferred Inferred Inferred XML XML Rules Engine ABox ABox Unstructured Unstructured Data Data Federation and Linked Open Data • Data semantically linked, integrated and accessed both internally and externally using RDF linked URIs which are Web addressable • Federated query of semantic and non-semantic data stores using canonical semantic domain model for data interoperability and inferencing 27 OTC POC Semantic Building Blocks and Methodology OTC POC Operational Ontologies 4) Build operational ontology for Legal Entities 1) Build conceptual ontology for Swaps in FIBO Legal Entity Upper Ontologies FIBO Model Knowledge 2) Build operational ontology for Swaps from FIBO 9) Perform queries to fulfill regulatory use cases and reports SPARQL Queries 1/30/2012 Legal Entity FIBO Swap Bridge FIBO Swap 5) Build bridging ontologies that tie together individual ontologies FIBO-FpML Swap Bridge 8) Invoke Reasoner to a. associate data in FpML Swap Ontology to FIBO Swap Ontology b. classify Swap Contracts into taxonomy levels according to their attributes Reasoning LE Instances Legal Entity FpML Swap Bridge FpML Swap 7) Ingest Legal Entity data into Legal Entity Ontology 3) Build operational ontology for Swaps from FpML 6) Ingest FpML Swap data into FpML Swap ontology FpML Instances 28 OTC Derivatives Semantic POC Demonstration • Swap Ontology • Classification and Reasoning • Semantic Query 1/30/2012 29 Semantic Building Blocks for Financial Data Standards and Risk Management Reasoning and Inferencing Data Federation Graph Pattern Matching Inferred Conclusions and Data Linkage Data Classification and Categorization Data Mapping and Integration Financial Industry Business Ontology Holistic Data Linkages and Bridges Runtime Knowledge Data Traceability Data Consistency Mortgages Securities Derivatives ... Semantic Descriptions of Financial Data, Concepts, Relationships and Rules Financial Data Standards Expressivity Data Taxonomies Asset and Risk Categories Trust Conceptual Ontologies Descriptive Human Facing Data and Knowledge Representation (FIBO) Systemic Risk Analysis Transparency Operational Machine Facing Implementation Ontologies Advanced Queries Data Rollups and Aggregation Higher Level Concepts (Upper Ontologies) Semantic Foundations for Financial Data Management 1/30/2012 30 Semantic Technology: Making the Investment in Semantic Technology Adoption can be an Evolutionary Process that may Lead to Strategic Value • Is still early in its lifecycle; tools are relatively immature and language standards are still evolving, vendors are small • Does require a learning curve to understand how the “semantic reasoner” thinks in order to best utilize the technology; which can take time and investment to develop • Will not necessarily replace current object oriented and relational database technology in the foreseeable future; but can be used to better enable and enhance conventional technology • Positions users that are adopters of its knowledge representation and reasoning capabilities to achieve valuable benefits not easily achievable using conventional technologies by themselves By embracing semantic technology and FIBO as a basis for enhancing financial industry data standards we are making a strategic investment to improve our data management capabilities by using the tools of the 21st century 1/30/2012 31 31 Invitation to Financial Regulators, Market Authorities and the Financial Industry Financial regulators to support and participate in a formal collaboration with financial industry participants and standards organizations such as ISDA, ISO, XBRL, FIX, MISMO, OMG, etc. to refine and implement FIBO as the standard financial instrument and entity ontology for regulatory reporting, business processing and risk analysis Financial regulators to act as catalysts in forming a public/private partnership to create best practice reference architectures for operational semantic implementations. Continued extension of the semantic proof-of-concept work to support the analytical requirements of regulators, market authorities and financial institutions OTC Derivatives (Contractual Provisions, Credit Default Swaps) Asset Backed Securities (Mortgage Backed Securities, Collateralized Debt Obligations) FIBO 1/30/2012 32