Metropolis: Buildings and Applications

Download Report

Transcript Metropolis: Buildings and Applications

Data on the Inside
versus
Data on the Outside
Pat Helland
Architect
Microsoft Corporation
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 2
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 3
Service Oriented Architectures
Service-Orientation
Independent Services
Chunks of Code and Data
Interconnected via Messaging
Actually, we’ve been
doing this for years!
We’re just been
making it more pervasive…
Services Communicate with Messages
Nothing Else
No Other Knowledge about Partner
May Be Heterogeneous
Service-A
Slide 4
Service-B
Bounding Trust via Encapsulation
Services Only Do Limited Things for Their Partners
This Is How They Bound Their Trust
Encapsulation Is About Bounding Trust
Business Logic Ensures Only the Desired Operations Happen
No Changes to the Data Occur Except Through Locally Controlled
Business Logic!
Service
Things I’ll Do for Outsiders
• Deposit
• Withdrawal
• Transfer
• Account Balance Check
Slide 5
Encapsulating Both Change and Reads
Encapsulating Change
Ensures Integrity of the Service’s Work
Ensures Integrity of the Service’s Data
Encapsulating Exported Data for Read
Ensures Privacy by Controlling What’s Exported
Allows Planning for Loose Coupling and Expirations
E.g. Wednesday’s Price-List
Sanitized Data
for Export
Data
Exported Data
Private
Internal
Data
Business Request
Slide 6
Trust and Transactions
For This Talk, Services Do Not Share Transactions!
This Ends Up Being a Definitional (Terminology) Issue
Clearly Some Bodies of Code Are Distrusting of Each Other
Those Bodies of Code Will Not Hold Locks for the Partner
Services With Intermittent Connectivity Won’t Do 2-Phase Commit
We Are Considering the Implications of These Cases
The Word Service Is Being Used for Not Sharing Transactions!
Service-A
Slide 7
Atomic “ACID” Transaction
Service-B
Data Inside and Outside Services
Data Is Different Inside from Outside
Outside the Service
Passed in Messages
Understood by Sender and Receiver
Independent Schema Definition Important
Extensibility Important
Inside the Service
Private to Service
Encapsulated by
Service Code
MSG
Data
SQL
MSG
Data Outside
the Service
Slide 8
Data Inside
the Service
Operators and Operands
Messages Contain Operators
Requests a Business Operation
Operators Provide Business Semantics
Part of the Contract between the Two Services
Operator Messages Contain Operands
Details Needed To Do the Business Operation
The Sending Service Must Put Them into the Message
Service
Deposit
Operands
Slide 9
Operator
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 10
Transactions and Inside Data
Transactions Make You Feel Alone
No One Else Manipulates the Data When You Are
Transactional Serializability
The Behavior Is As If a Serial Order Exists
Tg
Te
Ta
Tf
Td
These Transactions
Precede Ti
Slide 11
Tj
Ti
Tc
Tb
Ti Doesn’t Know About These
Transactions and They Don’t
Know About Ti
Tn
Tl
Th
Tk
Transaction
Serializability
Tm
These Transactions
Follow Ti
To
Life in the “Now”
Transactions Live in the “Now” Inside Services
Time Marches Forward
Transactions Commit
Advancing Time
Transactions See
the Committed
Transactions
Service
A Service’s
Biz-Logic Lives
in the “Now”
Each Transaction
Only Sees a Simple
Advancing of Time
with a Clear Set of
Preceding
Transactions
Slide 12
Sending Unlocked Data Isn’t “Now”
Messages Contain Unlocked Data
Assume No Shared Transactions
Unlocked Data May Change
Unlocking It Allows Change
Messages Are Not From the “Now”
They Are From the Past
There Is No Simultaneity At a Distance!
• Similar to Speed of Light
• Knowledge Travels at Speed of Light
• By the Time You See a Distant Object It May Have Changed!
• By the Time You See a Message, the Data May Have Changed!
Services, Transactions, and Locks Bound Simultaneity!
• Inside a Transaction, Things Appear Simultaneous (to Others)
• Simultaneity Only Inside a Transaction!
• Simultaneity Only Inside a Service!
Slide 13
Outside Data: a Blast from the Past
All Data From Distant Stars Is From the Past
• 10 Light Years Away; 10 Year Old Knowledge
• The Sun May Have Blown Up 5 Minutes Ago
• We Won’t Know for 3 Minutes More…
All Data Seen From a Distant Service Is From the “Past”
By the Time You See It, It Has Been Unlocked and May Change
Each Service Has Its Own Perspective
Inside Data Is “Now”; Outside Data Is “Past”
My Inside Is Not Your Inside; My Outside Is Not Your Outside
Going to SOA Is Like Going From Newtonian to Einstonian Physics
• Newton’s Time Marched Forward Uniformly
• Instant Knowledge
• Before SOA, Distributed Computing Many Systems Look Like One
• RPC, 2-Phase Commit, Remote Method Calls…
• In Einstein’s World, Everything Is “Relative” To One’s Perspective
• SOA Has “Now” Inside and the “Past” Arriving in Messages
Slide 14
Versioned Images of a Single Source
A Sequence of Versions Describing Changes to Data
Updates From
One Service
Data Owning Service
Wednesday’s
Price-List
Owner Controlled
Owner Changes
the Data
Sends Changes
as Messages
Data Is Seen
As Advancing
Versions
Price-List
Wednesday’s
Price-List
Tuesday’s
Price-List
Monday’s
Price-List
Slide 15
Wednesday’s
Price-List
Wednesday’s
Price-List
Tuesday’s
Price-List
Listening
Partner
Service-1
Listening
Partner
Service-5
Listening
Partner
Service-8
Tuesday’s
Price-List
Monday’s
Price-List
Listening
Partner
Service-7
Operators: Hope for the Future
Messages May Contain Operators
Requests for Business Functionality Part of the Contract
Service-B Sends an Operator to Service-A
If Service-A Accepts the Operator, It Is Part of Its Future
It Changes the State of
Service-A
Service-B Is Hopeful
It Wants Service-A To Do
the Work
When It Receives a Reply,
It’s Future Is Changed!
Hopeful for
the Future…
Decides
to Issue
Request
Ever
Hopeful,
Waiting
for a
Response
Invoking
Partner
Service-B
Invoked
Partner
Service-A
Operator
Request
Operator
Response
Hopes Fulfilled,
the Future
Is Now
Slide 16
Blithely
Ignorant
and
Minding
Its Own
Business
A Future
Forever
Altered
by the
Processing
of the
Request
from
Service-B
Operands: Past and Future
Operands May Live in the Past
Values Published As Reference Data
Come From Service-A’s Past
Service-B Preparing a Request for Service-A
Deposit
Friday’s
Price-List
Published:
11PM Thursday
Operands
Operator
On Friday, Operands
Are Extracted from
the Price-List Published
on Thursday
Operands May Live in the Future
They May Contain a Proposed Value Submitted to Service-A
Slide 17
Between Services: Life in the “Then”
Everything Between Services Lives in the Past or Future
Operators Live in the Future
Operands Live in the Past or the Future
It’s Not Meaningful to Speak of “Now” Between Services
No Shared Transactions  No Simultaneity
Life in the “Then”
Past or Future
Not Now
Service-1
Each Service Has
a Separate “Now”
Service-4
Different Temporal
Environments!
Service-2
Slide 18
Service-3
No Notion
of “Now”
in Between
Services!
Services: Dealing with “Now” and “Then”
Services Make the “Now” Meet the “Then”
Each Service Lives in Its Own “Now”
Messages Come and Go Dealing with the “Then”
The Business-Logic of the Service Must Reconcile This!!
Example: Accepting an Order
• A Biz Publishes Daily Prices
• Probably Want to Accept
Yesterday’s Prices for a While
• Tolerance for Time Differences
Must Be Programmed
Example:
“Usually Ships in 24 Hours”
• Order Processing Has Old Info
• Available Inventory Not Accurate
• Deliberately “Fuzzy”
• Allows Both Sides to Cope with
Difference in Time Domains!
The World Is No Longer Flat!
• SOA Is Recognizing That There Is More Than One Computer
• Multiple Machines Mean Multiple Time Domains
• Multiple Time Domains Mandate We Cope with Ambiguity to
Allow Coexistence, Cooperation, and Joint Work
Slide 19
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 20
Immutable And/Or Versioned Data
Data May Be Immutable
Once Written, It Is Unchangeable
•Windows NT4, SP1
• The Same Set of Bits
Every Time
Immutable Data Needs an ID
From the ID, Comes the Same Data
No Matter When, No Matter Where
Versions Are Immutable
Each New Version Is Identified
Given the Identifier, the Same Data Comes
Recent NY Times
• Maybe Today’s,
Maybe Yesterday’s
Version Independent Identifiers
Let You Ask for a Recent Version
New York Times; 1/6/05
Latest SP of NT4
• Specific Version of the Paper
-- Contents Don’t Change
• Definitely NT4,
Results Vary Over Time
Slide 21
Version
Independent
Immutability of Messages
Retries are a Fact of Life
Zero or more delivery semantics
Messages Must Be Immutable
Retries Must Not See Differences…
Once It’s Sent, You Can’t Un-send!
Service-A
Once It’s Outside,
It’s Immutable!
Slide 22
Stability Of Data
Immutability Isn’t Enough!
We Need a Common Understanding
President Bush  1990 vs. President Bush  2005
Stable Data Has a Clearly Understood Meaning
The Interpretation of Values Must Be Unambiguous
Suggestion
• Timestamping or
Versioning Makes
Stable Data
Advice
• Don’t Recycle
Customer-IDs
Slide 23
Observation
• A Monthly Bank Statement
Is Stable Data
Observation
• Anything Called
“Current” Is Not Stable
Schema and Immutable Messages
When a Message Is Sent, It Must Be Immutable
It Is Crossing Temporal Boundaries
Retries Mustn’t Give Different Results
The Message’s Schema Must Be Immutable
It Makes a Mess If the Interpretation of the Message Changes
Message
Message
Schema
Slide 24
Service-A
Immutable Message
Immutable Schema
for the Message
Schema Versions Are Immutable
• A Message Should Reference
a Specific Version of Its Schema
• The Schema Can Then Evolve
Without Invalidating the Schema
for the Existing Messages…
Reference-Based Data, Immutability,
and Directed Acyclic Graphs
Messages Must Be Interpreted Correctly Across Time
Stable Values Are Essential
References to Other Data Must Be Unambiguous Across Time
Immutable and Stable Contents
Referenced Structures Can’t Change in Content or Interpretation
Only Works to Reference Pre-Existing Stuff that Doesn’t Change
Version Independent References
Can Be Used with Caution
The Semantics of a Structure with Version Independent
References Will Change over Time… Be Careful!
Data
“B”
Data
“A”
Slide 25
Data
“D”
Data
“C”
Data
“F”
Data
“E”
Data
“H”
Data
“G”
Msg-I
Msg-J
DAGs of History
Data
“B1”
Data
“A1”
Data
“C2.1”
Data
“A1.1”
Slide 26
Data
“B2”
Data
“B3”
Data
“A2”
Data
“D1.1”
Data
“C1”
Data
“D1”
Service-1
Service-2
Data
“D2.1”
Data
“C2”
Data
“D2”
Data
“D1.2”
Data
“C3”
Service-3
Data
“D3”
Service-4
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 27
Storing Incoming Data
When Data Arrives from the Outside, You Store It Inside
Most Services Keep Incoming Data
Keep for Processing
Keep for Auditing
Inside Data
Incoming
Data
Slide 28
SQL, DDL, and Serializability
SQL’s DDL (Data Definition Language) is Transactional
Changes Are Made Using Transactions
The Structure of the Data May Be Changed
The Interpretation After the DDL Change Is Different
DDL Lives Within the Time Scope of the Database
The Database’s Shape Evolves Over Time
DDL Is the Change Agent for This Evolution
SQL Lives in the “Now”
Each Transaction’s Execution Is Meaningful Only Within the
Schema Definition at the Moment of Its Execution
Serializability Makes This Crisp and Well-Defined
Slide 29
Extensibility versus Shredding
Shredding the Message
The Incoming Data Is Broken Down to Relational Form
Empowers Query and Business Intelligence
Auditing Considerations
Typically, Don’t Want to Change the Message Image
Preserve for Auditing
May Keep Unshredded Version Also for Non-Repudiation
Extensibility
The Sender Added Stuff You Didn’t Expect
May or May Not Know How Utilize Extensions
Extensibility Fights Shredding!
Hard To Map Extensions To Planned Relational Tables
OK To Partially Shred
Yields Partial Query Benefits
Slide 30
Encapsulation of Inside Data
Inside Data Is Encapsulated Behind the
Business Logic of the Service
Access To the Data Can Be Through the Logic
Occasionally, Subsets of the Inside Data Are Filtered
and Shipped Outside
Inside Data
Slide 31
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 32
XML, SQL, and Objects
XML
Schematized Representation of Messages
Hierarchical Structure
Schema Supports Independent Definition and Extensibility
SQL
Stores Relational Data by Value
Allows You to “Relate” Fields by Values
Incredibly Query Capabilities
Rectangular Representation
Objects
Very Powerful Software Engineering Tool
Based on Encapsulation
Slide 33
Data
SQL
Bounded And Unbounded
Data Representations
Relational Is Bounded
Operations Within the Database
Value Comparisons Only Meaningful Inside
Tightly Managed Schema
XML-Infoset Is Unbounded
Open (Extensible) Schema
Contributions to Schema from
Who-Knows-Where
References (Not Just Values)
URIs Known to Be Unique
XML-Infosets Can Be Interpreted Anywhere
Slide 34
Encapsulation and Anti-Encapsulation
SQL Is Anti-Encapsulated
UPDATE WHERE
Query/Update by Joining Anything with Anything
Triggers/Stored-Procs Are Not Strongly Tied to
Protected Data
XML Is Anti-Encapsulated
Please Examine My Public Schema!
Components/Objects Offer Encapsulation
Long Tradition of Cheating:
Reference Passing to Shared Objects
Whacking on Shared Database
Slide 35
A Service’s View of Encapsulation
Anti-Encapsulation Is OK in Its Place
SQL’s Anti-Encapsulation Is Only Seen by the Local Biz-Logic
XML’s Anti-Encapsulation Only Applies to the “Public” Behavior
and Data of the Service
Encapsulation Is Strongly Enforced by the Service
No Visibility Is Allowed to the Internals of the Service!
Sanitized Data
for Export
Exported Data
Business Request
Slide 36
Data
The Service
Private
Is aInternal
Data
Black Box!
What About Persistent Objects?
Persistent Objects
Encapsulated by Logic
Kept in SQL
Uses Optimistic Concurrency (Low Update)
Stored as Collection of Records
May Use Records in Many Tables
Keys of Records Prefixed with Unique ID
This is the Object ID
Encapsulation by Convention
Encapsulation Broken
by Business Intelligence
Table-A
ID-X
ID-Y
ID-Z
<key>
<key>
<key>
Database-Key
Slide 37
SQL
Table-B
ID-X <key1>
ID-X <key2>
ID-X <key3>
<record>
<record>
<record>
ID-Y <key1>
ID-Y <key2>
<record>
<record>
Database-Key
<record>
<record>
<record>
Persistent Object
ID=Y
Characteristics of Inside versus Outside
Temporal
Nature
Schema
Definition
Outside Data
NOW
THEN
Tightly Defined:
within DB Bounds;
within a Transaction
Independent Definition
-----Compose-able from
Independent Pieces
Need for
Encapsulation
Encapsulation at the
Service Boundary;
-----Services Are Big So We
Need Objects Inside ‘Em
Just Data
-----No Behavior
Updateability
Classic DB Stuff
-----Assume We Need
Normalization
Write Once
-----Read Many
Classic DB Stuff
Must Integrate Schemas
-----What Are Cross-Schema
Semantics?
Queryability
Slide 38
Inside Data
Today’s Ruling Triumvirate
It is fantastic to compare anything to anything and combine
anything with anything in Relational (within the bounded database)
It is possible to have independent definition of schema and data in
XML
XML-Infosets. You can independently extend, too.
Components/ Provide encapsulation of data behind logic. Ensure enforcement of
Objects
business rules. Eases composition of logic.
SQL
Strengths and
Weaknesses
Arbitrary
Queries
SQL
Outstanding
Bounded Schema
XML
Problematic:
Unbounded Schema Schema inconsistency
Objects
Encapsulated Data
Impossible:
Can’t see the data!
Independent
Data Definition
Encapsulation
(Controls Data)
Impossible:
Not via SQL
Centralized Schema Enforced by DBA
Outstanding
Impossible:
Open Schema
Impossible
Can’t see the data!
Outstanding
Each model’s strength is simultaneously its weakness!
You can’t enhance one to add features of the other without breaking it!
Slide 39
Footnote: Arguably, SQL constrains the data semantics to avoid problems and
XML is a superset allowing the flexibility to get into problems SQL avoids.
Outline
Introduction
Data: Then and Now
Data on the Outside
Data on the Inside
Representations of Data
Conclusion
Slide 40
Putting It All Together!
Today, Services Need All Three!
XML-Infosets: Between the Services
Objects:
Implementing the Business Logic
SQL:
Storing Private Data and Messages
Data
SQL
Slide 41
XML-InfoSets for
Objects Implement SQL Holds
Messages Between Services
the Biz Logic
the Data
Data Inside and Outside Services
Data Is Different Inside from Outside
Outside the Service
Passed in Messages
Understood by Sender and Receiver
Independent Schema Definition Important
Extensibility Important
Inside the Service
Private to Service
Encapsulated by
Service Code
MSG
SQL
MSG
Data Outside
the Service
Slide 42
Data
Data Inside
the Service
Resources
http://msdn.microsoft.com/architecture
www.PatHelland.com
http://blogs.msdn.com/PatHelland
Slide 43