CS440 (Advanced Information Modeling) Lecture 1

Download Report

Transcript CS440 (Advanced Information Modeling) Lecture 1

Model Driven Development with ORM 2 and NORMA Terry Halpin

Neumont University

[email protected]

www.orm.net

© 2007 T. Halpin & Neumont University 1

• ORM Features, History, and Tool Support • The Data Modeling Process • ORM’s Graphical Language • Comparison with ER and UML • Case Study • Relational Mapping and Tool Demos 2

• ORM Features, History, and Tool Support

• The Data Modeling Process • ORM’s Graphical Language • Comparison with ER and UML • Case Study • Relational Mapping and Tool Demos • Temporal aspects of Data Modeling 3

Object-Role Modeling

(

ORM

) • A conceptual approach for modeling, querying, and transforming data • Fact-oriented ( attribute-free ). All facts are modeled as relationships (unary, binary, ternary …) 1 • Semantic stability (no remodeling or requerying to talk about an attribute) • Facilitates validation by verbalization & population • Richly expressive graphical constraint language (compared with industrial ER, or UML class diagrams).

1 The OMG’s SBVR approach is also fact-oriented.

4

Modeling Approach = Modeling Procedure(s ) + Modeling Language(s ) ORM includes procedures for conceptual modeling and for mapping (transforming) ORM models and queries to attribute-based models (e.g. Relational, OO, ER, UML, XSD, OWL).

ORM includes graphical and textual modeling languages and a textual query language.

5

ORM History and Tool Support

• Originated in 1970s in Europe • Various flavors (NIAM2007, ORM 2, FCO-IM etc.) • First ORM tools developed at Control Data labs in Brussels (IAST, RIDL*) • USA tools: InfoDesigner, InfoModeler, ActiveQuery, ORM Source Model solution in Microsoft Visio for Enterprise Architects • Current European tools: CaseTalk, Infagon, CogNIAM • ORM 2 tool under development: NORMA (open source plug-in to Visual Studio .NET) This presentation focuses on ORM 2 (2 nd generation ORM) 6

• ORM Features, History, and Tool Support

• The Data Modeling Process

• ORM’s Graphical Language • Comparison with ER and UML • Case Study • Relational Mapping and Tool Demos • Temporal aspects of Data Modeling 7

Client ATM Deposit Withdraw Transfer between accounts Get Balance

Client (.nr) AccountType (.code) uses has Account (.nr) is on Transaction (.nr) holds involves MoneyAmount (USD) is of TransactionType (.code)

Going from a process use case to a data model

8

Data Use Cases

• For data modeling, we need DATA (cases of data being used), e.g.

use cases – Sample reports – Sample input forms – Sample queries • How to go from a data use case to a data model?

– Have the domain expert verbalize the data – Rephrase this as unambiguous, elementary facts – Add and validate the business rules constraining the data 9

Analysis is a Joint Activity

• The domain expert best understands the business domain • The modeler elicits and formalizes this understanding • The modeler assists the domain expert to identify the business rules related to the data (constraints or derivation rules) • The modeler – validates Verbalizing the model with the client by the model in natural language – Populating the model with positive/negative examples 10

Patient# 571 Temperature 100 Data (uninterpreted syntax) 571 The Patient with Patient# ‘571’ has a Temperature of 100 o F Fact (proposition taken to be true) Information = data + semantics 11

Elementary Facts

An elementary fact is an assertion that an object has a property * or one or more objects participate in a relationship ** Person drives smokes Car where the fact cannot be split into simpler facts with the same objects (without info-loss) * plays a role by itself ** play different roles in the same association 12

A unary fact

The Person named ‘Jack Smith’

smokes

.

A binary fact

The Executive named ‘William Portals’

climbed

the Mountain named ‘Mt Rainier’.

A ternary fact

The Person named ‘Don Bradman’

played for

the Sport named ‘Cricket’ the Country named ‘Australia’.

Facts may also be of higher arity (4 or more roles).

13

An Old but Fun Example 4 QLD NSW WA … Shetland Pony … … … 10 Pot Middy, Ten Middy, Ten …

BeerServe (fl oz:) State (.code) … calls … by ...

QLD NSW 10 Middy ...

10 Pot ... ...

CommonName

Pint Pot … 20

14

Overall Procedure: Information systems life cycle

• Feasibility study • Requirements analysis • Conceptual design ( data , process) • Logical design ( data , process) • External design (data, process) • Prototyping • Internal design and implementation • Testing, validation, and maintenance Large projects are often developed iteratively 15

Conceptual analysis

usually involves: • • high level service (essential business process) modeling information modeling • • For large applications: • divide the UoD into manageable sub-sections prioritize the order in which sub-sections will be modeled apply the Conceptual Schema Design Procedure ( CSDP ) • to each sub-section integrate the subschemas into a global conceptual schema Many applications build on existing applications: • reverse-engineer the existing model(s) to a conceptual model • refine the conceptual model to fit the new business needs 16

1 2 3 4 5 6 7

Conceptual schema design procedure

Transform familiar examples into elementary facts and apply quality checks Draw the fact types and apply a population check Check for entity types that should be combined and note any arithmetic derivations Add uniqueness constraints and check arity of fact types Add mandatory role constraints and check for logical derivations Add value, set comparison, and subtyping constraints Add other constraints and perform final checks 17

• ORM Features, History, and Tool Support • The Data Modeling Process

• ORM’s Graphical Language

• Comparison with ER and UML • Case Study • Relational Mapping and Tool Demos 18

ORM 2 Graphical Modeling Language

Object = Entity or Value Entity = Object that is identified by a definite description.

Entities typically change their state over time.

Entities may be concrete or abstract.

e.g.

The Country that has CountryCode ‘AU’.

The President named ‘Abraham Lincoln’.

The Course with course code ‘CS542’.

Country Entity types are depicted as named, soft rectangles.

As a configuration option, soft rectangles may be replaced by hard rectangles or ellipses. Country Country President President President 19

Value = Lexical Constant (typically a character string or number).

Values are literal and cannot change their state.

e.g.

The CountryCode ‘AU’.

The PresidentName ‘Abraham Lincoln’.

CountryCode PresidentName The CourseCode ‘CS542’.

The RoomNumber ‘207’.

CourseCode RoomNr The SerialNumber 1090.

SerialNumber Value Types are depicted as named, dashed rectangles.

Optionally, ellipses or hard rectangles may be used instead.

20

Many entities are identified by their relationship to a simple value.

If this is true for all instances of their entity type, the reference (identification) scheme for their entity type may be displayed as a reference mode in parenthesis.

The reference mode may be popular, unit-based, or general.

A popular reference mode and is preceded by a dot. has a corresponding value type that is used to identify entities of one type only, e.g.

Country (.code) Course (.code) President (.name) Product (.name) Building (.nr) Employee (.nr) The value type name appends the reference mode name to the entity type name, with a user-definable format that may include a separator e.g.

CountryCode CourseCode PresidentName etc. Country_Code Course_Code President_Name etc.

21

A unit-based (or measurement ) reference mode uses a unit based on some unit dimension (whose display is often suppressed) 1 .

A colon “:” is appended to the unit e.g.

Height (cm:) Width (cm:) Tax (USD:) CostPrice (USD:) MoneyAmount (USD:) The value type name appends “Value” to the reference mode name (if the language is English) with a user-definable format e.g.

cmValue USDValue cm_Value USD_Value If desired, the unit type may be displayed after the colon, e.g.

Height (cm: Length) Tax (USD: Money) 1 Support for unit-based reference etc. is expected by end 2007 22

Different units based on the same unit dimension are permitted in the same model, e.g.

Tax (USD: Money) Fee (XEU: Money) Height (cm: Length) Distance (km: Length) General reference modes e.g.

have the same name as their value type.

The value type may be used to reference multiple entity types Book (ISBN) Website (URL) Link (URL) 23

An independent object type may have instances that exist in the model without participating in any other relationships.

Independent object types have a “!” placed after their name e.g.

Country !

(.code) If an object type shape is (either on the same page or on different pages) this is shown by a shadow e.g.

duplicated in the diagram An e.g.

external object type is defined in another model.

The display notation “^” is tentative Address^ 24

Predicates ( relationships ) have one or more roles , each played by instances of a single object type.

R R / S S R [role1] [role2] A Predicate readings may be shown in mixfix notation 1 using … as an object placeholder, e.g. … introduced … to … For unary and binary predicates with no leading or trailing text, the placeholder may be omitted e.g.

smokes likes i.e. … smokes i.e. … likes … Roles may also be named.

Duplicate predicate shapes are shadowed.

1 Mixfix allows natural verbalization of predicates of any arity, and non-infix predicates (common in many foreign languages). 25

For binary predicates, forward and inverse readings may be shown separated by “/”.

Alternatively, arrow tips may be used.

Combining a predicate with its object type(s) forms an elementary or compound fact type.

An e.g.

elementary fact can’t be split into smaller facts with the same objects, without information loss Person smokes Person was born in Country Person [employee] Department employs [employer] Person [manager] reports to / manages Sport Person … played … for ...

Country 26

A compound fact type e.g.

includes two or more fact types, and if used in a model must be declared to be derived drives Car is imported from Person Country … drives … imported from …

*

An e.g.

existential fact (or reference) simply asserts the existence of an object There exists a Country that has CountryCode ‘US’.

Existential fact types are displayed either using a reference mode or an explicit relationship, e.g.

Country (.code) Country has / refers to This includes constraints (see later).

CountryCode 27

An elementary fact type may be objectified , resulting in another object type.

R “A” e.g.

“Enrollment” Student enrolled in Course A fact type may be: • asserted ( base fact type) • fully derived • derived and stored • semi-derived R R

*

R

**

R

+

28

Uniqueness constraints require instances of their role or role sequence to be unique in the role or role sequence population.

Internal uniqueness constraints apply to a single predicate and are depicted by bar(s) over the constrained role(s).

External uniqueness constraints apply to roles from different predicates and are depicted by circled bars connected to the roles.

If the constraint applies to role(s) used to provide the preferred identification scheme for an object type, a double-bar is used.

29

A simple mandatory role constraint requires its role to be played by all instances of its object type’s population and is shown by a solid dot at either end of the role-type connector.

A A An inclusive-or (or disjunctive mandatory role ) constraint requires at least one of its roles to be played by all instances of its object type’s population and is shown by a circled, solid dot connected to the roles.

30

e.g.

Room is in Building (.nr) RoomNr has Language (.name) is spoken by Consultant (.nr) has has Passport (.nr) DriverLicense (.nr) HourSlot (dhCode) Room (.nr) … at … is booked for ...

20 Mon 9 a.m. ORC 20 Tue 2 p.m. ORC 33 Mon 9 a.m. XQC 33 Fri 5 p.m. STP 20 Mon 9 a.m. XQC 33 Mon 9 a.m. ORC Activity (.code) has /refers to Activity Name ORC ORM class STM Staff Meeting STP Staff Party XQC XQuery class STP Staff Planning SPY Staff Party 31

Object Value Constraints

Enumeration

A {a 1 , a 2 , a 3 }

Range

A {a 1 .. a n }

Semi-bounded discrete range

{a ..}

Bounded continuous range

{[a 1 .. a 2 ]} {(a 1 .. a 2 )} {[a 1 .. a 2 )} {(a 1 .. a 2 ]}

includes both end values excludes both end values includes first value includes last value

Role Value Constraints A {a 1 , a 2 } Same patterns as above 32

Subset Constraints

Simple:

2 1

Contiguous Role-pair:

2.1 2.2

1.1 1.2

Each object pair that plays the role sequence 1.1, 1.2

also plays the role sequence 2.1, 2.2

Other cases:

Each object tuple that plays the first role sequence also plays the second role sequence ORM 2 also displays subset constraints over join paths 33

Equality Constraints

2 role-sequences (of 1 or more roles):

2 2.1 2.2

1 1.1 1.2

Populations of role-sequences must be equal

3 or more role-sequences:

e.g.

1.1 1.2

2.1 2.2

3.1 3.2

34

Exclusion Constraints : n 1 n .1

n .2

: 1.1 1.2

: Populations of 2 or more role-sequences must be mutually exclusive Exclusive-Or Constraints A : 1 n or A : 1 n Each instance in

A

’s population plays exactly one of the

n

attached roles (

n

> 1) 35

Subtyping A B C

B

is a proper subtype of

A

(its primary supertype) and

C

(a secondary supertype) A A A B C B

Exclusive Total

C B

Partition

C 36

Frequency Constraints f 1 1 1 f 2 f 2 Each instance that plays role 1 does so

f

times Each instance pair that plays roles 1, 2 does so

f

times Each instance pair that plays roles 1, 2 does so

f

times The frequency specification

f

may be any of the following

n



n



n n..m

exactly at least at most at least

n n n n

(a positive integer) and at most

m

37

Ring Constraints A

Irreflexive Asymmetric Intransitive Antisymmetric Acyclic Asymmetric + Intransitive Acyclic + Intransitive Symmetric Purely Reflexive

38

The previous constraints are all alethic (necessarily true for each state).

ORM 2 also supports deontic versions of all these constraints Deontic constraints are colored blue rather than violet. Most include an “o” for “obligatory”. Deontic ring constraints instead use dashed lines.

Uniqueness Mandatory Subset, Equality, Exclusion Frequency f Irreflexive Asymmetric Intransitive Antisymmetric Purely Reflexive Acyclic Asym-Intrans Acyclic-Intrans Symmetric

39

Model Validation

Moon (.name) orbits / is orbited by Planet (.name) Phobos Deimos Io Io Mars Mars Jupiter Mars }

Counter-example

Uniqueness constraint on first role +ve form: Each Moon orbits at most one Planet .

Illustrated by a satisfying fact population .

-ve form: It is impossible that the same Moon orbits more than one Planet .

Test with a counter-example .

40

Moon (.name) orbits / is orbited by Planet (.name) Phobos Deimos Io Mars Mars Jupiter The absence of a uniqueness constraint on the second role may be verbalized using default form : It is possible that the same Planet is orbited by more than one Moon .

Illustrated by a satisfying fact population.

41

Sample screenshot showing automated verbalization (+ve plus some default) for some selected aspects.

Currently about 80% of constraints are verbalized.

The rest should be implemented in a few months.

42

In ORM 2, rules may be assigned different modalities Patient (.nr) is a husband of / is a wife of Alethic: It is possible that more than one Patient and that more than one is a husband of Patient is a wife of the same the same Patient Patient .

Each Patient , Patient in the population of combination occurs at most once Patient is a husband of Patient .

Deontic: It is obligatory that each Patient is a husband of at most one Patient .

It is obligatory that each Patient is a wife of at most one Patient . 43

• ORM Features, History, and Tool Support • The Data Modeling Process • ORM’s Graphical Language

• Comparison with ER and UML

• Case Study • Relational Mapping and Tool Demos 44

ER and UML class diagrams are attribute-based, leading to more compact diagrams that are closer to implementation schemas.

UML also includes many other diagram types to deal with process modeling etc.

ORM’s attribute-free nature facilitates validation by verbalization and population and semantic stability.

ORM’s graphic language is far richer for data modeling than that of ER and UML, and its textual languages are far easier for non-technical users to understand than UML’s OCL.

ORM’s graphical language is also orthogonal and unambiguous (unlike UML).

45

UML’s multiplicity notation is fine for binaries but not for e.g.

n

-aries

can’t express as a multiplicity

HourSlot (dhCode) Room (.nr) Activity (.code) … at … is booked for ...

has / is of ActivityName

Room

nr {P}

HourSlot

0..1

dhCode {P}

*

0..1

Each activity has a booking

Activity

Booking code {P} name {U1} 46

UML’s xor is defined between associations, not association roles, so this is ambiguous.

Vehicle

vehicleLeased

*

{xor}

*

vehicleSold lessor 0..1

0..1

seller

Company

ORM correctly defines the constraint between roles and treats it as a combination of exclusion and inclusive-or. is leased from is leased from Vehicle Company Vehicle Company was purchased from was purchased from 47

Semantic stability

ORM models are immune to changes that reshape attributes as entity types or relationships.

The meaning of a query is not changed if we change a constraint or add a new fact type.

ORM queries respect this principle and hence facilitate schema evolution. ER and OO queries do not:  such changes can cause attributes to be remodeled;  hence, existing queries need to be reformulated.

48

Person (SSN) is of has Gender (.code) Title List titled people and their gender  Person --- has Title --- is of  Gender

Person

SSN {P} gender title [0..1] select from where SSN, gender Person title is not null 49

Person (SSN) is of has Gender (.code) Title precedes

Person

SSN {P} gender

*

has

Title *

name {P}

* *

precedes List titled people and their gender  Person --- has Title --- is of  Gender select from Person.SSN, gender Person join on PersonTitle Person.SSN = PersonTitle.SSN

50

Have your cake and eat it too by using ORM for conceptual analysis and mapping it to ER or UML views as desired.

It is expected that the NORMA tool will provide automatic, live generation of both ER and UML views by the end of 2008.

51

• ORM Features, History, and Tool Support • The Data Modeling Process • ORM’s Graphical Language • Comparison with ER and UML

• Case Study

• Relational Mapping and Tool Demos 52

Specify an ORM schema for this report from a book publisher.

ISBN Title

1-33456-012-3 Mizu no Kokoro 2-55860-123-6 Mind Like Water 3-540-25432-2 Informatics 4-567-12345-3 Informatics 5-123-45678-5 Semantics

Publis hed

2002 2004 2005 2006

Translation of Year

2003 2004 2005 1-33456-012-3 2004 2005

Sales Nr

5000 6000 5000 3000 3000

Total

16000 6000 2005 2000 2000

Best Seller?

Y N N

is translated from has BookTitle Book (ISBN) sold total-

*

[totalCopiesSold] Year (CE) was published in Year (CE) is a best seller

*

Published Book NrCopies … in … sold ...

[copiesSoldInYear]

Each

PublishedBook

is a

Book that was published in

some

Year.

* For each

PublishedBook, totalCopiesSold=

sum

(copiesSoldInYear).

*

PublishedBook is a best seller

iff

PublishedBook sold total NrCopies >= 10000.

Model this report from the same business domain.

PNr

1 2 3 4 5 6 7 8 9

Name

John Smith Don Bradchap Sue Yakamoto Yoko Ohyes Isaac Seldon Ann Gables John Smith Ann Jones Selena Moore

Title

Mr Sir Mrs Dr Dr Ms Mr Ms Mrs

Gender

M M F F M F M F F

Books authored

1-33456-012-3 2-55860-123-6 3-540-25432-2, 5-123-45678-5 4-567-12345-3 5-123-45678-5

has/is of PersonName is of Gender (.code) {‘M’, ‘F’} Person (.nr) has authored is restricted to PersonTitle Book (ISBN)

Model this final report from the same business domain.

ISBN Title

1-33456-012-3 2-55860-123-6 3-540-25432-2 4-567-12345-3 Mizu no Kokoro Mind Like Water Informatics Informatics

PNr

1 4 2 5 6 1 7 1 5

Review Assignment Name

John Smith Yoko Ohyes Don Bradchap Isaac Seldon Ann Gables John Smith John Smith John Smith Isaac Seldon

Result

4 5 5 5 4 4 5

BookTitle has is authored by /authored Book (ISBN) Person (.nr) “ReviewAssignment !”  2 is assigned for review by resulted in Grade (.nr) {1..5}

The full schema has/is of PersonName is translated from is of has is authored by BookTitle Year (CE) was published in … in … sold ...

[copiesSoldInYear] Published Book Book (ISBN)  2 Person (.nr) is assigned for review by “ReviewAssignment !” has resulted in Grade (.nr) Gender (.code) is restricted to PersonTitle {1..5} {‘M’, ‘F’} NrCopies sold total-

*

[totalCopiesSold] is a best seller

* Each

PublishedBook

is a

Book that was published in

some

Year.

* For each

PublishedBook, totalCopiesSold=

sum

(copiesSoldInYear).

*

PublishedBook is a best seller

iff

PublishedBook sold total NrCopies >= 10000.

• ORM Features, History, and Tool Support • The Data Modeling Process • ORM’s Graphical Language • Comparison with ER and UML • Case Study

• Relational Mapping and Tool Demos

60

Relational Mapping

Conceptual Schema Relational Schema has PatientName Patient (.nr) smokes is allergic to [allergy] Drug (.name)

PK

Patient

patientNr patientName smokes PK,FK1 PK

Allergy

patientNr drugName

Rmap procedure generates 5 th by default.

normal form 61

External (forms) and Logical (relational) schemas provide different structures for grouping elementary (conceptual) facts.

(a)

Patient

* PatientNr: 1025 * Name: Ann Jones Smokes Allergies: Penicillin Codeine (b)

Patient

* PatientNr: 1056 * Name: John B. Smith Smokes Allergies: OK Patient patientNr 1025 1056 patientName Ann Jones John B. Smith smokes true false OK Allergy patientNr 1025 1025 drugName Penicillin Codeine 62

Tool Demos

(1) Microsoft Visio ORM Source Model Solution (2) NORMA

63

Microsoft Visio for Enterprise Architects supports: • Entry of ORM 1 schemas • Forward engineering of ORM schemas to relational schemas • Forward engineering of ORM updates to relational updates • Direct entry of relational schemas • Multiple styles for relational schemas (pure relational, IDEF1X, …) • Reverse engineering of relational schemas to ORM schemas • Report generation 64

NORMA (Neumont ORM Architect)

• NORMA 1 is the first tool to support ORM 2 • Coded in C#, XML and XSLT • Open source plug-in to Microsoft Visual Studio .NET 2005, utilizing Microsoft’s Domain Specific Language (DSL) toolkit.

• Supports entry of ORM 2 models • Automated live error checking and verbalization • Automatic transformation to implementation artifacts • Can import ORM models from Visio (via free Orthogonal Toolbox) • Currently pre-beta. A usable version for industry is expected in 2008 1 Public version downloadable from http://sourceforge.net/projects/orm.

A newer build will soon be provided. See NORMA Labs 1-5 to start.

65

• NORMA supports mappings to various implementation artifacts

n-ary ORM XSD Binary DTD ORM WSDL C# OIAL PLiX DCIL OWL VB.NET

PHP

.NETTiers

DSL Java DDIL EDM

SQL: 2003 MS SQL Server IBM DB2 Oracle PostgreSQL MySQL OIAL ORM Intermediate Abstraction Language DCIL Database Conceptual Intermediate Language DDIL Data Definition Intermediate Language PLiX Programming Language in XML mid-stage development early development 66

Main current limitations of NORMA • Forward engineering to new schemas only (generation of incremental schema updates is coming) • Verbalization of only about 75% of constraints (full verbalization expected early 2008) • No reverse engineering (we have a basic prototype. RevEng expected in a 2008 release) • Constraint code generation only for basics (ma, unique, value) (full code generation of graphic constraints expected late 2008) (code generation for formal textual constraints expected later) • • • • Future plans include complete application generation (including forms) ER, UML etc. views (editable) multi-model support (including model component reuse) integration with conceptual process modeling 67

Further resources

www.orm.net

www.ORMFoundation.net

www.inConcept.com www.ORMcentral.com -- my ORM website -- ORM Foundation website -- Journal of Conceptual Modeling, … -- COM API details for VEA, … www.objectrolemodeling.com/ -- Orthogonal Toolbox, … www.brcommunity.com/ -- Business Rules Journal articles Halpin, T. 2001,

Information Modeling and Relational Database Design

, Morgan Kaufmann.

Halpin, T. et al. 2003,

Database Modeling with Microsoft Visio for Enterprise Architects

, Morgan Kaufmann.

68