Intro to Data Modeling
Download
Report
Transcript Intro to Data Modeling
Introduction to
Data Modeling
Presented by Bert Scalzo, PhD
[email protected]
About the Author
Oracle DBA for 18+ years, versions 4 through 10g
Worked for Oracle Education & Consulting
Holds several Oracle Masters (DBA & CASE)
BS, MS, PhD in Computer Science and also an MBA
LOMA insurance industry designations: FLMI and ACS
Books
– The TOAD Handbook (Feb 2003)
– Oracle DBA Guide to Data Warehousing and Star Schemas (Mar 2003)
Articles
–
–
–
–
–
–
Oracle Magazine
Oracle Technology Network (OTN)
Oracle Informant
PC Week (now E-Magazine)
Linux Journal
www.Linux.com
About Quest Software
We provide application management solutions
that enable customers to Design, Develop,
Deploy, Manage and Maintain enterprise
applications without downtime
What is Application Management?
A holistic approach of managing the entire
application not individual components
How Do We Do It?
We surround the application infrastructure:
Application Monitoring
Database Management
High Availability
Microsoft Infrastructure Management
Purpose and Overview
Purpose
–
–
–
–
Introduce very basic data modeling concepts
Explain how data models relate to the business
Explain how data models relate to databases
Show some data modeling tools and their use
Overview
–
–
–
–
–
–
–
–
Why do we model data
What is data modeling
What are entities and attributes
What are unique identifiers
What are relationships
Special constructs: Inheritance & Dependencies
How to translate from logical to physical
How to generate a basic physical database
Why do we Model?
Would you build an office without a blueprint?
The Architect will create the first high
level drawings to validate the concept
with the client and then make a more
detailed plan (i.e. the blueprint ) for the
Contractor …
The Contractor will take this
blueprint and optimise it based
on technical constraints. The Contractor
will then create the actual office.
World of Modeling …
• End-user
• IT Partner/Liaison Business Process Modeling
(BPM)
• Business Analyst
• Improve process efficiency
• Define/document Bus. Processes
- create correct and complete
application requirements
• System Architect
• System Analyst
• App Developer
• Support for all UML diagrams
- Analyze requirements
- Design application
• Reverse/forward engineer code
• Bus. Analyst
• Data Architect
• Data Analyst
• DBA
• DB Developer
• DB Architect
Object-Oriented Modeling
(OOM - UML)
Conceptual Data Modeling
(CDM – E/R)
Physical Data Modeling
(PDM)
QDesigner synchronizes models
from all levels in a single tool
• Identify all data & relationships
- E/R (Entity/Rel’ship) diagrams
- DB independent view
• Business Rules?
•
•
•
•
DB-specific model
Reverse engineer existing DB
Create/Update DB from model
Data Warehouse Modeling
What is Data Modeling?
Conceptual (sometimes called Logical)
– Gather data requirements for business functions
being analyzed
– Think “WHAT???” – “What data necessary to do the
business function”
– It’s all about the business, do not impose
“Information System” terms or techniques
• No such business things as tables, keys, indexes
and the like
• Probably best done by business analysts in sync
with BPM
• In some shops, data modelers can bridge the
business/IT gap
What is Data Modeling?
Physical
– Translate logical model into an implementation plan
for a database
– Think “HOW???” – “How to construct database
objects to realize this“
– It’s all about the target database or other technology
being adopted
• Now it’s time to think about tables, keys, indexes
and the like
• Probably best done by data architects, data
modelers, and DBA’s
Where in Development Lifecycle
Analysis
Design
Conceptual
Some shops
just treat this
as one big
“Design” task
Reengineer
Physical
Develop
Deploy
Monitor
&
Maintain
Simple ERD Scenario
We’ll use a very simple ERD scenario to
explain the techniques
Basic business requirements for a super
simple bank:
– Customers interact with Employees to
• Open Accounts
• Make Transactions
– There are two kinds of Accounts
• Checking
• Savings (offers ATM access)
Looks simple, right? Wait till you see the
issues with just this …
What are Entities
According to Chen – a thing which can be
distinctly identified
– Peter Chen invented ER diagramming technique in 1976
Look again at the two critical aspects of an
entity
A business pertinent or relative noun
– Person, place, thing, or concept that has characteristics of
interest to the enterprise
– To find entities, simply look for nouns in your business
analysis …
NOTE – An entity is not synonymous with a
table
Entity Meta-Data
Notation for Entities
General Entity Conventions:
– Represented by a box
– Name inside top of box
– Name is
• Unique across model
• Business meaningful
• In its singular form
• In all upper case
– Displays no meta-data!!! – (cannot see comment & number)
What are Attributes
Property or characteristic of an entity
Elementary piece of information about an entity
An entity pertinent or relative adjective (i.e. modifies
noun by the use of describing, identifying, or quantifying
words)
– To find attributes, simply look for descriptive nouns that won’t be
used as entities and/or adjectives in your business analysis
– Each entity will generally have several to many attributes – and
each attribute may have a little to a lot of meta-data
• Take your time
NOTE – An attribute is not synonymous with a column
What are Unique ID’s
When defining entities and their attributes, the
following question should be asked…
– “What makes one instance or occurrence of an entity
different from another?”
• In other words, how can business people differentiate entity
data?
For example in our bank scenario, we are told that
Customers can be differentiated because no two
Customers have the same:
– Customer ID
– Federal Tax ID
– Concatenation of First Name + Last Name + Zip Code
NOTE – This has nothing to do with database
keys/indexes
Attribute Meta-Data
Comment
Business Data Type
Optional or Mandatory
Part of Unique Identifier(s)
Checks
–
–
–
–
–
–
–
–
–
Minimum
Maximum
Default
Units
Format
Upper/Lowercase
Cannot Modify
List of Legal Values
Business Rules
Failure to discover & document
attribute meta-data is where
many data models fail
Failure to capture this meta-data
results in ineffective databases:
that’s where bad data is allowed
Next few slides show Quest’s
QDesigner – other modeling
tools will have similar screens
NOTE - This represents the bulk of the work when data modeling
Meta-Data - Basic
Meta-Data – UID’s
Meta-Data - Checks
Meta-Data – Rules 1
Meta-Data – Rules 2
Notation for Attributes
General Attribute Conventions:
– Name inside bottom of entity box
– Name is
• Unique across entity
• Business meaningful
• In its singular form
• In Init-Cap case
• Spaces OK (no “–” or “_”)
– Primary unique id is underlined
– Primary unique id marked by <pi>
– Alternate unique id’s marked by <ai#>
– Data type shown with length
– Mandatory indicated by <M>
– Displays limited meta-data!!! –
(cannot see comment or any of checks)
Super and Sub Types
In the business world, we often find that things (i.e.
entities) do not fall into simple, clear cut classifications
– We sometimes find that things are very similar - with just a few
differences
In data modeling, this can be referred to as:
– Entity Inheritance
– Super and Sub Types
– Generalization Hierarchies
The key question to ask is whether these different
classifications are:
– Exclusive – must be one way or the other
– Inclusive – can actually be all ways at once
http://www.quest.com/whitepapers/Reality_Requires-QDesigner.pdf
Notation for Super/Sub
X means Exclusive
Super/Sub
Meta-Data
This Super/Sub meta-data only
important for transforming
conceptual to physical model…
(next data modeling e-seminar)
Now the Relationships
It’s called ER diagramming
– We did the E for “Entities” first
– But it’s really all about the R for “Relationships” that really matters
Experience has shown that while many people can readily
define entities and their unique identifiers, not as many are
as successful modeling the business relationships between
those entities
Definition:
– A business pertinent or relative connection between two entities
– A reason of relevance to the enterprise why entities may be
associated
NOTE – A relationship is not synonymous with a foreign key
http://www.quest.com/whitepapers/Its_Really_all_about-QDesigner.pdf
Ask Lots of Questions
The key to quality relationship analysis is to ask a zillion
business questions, model the answers, and repeat the
process until done …
Back to our banking example:
– How are Customers related to Accounts
• Can a Customer have one or more than one Account (yes)
• Can more than one Customer have the same Account (yes - joint)
– How are Employees related to Accounts
• Can more than one Employee open an Account (let’s say no)
• Can an Employee open more than one Account (obvious - yes)
– How are Transactions related to Accounts
• Woops – we somehow missed an entity earlier (that will happen)
• Can a Transaction apply to more than one Account (obvious - no)
• Can an Account have many Transactions over time (obvious - yes)
Relationship Meta-Data
Read the Relationships
To verify that you’ve got it right, read each relationship
to the business analysts or end users to be sure it’s
right. And you need to read it for both directions. Yes,
that’s lot’s of work …
Sentence Format:
EACH entity1 MAY2
MUST2
role_name3
1 OR MORE4
1 AND ONLY 14
3
1
entity5
4
2
5
Example:
– Each employee may open one or more account
– Each account must be opened by one and only one employee
Dependent Relationships
In a dependent association,
one entity is partially
identified by another. Each
entity must have an
identifier. In some cases,
however, the attributes of an
entity are not sufficient to
identify an occurrence of the
entity. For these entities,
their identifiers incorporate
the identifier of another
entity with which they have
a dependent association.
Relationships Results
Conceptual -> Physical
Check List:
– Verify everything with the business analysts and end users
– Verify everything with the business analysts and end users
– Verify everything with the business analysts and end users
Use your software’s model checking utilities and/or
reports
–
–
–
–
Every entity must have unique identifier (as per Chen)
Resolve many-to-many relationships (cannot be built)
Double check isolated entities (i.e. no relationships)
Look for very common modeling patterns (next slide)
Use your software’s generate physical model utility
NOTE – Generated physical model will require DBA
review …
Final Conceptual Model
Employee’s can be
Customers – right?
Initial Physical Model
Physical Model Work
Check List:
– Verify nothing got lost in translation from Conceptual to
Physical
– Add table(s) required for implementation, but not modeled
– Use your software’s model checking utilities and/or reports
• Every table should have primary key
• Add foreign key relationship meta-data (next slide)
• Add indexes to support data access needs (lots of work)
– Use your software’s generate SQL or DDL script utility
– Review the script (i.e. never just run SQL without looking)
Foreign Key Meta-Data
Final Physical Model
Physical -> Database
Carefully pick the DBMS
generation options
Example
DDL Script
Remember the
Conceptual model’s
meta-data?
Parting Thoughts
Data Modeling or CASE tools do not automatically = good
design
– Must do complete business analysis
– Must do adequate Conceptual -> Physical transformation
– Must add required physical meta-data (tuning & insight)
Some of worst databases built result from failure to do the
above
There are many other modeling issues – this was just a start …
–
–
–
–
–
–
–
Breaking models into sub-models
Conceptual-Physical Model compare and sync
Physical Model-Database compare and sync
Repository based collaborative modeling
Normalization and Denormalization
Data Warehousing (Star Schema design)
etc, etc, etc …
More Information
Free 45-Day Trial of QDesigner™
– www.quest.com/QDesigner
– Compete enterprise modeling solution
• Business Process Modeling (BPM)
• Object-Oriented (UML) Modeling (OOM)
• Conceptual Data Modeling (CDM)
• Physical Data Modeling (PDM)
Save $$$: Upgrade your TOAD/SQL Nav license to the
Suite
– Suites include PDM!!!
Modeling White Papers
– http://www.quest.com/whitepapers/qdesigner/index.asp
• Data Modeling: Common Mistakes and Their Impact
• Data Modeling: It's Really All About the Relationships
• Data Modeling: Reality Requires Super and Sub Types
QUESTIONS & ANSWERS