Database basics
Download
Report
Transcript Database basics
CTFS Workshop
Shameema Esufali
Asian data coordinator and technical resource
for the network
[email protected]
CTFS Workshop
Relational database basics
Why relational databases?
Why MySQL?
What about R?
Relational Theory
In order to work with MySQL it is necessary to
understand the basics of relational theory. i.e
how and why data is stored and managed in a
relational database.
The guiding principle behind a relational
database is to store data once and only once.
What is a Relation?
A table. Columns are fields (attributes) of data
related to other fields on the
same row (tuple).
Primary Key
Identifies the row of a table without
duplicates.
Tells you what the row contains
Eg. If treeid is the primary key then the row
has information about that tree
Candidate Primary Key
Any attribute(s) which together would serve as
the primary key.
Must uniquely identify a row of data.
Each part of the key must be essential to unique
identification. No redundancy.
Foreign Key
A foreign key is a column in a table that
matches the primary key column of another
table. Its function is to link the basic data of
two entities on demand, i.e. when two tables are
joined using the common key.
First Normal Form
One piece of information per column. No repeated rows. Eliminate fused
data eg Code1,Code2
Tag
Species
Code
1234
SHORME
A, BA
Tag
Species
Code
1234
SHORME
A
1234
SHORME
BA
Wrong!
Right
Second Normal Form
Each column depends on the entire primary key.
Wrong
Tag
Census
1234 1
Tag
Species
Seedsize X
SHORTR
Medium
Species
1234 SHORTR
Seedsize X
Medium
Y
DBH
11.3 15.4 12
Y
11.3 15.4
Right
Third Normal Form
Each column depends ONLY on the primary key. i.e.
there are no transitive dependencies
Wrong
Tag
Species
Seedsize
X
1234
SHORTR
Medium
11.3 15.4
Tag
Species
X
Y
1234
SHORTR
11.3 15.4
Right
Y
Fourth Normal Form
The table must contain no more than one
multi-valued dependency
Tag
DBH
Cod
e
1234
10
A
1234
11
A
1234
11
BA
Entity Relationship diagram (ERD)
Shows in a diagram how entities (tables) are
related to one another.
One to One
One to many
Many to many
One to one
Extension of number of attributes in a single
table
Rarely required
Tree
More tree
attributes
One to Many
Most
common
Child
Parent
Requires two
tables.
Linked by
Foreign Key
Family
Genus
Species
Many to many
Need to break down
to one to many
Measurement
Tree
Requires three tables
Measurement
Associative table
provides common
key
Code
Code
Reassembling data
Data was broken down into tables to preserve
integrity
How can we put it together to derive
information?
Use Structured Query Language (SQL) to JOIN
tables using a common attribute
Joins
Two tables may be
joined when they
share at least one
common attribute
The Primary key of
the Parent table is
stored in the Child
table as a cross
reference. This is
called a Foreign Key.
Primary Key in Parent
GenusID
Genus
FamilyID
1
Acacia
4
2
Acalypha
3
3
Adelia
3
4
Aegiphila
3
5
Alchornea
3
SpeciesID
Species
Foreign Key
in Child
GenusId
1
melanoceras
1
2
diversifolia
2
3
macrostachya
2
4
triloba
3
5
panamensis
4
6
costaricensis
5
7
latifolia
5
Table joined on Foreign Key
GenusID
SpeciesID
1
2
3
4
5
6
7
Species
melanoceras
diversifolia
macrostachya
triloba
panamensis
costaricensis
latifolia
GenusId
1
2
2
3
4
5
5
⇿
⇿
⇿
⇿
⇿
⇿
⇿
GenusID
1
2
2
3
4
5
5
Genus
Acacia
Acalypha
Acalypha
Adelia
Aegiphila
Alchornea
Alchornea
The Genus ID in the Species table is used to pick
up information for the corresponding Genus. It
looks for a row with the matching Primary Key
FamilyID
4
3
3
3
3
3
3
Extend to join many tables
With SQL you can join as many tables as you
need to in order to get the set of information
you need. Thus the previous example can be
extended to include Family which is a parent
table of Genus and/or extended in the another
direction to include Tree which is a child of
Species as long as there is a linking attribute.
This attribute is called a Foreign Key.