Sampling Populations • Ideal situation -

Download Report

Transcript Sampling Populations • Ideal situation -

Sampling Populations
• Ideal situation
- Perfect knowledge
• Not possible in many cases
- Size & cost
• Not necessary
- appropriate subset  adequate estimates
• Sampling
- A representative subset
Sampling Concepts
• Sampling unit
- The smallest sub-division of the population
• Sampling error
- Sampling error  as the sample size
• Sampling bias
- systematic tendency
Steps in Sampling
1. Definition of the population
- Any inferences  that population
2. Construction of a sampling frame
 This involves identifying all the individual sampling units
within a population in order that the sample can be drawn
from them
Steps in Sampling Cont.
3. Selection of a sampling design
- Critical decision
4. Specification of information to be collected
- What data we will collect and how
5. Collection of the data
Sampling designs
• Non-probability designs
- Not concerned with being representative
• Probability designs
- Aim to representative of the population
Non-probability Sampling Designs
• Volunteer sampling
- Self-selecting
- Convenient
- Rarely representative
• Quota sampling
- Fulfilling counts of sub-groups
• Convenience sampling
- Availability/accessibility
• Judgmental or purposive sampling
- Preconceived notions
Probability Sampling Designs
• Random sampling
• Systematic sampling
• Stratified sampling
Tobler’s Law and Independence
Everything is related to everything else, but near
things are more related than distant things.
• Sampled locations in close proximity are likely to
have similar characteristics, thus they are
unlikely to be independent
Spatial Patterns
• Point Pattern Analysis
 Location information
 Point data
• Geographic Patterns in Areal Data
 Attribute values
 Polygon representations
Point Pattern Analysis
Regular
Random
Clustered
Point Pattern Analysis
1. The Quadrat Method
2. Nearest Neighbor Analysis
the Quadrat Method
1. Divide a study region into m cells of equal size
2. Find the mean number of points per cell
3. Find the variance of the number of points per cell
(s2)
i=m
(xi – x)2
i=1
2
s =
S
m-1
where xi is the number of points in cell i
the Quadrat Method
4. Calculate the variance to mean ratio (VMR):
s2
VMR = x
5. Interpret VMR
VMR < 1

Regular (uniform)
VMR = 1

Random
VMR > 1

Clustered
the Quadrat Method
6. Interpret the variance to mean ratio (VMR)
2
(m
1)
s
c2 =
x
= (m - 1) * VMR
comparing the test stat. to critical values from the
c2 distribution with df = (m - 1)
Quadrat Method Example
The Effect of Quadrat Size
• Quadrat size
• Too small  empty cells
• Too large  miss patterns that occur within a single
cell
• Suggested optimal sizes
• either 2 points per cell (McIntosh, 1950)
• or 1.6 points/cell (Bailey and Gatrell, 1995)
2. Nearest Neighbor Analysis
• An alternative approach
- the distance between any given point and its nearest neighbor
• The average distance between neighboring points
(RO):
n
RO =
S di
n
i=1
The Nearest Neighbor Statistic
• Expected distance:
RE =
1
2 l
where l is the number
of points per unit area
• Nearest neighbor statistic (R):
RO
x
where x is the average
R=
=
RE
1/ (2 l) observed distance di
Interpreting the Nearest Neighbor Statistic
• Values of R:
• 0  all points are coincident
• 1  a random pattern
• 2.1491  a perfectly uniform pattern
• Through the examination of many random point
patterns, the variance of the mean distances
between neighbors has been found to be:
4-p
V [RE] =
4pln
where n is the
number of points
Interpreting the Nearest Neighbor Statistic
• Test statistic:
RO - RE
Ztest =
V [RE]
=
RO - RE
(4 - p)/(4pln)
= 3.826 (RO - RE) ln
• Standard normal distribution
Nearest Neighbor Analysis Example
Nearest Neighbor Analysis Example
• Observed mean distance (RO):
RO = (1 + 1 + 2 + 3 + 3 + 3) / 6 = 13 /6 = 2.167
• Expected mean distance (RE):
RE = 1/(2l) = 1/(26/42]) = 1.323
and use these values to calculate the nearest
neighbor statistic (R):
R = RO / RE = 2.167/1.323 = 1.638
• Because R is greater than 1, this suggest the points
are somewhat uniformly spaced
Z-test for the Nearest Neighbor
Statistic Example
•
1.
2.
3.
4.
Research question: Is the point pattern random?
H0: RO ~ RE (Point pattern is approximately random)
HA: RO  RE (Pattern is uniform or clustered)
Select a = 0.05, two-tailed because of H0
We have already calculated RO and RE, and together
with the sample size (n = 6) and the number of points
per unit area (l = 6/24), we can calculate the test
statistic:
Ztest = 3.826 (RO - RE) ln
= 3.826 (2.167 - 1.323) * /2
=2.99
Z-test for the Nearest Neighbor
Statistic Example
5. For an a = 0.05 and a two-tailed test, Zcrit=1.96
6. Ztest > Zcrit , therefore we reject H0 and accept HA,
finding that the point pattern is significantly different
from a random point pattern; more specifically it tends
towards a uniform pattern because it exceeds the positive
Zcrit value