Models of migration - University of Leeds

Download Report

Transcript Models of migration - University of Leeds

Models of migration
Observations and judgments
In: Raymer and Willekens, 2008, International migration
in Europe, Wiley
Introduction:models
• To interpret the world, we use models (mental
schemes; mental structures)
• Models are representations of portions of the real
world
• Explanation, understanding, prediction, policy
guidance
• Models of migration
Introduction: migration
• Migration : change of residence (relocation)
• Migration is situated in time and space
– Conceptual issues
• Space: administrative boundaries
• Time: duration of residence or intention to stay
– Lifetime (Poland); one year (UN); 8 days (Germany)

Measurement issues

Event: ‘migration’


Event-based approach; movement approach
Person: ‘migrant’

Status-based approach; transition approach
=> Data types and conversion
Introduction: migration
• Multistate approach
– Place of residence at x = state (state occupancy)
– Life course is sequence of state occupancies
– Change in place of residence = state transition
• Continuous vs discrete time
– Migration takes place in continuous time
– Migration is recorded in continuous time or discrete
time
• Continuous time: direct transition or event (Rajulton)
• Discrete time: discrete-time transition
Introduction: migration
• Level of measurement or analysis
– Micro: individual
• Age at migration, direction of migration, reason for
migration, characteristic of migrant
– Macro: population (or cohort)
• Age structure, spatial structure, motivational
structure, covariate structure
• Structure is represented by models
• Structures exhibit continuity and change
Probability models
• Models include
– Structure (systematic factors)
– Chance (random factors)
• Variate  random variable
– Not able to predict its value because of chance
• Types of data (observations) => models
– Counts: Poisson variate => Poisson models
– Proportions: binomial variate => logit models (logistic)
– Rates: counts / exposure => Poisson variate with offset
Model 1: state occupancy
• Yk State occupied by individual k
• ki = Pr{Yk=i} State probability
– Identical individuals: ki = i for all k
– Individuals differ in some attributes:
Z = covariates
ki = i(Z),
• Prob. of residing in i region by region of birth
• Statistical inference: MLE of i
– Multinomial distribution
Pr{N1  n 1 , N 2  n 2 , ...} 
I
m!
I
n !
i
i 1

i 1
ni
i
Model 1: state occupancy
• Statistical inference: MLE of state probability i
– Multinomial distribution
m! I
ni
Pr{N1  n1 , N 2  n 2 , ...}  I


i
 n i ! i1
i 1
I
– Likelihood function
L   ini
– Log-likelihood function
i 1
l  ln( L)  i 1 ni ln i 
I
ni
– MLE ˆ i 
m
– Expected number of individuals in i: E[Ni]=i m
Model 1: State occupancy with covariates
 i Z 
logit i Z   ln
 i  i 0  i1Z1  i 2 Z 2  i 3 Z 3  ...
 n Z 
i 
exp( i )
exp( i )
 I
exp(1 )  exp( 2 )  ...  1  ...
 exp( j )
j 1
multinomial logistic regression model
Count data
Poisson model:
Covariates:
Pr{N i  n i } 
in
i
ni !
exp[-i ]
ENi   i  expi 0  i1Z1  i 2 Z 2  ...
ln i  i 0  i1Z1  i 2 Z 2  ...
The log-rate model is a log-linear model with an offset:
 N i  i
E
 exp i 0   i1Z1   i 2 Z 2  ...

 PYi  PYi
ENi   i  PYi expi 0  i1Z1  i 2 Z 2  ...
Model 2: Transition probabilities
Age x
• State probability ki(x,Z) = Pr{Yk(x,Z)=i | Z}
• Transition probability
Pr{Y(x  1 ) = j | Y(x),Y(x - 1 ),...; Z} = Pr{ Y(x  1 ) = j | Y(x); Z}
Pr{Y(x  1 ) = j | Y(x)= i} = pij (x)
discrete-time transition probability
Migrant data; Option 2
Model 2: Transition probabilities
• Transition probability as a logit model
logit[ j ( x 1)]   j 0 ( x)   j1 ( x)Yi ( x)
pij ( x) 

exp  j 0 ( x)   j1 ( x)Yi ( x)
I
 exp[
r 1
j0

( x)   j1 ( x)Yr ( x)]
with jo(x) = logit of residing in j at x+1 for reference category
(not residing in i at x) and j0(x) +j1(x) = logit of residing in
j at x+1 for resident of i at x.
Model 2: Transition probabilities with covariates
pij ( x) 

expij ( x)
I
 exp[
r 1
with

ij
( x)]
ij ( x)  ij0 ( x)  ij1 ( x)Z1  ij 2 ( x)Z2  ij3 ( x)Z3  ...
e.g. Zk = 1 if k is region of birth (ki); 0
otherwise.
ij0 (x) is logit of residing in j at x+1 for someone
who resides in i at x and was born in i.
multinomial logistic regression model
Model 3: Transition rates
 ij ( x)  lim
( y  x )0
pij ( x, y)
for i  j
yx
ii(x) is defined such that

ij
( x)  0
j
Hence ii ( x)   ij ( x)  lim
j i
( y  x )0
1  pij ( x)
yx
Force of retention
Transition rates: matrix of intensities
  (x)
 11
-  12 (x)
μ(x)  
.


.

 -  1I (x)
-

21
(x) . .
(x) .
.
.
.
.
-  (x) .
22
2I
.
.
.
.
-
-
(x)

(x)
I2

.


.

(x)
 II 
I1
Discrete-time transition probabilities:
 p (x, y)
 11
 p12 (x, y)
P(x, y)  
.


.

 p1N (x, y)
p
p
p
21
22
2N
(x, y) . .
(x, y) . .
.
. .
.
. .
(x, y) . .
(x, y)

(x, y)
N2

.


.
p NN (x, y)
p
p
N1
dP ( x )
 μ( x)P( x)
dx
Transition rates: piecewise constant
transition intensities (rates)
Exponential model:
P( x, y)  exp ( y  x)M( x, y)
exp ( A) = I + A +
1 2 1 3
A + A + ...
2!
3!
3
(y - x)2
2 (y - x)
exp[( y  x)M( x, y ) = I - (y - x)M( x, y) +
M( x, y) M( x, y)3 + . . .
2!
3!
Linear approximation:
P(x,y)  I  12 M(x,y) I  12 M(x,y)
1
Transition rates: generation and distribution
ij ( x)  i ( x)ij ( x)
where ij(x) is the probability that an individual who leaves i
selects j as the destination. It is the conditional probability of a
direct transition from i to j.
Competing risk model
  (x)
 11
-  12 (x)

.

.


 -  1I (x)
-

21
(x) . .
(x) .
.
.
.
.
- (x) .
22
2I
.
.
.
.
-
-
(x)   (x)
  11
(x) - (x)
I2
12



.
.
 
.
  .
 II (x) - 1I (x)
I1
-

21
(x) . .
(x) .
.
.
.
.
- (x) .
22
2I
.
.
.
.
-
-
(x)

(x)
I2
. 

. 
 II (x)
I1









1
0
.
.
0
(x)
0

. .
(x) .
.
.
.
.
0
.
2
.
.
.
.


0 

.

.

 I (x)
0
Transition rates: generation and distribution
with covariates
Let ij be constant during interval => ij = mi
Log-linear model
mi  expi 0  i1Z1  i 2 Z 2  ...
ln mi  i 0  i1Z1  i 2 Z 2  ...
Cox model
mi ( x)  mio ( x) expi 0  i1Z1  i 2 Z 2  ...
From transition probabilities to
transition rates
The inverse method (Singer and Spilerman)
P(x,y)  I  12 M(x,y) I  12 M(x,y)
1
M( x, y) 
yx
2
I  P(x, y)I  P(x, y)1
From 5-year probability to 1-year probability:
P( x, x  1)  exp M( x, x  1)
Incomplete data
Expectation (E)
Poisson model:
Data availability:
ij
nij
P r{N ij  n ij} 
 
nij !
exp[-ij ]
E Nij  ij  i  j
The maximization (m) of the probability is equivalent to
maximizing the log-likelihood l   nij ln[ i  j ]   i  j

ˆ i 
ni 
ˆ

j
j
n j
ˆ
j 
ˆ i
ij
i
The EM algorithm results in the well-known expression
ni 
ij 
n j
n 

Incomplete data: Prior information
 
 
E Nij  ij  k ai  j exp  cij
 
ln ij  u  uiA  u Bj  uijAB
 
Log-linear model
E Nij  ij   i  m
*
*
j
Gravity model
0
ij
Model with
offset
A.
Time period
1975–1980
Data
Region of
origin
Northeast
Midwest
South
West
Total
Region of destination
Northeast
Midwest
43,123
462
350
51,136
695
1,082
287
677
44,455
53,357
South
1,800
1,845
67,095
1,120
71,860
West
753
1,269
1,141
37,902
41,065
Total
46,138
54,600
70,013
39,986
210,737
1845 / 1269 = 1.454
1800 / 753 = 2.390
2.390 / 1.454 = 1.644
B.
1980–1985
Data
C.
1980-1985
Predicted
Origin
Northeast
Midwest
South
West
Total
Northeast
44,845
326
651
237
46,059
Midwest
379
52,311
855
669
54,214
South
1,387
1,954
68,742
1,085
73,168
West
473
1,144
1,024
40,028
42,669
Total
47,084
55,735
71,272
42,019
216,110
Flows predicted based on marginal totals and 1975-80 matrix
Origin
Northeast
Midwest
South
West
Northeast
44,445
393
1,614
632
Midwest
431
52,055
1,977
1,272
South
814
1,047
68,324
1,087
West
369
719
1,253
39,678
Total
46,059
54,214
73,168
42,669
ODDS
ODDS
Ratio
Total
47,084
55,735
71,272
42,019
216,110
[1614/632] / [1977/1272] = 1.644
Interaction effect is ‘borrowed’
Source: Rogers et al. (2003a)
Adding judgmental data
• Techniques developed in judgmental
forecasting: expert opinions
• Expert opinion viewed as data, e.g. as
covariate in regression model with known
coefficient (Knudsen, 1992)
• Introduce expert knowledge on age
structure or spatial structure through model
parameters that represent these structures
Adding judgmental data
• US interregional migration
• 1975-80 matrix + migration survey in West
• Judgments
– Attractiveness of West diminished in early 1980s
– Increased propensity to leave Northeast and Midwest
• Quantify judgments
– Odds that migrant select South rather than West
increases by 20%
– Odds that migrant into the West originates from the
Northeast (rather than the West) is 9 % higher. For
Northeast it is 20% higher.
A.
Time period
1975–1980
Data
Region of
origin
Northeast
Midwest
South
West
Total
Region of destination
Northeast
Midwest
43,123
462
350
51,136
695
1,082
287
677
44,455
53,357
South
1,800
1,845
67,095
1,120
71,860
West
753
1,269
1,141
37,902
41,065
Total
46,138
54,600
70,013
39,986
210,737
D.
Flows predicted based on West survey and 1975-80 matrix
Origin
Northeast
Midwest
South
West
1980-1985 Northeast
21,181
272
1,037
473
Predicted Midwest
247
43,135
1,526
1,144
South
488
909
55,235
1,024
West
237
669
1,085
40,028
Total
22,153
44,985
58,883
42,669
E.
Total
22,963
46,052
57,656
42,019
168,690
Flows predicted based on West survey, 1975-80 matrix and judgmental data
Origin
Northeast
Midwest
South
West
Total
1980-1985 Northeast
40,243
516
2,365
899
44,023
Predicted Midwest
296
51,762
2,197
1,373
55,628
South
488
909
66,282
1,024
68,703
West
237
669
1,302
40,028
42,236
Total
41,264
53,856
72,146
43,324
210,590
Source: Rogers et al. (2003a)
Conclusion
• Unified perspective on modeling of migration:
probability models of counts, probabilities
(proportions) or rates (risk indicators)
• State occupancies and state transitions
– Transition rate = exit rate * destination probabilities
• Judgments
Timing of event
Direction of change