Transcript Document
Sufficient Statistics
Dayu 11.11
Some Abbreviations
• i.i.d. : independent, identically distributed
Content
• Estimator, Biased, Mean Square Error (MSE) and Minimum-Variance Unbiased Estimator (MVUE) When MVUE is unique?
• Lehmann–Scheffé Theorem – Biased – Complete – Sufficient • the Neyman-Fisher factorization criterion How to construct MVUE is unique?
• Rao-Blackwell theorem
Estimator
• The probability mass function (or density) of X is partially unknown, i.e. of the form f(x;θ) where θ is a parameter, varying in the parameter space Θ.
Unbiased
ˆ expectation i.e.
E
t(x) { ˆ } ˆ • E.g using mean of a sample to estimate mean of the population
x
is unbiased
E
(
x
)
E
( 1
n i n
1
x i
) 1
n E
(
i n
1
x i
) 1
n i n
1
E
(
x i
) 1
n
n
Mean Squared Error (MSE)
• MSE of an estimator T of an unobservable parameter θ is MSE(T)=E[(T- θ) 2 ] • Since E(Y 2 )=V(Y)+[E(Y)] 2 MSE(T)=var(T)+[bias(T)] 2 where bias(T)=E(T- θ)=E(T)- θ • For the unbiased one, MSE=V(T) since biasd(T)=0
Examples
Two estimators for σ 2 : Results from MLE, biased, but smaller variance Unbiased, but bigger variance
Minimum-Variance Unbiased Estimator (MVUE)
• An unbiased estimator of minimum MSE also has minimum variance.
• MVUE is an unbiased estimator of parameters, whose variance is minimized for all values of the parameters. • Two theorems – Lehmann-Scheffé theorem MVUE is unique. can show that – Constructing a MVUE: Rao-Blackwell theorem
Lehmann–Scheffé Theorem
• any estimator that is complete , sufficient , and unbiased best unbiased estimator of its expectation. • The Lehmann-Scheffé Theorem states that if a complete and sufficient statistic T exists, then the UMVU estimator of g(θ) (if it exists) must be a function of T. is the unique
Completeness
• Suppose a random variable X has a probability distribution belonging to a known family of probability distributions, parameterized by θ, • A function g(X) is an unbiased estimator of zero if the expectation E(g(X)) remains zero regardless of the value of the parameter θ. (by the definition of unbiased ) • Then X is a complete statistic precisely if it admits (up to a set of measure zero) no such unbiased estimator of zero except 0 itself.
Example of Completeness
• suppose X 1 variance 1.
, X 2 are i.i.d. random variables, normally distributed with expectation θ and • Not complete: Then X 1 — X 2 is an unbiased estimator of zero. Therefore the pair (X 1 , X 2 ) is not a complete statistic. • Complete:
X
2 On the other hand, the sum X 1 can be shown to be a complete statistic. That means that there is no non-zero function g such that E(g(X 1 + X 2 )) remains zero + regardless of changes in the value of θ.
Detailed Explanations
• X 1 + X 2 ~(2θ,2)
Sufficiency
• Consider an i.i.d. sample X 1 , X 2 ,.. X n • Two people A and B: – A observe the entire sample X 1 , X 2 ,.. X n – B observes only one number T, T=T(X 1 , X 2 ,.. X n ) • Intuitionly, Who has more information?
• Under what condition, B will have as much information about θ as A has?
Sufficiency
• Definition: – A statistic T(X) is sufficient for θ precisely if the conditional probability distribution of the data X given the statistic T(X) does not depend on θ.
• How to find?: that the Neyman-Fisher
factorization criterion:
If the probability density function of X is f(x;θ), then T satisfies the factorization criterion if and only if functions g and h can be found such
• h(x): a function that does not depend on θ • g(T(x),θ): a function that depends on data only throught T(x) • E.g. • T=x 1 +x 2 +.. +x n is a sufficient statistic for p for Bernoulli Distribution B(p) g(T(x),p) ∙1 h(x)=1
Example 2
Test T=x1+x2+.. +xn for Poisson Distribution Π(λ): h(x): independent of λ Hence, T=x 1 +x 2 +.. +x n is sufficient!
g(T(x), λ)
Notes on Sufficient Statistics
• Note that the sufficient statistic is not unique. If T(x) is sufficient, so are T(x)/n and log(T(x))
Rao-Blackwell theorem
• named after – C.R. Rao (1920- ) is a famous Indian statistician and currently professor emeritus at Penn State University – David Blackwell (1919-) is Professor Emeritus of Statistics at the UC Berkeley • describes a technique that can transform an absurdly crude estimator into an estimator that is optimal by the mean squared-error criterion or any of a variety of similar criteria.
Rao-Blackwell theorem
• Definition: A Rao–Blackwell estimator δ(X) given a sufficient statistic T(X). δ 1 (X) of an unobservable quantity θ is the conditional expected value E(δ(X) | T(X)) of some estimator – δ(X) : the "original estimator" – δ 1 (X): the "improved estimator". • The mean squared error of the Rao– Blackwell estimator does not exceed that of the original estimator.
Conditional Expectation
B E
( {
x
f
(
x
) |
X B
) |
f
(
x
)
x
B
P
(
x b
} |
x
B
)
f P
(
x
|
x
B
)
P P
(
x
0 ( ,
x
)
x B
)
B x
B
(
x
)
Example I
• Phone calls arrive at a switchboard according to a Poisson process at an average rate of λ per minute. • λ is not observable • Observe: the numbers of phone calls that arrived during n successive one-minute periods are observed. • It is desired to estimate the probability e −λ that the next one-minute period passes with no phone calls.
Original estimator: t=x1+x2+.. +xn is sufficient
Example II
• To estimate λ for X 1 … X n • Original estimator: X 1 ~ P(λ) We know t= X 1 +…+ X n is sufficient • Improved estimator by R-B theorem: E[X 1 | X 1 +…+ X n =t] cannot compute directly We know Σ[E(X i | X 1 +…+ X n =E(ΣX i | X 1 +…+ X n =t)] =t)=t • Since X 1 … X n In fact, it’s
x
are i.i.d. so every term is t/n