0% found this document useful (0 votes)
3 views5 pages

estimator

The document discusses the properties of sample mean and variance as estimators for distributional parameters, emphasizing their unbiasedness and consistency. It introduces the method of moments for estimating parameters of distributions, highlighting its application to uniform distributions and the importance of stochastic simulation for assessing estimator performance. Additionally, it covers large sample properties and the Central Limit Theorem, noting that estimators tend to follow a normal distribution as sample size increases.

Uploaded by

simon wong
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
3 views5 pages

estimator

The document discusses the properties of sample mean and variance as estimators for distributional parameters, emphasizing their unbiasedness and consistency. It introduces the method of moments for estimating parameters of distributions, highlighting its application to uniform distributions and the importance of stochastic simulation for assessing estimator performance. Additionally, it covers large sample properties and the Central Limit Theorem, noting that estimators tend to follow a normal distribution as sample size increases.

Uploaded by

simon wong
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 5

Deriving Parameter Estimators

1. Properties of the Sample Mean and Variance


In several important cases distributional parameters are directly related to the mean
and variance of y. For example, for the exponential distribution a = E[y] and for the
normal distribution = E[y] and = Var[y]. In such cases, estimates of the mean
and variance can be used directly as estimates of the distributional parameters.

The sample mean and variance (computed by the MATLAB functions mean and var)
are often used to estimate the true mean and variance of a random variable y from the
random sample y1,y2,...,yn.
The definition of the sample mean my is:

The definition of the sample variance sy2 depends on whether the true mean is known. If the true mean
is known and is used in the sample variance calculation, the sample variance definition is:

If the sample mean (rather than the true mean) is used in the sample variance calculation in the

Note that the second definition divides the sum by n -1 rather than n . This is to insure that the
variance estimate is unbiased. We will use this second definition (since we will generally not know the
true mean). If n is large the two definitions are practically equivalent.
In order to assess the properties of these estimators we evaluate their means and
variances.
Mean and Variance of my:
The mean and variance of my are obtained by applying the linearity properties of the
expectation and by noting that the measurements in a random sample are
independent and identically distributed (so the covariances between different
measurements are zero):

These results show that the sample mean is unbiased and consistent.
Mean and Variance of sy2:
The mean and variance of sy2 are obtained by applying the same properties.
The derivation of E[sy2] is complicated by the dependence of the sample mean (inside
the variance summation) on the measurements. However, it is possible after some
manipulation to show that the 1/(n-1) version of the sample variance is an unbiased
estimate of the true variance.
The derivation of Var[sy2] is more difficult but can be found in advanced statistics
texts. The result indicates that the variance of sy2 is proportional to 1/n for large n, so
the sample variance is a consistent estimator of the true variance.

2. Deriving Parameter Estimators with the Method of Moments


The sample mean and variance can be used to estimate distributional parameters
whenever a relationship can be derived between the true mean and variance and the
parameters of interest. The estimation procedure is based on the method of moments,
which can be applied to many distributions with a small number of unknown
parameters (usually one or two).

Suppose that we wish to estimate the parameters a and b of a uniform probability density fy(y) from a random
sample y1, y2,...,yn:

The method of moments proceeds as follows:

1. Express the mean and variance of the random variable in terms of the unknown
parameters:

E[y] = (a + b) / 2
Var[y] = (b - a)2 / 12

2. Replace the unknown parameters in these expressions by the estimates and and
replace the true mean and variance by the sample mean my and the sample variance sy2 :

3. Solve the moment equations for the two unknowns and :


In order to see whether these estimators are unbiased and consistent we need to derive the mean and variance
of the estimation error. This is difficult to do symbolically but easy to do with stochastic simulation (for a
specific pair of true a and b values). The program method_moments plots the univariate probability densities
for and for a = 1, b = 4 and n = 20:
% Method_moments
function method_moments
close all
nsamp=20
nrep=10000
a=1.
b=4.
for i=1:nrep
y=unifrnd(a,b,1,nsamp);
ahat(i)=mean(y)-sqrt(12)*std(y)/2;
bhat(i)=mean(y)+sqrt(12)*std(y)/2;
end
mahat=mean(ahat)
vahat=var(ahat)
mbhat=mean(bhat)
vbhat=var(bhat)
[ncount xbin]=hist(ahat,50);
figure
dx=xbin(2)-xbin(1);
bar(xbin,ncount/(nrep*dx),1)
[ncount xbin]=hist(bhat,50);
figure
dx=xbin(2)-xbin(1);
bar(xbin,ncount/(nrep*dx),1)
[ncount xbin]=hist(bhat,50);
figure
dx=xbin(2)-xbin(1);
bar(xbin,ncount/(nrep*dx),1)
return

Note the tendency of the estimates to cluster around the true values of a =1 and b = 4. The estimates are both
unbiased, with E[ ] = 1.0092 and E[ ] =3.9917. Deviations from these means can be significant (as large
as 0.5) with n = 20.
This plot of the variances of and vs. sample size (n) indicates that the method of moment estimates are
consistent since both variances decrease as 1/n.

The method of moments extends to more than two parameters and can be used with
other probability distributions. However, there is no guarantee that the approach will
give either an unbiased or minimum variance estimate.
The primary alternative to the method of moments is maximum likelihood
estimation. Maximum likelihood estimators have desirable large sample properties
but can be biased and may not yield minimum variance estimates for small samples.
In practice, ad hoc estimation approaches for deriving estimators may prove best in
certain applications. The performance of such estimators should be checked with
stochastic simulation.

3. Large Sample Properties and Standardized Statistics


The Central Limit Theorem suggests that the probability density of most estimators
with a reasonable number of samples (usually greater than around 20) will be nearly
normal. This appears to be true, for example, for the uniform distribution a and b
estimators shown above.

A good approximation to the probability density of the parameter estimate


can usually be found if we can compute the mean and variance of and then
substitute these values into the expression for a univariate normal probability density.
This is the so-called large sample approximation.
It is usually convenient to express the large sample probability density of a parameter
estimate in terms of a normalized variable called a standardized statistic:

For large samples this statistic has a unit normal probability distribution with a mean of 0.0 and a
variance of 1.0.
The unit normal cumulative distribution function is widely tabulated in statistics texts
and reference books.

Note that if is unbiased the standardized statistic is:

The figure below compares the probability distribution of the standardized


statistic (derived above with stochastic simulation, blue line)
to a unit normal distribution (dashed red line). Normal distributions plot as straight
lines on the normal probability scale used in this figure.

The unit normal provides a reasonable approximation to probabilities away from the
tails of the distribution. For example, the probability that za is between -1 and +1 is
close to 0.68 for both the original standardized statistic and the unit normal
approximation.

You might also like