5
20
X
i
=1
(
N
i

5)
2
,
and under the null hypothesis,
Q
will be approximately distributed as
χ
2
with 19 degrees of
freedom.
Suppose that we want to test whether a random sample of observations comes from a par
ticular distribution. The the following procedure can be adopted:
(i) Partition the entire real line, or any particular interval that has probability 1, into a
finite number of
k
disjoint subintervals. Generally,
k
is chosen so that the expected
number of observations in each subinterval is at least 5, if
H
0
is true.
(ii) Determine the probability
p
(0)
i
that the particular hypothesized distribution would
assign to the
i
th subinterval, and calculate the expected number
np
(0)
i
of observations
in the
i
th subinterval,
i
= 1
, . . . , k
.
(iii) Count the number
N
i
of observations in the sample that fall within the
i
th subinterval.
(iv) Calculate the value of
Q
as defined in (1). If the hypothesized distribution is correct,
then
Q
will approximately follow a
χ
2
distribution with
k

1 degrees of freedom.
3
In order to apply Wilk’s theorem (Theorem 9.1.4 in the book), the parameter space must be
an open set in
k
dimensional space. This is not true for the multinomial distribution if we
let
p
to be the parameter (as
∑
k
i
=1
p
i
= 1).
The set of probability vectors lies on a (
k

1) dimensional set of
R
k
.
However, we can
effectively treat the vector
θ
= (
p
1
, . . . , p
k

1
) as the parameter, as
p
k
= 1

p
1

. . .

p
k

1
is a function of
θ
.
As along as we believe that all the coordinates of
p
are strictly between 0 and 1, the set of
possible values of the (
k

1)dimensional parameter
θ
is open.
Therefore, by the Wilk’s theorem,

2 log Λ(
X
) is approximately
χ
2
with
k

1 degrees of
freedom.
Exercise: Suppose that
Y
1
, . . . , Y
n
is a random sample from a population with density function
given by
f
(
y

p
) =
(
p
i
if
y
=
j
, where
j
= 1
,
2
,
3
0
otherwise
,
where
p
= (
p
1
, p
2
, p
3
) is the vector of parameters such that
p
1
+
p
2
+
p
3
= 1 and
p
j
≥
0
for
j
= 1
,
2
,
3. Use the likelihood ratio test for testing
H
0
:
p
1
=
p
2
=
p
3
versus
q
H
1
:
H
0
is not true.
Use the level
α
= 0
.
05.
4
1.3
Goodnessoffit for composite hypothesis
We can extend the goodnessoffit test to deal with the case in which the null hypothesis is
that the distribution of our data belongs to a particular parametric family.
The alternative hypothesis is that the data have a distribution that is not a member of that
parametric family.
Thus, in the statistic
Q
, the probabilities
p
(0)
i