Part 1

For the questions (a), (b) and (c), use the normal distribution. For questions (d) and
(e) do not use the normal distribution.

(a) A pundit has claimed that at least 60% of UK soccer matches end in victory for the
home team. Use the dataset “UKSoccer_1” to test this hypothesis using a level of
significance of 5%.

(b) Conduct the same test as in question (a), but using the larger sample in
“UKSoccer_2”.

(c) The IT department at UCC has estimated that there are 23.7 hacking attempts
against their network per day on average. If hacking attempts follow a Poisson
distribution, then what is the probability of there being 30 or more hacking attempts
during a day?

(d) A publication claims that 25% of babies are born prematurely. Conduct a test of
this claim with a level of significance of 10%. (See Preemie variable in “Babysamp 98”)

(e) The advertising for a diet product claims that at least 20% of men have a body fat
percentage greater than 30%. Test this claim with a level of significance of 5%. Dataset:
“Bodyfat”.

Part II

(a) In a region of Cork there are 1.42 traffic accidents per day on average. Suppose
that traffic accidents follow a Poisson distribution. Find the probabilities for the five
outcomes indicated in the table above.

(b) Discuss reasons why data may not follow the Poisson distribution. Include at least
two examples in your answer.

(c) Explain what is meant by the “variance” of a probability distribution and then
explain the formula for the variance of the number of successes in a binomial
distribution.

Part III

(a) Explain fully the meaning of a 95% confidence interval.

(b) Using the “ChildSpeaks” sample dataset, find the mean age (in months) at which a
child first speaks. Then construct a 90% confidence interval around that mean.

(c) Explain briefly why it may be better to use the Student’s t distribution rather than
the normal distribution when carrying out an hypothesis test. (Max. 150 words)

Part IV

(a) Test the hypothesis that men and women are equally likely to be politically liberal.
Use a level of significance of 5%.

Dataset: “Student survey”

(b) We are interested in whether a training programme for Little League baseball
players improves their ability to throw a strike (causing the batter to swing and miss

