School of Computing Sciences
Module: CMP-5017B/5019B/7008B Applied Statistics
Assignment:: Course Work 1
Set by : Dr Aristidis K. Nikoloulopoulos e-mail: A.Nikoloulopoulos@uea.ac.uk
Date set : 22 January 2020
Value : 25%
Date due : 6 February 2020 by 15:00
Returned by : 13 February 2020
Submission : Hard copy to Hub
Learning outcomes
Become familiar with the R interface and language.
Specification
Overview
To improve understanding of material by working on teams (at the most 3 students per team)
on problems based on that material introduced in class.
Description
In any of the following problems (a) implement the programs and test them; (b) give comments,
with the symbol #, that demonstrate the steps required to obtain the correct solution and the
important principles used; (c) provide the output of your programs (commented out though with
the symbol #); (d) don’t use loops (e.g., for, while), if, else, ifelse statements and the
functions which, which.max, which.min, cumsum, match:
1. Simulate a random vector a with 100 elements from the uniform distribution on the interval
(−15, 15). [marks 2]
For this vector write a program:
(a) To calculate the sum of the positive elements in vector a . [marks 4]
(b) To calculate the sum of the vector elements of a until the first appearance of the first
positive element in a. [marks 8]
2. (a) Write a program for simulating n = 1000 observations from the student-t distribution
with 7 degrees of freedom. Denote these observations by xi
, i = 1, . . . , n. [marks 2]
(b) Then write a program for calculating the truncated mean,
ytr =
Pk
i=1 yi
k
where yi are the observations excluding the observations below the 5% and above
the 95% of the entire random sample xi
from the student-t distribution. [marks 12]
1
School of Computing Sciences
(c) Finally, write a program for calculating the following statistic,
d =
Xm
i=1
c(i)x(i)
.
To derive x(i) use only the negative observations from xi
, i = 1, . . . , n and then order
them, i.e.,
x(1) < x(2) < . . . < x(m−1) < x(m)
.
Note that that the number of negative observations is m ≤ n. To calculate the
constants c(i)
, i = 1, . . . , m use the following formula:
c(i) =
m − (i)
m
1/2

m − (i) + 1
m
1/2

1
m
,
where (i) is the rank of the negative xi
. [marks 14]
3. Suppose that A is a matrix with dimension p × p. A number q < p can be used to take a
partition of the matrix A as follows:
A =

A1 A2
A3 A4

,
where A1 is the upper-left sub matrix of A with dimension q ×q, A2 is the upper-right sub
matrix of A with dimension q×(p−q), A3 is the lower-left sub matrix of A with dimension
(p − q) × q, and A4 is the lower-right sub matrix of A with dimension (p − q) × (p − q).
Write a program to calculate the matrix
B =

A1 − A2A−1
4 A3 , if A is a symmetric and positive definite matrix
0q , else,
where 0q is a q × q matrix with all “zero” elements. The program should have as input a
matrix A and a number q.
An example of a such a partition is given below:
A =


1 0 0.5 −0.3 0.2
0 1 0.1 0 0
0.5 0.1 1 0.3 0.7
−0.3 0 0.3 1 0.4
0.2 0 0.7 0.4 1


,
A1 =

1 0
0 1
, A2 =

0.5 −0.3 0.2
0.1 0 0 
A3 =

0.5 0.1
−0.3 0
0.2 0

 , A4 =

1 0.3 0.7
0.3 1 0.4
0.7 0.4 1

 .
[marks 14]
2
School of Computing Sciences
4. It is known that the expected value of a continuous random variable with density function
f(x) is given by,
E(X) = Z
xf(x) dx.
In practice, to estimate the expected value we use the sample mean x¯ =
Pn
i=1 xi/n,
where xi
, i = 1, . . . , n are observations for the distribution with density f(x).
Using this idea we can calculate numerically any integral of the form R
φ(x) dx, if we write
this integral as an expected value, i.e.,
Z
φ(x) dx =
Z
φ(x)
h(x)
h(x) dx = E

φ(x)
h(x)

,
and use the following algorithm:
(a) Simulate a random vector x = (x1, . . . , xn) with a large number of elements
(say n = 1000) from h(x).
(b) Set y =
φ(x)
h(x)
.
(c) The integration value can be given by y¯ =
Pn
i=1 yi/n.
Write such a program for calculating R 1
0
x(x
5 − 1) dx where h(x) is the uniform density in
the unit interval. [marks 14]
5. (a) Plot, in the same figure, the functions: f(x) = sin(x) + π/4, −2π ≤ x ≤ 2π and
g(x) = 
sin(x), 0 ≤ x ≤ π or − 2π ≤ x ≤ −π
−π/4, elsewhere.
Legends should be used. [marks 12]
(b) Plot, in the same figure, the densities of N(µ = 1, σ2 = 1) and N(µ = −3.5, σ2 =
3/4). Legends should be used. [marks 6]
(c) Create a figure for comparing the probability mass function of binomial distribution
with 30 trials and probability of success p = 0.3 and its approximation by normal
distribution. [marks 12]
Relationship to formative assessment
We hand out exercises for you to do, the aim being to give you an opportunity to test understanding
and exercise your skills; assignment’s questions are similar.
Deliverables
One member of the team, on behalf of the team, should submit a piece of coursework in the
following way:
1. Print an assignment cover sheet from your portal.
2. Provide the student number of every member of the team.
3. Attach your written work to the cover sheet.
(a) An *.R file with the programs and their testing evaluation (commented). Name this
file using your student numbers, for e.g. 3902269-3902270-3902271.R. The structure
of the *.R file should be similar with the solutions of the lab-sessions.
#1.
3
School of Computing Sciences
# Simulate from uniform …
x<-…

# (a)

# (b)

#2.

#9.

(b) The printed figures from problem 5.
4. Upload the *.R file in the Blackboard under Assignment 1.
5. If you fail any of the above steps your coursework won’t be assessed.
Resources
1. Blackboard notes.
2. If you have any questions drop by my office S.211A (Mondays from10.00a.m. to 12.00
p.m.) or e-mail me at a.nikoloulopoulos@uea.ac.uk.
3. If you get stuck with an R error, you should first test your code systematically to isolate
the fragment causing the problem; if you still can’t see it, then ask. Stackoverflow is a
valuable resource, but you may not post questions there (or on any other similar forum)
asking for help with this assignment. If you use code copied from any source you must
acknowledge the source (e.g. by including a comment at the top of the reused code,
with the URL and author if appropriate). Failure to acknowledge code written by others is
plagiarism, which is not allowed (General Regulation 18).
Marking scheme
• Understanding of material displayed.
• Clarity of explanations of working.
• Quality of reporting.
• As always credit is given for a persuasive argument.
4 EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue

EasyDue™是一个服务全球中国留学生的专业代写公司