这个作业是用R语言完成应用统计

School of Computing Sciences

Module: CMP-5017B/5019B/7008B Applied Statistics

Assignment:: Course Work 1

Set by : Dr Aristidis K. Nikoloulopoulos e-mail: A.Nikoloulopoulos@uea.ac.uk

Date set : 22 January 2020

Value : 25%

Date due : 6 February 2020 by 15:00

Returned by : 13 February 2020

Submission : Hard copy to Hub

Learning outcomes

Become familiar with the R interface and language.

Specification

Overview

To improve understanding of material by working on teams (at the most 3 students per team)

on problems based on that material introduced in class.

Description

In any of the following problems (a) implement the programs and test them; (b) give comments,

with the symbol #, that demonstrate the steps required to obtain the correct solution and the

important principles used; (c) provide the output of your programs (commented out though with

the symbol #); (d) don’t use loops (e.g., for, while), if, else, ifelse statements and the

functions which, which.max, which.min, cumsum, match:

1. Simulate a random vector a with 100 elements from the uniform distribution on the interval

(−15, 15). [marks 2]

For this vector write a program:

(a) To calculate the sum of the positive elements in vector a . [marks 4]

(b) To calculate the sum of the vector elements of a until the first appearance of the first

positive element in a. [marks 8]

2. (a) Write a program for simulating n = 1000 observations from the student-t distribution

with 7 degrees of freedom. Denote these observations by xi

, i = 1, . . . , n. [marks 2]

(b) Then write a program for calculating the truncated mean,

ytr =

Pk

i=1 yi

k

where yi are the observations excluding the observations below the 5% and above

the 95% of the entire random sample xi

from the student-t distribution. [marks 12]

1

School of Computing Sciences

(c) Finally, write a program for calculating the following statistic,

d =

Xm

i=1

c(i)x(i)

.

To derive x(i) use only the negative observations from xi

, i = 1, . . . , n and then order

them, i.e.,

x(1) < x(2) < . . . < x(m−1) < x(m)

.

Note that that the number of negative observations is m ≤ n. To calculate the

constants c(i)

, i = 1, . . . , m use the following formula:

c(i) =

m − (i)

m

1/2

−

m − (i) + 1

m

1/2

−

1

m

,

where (i) is the rank of the negative xi

. [marks 14]

3. Suppose that A is a matrix with dimension p × p. A number q < p can be used to take a

partition of the matrix A as follows:

A =

A1 A2

A3 A4

,

where A1 is the upper-left sub matrix of A with dimension q ×q, A2 is the upper-right sub

matrix of A with dimension q×(p−q), A3 is the lower-left sub matrix of A with dimension

(p − q) × q, and A4 is the lower-right sub matrix of A with dimension (p − q) × (p − q).

Write a program to calculate the matrix

B =

A1 − A2A−1

4 A3 , if A is a symmetric and positive definite matrix

0q , else,

where 0q is a q × q matrix with all “zero” elements. The program should have as input a

matrix A and a number q.

An example of a such a partition is given below:

A =

1 0 0.5 −0.3 0.2

0 1 0.1 0 0

0.5 0.1 1 0.3 0.7

−0.3 0 0.3 1 0.4

0.2 0 0.7 0.4 1

,

A1 =

1 0

0 1

, A2 =

0.5 −0.3 0.2

0.1 0 0

A3 =

0.5 0.1

−0.3 0

0.2 0

, A4 =

1 0.3 0.7

0.3 1 0.4

0.7 0.4 1

.

[marks 14]

2

School of Computing Sciences

4. It is known that the expected value of a continuous random variable with density function

f(x) is given by,

E(X) = Z

xf(x) dx.

In practice, to estimate the expected value we use the sample mean x¯ =

Pn

i=1 xi/n,

where xi

, i = 1, . . . , n are observations for the distribution with density f(x).

Using this idea we can calculate numerically any integral of the form R

φ(x) dx, if we write

this integral as an expected value, i.e.,

Z

φ(x) dx =

Z

φ(x)

h(x)

h(x) dx = E

φ(x)

h(x)

,

and use the following algorithm:

(a) Simulate a random vector x = (x1, . . . , xn) with a large number of elements

(say n = 1000) from h(x).

(b) Set y =

φ(x)

h(x)

.

(c) The integration value can be given by y¯ =

Pn

i=1 yi/n.

Write such a program for calculating R 1

0

x(x

5 − 1) dx where h(x) is the uniform density in

the unit interval. [marks 14]

5. (a) Plot, in the same figure, the functions: f(x) = sin(x) + π/4, −2π ≤ x ≤ 2π and

g(x) =

sin(x), 0 ≤ x ≤ π or − 2π ≤ x ≤ −π

−π/4, elsewhere.

Legends should be used. [marks 12]

(b) Plot, in the same figure, the densities of N(µ = 1, σ2 = 1) and N(µ = −3.5, σ2 =

3/4). Legends should be used. [marks 6]

(c) Create a figure for comparing the probability mass function of binomial distribution

with 30 trials and probability of success p = 0.3 and its approximation by normal

distribution. [marks 12]

Relationship to formative assessment

We hand out exercises for you to do, the aim being to give you an opportunity to test understanding

and exercise your skills; assignment’s questions are similar.

Deliverables

One member of the team, on behalf of the team, should submit a piece of coursework in the

following way:

1. Print an assignment cover sheet from your portal.

2. Provide the student number of every member of the team.

3. Attach your written work to the cover sheet.

(a) An *.R file with the programs and their testing evaluation (commented). Name this

file using your student numbers, for e.g. 3902269-3902270-3902271.R. The structure

of the *.R file should be similar with the solutions of the lab-sessions.

#1.

3

School of Computing Sciences

# Simulate from uniform …

x<-…

…

# (a)

…

# (b)

…

#2.

…

…

#9.

…

(b) The printed figures from problem 5.

4. Upload the *.R file in the Blackboard under Assignment 1.

5. If you fail any of the above steps your coursework won’t be assessed.

Resources

1. Blackboard notes.

2. If you have any questions drop by my office S.211A (Mondays from10.00a.m. to 12.00

p.m.) or e-mail me at a.nikoloulopoulos@uea.ac.uk.

3. If you get stuck with an R error, you should first test your code systematically to isolate

the fragment causing the problem; if you still can’t see it, then ask. Stackoverflow is a

valuable resource, but you may not post questions there (or on any other similar forum)

asking for help with this assignment. If you use code copied from any source you must

acknowledge the source (e.g. by including a comment at the top of the reused code,

with the URL and author if appropriate). Failure to acknowledge code written by others is

plagiarism, which is not allowed (General Regulation 18).

Marking scheme

• Accuracy of answers.

• Understanding of material displayed.

• Clarity of explanations of working.

• Quality of reporting.

• As always credit is given for a persuasive argument.

4

EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

**E-mail:** easydue@outlook.com **微信:**easydue

**EasyDue™是一个服务全球中国留学生的专业代写公司
专注提供稳定可靠的北美、澳洲、英国代写服务
专注提供CS、统计、金融、经济、数据科学专业的作业代写服务**