这个作业是用R语言完成应用统计

Module: CMP-5017B/5019B/7008B Applied Statistics
Assignment:: Course Work 1
Learning outcomes
Become familiar with the R interface and language.
Specification
Overview
To improve understanding of material by working on teams (at the most 3 students per team) on problems based on that material introduced in class.
Description
In any of the following problems (a) implement the programs and test them; (b) give comments,with the symbol #, that demonstrate the steps required to obtain the correct solution and the important principles used; (c) provide the output of your programs (commented out though with the symbol #); (d) don’t use loops (e.g., for, while), if, else, ifelse statements and the functions which, which.max, which.min, cumsum, match:
1. Simulate a random vector a with 100 elements from the uniform distribution on the interval (−15, 15). [marks 2]
For this vector write a program:
(a) To calculate the sum of the positive elements in vector a . [marks 4]
(b) To calculate the sum of the vector elements of a until the first appearance of the first positive element in a. [marks 8]
2. (a) Write a program for simulating n = 1000 observations from the student-t distribution with 7 degrees of freedom. Denote these observations by xi, i = 1, . . . , n. [marks 2]
(b) Then write a program for calculating the truncated mean,ytr =Pki=1 yik where yi are the observations excluding the observations below the 5% and above the 95% of the entire random sample xi from the student-t distribution. [marks 12]

(c) Finally, write a program for calculating the following statistic,d =Xmi=1c(i)x(i).
To derive x(i) use only the negative observations from xi, i = 1, . . . , n and then order them, i.e.,x(1) < x(2) < . . . < x(m−1) < x(m).
Note that that the number of negative observations is m ≤ n. To calculate the constants c(i), i = 1, . . . , m use the following formula:c(i) =m − (i)m1/2−m − (i) + 1m1/2−1m, where (i) is the rank of the negative xi. [marks 14]
3. Suppose that A is a matrix with dimension p × p. A number q < p can be used to take a partition of the matrix A as follows:
A =A1 A2A3 A4,
where A1 is the upper-left sub matrix of A with dimension q ×q, A2 is the upper-right sub matrix of A with dimension q×(p−q), A3 is the lower-left sub matrix of A with dimension (p − q) × q, and A4 is the lower-right sub matrix of A with dimension (p − q) × (p − q).
Write a program to calculate the matrix B =A1 − A2A−14 A3 , if A is a symmetric and positive definite matrix0q , else,where 0q is a q × q matrix with all “zero” elements. The program should have as input a matrix A and a number q.
An example of a such a partition is given below:
A =
1 0 0.5 −0.3 0.2
0 1 0.1 0 0
0.5 0.1 1 0.3 0.7
−0.3 0 0.3 1 0.4
0.2 0 0.7 0.4 1
,
A1 =1 00 1, A2 =
0.5 −0.3 0.2
0.1 0 0 
A3 =
0.5 0.1
−0.3 0
0.2 0
 , A4 =
1 0.3 0.7
0.3 1 0.4
0.7 0.4 1 .
[marks 14]

4. It is known that the expected value of a continuous random variable with density function f(x) is given by,E(X) = Z xf(x) dx.
In practice, to estimate the expected value we use the sample mean x¯ =Pni=1 xi/n,where xi, i = 1, . . . , n are observations for the distribution with density f(x).
Using this idea we can calculate numerically any integral of the form Rφ(x) dx, if we write this integral as an expected value, i.e.,Zφ(x) dx =Zφ(x)h(x)h(x) dx = Eφ(x)h(x),and use the following algorithm:
(a) Simulate a random vector x = (x1, . . . , xn) with a large number of elements (say n = 1000) from h(x).
(b) Set y =φ(x)h(x).
(c) The integration value can be given by y¯ =Pni=1 yi/n.
Write such a program for calculating R 10x(x5 − 1) dx where h(x) is the uniform density in the unit interval. [marks 14]
5. (a) Plot, in the same figure, the functions: f(x) = sin(x) + π/4, −2π ≤ x ≤ 2π and g(x) = sin(x), 0 ≤ x ≤ π or − 2π ≤ x ≤ −π−π/4, elsewhere.
Legends should be used. [marks 12]
(b) Plot, in the same figure, the densities of N(µ = 1, σ2 = 1) and N(µ = −3.5, σ2 =3/4). Legends should be used. [marks 6]
(c) Create a figure for comparing the probability mass function of binomial distribution with 30 trials and probability of success p = 0.3 and its approximation by normal distribution. [marks 12]
Relationship to formative assessment
We hand out exercises for you to do, the aim being to give you an opportunity to test understanding and exercise your skills; assignment’s questions are similar.
Deliverables
One member of the team, on behalf of the team, should submit a piece of coursework in the following way:
1. Print an assignment cover sheet from your portal.
2. Provide the student number of every member of the team.
3. Attach your written work to the cover sheet.
(a) An *.R file with the programs and their testing evaluation (commented). Name this file using your student numbers, for e.g. 3902269-3902270-3902271.R. The structure of the *.R file should be similar with the solutions of the lab-sessions.
#1.
# Simulate from uniform …
x<-…

# (a)

# (b)

#2.


#9.

(b) The printed figures from problem 5.
4. Upload the *.R file in the Blackboard under Assignment 1.
5. If you fail any of the above steps your coursework won’t be assessed.
Resources
1. Blackboard notes.
2. If you have any questions drop by my office S.211A (Mondays from10.00a.m. to 12.00p.m.) or e-mail me at a.nikoloulopoulos@uea.ac.uk.
3. If you get stuck with an R error, you should first test your code systematically to isolate the fragment causing the problem; if you still can’t see it, then ask. Stackoverflow is a valuable resource, but you may not post questions there (or on any other similar forum) asking for help with this assignment. If you use code copied from any source you must acknowledge the source (e.g. by including a comment at the top of the reused code,with the URL and author if appropriate). Failure to acknowledge code written by others is plagiarism, which is not allowed (General Regulation 18).
Marking scheme
• Accuracy of answers.
• Understanding of material displayed.
• Clarity of explanations of working.
• Quality of reporting.
• As always credit is given for a persuasive argument.