这是一篇来自美国的关于评估任务2问题的数学代写

 

Q1. DATA SET WBC (Marks = 60 marks)

In an imaging experiment 50 white blood cells (WBCs) from non-diseased and 50 white blood cells from diseased patient group were analysed.

On each WBC the following characteristics of the WBC image was measured

  • eccen Cell eccentricity
  • arn Cell area
  • perin Perimeter of the cell
  • soln

Solidity of the cell

  • ext Extent of the cell
  • diam Diameter of the cell

Part A:

The aim of the study on WBCs is to test whether the two groups have identical population mean vectors.

NOTE: Show your SAS code and relevant formulae, outputs, and interpretation.

a) Find the group specific mean vectors, and the variance-covariance and the correlation matrices. (5 marks)

b) State your null and alternative hypotheses, show the mathematical formulae for the appropriate test statistic and the formulation for finding the critical value and associated p values. (5 marks)

c) Carry out appropriate multivariate procedures to determine whether the WBC cells differ across the diseased and non-diseased populations. Show your SAS code and relevant formulae, outputs, and interpretation. (15 marks)d) What underlying assumptions are involved in this test procedure above? (5 marks)

e) Now test which of the WBC characteristics differ significantly between the diseased cells compared to the non-diseased. Use the pooled variance of the relevant variables. (5 marks)

f) Obtain the 95% simultaneous confidence intervals of the differences, state your alpha value.

Create a table of the resultant confidence intervals. (5 marks)

g) Obtain the analogous Bonferroni 95% Confidence Intervals of the differences. Create a table of the resultant confidence intervals. (5 marks)

h) Upon which if any of the WBC measures do the disease and non-diseased cells differ significantly, based on your answers in f)-g)? (5 marks)

Part B:

NOTE: Show your SAS code and relevant formulae, outputs, and interpretation.

a) For each group, plot the pairwise 90% prediction ellipses using PROC CORR for the pair of variables most significantly and negatively correlated and the pair of variables most significantly positively correlated. (5 marks)

b) Produce plots to test for multivariate normality of your data for each group. Is the data normal? (5 marks)

NOTE: In all parts of the question ensure you show your SAS code and relevant formulae, outputs,and interpretation.

Q2. DATA SET WBC (Marks = 70 marks)

For the data set analysed in Question 1 on WBCs:

For the diseased group:

a) Perform a principal component analysis (PCA). Show your full SAS code and all SAS output (5 marks)

b) Give (write out) the formulation of the first 3 principal components Prin j, j=1, …, 3 (PC1,PC2, PC3). (3 marks)

c) Find the variance and the cumulative proportion explained by each of the full suite of principal components. (3 marks)

d) Create the Principal Component Pattern Profile plot and interpret all the Principal Components. Justify your answers carefully according to your Principal Component Pattern  Profile plot. (8 marks)

e) How many principal components (PC’s) would you retain based on the scree plot? Justify your answer. (2 marks)

f) Perform formal statistical tests to ascertain the optimal number of principal components to retain. HINT: Test the significance of the “larger” components, that is, the components corresponding to the larger eigenvalues. (4 marks)

g) Construct the 95% CI for 1. Show your formula and working along with the result. (2 marks)

h) Construct the 95% CI for 2. Show your formula and working along with the result. (2 marks)

i) Which variables contribute the most to PC2? (1 mark) For the non-diseased group:

a) Perform a principal component analysis (PCA). Show your full SAS code and all output (5  marks)

b) Give (write out) the formulation of the first 3 principal components Prin j, j=1, …, 3 (PC1,PC2, PC3). (3 marks)

c) Find the variance and the cumulative proportion explained by each of the full suite principal components. (3 marks)

d) Create the Principal Component Pattern Profile plot and interpret all the Principal Components. Justify your answers carefully according to your Principal Component Pattern Profile plot. (8 marks)

    e) How many principal components (PC’s) would you retain based on the scree plot? Justify your answer. (2 marks)

    f) Perform formal statistical tests to ascertain the optimal number of principal components to retain. HINT: Test the significance of the “larger” components, that is, the components corresponding to the larger eigenvalues. (4 marks)

    g) Construct the 95% CI for 1. Show your formula and working along with the result. (2 marks)

    h) Construct the 95% CI for 2. Show your formula and working along with the result. (2 marks)

    i) Which variables contribute the most to PC2? (1 mark)

    j) Make comments about the differences and similarities between the PC analytic results based on the diseased and non-diseased PC pattern profiles and the first 2 PCs found. (10 marks)

    NOTE: In all parts of the question ensure you show your SAS code and relevant formulae, working out, outputs, and write your conclusions and interpretation carefully.

    Q3. DATA SET TWIN: (Marks = 25 marks)

    A sample of identical twin’s personality traits (TCIs) as discussed in a psychometric case study were investigated.

    A total of 30 twin pairs were questioned. The following questions were put to the twins.

    • X1: What is the level of Novelty Seeking (NS) you observe in your twin?
    • X2: What level of Novelty Seeking NS) does your twin see in you?
    • X3: What is the level of Harm Avoidance (HA) you observe in your twin?
    • X4: What level of Harm Avoidance (HA) does your twin see in you?

    Responses were recorded on the five-point scale. Responses included the following rank values

    1. None of the trait in question
    2. Very low level of the trait in question
    3. Some level of the trait in question
    4. A great deal of the trait in question
    5. Huge level of the trait in question.The aim of the study was to ascertain whether the twins accurately perceive/rank the NS and HA levels of their twin.
    1. a) Perform the appropriate Hotelling’s T-squared test – show your formula, SAS code, SAS output, hypothesis tests being tested, test statistic and p value. (10 marks)
    1. b) Provide via SAS the sample means and variances of the differences in responses between the twins (5 marks).
    1. c) Does the first twin accurately perceive the level of NS or HA of the second twin? Justify your conclusion. (10 marks)

    NOTE: In all parts of the question ensure you show your SAS code or IML code and relevant formulae, outputs, working and interpretation. Write your conclusions out carefully.