MATH4068: Coursework 2021

gdp <-exp（gap [，3：14]）

lifeExp <-差距[，15：26]
colnames（lifeExp）<-年

•首先创建一些基本的探索性数据分析图，以显示GDP和预期寿命

•使用log进行log（GDP）数据和寿命预期数据的主成分分析

•计算每个主要成分所解释的变化比例，并提供

•查看log（GDP）和预期寿命数据的主要组成部分，并提供

•提供前三个主要成分得分的组合的散点图，在

•使用log（GDP）和预期寿命的组合数据集进行多维缩放，即

Coursework
The ﬁle gap.csv is available on Moodle, and contains the GDP per capita, and the life expectancy for 142
diﬀerent countries from 1952 to 2007. This data is from gapminder.org.
Load the data into R using the commands
gap <- gap.raw
gap[,3:14]<- log(gap.raw[,3:14])
Note that for GDP per capita, it is best to work with log(GDP) when doing statistical analysis, as the values
vary over several orders of magnitude between countries. For ease of plotting, it may be useful to split the
data into two data frames, one containing GDP per capita, and the other life expectancy data.
gdp <- exp(gap[,3:14])
years <- seq(1952, 2007,5)
colnames(gdp) <- years
rownames(gdp) <- gap[,2]
lifeExp <- gap[,15:26]
colnames(lifeExp) <- years
rownames(lifeExp) <- gap[,2]
In this project, you will analyse this data using the methods we have looked at during the module.
• Begin by creating some basic exploratory data analysis plots, showing how GDP and life expectancy
have changed over the past 70 years.
Principal component analysis
• Carry out principal component analysis on the log(GDP) data and on the life-expectancy data using
your preferred choice of S or R.
• Calculate the proportion of variation explained by each of the principal components, and provide a
scree plot. Discuss how many principal components you would choose to retain in each case.
• Look at the leading principal components for the log(GDP) and the life expectancy data, and provide
an interpretation for each component you have chosen to retain.
• Provide scatter plots of combinations of the ﬁrst three principal component scores, indicating on the
plot the names of the countries. Colour the data points by the continent they belong to. Identify and
discuss any countries that have interesting characteristics based on your analysis. Can you explain what
happened in any of these countries?
Multidimensional scaling
• Perform multidimensional scaling using the combined dataset of log(GDP) and life expectancy, i.e.,
using
gap[,3 26]

Find and plot a 2-dimensional representation of the data. As before, colour each data point by the continent
it is on. Discuss the similarity of this plot with your previous plots.

EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue

EasyDue™是一个服务全球中国留学生的专业代写公司