这是一个统计机器学习的R语言代写assignment

Overview of Dataset

My dataset is a credit card approval prediction dataset. It includes two csv file: the first one is application_record csv and credit_record csv. The application_record csv contains appliers’ personal information, which we could use as features for predicting. The credit_record.csv records users’ behaviors of credit card.

We found this dataset on Kaggle and we already downloaded these two csv files from Kaggle. The link for the dataset
is: https://www.kaggle.com/rikdifos/credit-card-approval-prediction?select=application_record.csv

In the application_record csv file, there are 438557 observations of 18 variables. And in the credit_record csv file, there are 1048575 observation of three variables. No missing values in credit_record, but some occupation type values (about 1/3) are missing in application_record, which leaves us about 300000 complete data entries. We think that is enough for us to predict the credit card approval pattern. Or we can simply remove the Occupation parameter in the model. We will need look into that later.

Overview of Research Questions

Our prject focuses on predicting credit card approval. The main research question is what the credit card issuance criteria are. Among all 18 variables in application_record, we find that annual income, number of children, education level, age, days employed, number of family members are predictors that could potentially have larger effect on the credit card approval decision.

The question will be best answered with both regression and classification approach. Since these 18 variables contain both numbers (quantitative) and characters (qualitative).

One thing to note — the use of regression or classification methods depends on the form of the
outcome variable. Since it sounds like your outcome variable is categorical (approval or not), you’ll
most likely end up using classification machine learning models.

Report Contents

project report should be written similarly to a paper, with figures, code, and results included
throughout to illustrate your points and findings. Text should be included to guide the reader. I
recommend reading through the example report to get an idea of this layout. More specifically, your
report should contain:

– An introduction section: Describes the data, the research questions, provides any background
readers need to understand your project, etc.

– A conclusion section: Discusses the outcome(s) of models you fit. Which models performed well,
which performed poorly? Were you surprised by model performance? Next steps? General
conclusions?

– A table of contents

– A section for exploratory data analysis: This should contain at least 3 to 5 visualizations and/or tables
and their interpretation/discussion. At minimum your group should create a univariate visualization of
the outcome(s), a bi-variate or multivariate visualization of the relationship(s) between the outcome
and select predictors, etc. Part of an EDA involves asking questions about your data and exploring
your data to find the answers.

– A section discussing data splitting and cross-validation: Describe your process of splitting data into
training, test, and/or validation sets. Describe the process of cross-validation.

– A section discussing model fitting: Describe the types of models you fit, their parameter values, and
the results.

– Model selection and performance: A table and/or graph describing the performance of your best
fitting model on testing data. Describe your best-fitting model however you choose, and the quality of
its predictions, etc.

R语言代写｜R Machine Learning

于2022-03-032022-03-03由easydue发布

Overview of Dataset

Overview of Research Questions

Report Contents

R代写

作业代写｜BEEM012 – Empirical Assignment Brief

R代写

R语言代写｜Module 2 R Practice

R代写

R语言代写｜Introduction to Actuarial Science Assignment

R语言代写｜R Machine Learning

于2022-03-032022-03-03由easydue发布

Overview of Dataset

Overview of Research Questions

Report Contents

相关文章

R代写

作业代写｜BEEM012 – Empirical Assignment Brief

R代写

R语言代写｜Module 2 R Practice

R代写

R语言代写｜Introduction to Actuarial Science Assignment