这个作业是用R语言分析预测客户流失的数据

T81-576 Spring 2020, Telecommunications Customer Churn Case Study 1
T81-576
Analytics Applications
Case Study
Telecommunications Customer Churn
Due: March 17, 2020 5:29pm
Prof. Farmer
T81-576 Spring 2020, Telecommunications Customer Churn Case Study 2
https://www.kaggle.com/blastchar/telco-customer-churn
Introduction
Simply put, customer churn occurs when customers or subscribers stop doing business with a
company or service. Also known as customer attrition, customer churn is a critical metric
because it is much less expensive to retain existing customers than it is to acquire new
customers – earning business from new customers means working leads all the way through
the sales funnel, utilizing your marketing and sales resources throughout the process. Customer
retention, on the other hand, is generally more cost-effective as you’ve already earned the trust
and loyalty of existing customers.
Customer churn impedes growth, so companies should have a defined method for calculating
customer churn in a given period of time. By being aware of and monitoring churn rate,
organizations are equipped to determine their customer retention success rates and identify
strategies for improvement.
Various organizations calculate customer churn rate in a variety of ways, as churn rate may
represent the total number of customers lost, the percentage of customers lost compared to
the company’s total customer count, the value of recurring business lost, or the percent of
recurring value lost. Other organizations calculate churn rate for a certain period of time, such
as quarterly periods or fiscal years. One of the most commonly used methods for calculating
customer churn is to divide the total number of clients a company has at the beginning of a
specified time period by the number of customers lost during the same period.
Case Study Objective
To predict behavior to retain customers. You should analyze all relevant customer data and
develop a model to predict the likelihood of a customer churning. Final output of model should
be a score of the customer’s likelihood to churn. You must, at a minimum, use a neural
network in your model.
Data Content
Each row represents a customer, each column contains customer’s attributes described on the
column Metadata. The raw data contains 7043 rows (customers) and 21 columns (features).
The “Churn” column is our target.
The data set includes information about:
• Customers who left within the last month – the column is called Churn
T81-576 Spring 2020, Telecommunications Customer Churn Case Study 3
• Services that each customer has signed up for – phone, multiple lines, internet, online
security, online backup, device protection, tech support, and streaming TV and movies
• Customer account information – how long they’ve been a customer, contract, payment
method, paperless billing, monthly charges, and total charges
• Demographic info about customers – gender, age range, and if they have partners and
dependents
Columns and Related Information
Field Field Type Description
customerID Text/String Unique ID for the each customer
gender Text/String Whether the customer is a male or a female
SeniorCitizen Numeric Whether the customer is a senior citizen or not (1, 0)
Partner Text/String Whether the customer has a partner or not (Yes, No)
Dependents Text/String Whether the customer has dependents or not (Yes, No)
tenure Numeric Number of months the customer has stayed with the
company
PhoneService Text/String Whether the customer has a phone service or not (Yes,
No)
MultipleLines Text/String Whether the customer has multiple lines or not (Yes,
No, No phone service)
InternetServices Text/String Customer’s internet service provider (DSL, Fiber optic,
No)
OnlineSecurity Text/String Whether the customer has online security or not (Yes,
No, No internet service)
OnlineBackup Text/String Whether the customer has online backup or not (Yes,
No, No internet service)
DeviceProtection Text/String Whether the customer has device protection or not
(Yes, No, No internet service)
TechSupport Text/String Whether the customer has tech support or not (Yes, No,
No internet service)
StreamingTV Text/String Whether the customer has streaming TV or not (Yes, No,
No internet service)
StreamingMovies Text/String Whether the customer has streaming movies or not
(Yes, No, No internet service)
Contract Text/String The contract term of the customer (Month-to-month,
One year, Two year)
Paperless Text/String Whether the customer has paperless billing or not (Yes,
No)
T81-576 Spring 2020, Telecommunications Customer Churn Case Study 4
Field Field Type Description
PaymentMethod Text/String The customer’s payment method (Electronic check,
Mailed check, Bank transfer (automatic), Credit card
(automatic))
MonthlyCharges Numeric The amount charged to the customer monthly
TotalCharges Numeric The total amount charged to the customer
Deliverables
• R or Python code
• Any graphs, plots, etc.
• One-page summary of your work (this of this as a one-page executive summary of the
findings)
• Insights into how you think you can improve on your model (e.g., more data, more
advanced models, etc.)
Final Remarks
• All work should be your own work.
• Please follow the same rules as we discussed for the mid-term exam.
• Any questions should be directed to me.