Predicting Total Spend

less than 1 minute read

The aim of the project was to come up with a predictive model that explains the total amount spent with their credit cards. The entire dataset consisted of 5000 rows, each corresponding to one customer and a number of variables. Customers could possibly have two credit cards- one primary and one secondary. The focus was on the total amount spent on both cards combined.

Following the initial pre-processing of the data, linear regression was employed to come up with a model that accurately described the driving force behind the money spent by customers. A 70-30% split was done to generate development and validation datasets. The models developed by both development and validation sets gave numerically close R squared values indicating that the model developed was reliable. The drivers behind total amount spent were found to be the sector in which the customer was employed, whether he/she had a car on lease, and obviously the income earned.

Visit the Github Repository here.
Directly download the final report with complete analysis and results here.