Permalink
Find file Copy path
5e982f5 Dec 4, 2019
1 contributor

Users who have contributed to this file

41 lines (25 sloc) 2.36 KB

Capstone Project: Predicting Delinquencies with Flask

Project Goal

Predict credit delinquencies with highest accuracy and develop a Flask GUI where users can input credit information and receive the probability of a delinquency.

Data

The data used was pulled from a Kaggle competition (link below). The data contains bank information on 250,000 borrowers. There is a training data set and a test data set. The training data sets shows which customers experienced 90 days past due delinquency or worse. The test data set does not show which customers experiencesd delinquency.

https://www.kaggle.com/c/GiveMeSomeCredit/overview

EDA

The data set had many issues. There were many missing values for monthly income and number of dependents. There were also numerous values that did not make sense, such as having 96 or 98 times past due.

Project Solution

The best model for prediction was a neural network classifier. I also tried gridsearching for a neural network classifier but did not get varying results from using the standalone model. I found that the features used had much more of a predictive power.

I also used Logistic Regression and Gradient Boosting Classifier. The Gradient Boosting Classifier got an AUC score on Kaggle of .70. The Neural Network Classifier (using all the same features) increased my score to .86.

Business Problem and Solution

Banks have a difficult time predicting defaults. This tool can be deployed for people who are underwriting loans to assist with their process.

Flask

I used Flask to create a site where a user can input their data in three ways: input through the site, upload an excel sheet, or upload a pdf. The output will create a table on the site showing the probability of a delinquency and also give you the option of downloading the table to excel.

AWS

I used AWS to publish my flask application for other people to use it. There is a pdf and excel file ready to be used for the site in the folder flask_inputs

http://ec2-54-172-108-107.compute-1.amazonaws.com/

Jupyter Notebooks

EDA

  • Went through training and test set to populate missing values and add features.

Model

  • Used three different models. Created submissions for Kaggle.

Power Point Presentation

The power point presentation is listed above under the name "Predicting_Defaults.pdf"