Skip to content

This repo seeks to Explore The Variables That are Pivotal In Predicting a Borrowers Loan outcome.

Notifications You must be signed in to change notification settings

quzeem91/Communicate_Data_Findings-Prosper-Loan-Dataset-Exploration

Repository files navigation

Prosper Loan Data Exploration

by OLAMIDE QUZEEM O.

Installations

  • NumPy
  • pandas
  • Matplotlib
  • Seaborn

Dataset

This dataset contains information on 113,937 loans with 81 variables from Prosper Loan, a peer-to-peer personal loan lending company. The dataset was explored to help identify pivotal variables in loan completion.

You can download the dataset from this URL, and the dataset description file can be accessed through this link.

The dataset originally had 113,937 observations with 81 variables but was later subsetted to 55,089 observations and 17 variables.

This reduction in the dataset was done to focus on exploring variables that might help predict the outcome of a loan (completed, charged off, defaulted, canceled) to determine which loan applications should be approved.

Out of the 81 variables, only a subset of 17 variables seemed to be pivotal to the analysis objective and were selected for further exploration.

Summary of Findings

Based on the analysis, the following findings were observed:

  1. Borrowers with a non-available value for their listing category and employment status, and a not displayed value for their income range, are prone to default on loans.
  2. Homeowners and non-homeowners have a similar distribution in loan completion and defaults.
  3. The number of recommendations a borrower receives is positively correlated with loan completion.
  4. Borrowers with good debt-to-income (DTI) ratio tend to complete their loans.
  5. Loans with a 1-year duration have the highest completion rate with fewer defaults.
  6. Borrowers with a listing category of "Auto" and auto-related values such as motorcycles, boats, and RVs also have a good completion rate.

Key Insights for Presentation

For the presentation, the following key insights will be highlighted:

  1. Variables suspected to influence loan outcome were selected and subsetted for analysis.
  2. The variable of interest, loan outcome, was introduced along with other variables that individually influence its values, such as listing category, borrower homeownership status, number of recommendations, and debt-to-income ratio.
  3. These variables were explored and plotted against each other using clustered bar charts, facetted histogram plots, point plots, and scatter plots. Patterns were observed, such as borrowers with good DTI and a low number of recommendations being more likely to complete their loans. Additionally, a selected group of homeowners taking loans above $25,000 without defaulting was identified.

By presenting these key insights, the audience will gain a clear understanding of the factors that contribute to loan completion and the potential predictors of loan outcomes.

About

This repo seeks to Explore The Variables That are Pivotal In Predicting a Borrowers Loan outcome.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published