Developed a high-performing Decision Tree model (84.07% accuracy) for a finance company, automating credit score categorization (High Risk, Low Risk) and reducing manual efforts. Identified key drivers and recommended a focused lending approach to the top 20% low-risk customers, boosting successful repayment chances by 1.7 times.
About the dataset
1.1. Problem Statement
Over the years, a global finance company has collected basic bank details and gathered a lot of credit-related information. The management wants to build an intelligent system to segregate the people into credit score brackets to reduce the manual efforts. In terms of business value, we would like to know what each customer's credit classification is based on the desired variables. Based on our dataset, we are classifying what ‘credit score’ each customer has based on the other variables such as number of bank accounts, interest rate, annual income, outstanding debt, etc.
1.2. Task (business questions?)
Given a person’s credit-related information, build a machine learning model that can classify the credit score into categories of ‘Bad’, ‘Standard’ and ‘Good’.
1.3. Dataset size
100,000 Rows × 28 Columns
1.4 Description of columns represents and types
Column Name | Description |
---|---|
ID | A unique identifier for each record in the dataset |
Customer_ID | A unique identifier for each customer |
Month | The month in which the record was created or updated |
Name | The name of the individual |
Age | The age of the individual |
SSN | The Social Security Number |
Occupation | The occupation or job title of the individual |
Annual_Income | The annual income of the individual |
Monthly_Inhand_Salary | The monthly salary received. |
Num_Bank_Accounts | The number of bank accounts the individual holds |
Num_Credit_Card | The number of credit cards the individual owns |
Interest_Rate | The interest rate on the individual's primary loan |
Num_of_Loan | The number of loans the individual currently has |
Type_of_Loan | The type of loan. |
Delay_from_due_date | The number of days the individual is delayed on their loan payment |
Num_of_Delayed_Payment | Delayed payments |
Changed_Credit_Limit | Changed credit limit |
Num_Credit_Inquiries | The number of inquiries made into the individual's credit. |
Credit_Mix | The types of credit the individual has. |
Outstanding_Debt | The total amount of debt the individual currently has |
Credit_Utilization_Ratio | The ratio of current credit card balances to credit limits |
Credit_History_Age | The age of the individual's oldest credit line |
Payment_of_Min_Amount | Payment of minimum amount |
Total_EMI_per_month | The total monthly EMI (Equated Monthly Installment) the individual is responsible for |
Amount_invested_monthly | The amount the individual invests on a monthly basis |
Payment_Behaviour | Describes the individuals payment behavior |
Monthly_Balance | The average monthly balance in the individual's bank accounts |
Credit_Score | The individuals credit score |
1.5. Link of the original data
https://www.kaggle.com/datasets/parisrohan/credit-score-classification?select=train.csv
1.6. Link of the Column Names Dictionary