Skip to content

Develop a comprehensive machine learning model to predict fraudulent credit card transactions. This project includes data analysis, cleaning, exploratory data analysis (EDA), model training, evaluation, and comparison, with detailed visualizations and metrics.

Notifications You must be signed in to change notification settings

Jotis86/Credit-Card-Fraud-Detection-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

📊 Credit Card Fraud Detection Project

🎯 Objective

  • The objective of this project is to develop a machine learning model that can predict whether a credit card transaction is fraudulent or not. 🕵️‍♂️💳

⚙️ Functionality

  • Data analysis and cleaning 🧹
  • Data exploration and visualization 🔍📊
  • Training multiple machine learning models 🤖
  • Evaluating and comparing models 📈
  • Visualizing results and metrics 📉

🛠️ Tools Used

  • Python 🐍
  • Pandas for data manipulation 🐼
  • NumPy for numerical operations 🔢
  • Matplotlib and Seaborn for data visualization 📊🎨
  • Scikit-learn for machine learning modeling and evaluation 🤖

🛤️ Development Process

  1. Data Analysis and Cleaning 🧹
  • Data Loading: Import the credit card transactions dataset 📥
  • Basic Analysis: Explore the structure and basic statistics of the dataset 📊
  • Duplicate Check: Identify and remove duplicate rows 🗑️
  • Data Cleaning: Check for and handle missing values 🚫
  1. Exploratory Data Analysis (EDA) 🔍
  • Target Variable Distribution: Visualize the distribution of fraudulent and non-fraudulent transactions 📊
  • Correlation Matrix: Analyze the correlation between variables 🔗
  • Feature Distribution: Visualize the distribution of each feature 📈
  1. Data Preprocessing 🧪
  • Feature and Target Separation: Split the dataset into features (X) and target variable (y) ✂️
  • Dataset Splitting: Divide the data into training and testing sets 🧩
  • Feature Scaling: Normalize the features to improve model performance 📏
  1. Machine Learning Modeling 🤖
  • Logistic Regression: Train and evaluate a logistic regression model 📉
  • Random Forest: Train and evaluate a Random Forest model 🌳
  • Support Vector Machine (SVM): Train and evaluate an SVM model 🧠
  1. Model Evaluation and Comparison 📈
  • ROC Curve and AUC: Compare models using the ROC curve and area under the curve (AUC) 📊
  • Precision-Recall Curve: Evaluate the precision and recall of the models 📉
  • Additional Metrics: Calculate precision, recall, F1-score, and accuracy for each model 📏

📈 Results

  • Logistic Regression: Good performance in precision but lower recall 📉
  • Random Forest: Best balance between precision and recall 🌳
  • SVM: High precision but lower recall compared to Random Forest 🧠

📊 Visualizations

  • Target Variable Distribution: Bar chart showing the distribution of fraudulent and non-fraudulent transactions 📊
  • Correlation Matrix: Heatmap showing the correlation between variables 🔗
  • Feature Distribution: Histograms showing the distribution of each feature 📈
  • ROC Curve: Plot comparing the ROC curves of the models 📉
  • Precision-Recall Curve: Plot comparing the precision and recall of the models 📊

🗂️ Project Structure

  • Notebook

📝 Conclusions

  • Random Forest is the most balanced model for detecting credit card fraud 🌳
  • Feature standardization and duplicate removal are crucial steps in data preprocessing 🧹
  • Evaluating multiple metrics is essential for a comprehensive model comparison 📏

📬 Contact

  • For any inquiries or collaborations, you can contact me at: jotaduranbon.com 📧

About

Develop a comprehensive machine learning model to predict fraudulent credit card transactions. This project includes data analysis, cleaning, exploratory data analysis (EDA), model training, evaluation, and comparison, with detailed visualizations and metrics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published