Tumor Prediction with Machine Learning

This project leverages machine learning algorithms to predict whether a tumor is benign or malignant using biomedical data. Various classification models are explored, and the best-performing model is selected based on its performance.

Project Overview

The project follows these steps:

Data Loading: Downloading the dataset from OpenML.
Data Exploration and Preparation:
- Analyzing dataset dimensions.
- Assigning descriptive names to columns.
- Visualizing the distribution of target classes.
Model Training:
- Training and evaluating several classification models:
  - Random Forest
  - Logistic Regression
  - Support Vector Machine (SVM)
  - Gradient Boosting
- Cross-validation to compare model performance.
Hyperparameter Tuning:
- Using GridSearchCV to optimize hyperparameters of the Gradient Boosting model, selected as the best-performing model.

Requirements

To run the project, you need the following dependencies:

Python 3.7+
Python libraries:
- openml
- pandas
- numpy
- scikit-learn
- seaborn
- matplotlib

Install the dependencies using:

pip install openml pandas numpy scikit-learn seaborn matplotlib

Notebook Structure

The project is documented in a Jupyter notebook. The main structure includes:

Introduction: Brief description of the problem and objectives.
Data Loading and Exploration:
- Loading the dataset from OpenML.
- Initial exploratory analysis (dimensions, data types, class distribution).
Model Training and Evaluation:
- Comparing different models using metrics such as precision, recall, and F1-score.
- Visualizing results with plots.
Best Model Optimization:
- Hyperparameter tuning with GridSearchCV to improve the accuracy of the Gradient Boosting model.
Conclusions: Summary of results and future steps.

Usage

Follow these steps to run the notebook:

Clone the repository:

git clone https://github.com/Unai1117/BreastCancerBenignOrMalignant.git

Navigate to the project directory:
```
cd project_name
```
Open the notebook in Jupyter:
```
jupyter notebook tumor_prediction.ipynb
```
Execute the cells sequentially to reproduce the analysis and results.

Results

The optimized model (Gradient Boosting) achieved the following results:

Key Metrics:
- Accuracy: 98%
- Recall: 97%
- F1-Score: 97%

Detailed insights can be found in the hyperparameter tuning section and the classification reports generated by the models.

Contributions

Contributions are welcome. To improve the project or add new features, follow these steps:

Fork the repository.
Create a branch for your feature:
```
git checkout -b new_feature
```
Commit your changes:
```
git commit -m "Description of changes"
```
Push your changes and submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
tumor_prediction.ipynb		tumor_prediction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tumor Prediction with Machine Learning

Project Overview

Requirements

Notebook Structure

Usage

Results

Contributions

About

Releases

Packages

Languages

Unai1117/BreastCancerBenignOrMalignant

Folders and files

Latest commit

History

Repository files navigation

Tumor Prediction with Machine Learning

Project Overview

Requirements

Notebook Structure

Usage

Results

Contributions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages