All Data Science and Machine Learning Projects that I have done will avaliable here.
🚀 Breast Cancer Prediction Using R: A Comprehensive Analysis 🌟
🛠️ Project Overview In this project, I implemented various techniques to analyze and predict breast cancer outcomes using the K-Nearest Neighbors (KNN) algorithm.
🔍 Data Analysis and Visualization Data Preparation:
Loaded and processed the dataset, transforming categorical diagnosis labels into factors for analysis. Visualization:
Pie Chart: Illustrated the distribution of benign and malignant diagnoses. Histogram: Showed the distribution of the normalized area_mean feature. Pair Plot: Visualized relationships between selected features and diagnosis categories. Data Normalization:
Normalized the dataset features to prepare for model training.
🔬 Model Training and Evaluation Train-Test Split:
Divided the data into training and testing sets. KNN Algorithm:
Applied the KNN algorithm to predict diagnoses. Performance Evaluation:
Cross Table: Assessed the model's performance in terms of prediction accuracy. Confusion Matrix: Provided insights into true positives, false positives, true negatives, and false negatives. Accuracy: 98% Sensitivity: 100% Specificity: 91.3% Kappa: 0.9418
📊 Tools and Libraries Utilized libraries including ggplot2, caret, class, gmodels, lattice, and GGally for visualization and analysis.
🌟 Exploring Naive Bayes Classification with the Iris Dataset in R 🌟
I'm excited to share my recent project where I implemented the Naive Bayes classification algorithm using Iris dataset! 🌸
📊 Highlights:
Dataset: The Iris dataset consists of 150 samples from three species of iris flowers (Setosa, Versicolor, and Virginica), with four
features: sepal length, sepal width, petal length, and petal width.
Objective: To classify the species of iris flowers based on their features using the Naive Bayes algorithm.
🔍 Key Steps:
Data Preparation: Loaded the dataset and explored its structure.
Data Partitioning: Used caret's createDataPartition to split the data into training (70%) and testing (30%) sets.
Model Training: Trained the Naive Bayes model using the e1071 package.
Model Evaluation: Evaluated the model's performance using confusion matrix,
Visualization: Created a confusion matrix heatmap and visualized the model performance
📈 Results:
Achieved a model accuracy of 91%