Skip to content

Latest commit

 

History

History
66 lines (42 loc) · 2.16 KB

File metadata and controls

66 lines (42 loc) · 2.16 KB

Brain-stroke-classification

This repository contains the implementation and analysis of a machine learning project aimed at classifying brain strokes into ischemic and hemorrhagic types. Leveraging Random Forest, K-Nearest Neighbors (KNN), and Decision Tree algorithms, i achieve high classification accuracy, providing a potential tool for aiding in the rapid diagnosis and treatment of strokes.

Dataset

I utilized a publicly available dataset from Kaggle, which includes demographic information, medical history, lifestyle factors, and socioeconomic variables. Key features include:

Age, Gender, Hypertension, Heart disease, Blood sugar levels, Body Mass Index (BMI), Smoking status, Work type, Residence type,

The target variable is the presence or absence of a stroke.

Data Analysis

Age Distribution: Stroke incidence increases with age.

image

Smoking Status: Higher stroke prevalence among smokers.

image

Heart Disease: Strong correlation between heart disease and stroke incidence.

image

Correlation Analysis: Explored relationships between features to understand stroke risk factors.

image

Models and Performance

I evaluated three machine learning models:

Random Forest

Accuracy: 95.3%, Precision: 0.478, Recall: 0.498, F1 Score: 0.488,

K-Nearest Neighbors (KNN)

Accuracy: 95.6%, Precision: 0.478, Recall: 0.499, F1 Score: 0.489,

Decision Tree

Accuracy: 95.5%, Precision: 0.955, Recall: 0.478, F1 Score: 0.489,

Conclusion

All models achieved high overall accuracy, but faced challenges with class imbalance, leading to lower precision and recall for stroke classification. Future work will focus on addressing this imbalance and incorporating additional clinical data to enhance model performance.