AOUAD Mohamed, Jad
Mini Projet Intro ML

Repository



Binary Classification Workflow

Introduction
This project focuses on applying a binary classification model to two different datasets: Banknote Authentication and Chronic Kidney Disease. It encompasses the entire machine learning workflow from data import and preprocessing to model training, validation, and analysis.

Datasets


Banknote Authentication Dataset: UCI Machine Learning Repository


Chronic Kidney Disease Dataset: Kaggle


Installation
To run this project, you need Python installed on your machine along with the following libraries:

Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn

You can install these packages using pip:

pip install pandas numpy scikit-learn matplotlib seaborn


File Description


binary_classification_workflow.py: Contains all functions for data preprocessing, model training, validation, and result display.

main.ipynb: Jupyter Notebook demonstrating the application of the workflow to the datasets.


Usage

Clone the repository from GitLab : https://gitlab.imt-atlantique.fr/m21aouad/mini-projet-intro-ml.git

Download the datasets and place them in the project directory.
Run the Jupyter Notebook to see the workflow in action.


Functions Overview


Data Preprocessing: Functions for cleaning, filling missing values, scaling, and normalizing data.

Model Training and Validation: Includes functions for splitting data, handling categorical features, feature selection, and model performance comparison.

Utility Functions: For tasks like checking skewness, identifying outliers, and visualizing data.


Acknowledgements
This project is part of the course "Intro ML" at IMT Atlantique.