Titanic Data Preprocessing

Project Overview

This project involves cleaning and preprocessing the famous Titanic dataset to prepare it for further analysis or machine learning tasks. The dataset contains information about passengers, and the goal is to handle missing values, convert categorical data to numerical format, and normalize numeric features.

About the Project

Dataset used: Titanic - Machine Learning from Disaster (Kaggle)
Tools & Technologies: Python, Pandas, scikit-learn
Main steps performed:
- Dropped columns with excessive missing values
- Filled missing values with mean or mode
- Encoded categorical variables (Sex and Embarked) into numeric values
- Normalized numerical columns (Age, Fare, SibSp, Parch)
- Saved the cleaned dataset for future use

How to Use

Clone or download this repository.
Make sure Python and required libraries (pandas, scikit-learn) are installed.
Run the preprocessing script:
```
python data_preprocessing.py
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Titanic Data Preprocessing

Project Overview

About the Project

How to Use

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Titanic Data Preprocessing

Project Overview

About the Project

How to Use