This project analyzes the Titanic dataset to predict passenger survival using machine learning techniques. An extensive feature engineering process was performed and several classification models were tested.
Random Forest showed the best performance with an accuracy of 84.13%, followed by XGBoost and SVM with 82.73%.
The passenger's title and gender were the most influential factors in survival prediction.
First-class passengers had a significantly higher survival rate (63%) compared to third-class passengers (24%).
A slight difference in age distribution is observed between survivors and non-survivors.
The model shows a good balance between true positives and negatives, with relatively few false positives and negatives.
✓ The passenger's title and gender were the most important predictors of survival.
✓ Passenger class had a significant impact on survival chances.
✓ Random Forest outperformed other models with an accuracy of 84.13%.
✓ Family size and fare were also important factors in prediction.