Titanic: Machine Learning from Disaster is the 101 type of machine learning competition hosted on Kaggle since it started. The task is to predict who would survive the disaster given information on individual's age, gender, socio-economic status(class) and various other features.
How to read the graph:
- Each rectangle in the graph represents a passenger on Titanic, color yellow means that the passenger survived the disaster and the color blue indicates that he does not.
- There can be multiple people with the same age, gender, and class values, so I set the opacity of these rectangles to be 20%. So the place on the graph where you can see solid yellow shows that those passengers have a higher chance of surviving, whereas solid blue indicates danger.
Based on this visualization we can see that:
- females (young or old, except around age 25) and young males(under age 15) from middle and upper class tend to survive.
- the overall survivor rate for female passengers is higher than male passengers.
So, without all the drama shown in the classic movie, this visualization basically predicts that Jack will most likely not able to make it, but Rose will survive...
Update: Jan 14
I made a couple of changes to the visualization during last few days. Now a newer version is available here.