I started to work Titanic Survival Prediction data set in Kaggle on Dec. 20. I couldn’t estimate how I should start this project, and then I found coursework in DataCamp to help run an elemental analysis of the project. I followed a guide in DataCamp, which includes predicting with Decision Tree and Random Forest, and completed it on the same day. Working in DataCamp was simple since I read explanations and fill blanks on code lines.

After the coursework, I read some articles about the Titanic project in Kaggle. It helps me understand the importance of feature engineering. Titanic Survival Predictions (Beginner) So far, I skimmed the article and understood what she did for her work.

Today, I tried to build code on my local computer myself and find weaknesses that I have. Well, from this work I could realize some problems that did not happen in DataCamp and most of them are related to a knowledge of Pandas, NumPy packages, which means I consumed lots of time in manipulating data.

So, I decided to learn Pandas, NumPy first (It took a while to get used to with R before too) and then read some more works in Kaggle. I don’t expect I can find a new methodology to shape a better output on the project, but I can learn something more from the study.