Cleaning Airline Data with Python

In this project a messy dataset about airline flights is cleaned to enable it to be used for further analysis. For this purpose, the project uses Python pandas library.

Continue reading

Exploratory Data Analysis of Facebook Ad Data with Python

in eda

In this project facebook ad data is analysed through means of an exploratory data analysis. Metrics commonly use in ad analysis are implemented and investigated. It is assumed business performance is driven by absolute return on advertising spend and as such the ROAS metric is targeted. This preliminary analysis suggests further campaigns should focus on the 30-34 age group, particularly males. The advertising spend is least effectively targeted on the 45-49 age group. However, the number of clicks associated with these conclusions is in some cases low and it is therefore suggested that further work aim to show the statistical significance of targeting these groups.

Continue reading

Using Docker and Kubernetes to produce a scalable fraud detection API

In this report a simple logistic regression model is used to classify credit card transactions as fraudulent or not. A Recall of 0.8 and Precision of 0.7 is obtained for a false positive rate of 0.0005. However, for a model to be useful from a business perspective an understanding of how to deploy the model in the real world is important. Docker and Kubernetes are investigated for this purpose.

Continue reading

Visualizing Japan's Population Data using D3.js, Flask and MongoDB

In this project an interactive data visualisation is created using a dataset containing Japan's population between 1871 and 2015. The data is broken down by geographic region in a number of ways, including by island, prefecture, region and capital.

Continue reading