Web Scraping text data with Scrapy and Beautiful Soup

There is a wealth of information available on the internet. Web scraping is the process which enables people to collate and start to organise this data into a more structured format for further analysis. This project investigates this process using data provided by the UK Parliment, in particular the financial interests of members of the House of Commons. Attributes of the html data motivate the use of two Python libraries, Beautiful Soup and Scrapy, for this work.

Continue reading