Extract information from the internet using Web Scraping in order to acquire datasets describing Rwanda's popularity. Perform an exploratory data analysis in order to visualize Rwanda's popularity growth.
This project is made up of 3 parts. Web scraping, analysis, and presentation. Using web scraping techniques, I shall scrape social media, news stations and travel review websites in order to acquire datasets which will be used for the analysis stage. I will analyze the data using Jupyter Notebooks and finally present it using powerpoint.
With the help of Beautiful Soup, scrape all mentions of 'Rwanda' from BBC webbsite search tab.
Dataset acquired: BBC_Data
Webscraping Program: Rwanda BBC Webscraping.py
With the help of Beautiful Soup, scrape all mentions of 'Rwanda' from Aljazeera webbsite search tab.
Dataset acquired: Aljazeera Data
Webscraping Program: Rwanda Aljazeera Webscraping.py
With the help of Beautiful Soup, scrape all mentions of 'Rwanda' from Euro News webbsite search tab.
Dataset acquired: Euro News Data
Webscraping Program: Rwanda_euro_news Webscraping.py
We will get all the information from the subreddit 'r/Rwanda' using the reddit API. This is faster than manually scraping the subreddit directly.
Dataset acquired: Reddit Data
Webscraping Program: Rwanda Reddit Scrape.py
With the help of Selenium, we shall scrape information on all the stays in Rwanda from Booking.com
Dataset acquired: Booking.com Data
Webscraping Program: booking.com scraping.py
With the help of Selenium, we shall scrape information on all the stays in Rwanda from the AirBnB website.
Dataset acquired: AirBnB Data
Webscraping Program: airbnb scraper.py
With the help of Selenium, we shall scrape information on all the stays in Rwanda from the TripAdvisor website.
Dataset acquired: TripAdvisor Data
Webscraping Program: Tripadvisor Rwanda stays.py
Perform an initial analysis on the data collected so far. This will help us to understand the data collected and prepare us for the actual exploratory analysis.
Notebook: Analysis