Have a look at the result in a "blog" format here!
Our project extension aims to inspect the difference in quantity, quality and distribution of friends according to where a person lives. Cultural factors, as well as geography and political decisions, may influence the result. We will analyse the problem in different countries. To do so, we will use BrightKite and Gowalla datasets. From them, we will compute users' nationalities and friendships characteristics such as proximity, quantity of friends and frequency of meeting. With these data, we will be able to assert (or no) the significance of regional disparities. We will inspect further the underlying causes of those disparities using another dataset that includes characteristics of each country (mobility, wealth, ...). By doing this study, we may find the factors that increase the number of friends or bring them closer, which may be solutions against the threat of depression and isolation augmentation.
- Is there differences in terms of friendship characteristics depending on the country?
- What are the factors that impact the number of friends?
- What are the factors that impact the proximity of friends?
- What are the factors that impact the quality of friendships?
- Is there a relation between the number and quality of friendships?
- How does friendship characteristics impact happiness?
- Gowalla dataset: https://snap.stanford.edu/data/loc-gowalla.html. A dataset from the paper. Checkins of users and friends relationship.
- Brightkite dataset: https://snap.stanford.edu/data/loc-Brightkite.html. A dataset from the paper. Checkins of users and friends relationship.
- Compilation of UNData: https://www.kaggle.com/sudalairajkumar/undata-country-profiles. Different characteristics for each country.
- Compilation of USGovt: https://www.kaggle.com/fernandol/countries-of-the-world. More characteristics for each country.
- happiness2020.pkl and countries_info.csv from "tutorial 01- Handling data". More characteristics for each country.
- Mapping (home position to country belonging)
- Clustering
- Anova test and Kruskal-Wallis test
- Correlation matrix
- Pair plots
- PCA
- Linear regression
Week 12
- Compute the nationalities
- Compute friendships characteristics (number of friends, proximity of them, frequency of meetings)
Week 13
- Find similarities and disparities between countries based on country characteristics
- Find similarities and disparities between countries based on friendship characteristics
Week 14
- Infer the factors that influence friendships characteristics for each country
- Infer the relations between quality and quantity of friends for each user
- Revising and commeting code
- Plotting and reporting results
- Compute the nationalities [Week 12] (Iván-Daniel)
- Compute friendships characteristics (number of friends, proximity of them, frequency of meetings) [Week 12] (Thibault & Andrés)
- Find similarities and disparities between countries based on country characteristics[Week 13] (Iván-Daniel & Thibault)
- Find similarities and disparities between countries based on friendship characteristics[Week 13] (Iván-Daniel & Thibault)
- Infer the factors that influence friendships characteristics for each country [Week 14] (Andrés & Iván-Daniel)
- Infer the relations between quality and quantity of friends for each user [Week 14] (Andrés)
- Revising and commeting code [Week 14] (All members in collaboration)
- Plotting and reporting results [Week 14] (All members in collaboration)