You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mass selection into groups of like-minded individuals may be fragmenting and polarizing online society, particularly with respect to partisan differences1,2,3,4.
However, our ability to measure the social makeup of online communities and in turn, to understand the social organization of online platforms, is limited by the pseudonymous, unstructured and large-scale nature of digital discussion.
Here we develop a neural-embedding methodology to quantify the positioning of online communities along social dimensions by leveraging large-scale patterns of aggregate behaviour.
Applying` our methodology to 5.1 billion comments made in 10,000 communities over 14 years on Reddit, we measure
how the macroscale community structure is organized with respect to age, gender and USpolitical partisanship.
Examining political content, we find that
Reddit underwent a significant polarization event around the 2016 USpresidential election.
Contrary to conventional wisdom, however, individual-level polarization is rare; the system-level shift in 2016 was disproportionately driven by the arrival of new users.
Political polarization on Reddit is unrelated to previous activity on the platform and is instead temporally aligned with external events.
We also observe a stark ideological asymmetry, with the sharp increase in polarization in 2016 being entirely attributable to changes in right-wing activity.
This methodology is broadly applicable to the study of online interaction, and our findings have implications for the design of online platforms, understanding the social contexts of online behaviour, and quantifying the dynamics and mechanisms of online polarization.
a, A two-dimensional t-distributed stochastic neighbour embedding (t-SNE) projection of the 10,006 subreddits in our Reddit community embedding, with points coloured by clusters found by hierarchical clustering. b, An illustration of our methodology to generate social dimensions. c, The distribution of partisan scores for the 10,006 most popular Reddit communities. The 𝑥x-axis shows the number of standard deviations from the mean partisan score (z-score). Communities vary from far-leftwing to far-rightwing and are coloured by z-score. d, Top, communities most associated with the left-wing and right-wing ends of the dimension (for community descriptions, see Supplementary Table 1). Bottom, words most associated with the left-wing and right-wing ends of the dimension, considering only word usage in political communities in 2017 as quantified by the partisan-ness dimension (Extended Data Fig. 6).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Quantifying social organization and political polarization in online platforms
Nature volume600,pages264–268 (2021)
https://www.nature.com/articles/s41586-021-04167-x
Abstract
Mass selection into groups of like-minded individuals may be fragmenting and polarizing online society, particularly with respect to partisan differences1,2,3,4.
However, our ability to measure the social makeup of online communities and in turn, to understand the social organization of online platforms, is limited by the pseudonymous, unstructured and large-scale nature of digital discussion.
Here we develop a neural-embedding methodology to quantify the positioning of online communities along social dimensions by leveraging large-scale patterns of aggregate behaviour.
Applying` our methodology to 5.1 billion comments made in 10,000 communities over 14 years on Reddit, we measure
how the macroscale community structure is organized with respect to age, gender and USpolitical partisanship.
Examining political content, we find that
This methodology is broadly applicable to the study of online interaction, and our findings have implications for the design of online platforms, understanding the social contexts of online behaviour, and quantifying the dynamics and mechanisms of online polarization.
a, A two-dimensional t-distributed stochastic neighbour embedding (t-SNE) projection of the 10,006 subreddits in our Reddit community embedding, with points coloured by clusters found by hierarchical clustering. b, An illustration of our methodology to generate social dimensions. c, The distribution of partisan scores for the 10,006 most popular Reddit communities. The 𝑥x-axis shows the number of standard deviations from the mean partisan score (z-score). Communities vary from far-leftwing to far-rightwing and are coloured by z-score. d, Top, communities most associated with the left-wing and right-wing ends of the dimension (for community descriptions, see Supplementary Table 1). Bottom, words most associated with the left-wing and right-wing ends of the dimension, considering only word usage in political communities in 2017 as quantified by the partisan-ness dimension (Extended Data Fig. 6).
Source data
Data availability
All data are available from the pushshift.io Reddit archive28 at http://files.pushshift.io/reddit/
Source data are provided with this paper.Reddit community embedding, social dimension vectorsand community scores areavailable at https://github.com/CSSLab/social-dimensions
Code availability
All code is available at https://github.com/CSSLab/social-dimensions. Analyses were performed with Python v3.7, pandas v1.3.3 and Spark v3.0.
Beta Was this translation helpful? Give feedback.
All reactions