Abstract
Statistical laws of information avalanches in social media appear, at least according to existing empirical studies, not robust across systems. As a consequence, radically different processes may represent plausible driving mechanisms for information propagation. Here, we analyze almost one billion time-stamped events collected from several online platforms – including Telegram, Twitter and Weibo – over observation windows longer than ten years, and show that the propagation of information in social media is a universal and critical process. Universality arises from the observation of identical macroscopic patterns across platforms, irrespective of the details of the specific system at hand. Critical behavior is deduced from the power-law distributions, and corresponding hyperscaling relations, characterizing size and duration of avalanches of information. Statistical testing on our data indicates that a mixture of simple and complex contagion characterizes the propagation of information in social media. Data suggest that the complexity of the process is correlated with the semantic content of the information that is propagated.
Similar content being viewed by others
Introduction
Social media have dramatically changed the way people produce, access and consume information1, and there is increasing evidence that online discussions have the potential to impact society in unprecedented ways2. For example, the public debate around the COVID-19 pandemic has been accompanied by the so-called Infodemic that is affecting the outcome of the vaccination campaign by increasing hesitancy3,4,5. Also, online discussions in the Reddit channel r/wallstreetbets induced many individuals to buy GameStop shares in opposition to the shorting operation carried out by hedge funds and professional investors. As a result, the market capitalization of the company displayed an increase of more than $22 billion in just a few days6. It is not surprising therefore the renewed scientific interest to comprehend the mechanisms that drive information propagation.
Analyses of the propagation of information in social media reveal, at least qualitatively, similarities with other natural phenomena such as the firing of neurons7,8 and earthquakes9. These processes are characterized by bursty activity patterns. The activity consists of point-like events in time, and bursts (or avalanches) of activity are defined as sequences of close-by events. Bursts are separated by long periods of low activity. Activity can be characterized at the macroscopic level by the distributions P(S) and P(T) of the size S and the duration T of avalanches10,11,12,13,14,15. In real-world systems P(S) and P(T) have a power-law decay for large value of their argument, i.e., P(S) ~ S−τ and P(T) ~ T−α7,8,9,12,16,17,18. This property is interpreted as evidence of the system operating at, or in the vicinity of, a critical point. This statement is supported by the theory of absorbing phase transitions according to which, if the avalanche dynamics is at a critical point, then P(S) and P(T) must decay as power laws, see Eq. (3). Furthermore, in a process operating at criticality, the average size of avalanches with given duration must obey the hyperscaling relation 〈S〉 ~ Tγ, with γ = (α − 1)/(τ − 1)16,19,20. The specific values of the exponents τ and α typically differ for classes of systems. Their actual values are fundamental for the characterization of systems into universality classes, i.e., an ontology of processes with conceptual and practical relevance21.
Universality is the notion that nearly identical avalanche statistics are observed for a multitude of systems governed by different dynamical laws that nevertheless share some basic core mechanisms. Criticality instead refers to the fact that avalanche statistics are characterized by algebraic distributions. Classifying a system within a universality class is informative about the basic core mechanisms that drive the unfolding of the avalanches. Where information propagation (in general, and in online social media) is concerned, the issue of the existence of well-defined universality classes is far from settled. Existing analyses typically study data collected from a single source and over short observation windows. It is often found that distributions of avalanche size and duration obey power laws, but the estimated values of the exponents vary across studies: τ values range between τ ≃ 2 and τ ≃ 413,14,22,23,24, whereas α ≃ 3.625 or α ≃ 2.526,27. Also, empirical studies reporting on correlations between size and duration of avalanches fail to find a power law28,29. This variability might be ascribed to multiple operative definitions of avalanches, which can be given in terms of hashtags time series22,28 as well as reply trees or retweet chains13,24,30. Furthermore, regardless of the definition, the temporal resolution can affect the avalanche distribution12,31.
As a consequence of the variability in the distributions inferred, uncertainty about representative theoretical models remains. In particular, it is an open problem to determine when and if models based on simple contagion are more appropriate to describe the spreading of information online than those based on complex contagion. Stemming from the similarity between the spreading of disease and information, a widely accepted paradigm is that information propagates according to a simple contagion process, where only a single exposure to activity may be sufficient for its diffusion10,13,22,28,32,33. Simple contagion is at the core of many theoretical models of information propagation used in the literature, all displaying critical properties of the mean-field branching process (BP), i.e., τ = 3/2 and α = 234,35,36,37, see Methods. However, there are quite a few studies in favor of the complex contagion paradigm38,39,40,41. As originally introduced by Centola and Macy, in a complex contagion process the involvement of an individual in the propagation of information requires exposure from multiple acquaintances42. Complex contagion is exemplified by some models, such as the linear threshold model and the Random Field Ising Model19,43 (RFIM), see Methods. Distinguishing between simple and complex contagion and, possibly, comprehending how they coexist within the same population44, is fundamental to understand the spreading of (mis)information in online social media38,45.
In this work, we perform a large-scale study of (hash)tags time series from Twitter, Telegram, Weibo, Parler, StackOverflow and Delicious [see Methods and Supplementary Information (SI) A for details about the data sets]. We consider a total of 206,972,692 time series. In our study, a time series consists of all posts that carry the same topic identifier, such as a hashtag on Twitter. Taken cumulatively, our time series consists of 905,377,009 events, collected over periods even longer than 10 years. The Twitter data, collected specifically for this work, are fully available together with codes to reproduce the results of this paper46,47. To define avalanches in a principled fashion we adopt the approach inspired by percolation theory proposed in Ref. 31, see Methods. We provide evidence that social media share universal statistics of avalanches that are well described by power-law distributions. We also develop a novel statistical technique able to determine the level of criticality and complexity of individual time series, see Methods. We find that nearly 20% of the time series are less than 5% away from criticality. These account for 53% of all events in our data sets. At the aggregate level, each social medium displays a critical behavior that is compatible with the RFIM, indicating that, plausibly, processes compatible with complex contagion may play a preponderant role in information diffusion. A more detailed analysis reveals a more nuanced scenario, where about 50% of the individual time series are better explained in terms of a complex rather than a simple contagion process. A qualitative analysis of the most popular hashtags suggests that information concerning conversational topics, e.g., music or TV shows, spreads according to the rules of simple contagion, whereas information concerning political/societal controversies shows signatures of an underlying complex contagion process.
Results
Selection of temporal resolution
Here, an avalanche is defined as a maximal subset of contiguous events in a time series such that two consecutive ones are separated by a time interval smaller than Δ. A proper choice Δ* of the time resolution Δ for the specific data set at hand is necessary to avoid significant distortion in the resulting avalanche statistics. This is true for synthetic time series generated by temporal point processes31, but also for the empirical time series as those analyzed in this paper (see SI E for details). To determine the value of Δ* we use the principled method developed in Ref. 31 that identifies Δ* as the critical point of a one-dimensional percolation model, see Methods for details. Results are presented in Fig. 1. Values of Δ* for each data set are reported in the SI A; they vary substantially across data sets, from Δ* ≃ 1500 s for Twitter to Δ* ≃ 30,000 s for Telegram (Fig. 1b).
a In the main panel, we show the percolation strength P∞ as a function of the temporal resolution Δ, see Eq. (1). The unit of measurement for Δ is one second. Different colors/symbols refer to different social media: Twitter (TWT), Telegram (TLG), Parler (PARL), Weibo (WEI), StackOverflow (STCK), and Delicious (DEL). In the inset, we plot the same data as in the main panel, but with the horizontal axis rescaled as Δ → Δ/Δ*. b In the main panel, we plot the susceptibility χ as a function of the time resolution for the same data as in A, see Eq. (1). The optimal resolution Δ* is identified as the location of the peak of the susceptibility, see Eq. (2). In the inset, we plot the same data as in the main panel, but with the rescaling Δ → Δ/Δ*. For the sake of comparison, each curve has been normalized to its maximum χ*, see Methods.
Once the time resolution is rescaled according to Δ → Δ/Δ*, the curves of the percolation strength for the different data sets exhibit a nearly identical quantitative behavior, see insets of Fig. 1. This fact suggests the possibility of seeing the propagation of information in social media as a universal process, with Δ* representing the natural resolution for observing information avalanches. Figure 2a, b shows the distributions of avalanche size and duration obtained by setting Δ = Δ*. Figure 2c shows the relation between average size and duration. The collapse of the curves relative to different data sets on a single curve hints once more, at least when data are considered at the aggregate level, to processes belonging to the same universality class.
a Avalanche size distribution. Different colors/symbols indicate data obtained from different social media. Acronyms are defined as in Fig. 1. Dotted lines represent the maximum likelihood estimators of the exponent τ obtained by fitting the Random Field Ising Model (RFIM), in red, and the branching process (BP), in teal. The RFIM was fitted using N = 109, R = 0.8 and considering the same number of avalanches as the Twitter sample. The BP was fitted using n = 1.0 and sampling 106 avalanches. Distributions are displayed via logarithmic binning of the data. To make distributions collapse one on the top of the other, the size is multiplied by the factor 1/〈S〉 and probabilities are multiplied by the factor 〈S〉. Cumulative distributions are reported in SI E. b Distribution of avalanche duration for the same data as in panel a. To make distributions collapse one on the top of the other, duration is multiplied by the factor 1/〈T〉 and probabilities are multiplied by the factor 〈T〉. Dashed lines represent the maximum likelihood estimators of the exponent α obtained by fitting the RFIM (red) and the BP (teal). c Average size of avalanches with given duration. Data are the same as in a and b. To make the curves collapse one on top of the other, the abscissa of each curve is rescaled as T/〈T〉 and the ordinate is rescaled as 〈S〉(T)/〈S〉. Solid lines represent the hyperscaling exponent \((\hat{\alpha }-1)/(\hat{\tau }-1)\) obtained using the maximum likelihood estimators of τ and α for the RFIM (red) and for the BP (teal). d Maximum likelihood estimates of the exponents \(\hat{\tau }\), \(\hat{\alpha }\) and \(\hat{\gamma }\), see SI C for details. We also display the ratio \((\hat{\alpha }-1)/(\hat{\tau }-1)\). Error bars are always smaller than the size of the symbol. Dotted lines correspond to the best fit of the exponent τ to the RFIM (red) and to the BP (teal), as shown in panel a. Analogously for dashed lines, representing the best fit of α shown in b and for solid lines, representing the hyperscaling relation shown in c.
Criticality and universality of avalanche statistics
The avalanche statistics of Fig. 2a–c seems well described by power laws, indicating that the underlying process is (nearly) critical, and that its universality class can be identified by estimating the value of the critical exponents τ, α, and γ, see Eq. (3)21. We rely on maximum likelihood estimation for τ and α48; linear regression on the logarithm of the relation 〈S〉 ~ Tγ is used to estimate γ. Results are reported in Fig. 2d, see SI C for details. The estimated exponent \(\hat{\tau }\) is compatible with the one of the mean-field RFIM universality class, i.e., τ = 9/419. The compatibility of the avalanche statistics with those of a homogeneous mean-field model is not surprising given that in some social media there is no underlying network among users and in others there are mechanisms for the propagation of information that bypass it. For example, in Telegram all users who subscribe to a channel receive all messages sent from any other user of that channel, meaning that there is an all-to-all network among all users of the channel as in the mean-field version of the RFIM. In StackOverflow there is no underlying network as users do not follow each other, rather they search for content using common tools offered by the platform. Even in Twitter, where users have follower–followee relationships, the network can be easily bypassed by the way the platform manages users’ feeds. There is an apparent mismatch between our estimates \(\hat{\alpha }\) and \(\hat{\gamma }\) and the RFIM predictions α = 7/2 and γ = 2 due to finite-size effects. To properly address this issue, we performed numerical simulations of the RFIM, and measured the maximum likelihood estimators of τ and α. For consistency, we performed the same operation for the BP too. The results of Fig. 2 reveal that, overall, our data are compatible with the phenomenology of the RFIM and not with the phenomenology of the BP.
The proximity of exponents estimated across different data sets points to the existence of a genuine and distinctive universality class for information propagation in social media when considered at the aggregate level. In particular, this class seems to be different from that of the BP often invoked as a representative in phenomena related to information diffusion. This universal scaling is a genuine feature of social media, as if we repeat the same analysis on time series describing activity in very different types of systems, e.g., brain networks and earthquakes, avalanche duration and size still decay in a power-law fashion, but with radically different exponent values, see SI D for details. In particular, for neuronal avalanches in the brain, we recover exponents compatible with previous studies8,49,50,51.
Complexity of avalanche statistics
To assess if the statistical properties obtained on aggregate data are representative of individual time series, we develop a maximum likelihood method to fit the time series against the BP and the RFIM. The technique is inspired by the work of Ref. 48, see Methods for details. The method supports three different tests. First, it establishes the regime of a time series, depending on how the best estimate of the branching ratio parameter \(\hat{n}\) compares to the critical value nc = 1 for the BP, or how the best estimate of the disorder parameter \(\hat{R}\) compares to the critical value \({R}_{c}=\sqrt{2/\pi }\simeq 0.8\) for the RFIM. Second, it evaluates the goodness of the individual fits via their p values. Similarly to the prescription of Ref. 48, we set the threshold for statistical significance equal to p = 0.1. We verified, however, that the outcome of the analysis is not greatly affected by the choice of the threshold value, see SI J. Third, it establishes whether a time series is better modeled by the BP or by the RFIM by comparing their likelihood.
Results of our analysis are reported in Figs. 3 and 4. Our method is applied only to time series that contain at least two avalanches larger than Smin = 10. These two avalanches must also have different sizes, so that P(S) has at least two non-zero values. Tests of robustness for different Smin values are reported in the SI J. In all systems we find that the best fitting parameter assumes values over a broad range, encompassing a large portion of the subcritical phase and the critical point of the models (Fig. 3a, b). The majority of events belongs to a minority of time series giving rise to the largest avalanches. As a consequence, the large-scale behavior of each system is mainly determined by those few time series that are fitted in a narrow region of the parameter space close to the critical point for both the BP and the RFIM (insets of Fig. 3a, b). Also, our tests indicate that the vast majority of time series are well described by at least one of the two models (Fig. 4a). The model selection indicates that individual time series are divided into two nearly equally populated classes, one better described by the BP and the other by the RFIM (Fig. 4a). Simple and complex contagion thus coexist in social media, with only a mild dominance of complex over simple contagion (Fig. 3c). The individual-level analysis is not incompatible with the results obtained for the aggregate data (Fig. 2). If we aggregate data only from the time series that we attributed to the class of complex contagion, we consistently recover a power-law scaling compatible with that class for all avalanche sizes, see Fig. 3d. However, the aggregation of time series that are classified in the BP class generate a distribution characterized by a neat crossover from BP scaling for small avalanches to RFIM scaling for large avalanches (Fig. 3d). The mixture produces a universal distribution that is overall more compatible with the RFIM universality class rather than the BP class (Fig. 2c).
a We fit each individual time series against the Random Field Ising Model (RFIM) to determine the best estimator of the disorder parameter \(\hat{R}\). We then compute the distribution of \(\hat{R}\) for all time series of a given data set. Acronyms of the data sets are defined as in Fig. 1. We fit only avalanches whose size is at least equal to Smin = 10. The dashed vertical gray line denotes Rc, i.e., the critical value of the RFIM parameter. The inset shows the same data as in main panel, but each time series contributes to the histogram with a weight equal to its total number of events. b Same analysis as in a, but obtained by fitting individual time series against the branching process (BP) to determine the best estimator of the branching ratio \(\hat{n}\). c Probability that the log-likelihood ratio test favors the RFIM over the BP (teal), or vice versa the BP over the RFIM (red). Only time series that are sufficiently well fitted by both models are considered in the analysis, see Fig. 4b. Error bars represent σ/N, where N is the sample size and \(\sigma =\sqrt{0.25\,N}\) is the standard deviation of a binomial distribution with probability of success equal to 1/2. Asterisks are used to denote significant deviations from the unbiased binomial model, i.e., three asterisks indicate p < 0.001, and one asterisk stands for p < 0.1. d We use the classification of panel c to divide time series in two distinct classes. We then consider only time series whose best estimators are sufficiently close to the critical value of the model representing their class, i.e., \(| \hat{R}-{R}_{c}| /{R}_{c}\le 0.05\) or \(| \hat{n}-{n}_{c}| /{n}_{c}\le 0.05\), to compute the distribution of avalanche size for each class. Full symbols are used for the RFIM class, empty markers are used to display the distributions of the BP class. The dotted lines correspond to the best fit of the exponent τ to the RFIM (red) and to the BP (teal).
a We consider avalanches with size S ≥ Smin = 10 and fit them against the branching process (BP) and the Random Field Ising Model (RFIM). For each time series, we establish whether the fits against the individual models are statistically significant or not; if both fits cannot be rejected, we then select the best model by means of the log-likelihood ratio. We report the fraction of time series that are classified in the RFIM class. This fact may happen because the RFIM fit cannot be rejected whereas the BP is rejected, or both fits cannot be rejected but the RFIM is favored over the BP in terms of log-likelihood ratio. The fraction of time series that are classified as BP is defined in an analogous manner. The fraction of time series that is classified as neither BP nor RFIM is represented by the bar labeled as “None.” Finally, some time series pass both statistical tests. Their fraction is denoted by the label “Both” in the figure. In this case, the log-likelihood ratio test is required for model selection, see Fig. 3c. b We restrict our attention to Twitter hashtags containing characters from the English alphabet only, and display the 30 most popular hashtags classified either in the RFIM (blue) or the BP (red) classes. The font size is proportional to the rank of the hashtag in each class. Hashtags of both classes are selected among those that are sufficiently critical, i.e., \(| \hat{R}-{R}_{c}| /{R}_{c}\le 0.05\) for a time series in the RFIM class or \(| \hat{n}-{n}_{c}| /{n}_{c}\le 0.05\) for a time series in the BP class.
Discussion
We showed that temporal patterns characterizing bursts of activity in online social media are conveniently classified in two universality classes. This finding suggests that few core mechanisms determine the large-scale behavior of information diffusion and that many peculiarities that characterize individual platforms are far less relevant. Also, in contrast with the vast majority of previous studies where purely diffusive models have been considered37, we showed that information propagation in social media is often better described by complex contagion dynamics. Complex contagion is here exemplified by the RFIM, an agent-based model of activation originally formulated to describe the para-to-ferromagnetic phase transition in metals19. Recast in the language proper to the description of information propagation52, the RFIM prescribes that each agent (i) has a personal opinion, (ii) is subject to the social influence exerted by the agents she interacts with, and (iii) is also driven by an external force representing the public information about exogenous events. These appear reasonable assumptions for modeling many realistic discussions happening in social media. Figure 4b shows the 30 most popular Twitter hashtags identified by our method either in the simple or in the complex contagion classes. In the category of simple contagion, we find conversational topics, mostly related to music or cinema/TV shows. Hashtags belonging to the class of complex contagion display either periodic patterns or are related to political/controversial themes. This suggests the existence of a relation between the semantics of hashtags and the universality class of the corresponding time series. This qualitative picture fits with previous studies that have explicitly focused on the semantic of different hashtags in Twitter45. For both classes of information avalanches, we inferred the dynamics underlying their generation as critical, a fact that provides theoretical ground for the surprising but remarkable robustness of our findings. The presence of a large portion of social media content that acquires popularity via complex contagion dynamics calls for a reconsideration of predictive algorithms relying on the temporal characteristics of the signal only, because these algorithms often neglect the semantics of hashtags and, even more frequently, the characteristics of the network over which they spread53,54,55,56,57. Both aspects are important for the successful characterization of the process underlying the propagation of information38,45,58,59. We further speculate that our results extend beyond the six platforms considered here. If so, there must be a mechanism that explains the universality shown by the data, involving critical dynamics that is independent of the peculiarities implemented in the individual platforms. Understanding where this mechanism is rooted in and how to exploit this mechanism for the prediction of the propagation of information in online social media remain open challenges for future research.
Methods
Data
We build a time series for each (hash)tag appearing in the data at our disposal. A time series contains the times, i.e., {t1, t2, …}, when the (hash)tag is observed in the data.
Specifically, the Twitter data set is composed of 2,353,192,777 tweets corresponding to a 10% random sample of all Tweets posted on Twitter during the observation window from October 1 to November 30, 2019. The collection of this data has been performed via the Indiana University OSoME Decahose stream60,61. Telegram time series are extracted from a total of 317,224,715 messages, originally collected in Ref. 62. Parler time series are extracted from a total of 183,062,974 posts, originally collected in Ref. 63. Weibo time series are extracted from 226,841,249 posts, originally collected in Ref. 64. StackOverflow time series are extracted from a total number of 46,947,635 questions and answers. Delicious time series were extracted from 7,034,524 users actions, originally collected in Ref. 65. Timestamps always have the temporal resolution of the second, except for the StackOverflow data set, whose temporal resolution is the millisecond.
We pre-process the data so that the number of events per unit time is roughly constant over the whole temporal window considered (see SI A for details) to obtain a corpus of 206,972,692 time series consisting of 905,377,009 total events.
Selection of the temporal resolution
We follow the same procedure as in Ref. 31. Given a time series {t1, t2, …}, we define an avalanche starting at tb as a sequence of events {tb, tb+1, …, tb+S−1} such that tb − tb−1 > Δ, tb+S − tb+S−1 > Δ and tb+i − tb+i−1 ≤ Δ for all i = 1, …, S, where Δ is the resolution parameter. The size S of an avalanche is the number of events within it and the duration T is the time lag between the first and last event in the avalanche, i.e., T = tb+S−1 − tb. Depending on the value of Δ, the same time series is composed of different avalanches.
We identify the optimal resolution Δ* as the critical point of a one-dimensional percolation model that is used to describe the time series. Each time series in a data set is considered as an instance of the one-dimensional percolation model. We measure the size SM of the largest avalanche within each time series. We define the percolation strength P∞ and its associated susceptibility χ, respectively, as
where 〈SM〉 and \(\langle {S}_{M}^{2}\rangle\) are, respectively, the first and second moments of the distribution of the size of the largest avalanche SM across all time series in a data set. Δ* is computed as the resolution maximizing χ, i.e.,
As time series with only one event introduces an offset in the measure of P∞ and are not informative with respect to the optimal resolution Δ*, i.e., SM = 1 for any Δ in these time series, we remove them from the sample and compute P∞ and χ considering only time series composed of at least two events.
Values of the optimal resolution Δ* are reported in SI A. Note that the avalanche statistics reported in Fig. 2 is obtained considering all avalanches, excluding the largest one of each time series. This choice is due to the well-known fact that in percolation theory the largest cluster respects different statistics than that of finite clusters66.
The branching process
In the BP an individual initially active spreads activity to a random number of peers, who can in turn spread activity further34. The process continues for a number T of time steps or generations, until there is a generation in which no individual further spreads activity. T is the duration of the avalanche. The size S of the avalanche is the total number of individuals activated during the avalanche. The average number of individuals who are activated from a single spreader is the branching ratio n and the model is critical for n = nc = 1. The branching ratio is the only tunable parameter of the model.
Finite avalanches of activity in the BP obey the laws
where 〈⋅〉 is the average over different avalanches, and P(S) and P(T) are the probability distributions of S and T, respectively. The functions \({{{{{{{{\mathcal{D}}}}}}}}}_{S}\) and \({{{{{{{{\mathcal{D}}}}}}}}}_{T}\) are known as scaling functions and introduce corrections at small values of their argument, where we have defined the reduced distance from the critical point \(n^{\prime} =| n-{n}_{c}| /{n}_{c}\). The BP is characterized by the exponent τ = 3/2, α = 2 and γ = 2. The above exponents are not independent, rather they are related by γ = 1/(σzν) = (α − 1)/(τ − 1). σ, z and ν are additional critical exponents that we do not explicitly consider in our analysis.
The Random Field Ising Model
We consider the mean-field formulation of the zero-temperature RFIM. Agent i is characterized by the state variable yi = ±1 indicating whether the agent is active, yi = +1, or not, yi = −1. Each agent i has a propensity hi to become active, with hi ∈ (−∞, +∞). A large value of hi indicates that the agent is particularly prone to become active. Agents interact by means of ferromagnetic interactions that model social pressure, i.e., active neighbors push an inactive agent to become active. The whole system is further affected by public information that all agents have access to and that pushes users toward becoming active with intensity H ∈ (−∞, +∞). In the initial configuration, all agents are inactive. The external pressure H grows till the agent with the largest hi value becomes active. This change of state can trigger an avalanche of activity in the other nodes. Specifically, agent j becomes active if the following condition is met
where N is the system size and the mean-field formulation is expressed by the all-to-all interaction. Once in the active state, agents cannot change their state back to inactive. When an avalanche ends, the external pressure H grows again until a new user becomes active and triggers a new avalanche. The field is frozen during the unfolding of avalanches, meaning that avalanches are characterized by a time scale much shorter than the one characterizing external pressure. In the long-term limit, when H = +∞, all agents become active. The size S of an avalanche is given by the number of users that are activated during the avalanche; its duration T is given by the activation rounds characterizing the avalanche.
The stochasticity of the model comes from the random nature of the propensities hi, extracted from a normal distribution with zero mean and variance R. The choice of the normal distribution is quite standard both for ferromagnets and social systems52. R is the only tunable parameter of the model, and the model is critical for \(R={R}_{c}=\sqrt{2/\pi }\). Avalanche statistics obey laws similar to those of Eq. (3). The functional form of the scaling functions, however, is not the same as in the BP; also, their argument is given in terms of the distance from the critical point of RFIM, i.e., \(n^{\prime} =| n-{n}_{c}| /{n}_{c}\) is replaced by \(R^{\prime} =| R-{R}_{c}| /{R}_{c}\). The values of the critical exponents are τ = 9/4, α = 7/2 and γ = 219. In SI F, we show that the peculiar form of the scaling function \({{{{{{{{\mathcal{D}}}}}}}}}_{T}\) introduces strong preasymptotic corrections to the functions P(T) and 〈S〉(T), affecting the measure of α and γ obtained through numerical simulations of the model.
Model selection
To ascribe each time series to a dynamical model, we first fit each model individually by maximizing its likelihood. We evaluate the p value of the fits and, if both hypotheses cannot be rejected, we select the best fit via the log-likelihood ratio test.
To perform the fit, we compare the probability distribution P(S) of the avalanche sizes identified in the time series with the conditional distributions of the avalanche size QRFIM(S∣R) and QBP(S∣n), respectively, obtained for the RFIM and the BP for a given value of the parameters R and n. The construction of the model distributions Q requires discretizing the parameter space of the models. In this study R varies in the interval [0.025, 2.7] by steps of length dR = 0.025 and n varies in [0.02, 1.7] by steps of length dn = 0.015. dR (dn) represents the uncertainty on the parameter. Instead of sampling avalanches from the model at a precisely given value of R (n), we consider model instances corresponding to R (n) values uniformly distributed over an interval of length dR (dn) centered at R (n). The distribution Q corresponding to a specific value of the parameter model is constructed as the superposition of 500 distributions whose parameter values are randomly sampled from the corresponding interval. Fitting a time series to a model means estimating the best parameter with an accuracy of dR (dn) for the RFIM (BP).
Given the empirical distribution P and the model distributions Q, we evaluate the log-likelihood function
The summation is performed over all avalanches with S ≥ Smin, a parameter we vary in our analysis. The distributions P and Q are normalized over the interval [Smin, ∞) to account for this fact. The best fit is obtained by finding the parameter value that maximizes the log-likelihood of Eq. (5). The maximization of the log-likelihood of Eq. (5) is equivalent to the minimization of the cross-entropy of the distribution Q relative to the distribution P. To avoid numerical problems in the estimation of the likelihood, we smoothen the function Q. Details are provided in SI G.
To assign a p value to a fit, we follow the prescription of Ref. 48. Indicating with Ztail/Z the fraction of avalanches with S ≥ Smin in the fitted time series, a synthetic sample of Z avalanches is created by sampling avalanches with S≥Smin from the selected model Q with probability Ztail/Z and by sampling avalanches with S < Smin from the empirical distribution with complementary probability. Each of these synthetic samples is fitted analogously to the original sample obtained from the time series. We compute the Kolmogorov–Smirnov (KS) distance between the empirical distribution P and the selected model Q, as well as between the synthetic samples and their best model. The p value of the fit is defined as the fraction of synthetic samples whose KS distance from the selected model is larger than the KS distance between the real sample and its best model. The hypothesis that the sample has been generated by a certain dynamical model, say the RFIM, cannot be rejected if the p value of the fit to the RFIM is larger than a pre-established significance threshold. We set the threshold to 0.1 in the main text, following the prescription of Ref. 48. Tests of robustness against the choice of this parameter value are reported in SI J.
If one of the two hypotheses can be rejected but the other cannot, the non-rejected model automatically becomes the selected one. If both hypotheses can be rejected, the time series is classified as “None.” If, however, both hypotheses cannot be rejected, we select as the best model the one with the largest likelihood48. We neglect the possibility that a single time series could be described by a mixture of models. Empirical data are fitted only if the time series contains at least 50 events and at least 10 avalanches.
We validate our fitting procedure applying it to synthetic distributions P generated by the RFIM or by the BP. Results are shown in SI I and confirm the ability of our procedure to identify the ground-truth model and the correct value of the parameter.
More details about the fitting and model selection protocol, including tests of robustness against the threshold on the p value and on Smin, are given in the SI.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
The Twitter data generated in this study have been deposited in the Zenodo (https://zenodo.org/record/5779063#.Yg_aP-7MLCV) and GitHub (https://github.com/DaniMuzi/SocialMedia) database46,47. Telegram, Parler, Weibo, StackOverflow, and Delicious data used in this study have been generated in other works. URLs to each of these data sets are provided in SI A.
Code availability
The Python and C codes used for this project are available on Zenodo (https://zenodo.org/record/5779063#.Yg_aP-7MLCV) and GitHub (https://github.com/DaniMuzi/SocialMedia)46,47.
References
Ahmad, A. N. Is Twitter a useful tool for journalists? J. Media Pract. 11, 145–155 (2010).
Kwak, H., Lee, C., Park, H. & Moon, S. What is Twitter, a social network or a news media? In Web Conf. 2010 – Proc. World Wide Web Conf. WWW 2010, 591–600 (2010).
Pierri, F. et al. The impact of online misinformation on us covid-19 vaccinations. arXiv preprint arXiv:2104.10635 (2021).
Yang, K.-C., Torres-Lugo, C. & Menczer, F. Prevalence of low-credibility information on Twitter during the covid-19 outbreak. Proc. ICWSM Intl. Workshop on Cyber Social Threats (CySoc) https://doi.org/10.36190/2020.16 (2020).
Yang, K.-C. et al. The covid-19 infodemic: Twitter versus Facebook. Big Data Soc. 8, 20539517211013861 (2021).
Phillips, M. & Lorenz, T. ‘dumb money’ is on GameStop, and it’s beating wall street at its own game. The New York Times (2021).
Dalla Porta, L. & Copelli, M. Modeling neuronal avalanches and long-range temporal correlations at the emergence of collective oscillations: continuously varying exponents mimic m/eeg results. PLoS Comput. Biol. 15, e1006924 (2019).
Beggs, J. M. & Plenz, D. Neuronal avalanches in neocortical circuits. J. Neurosci. 23, 11167–11177 (2003).
Bak, P., Christensen, K., Danon, L. & Scanlon, T. Unified scaling law for earthquakes. Phys. Rev. Lett. 88, 178501 (2002).
Gleeson, J. P., Ward, J. A., O’sullivan, K. P. & Lee, W. T. Competition-induced criticality in a model of meme popularity. Phys. Rev. Lett. 112, 048701 (2014).
Barabasi, A.-L. The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211 (2005).
Karsai, M., Kaski, K., Barabási, A.-L. & Kertész, J. Universal features of correlated bursty behaviour. Sci. Rep. 2, 1–7 (2012).
Nishi, R. et al. Reply trees in twitter: data analysis and branching process models. Soc. Netw. Anal. Min. 6, 26 (2016).
Wegrzycki, K., Sankowski, P., Pacuk, A. & Wygocki, P. Why do cascade sizes follow a power-law? In Web Conf. 2017 – Proc. World Wide Web Conf. WWW 2017, 569–576 (2017).
Lerman, K. & Ghosh, R. Information contagion: an empirical study of the spread of news on Digg and Twitter social networks. In 4th Int. AAAI Conf. Web Soc. Media ICWSM 2010, vol. 4 (2010).
Munoz, M. A., Dickman, R., Vespignani, A. & Zapperi, S. Avalanche and spreading exponents in systems with absorbing states. Phys. Rev. E 59, 6175 (1999).
Onnela, J.-P. & Reed-Tsochas, F. Spontaneous emergence of social influence in online systems. Proc. Natl. Acad. Sci. USA. 107, 18375–18380 (2010).
Munoz, M. A. Colloquium: criticality and dynamical scaling in living systems. Rev. Mod. Phys. 90, 031001 (2018).
Sethna, J. P., Dahmen, K. A. & Myers, C. R. Crackling noise. Nature 410, 242–250 (2001).
Colaiori, F. Exactly solvable model of avalanches dynamics for barkhausen crackling noise. Adv. Phys. 57, 287–359 (2008).
Ódor, G. Universality classes in nonequilibrium lattice systems. Rev. Mod. Phys. 76, 663 (2004).
Sreenivasan, S., Chan, K. S., Swami, A., Korniss, G. & Szymanski, B. K. Information cascades in feed-based networks of users with limited attention. IEEE Trans. Netw. Sci. Eng. 4, 120–128 (2016).
Zhou, F., Xu, X., Trajcevski, G. & Zhang, K. A survey of information cascade analysis: Models, predictions, and recent advances. ACM Comput. Surv. 54, 1–36 (2021).
Cao, Q., Shen, H., Cen, K., Ouyang, W. & Cheng, X. Deephawkes: bridging the gap between prediction and understanding of information cascades. In Proc. ACM Int. Conf. Inf. Knowl. Manag., 1149–1158 (2017).
Oliveira, D. F. & Chan, K. S. Diffusion of information in an online social network with limited attention. Inf. Secur. 43, 362–374 (2019).
Bild, D. R., Liu, Y., Dick, R. P., Mao, Z. M. & Wallach, D. S. Aggregate characterization of user behavior in twitter and analysis of the retweet graph. ACM Trans. Internet Technol. 15, 1–24 (2015).
Weng, L., Flammini, A., Vespignani, A. & Menczer, F. Competition among memes in a world with limited attention. Sci. Rep. 2, 335 (2012).
Gleeson, J. P., O’Sullivan, K. P., Baños, R. A. & Moreno, Y. Effects of network structure, competition and memory time on social spreading phenomena. Phys. Rev. X 6, 021019 (2016).
Szabo, G. & Huberman, B. A. Predicting the popularity of online content. Commun. ACM 53, 80–88 (2010).
Li, W., Cranmer, S. J., Zheng, Z. & Mucha, P. J. Infectivity enhances prediction of viral cascades in Twitter. PLoS One 14, e0214453 (2019).
Notarmuzi, D., Castellano, C., Flammini, A., Mazzilli, D. & Radicchi, F. Percolation theory of self-exciting temporal processes. Phys. Rev. E 103, L020302 (2021).
O’Brien, J. D., Aleta, A., Moreno, Y. & Gleeson, J. P. Quantifying uncertainty in a predictive model for popularity dynamics. Phys. Rev. E 101, 062311 (2020).
Crane, R. & Sornette, D. Robust dynamic classes revealed by measuring the response function of a social system. Proc. Natl. Acad. Sci. USA. 105, 15649–15653 (2008).
Watson, H. W. & Galton, F. On the probability of the extinction of families. J.R. Anthropol. Inst. G.B. Irel. 4, 138–144 (1875).
Harris, T. E. et al. The Theory of Branching Processes, Vol. 6 (Springer Berlin, 1963).
Liggett, T. M. Interacting Particle Systems, Vol. 276 (Springer Science & Business Media, 2012).
Radicchi, F., Castellano, C., Flammini, A., Muñoz, M. A. & Notarmuzi, D. Classes of critical avalanche dynamics in complex networks. Phys. Rev. Res. 2, 033171 (2020).
Weng, L., Menczer, F. & Ahn, Y.-Y. Predicting successful memes using network and community structure. In 8th Int. AAAI Conf. Web Soc. Media ICWSM 2014, Vol. 8 (2014).
Vasconcelos, V. V., Levin, S. A. & Pinheiro, F. L. Consensus and polarization in competing complex contagion processes. J. R. Soc. Interface 16, 20190196 (2019).
State, B. & Adamic, L. The diffusion of support in an online social movement: evidence from the adoption of equal-sign profile pictures. In CSCW 2015 – Companion 2015 ACM Conf. Comput. Support. Coop. Work Soc. Comput., 1741–1750 (2015).
Hodas, N. O. & Lerman, K. The simple rules of social contagion. Sci. Rep. 4, 1–7 (2014).
Centola, D. & Macy, M. Complex contagions and the weakness of long ties. Am. J. Sociol. 113, 702–734 (2007).
Dodds, P. S. & Watts, D. J. A generalized model of social and biological contagion. J. Theor. Biol. 232, 587–604 (2005).
Guilbeault, D., Becker, J. & Centola, D. Complex contagions: a decade in review. Complex Spreading Phenomena in Social Systems 3–25 (2018).
Romero, D. M., Meeder, B. & Kleinberg, J. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on Twitter. In Web Conf. 2011 – Proc. World Wide Web Conf. WWW 2011, 695–704 (2011).
Notarmuzi, D., Castellano, C., Flammini, A., Mazzilli, D. & Radicchi, F. GitHub. https://github.com/DaniMuzi/SocialMedia (2021).
Notarmuzi, D., Castellano, C., Flammini, A., Mazzilli, D. & Radicchi, F. Zenodo. https://zenodo.org/record/5779063#.Ybhyi33P1Yg (2021).
Clauset, A., Shalizi, C. R. & Newman, M. E. Power-law distributions in empirical data. SIAM Rev. 51, 661–703 (2009).
Haldeman, C. & Beggs, J. M. Critical branching captures activity in living neural networks and maximizes the number of metastable states. Phys. Rev. Lett. 94, 058101 (2005).
Friedman, N. et al. Universal critical dynamics in high resolution neuronal avalanche data. Phys. Rev. Lett. 108, 208102 (2012).
Shriki, O. et al. Neuronal avalanches in the resting meg of the human brain. J. Neurosci. 33, 7079–7090 (2013).
Michard, Q. & Bouchaud, J.-P. Theory of collective opinion shifts: from smooth trends to abrupt swings. Eur. Phys. J. B 47, 151–159 (2005).
Kobayashi, R. & Lambiotte, R. Tideh: time-dependent Hawkes process for predicting retweet dynamics. In 10th Int. AAAI Conf. Web Soc. Media ICWSM 2016, Vol. 10 (2016).
Zhao, Q., Erdogdu, M. A., He, H. Y., Rajaraman, A. & Leskovec, J. Seismic: a self-exciting point process model for predicting tweet popularity. In Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 1513–1522 (2015).
Matsubara, Y., Sakurai, Y., Prakash, B. A., Li, L. & Faloutsos, C. Rise and fall patterns of information diffusion: model and implications. In Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min, 6–14 (2012).
Rizoiu, M.-A. et al. Expecting to be hip: Hawkes intensity processes for social media popularity. In Web Conf. 2017 – Proc. World Wide Web Conf. WWW 2017, 735–744 (2017).
Haimovich, D., Karamshuk, D., Leeper, T. J., Riabenko, E. & Vojnovic, M. Scalable prediction of information cascades over arbitrary time horizons. Preprint at arXiv:2009.02092 (2020).
Barzel, B. & Barabási, A.-L. Universality in network dynamics. Nat. Phys. 9, 673–681 (2013).
Hens, C., Harush, U., Haber, S., Cohen, R. & Barzel, B. Spatiotemporal signal propagation in complex networks. Nat. Phys. 15, 403–412 (2019).
University, I. OSoMe, Observatory on social media. https://osome.iu.edu (2020).
Twitter. Decahose stream. https://developer.twitter.com/en/docs/twitter-api/v1/tweets/sample-realtime/overview/decahose.
Baumgartner, J., Zannettou, S., Squire, M. & Blackburn, J. The pushshift telegram dataset. Proceedings of the International AAAI Conference on Web and Social Media. 14, 840–847 (2020).
Aliapoulios, M. et al. An early look at the parler online social network. Preprint at arXiv:2101.03820 (2021).
Fu, K.-w., Chan, C.-h. & Chau, M. Assessing censorship on microblogs in china: discriminatory keyword analysis and the real-name registration policy. IEEE Internet Comput. 17, 42–50 (2013).
Basile, V., Peroni, S., Tamburini, F. & Vitali, F. Topical tags vs non-topical tags: towards a bipartite classification? J. Inf. Sci. 41, 486–505 (2015).
Stauffer, D. & Aharony, A. Introduction to Percolation Theory (CRC Press, 2018).
Acknowledgements
F.R. acknowledges support from the National Science Foundation (CMMI-1552487). D.N. was partially funded by the National Science Foundation NRT grant 1735095. Any opinions, findings, and conclusions or recommendations expressed in this work are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. A.F. acknowledges support from DARPA award HR001121C0169.
Author information
Authors and Affiliations
Contributions
D.N., C.C., A.F., D.M., and F.R. designed the experiments and wrote the paper. D.N. performed the data collection and the experiments.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Baruch Barzel, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Notarmuzi, D., Castellano, C., Flammini, A. et al. Universality, criticality and complexity of information propagation in social media. Nat Commun 13, 1308 (2022). https://doi.org/10.1038/s41467-022-28964-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-022-28964-8
This article is cited by
-
Environmental communication strategies in green consumption: spatiotemporal shifts across six domains revealed by social big data
Environment, Development and Sustainability (2025)
-
MIGCL: Fake news detection with multimodal interaction and graph contrastive learning networks
Applied Intelligence (2025)
-
Collective dynamics behind success
Nature Communications (2024)
-
Network toxicity analysis: an information-theoretic approach to studying the social dynamics of online toxicity
Journal of Computational Social Science (2024)