A Generative Model for Extremely Sparse
Edge-Exchangeable Networks
Abstract
We propose a graph generative model for sequences of extremely sparse, edge-exchangeable networks. Models for sparse graphs often face a trade-off between desirable properties like exchangeability and the ability to capture the sparsity observed in real-world networks. While models based on vertex or edge exchangeability have successfully generated sparse graphs, achieving the ”extremely sparse” regime, where the number of edges scales near-linearly with the number of nodes, has remained a challenge. Recently, a novel Completely Random Measure (CRM) was introduced, demonstrating that this rate could be achieved within the vertex-exchangeable framework of Caron and Fox. This paper extends this work by demonstrating that this new CRM can be integrated into the alternative edge-exchangeable framework to achieve extreme sparsity.
1 Introduction
The proliferation of large-scale network data across diverse scientific domains, from microbiology and ecology to economics and the social sciences, has revealed a set of unifying structural properties. Among these, sparsity stands out as one of the most fundamental. For a simple graph with nodes, the number of possible edges scales quadratically, specifically as . However, empirical networks rarely approach this theoretical maximum. We formalize this observation by defining a graph sequence as dense if its edge count scales quadratically with its node count, and sparse if the scaling is sub-quadratic. Real-world networks are overwhelmingly sparse. While they also display other important features like power-law degree distributions, the small-world phenomenon, and community structures, our focus here is squarely on the challenge of modeling sparsity.
Constructing a statistically coherent model for sparse networks is notoriously difficult, as sparsity is often incompatible with desirable modeling assumptions. For instance, nodes exchangeability—the property that a model’s distribution is invariant to the relabelling of nodes—is a desirable statistical property for a generative model. However, the Aldous-Hoover theorem (Aldous, 1981; Hoover, 1979) presents a major obstacle, demonstrating that any graph model satisfying this property must be either dense or empty (see also Orbanz and Roy, 2015).
To circumvent this limitation, several distinct modeling paradigms have been developed. Preferential attachment models (Barabási and Albert, 1999; Bollobás, 2001), for example, abandon exchangeability entirely to generate sparse graphs.
Other lines of research have focused on weakening the exchangeability assumption. One prominent approach, initiated by Caron and Fox (Caron and Fox, 2017), substitutes nodes exchangeability with the more flexible notion of Kallenberg exchangeability (Kallenberg, 2005; Veitch and Roy, 2015; Borgs et al., 2018). This framework has spurred a rich body of work modeling various network phenomena, such as overlapping communities (Todeschini et al., 2020; Miscouridou et al., 2026), clustering (Caron et al., 2023), dynamic evolution (Naik et al., 2022), and core-periphery structures (Naik et al., 2021). An alternative strategy involves replacing nodes exchangeability with edge exchangeability (Crane and Dempsey, 2015, 2016; Broderick and Cai, 2016; Cai et al., 2016). Models in this class also achieve sparsity. Both of these approaches employ Bayesian nonparametric tools like Completely Random Measures (CRMs). These models have been shown to generate graph sequences where the number of edges, , scales as for .
A recent breakthrough by Kilian et al. (2025) has redefined the achievable level of sparsity within the Caron-Fox framework. They introduced a novel CRM that produces graph sequences where , with being a slowly varying function. This behavior is termed extremely sparse. This paper builds directly on that insight. Our central contribution is to adapt the methodology from Kilian et al. (2025) to achieve this same extreme sparsity within the edge-exchangeable model class.
The remainder of this paper is organized as follows. Section 2 provides a review of the edge-exchangeable network model. Section 3 and Section 4 contain our primary results: we first generalize the sparsity theorem from Cai et al. (2016) to encompass the extremely sparse regime, and then we introduce the Beta process with rapid variation, demonstrating its utility in constructing an extremely sparse, edge-exchangeable model. Finally, Section 6 presents simulation studies validating our theoretical findings. All deferred proofs and definitions of asymptotic notation are available in the appendix.
2 Edge-exchangeable graph sequences
The notion of an edge-exchangeable graph appeared around 2015, in the work of Crane and Dempsey (Crane and Dempsey, 2015, 2016) and of Broderick, Cai, and Campbell (Broderick and Cai, 2016; Cai et al., 2016), with some results presented in workshops prior to publication. In what follows, we are inspired by the presentation in Cai et al. (2016). This section provides an introduction to the topic; we refer the reader to the aforementioned references for more details.
2.1 Permutation Invariance to Edge Arrival Order
Let be a sequence of (multi)graphs, where each graph consists of a finite set of vertices and a finite multiset of timestamped edges . We assume the sequence is projective (or growing) i.e., and .
Each (timestamped) edge is a tuple whose first entry is a set of two vertices in and whose second entry is a timestamp corresponding to the step at which this edge first appeared. As in the Caron-Fox model, we are only interested in vertices involved in at least one edge. Consequently, we can define as the set of all vertices that appear in the edges of , in which case the graph is completely defined by its edge set .
Definition 2.1 (Cai et al., 2016 Definition 2.4).
Consider a random graph sequence defined as above with . The sequence is (infinitely) edge-exchangeable if for every and for every permutation of the steps , , where has the edge set .
In other word, the distribution of the generative model is invariant under any finite permutation of the order in which the edges arrived. This model (and its inherent symmetry) can be connected to partition, feature allocation and trait allocation; these connections are explained in Broderick and Cai (2016) and Campbell et al. (2018).
2.2 A Bayesian Nonparametric Model
We now present a graph generative model that exhibits edge-exchangeability, which we will later prove can generate extremely sparse graphs. This model, referred to as the graph frequency model in Cai et al. (2016), is similar to the Caron-Fox model from Caron and Fox (2017) but with several key differences. These differences explain why one model is edge-exchangeable while the other is Kallenberg-exchangeable. A comparison of these two models is provided in Cai et al. (2016), where the authors also prove that the two notions are distinct.
In the graph frequency model, we consider a countably infinite set of latent vertices, indexed by the positive integers . Associated with these vertices is an infinite collection of edge labels with values in and edge probabilities with values in . For any given (potentially random) choice of these labels and probabilities, we define the measure on as:
If either the labels or the probabilities (or both) are random, then is a discrete random measure on . Given , the graph sequence is constructed recursively. We initialize with . Then, at each step , we form a new edge set, , by sampling the multiplicity for every possible edge according to:
We then add copies of the edge with timestamp to . Finally, the edge set is updated as . This process generates multigraphs, potentially adding multiple edges at each time step.
Proposition 2.2 (Cai et al., 2016).
The sequence of multigraphs generated via the preceding method is edge-exchangeable.
Proof.
Conditional on , the formation of an edge at any step is an independent event with an identical probability distribution. The exchangeability of the time stamps follows directly. ∎
This model can be implemented with a random measure constructed from a Poisson point process. Let be a Poisson process on with a non-atomic, -finite rate measure that satisfies and 111These two conditions on ensure that is a countably infinite collection of weights in and that their sum a.s.. We can then use to define the set of edge probabilities as for , and (to prevent self-loops), where each . The edge labels can be sampled independently and uniformly from . With this setup, becomes a homogeneous CRM on with no deterministic or fixed atomic components (Kingman, 1967, 1993; Lijoi and Prünster, 2010). The choice of the CRM , and therefore the choice of the measure , strongly influences the properties of the resulting edge-exchangeable graph sequence.
While this model generates multigraphs, it can be readily adapted to produce simple graphs . This is done by setting and defining as the set of unique edges in (i.e., those with a multiplicity of at least one). Although the resulting simple graphs do not inherit the edge-exchangeability property, they retain many characteristics of the original multigraphs, most notably their sparsity. In what follows, we consider only multigraphs.
3 Extreme Sparsity
We denote the number of vertices and edges in the multigraph, respectively, as:
where is the multiplicity of the edge at time . The following theorem states that an edge-exchangeable model constructed with a measure that has a rapidly varying tail is extremely sparse.
Theorem 3.1.
Assume that, as ,
| (1) |
where , is slowly varying, and
| (2) |
for all sufficiently large . If then
| (3) |
In particular, if , then
| (4) |
This result extends Theorem 5.3 in Cai et al. (2016) to the extremely sparse regime. The main difference is that we consider rapidly varying measures, whereas they consider regularly varying measures that are not rapidly varying. This crucial distinction allows us to achieve a better sparsity rate. The proof methods rely on conditioning on the point process , after which the approach is similar to that of (Janson, 2018).
There exist several families of slowly varying functions that satisfy all the required assumptions in the theorem. The most useful one is the logarithmic family for . However, these also include more log-based families, such as the iterated log family for , where is the logarithm iterated times, and the logarithmic exponential family for . Additionally, there are Lambert -based families, such as the Lambert family for , where is the principal branch of the Lambert function, or more generally the iterated Lambert family , where is the iterated principal branch of the Lambert function.
4 Beta process with rapid variation
We define the Beta process with rapid variation (RapidBeta) as the process whose rate measure on has the density:
| (5) |
where the parameters satisfy and . For the special case where , this measure recovers the mixture of stable densities from Kilian et al. (2025), restricted to the range . Figure 2 displays the RapidBeta density for and illustrates the effects of the parameters , , and . The effect of can be understood by adapting Proposition 1 from Kilian et al. (2025):
Proposition 4.1.
If , the density of the rate measure satisfies:
If , the density of the rate measure satisfies:
Additionally, one can verify that the two necessary conditions for the Poisson process construction hold: and (see appendix for the proof). It follows that an edge-exchangeable graph sequence constructed using the RapidBeta measure with satisfies the condition of Theorem 3.1. Therefore, this construction yields an extremely sparse graph sequence.
While the asymptotic sparsity regime of the RapidBeta construction is entirely determined by , the lower endpoint plays a non-negligible role at finite sample sizes. Indeed, the Lévy density is obtained by integrating contributions of the form over , so that components with larger (i.e., closer to 1) dominate the small- behavior responsible for the asymptotic regime. However, when is substantially below 1, the mixture includes terms with smaller exponents, which decay more slowly and can materially influence the effective behaviour over the range of weights accessible in finite simulations. As a consequence, although all choices with are asymptotically equivalent and lead to extreme sparsity, selecting close to 1 concentrates the mixture on exponents near the critical boundary and suppresses lower-order contributions, thereby accelerating the emergence of the extreme-sparse regime in practice.
5 Simulation of the RapidBeta Process
To simulate an edge-exchangeable graph sequence as described in Section 2, we must first generate the Poisson point process
When is a RapidBeta process, the associated Lévy measure satisfies , implying that is almost surely infinite. Consequently, exact simulation is impossible, and we instead consider a truncated version.
5.1 Truncated RapidBeta Process
For a fixed , define the truncated Poisson process
| (6) |
which contains only finitely many atoms a.s. Let us define the intensity measure as
| (7) |
Sampling from is equivalent to sampling a 2D Poisson point process on with intensity , and discarding the first dimension. Sampling from is performed analogously but with the 2D Poisson point process taken on .
5.2 Partitioned Thinning Scheme
To construct an efficient exact sampler for , we partition into two regions:
| (8) |
where is fixed. On , we use the bound
| (9) |
where . On , we use the bound
| (10) |
These bounds define valid dominating measures on each region, allowing exact sampling via thinning as described in detail in Algorithm 1.
5.3 Correctness
Theorem 5.1.
The set of weights produced by the thinning procedure described above is a realization of a Poisson point process on with intensity i.e. the -truncated RapidBeta process.
5.4 Truncation error induced by the -approximation
The -approximation replaces the full weight collection by the truncated set . A natural measure of the approximation error is the discarded total mass
| (11) |
Proposition 5.2 (Discarded mass under truncation).
Assume , then, as ,
| (12) |
In particular, the total mass discarded by truncating the CRM at level vanishes at a logarithmic rate.
5.5 Diagnostics
To asses the quality of our algorithm in practice we perform two diagnostics. The first diagnostic compares the empirical tail counts , averaged over repeated samples, to the theoretical tail measure , leveraging the property that for a Poisson process. Agreement between empirical and theoretical curves indicates that the marginal tail behavior and intensity are correctly captured. The second diagnostic examines the transformed order statistics by plotting against the rank k. For a correctly specified Poisson point process, these transformed values align with the identity line, reflecting the fact that the ordered points map to approximately unit-rate arrivals. Together, these diagnostics provide complementary validation: the former targets first-moment properties of the tail, while the latter probes the global distributional and structural consistency of the point process. In practice, our algorithm provides good performance; see Figure 3.
6 Simulations
Once the weight generation procedure is available, we can generate the sequence of graphs following the graph frequency model (see Algorithm 2).
We compare our method to graph sequences generated by a three-parameter Beta process, as described in (Cai et al., 2016). We sample the weights of the three-parameter Beta process using a Poisson thinning construction based on its Lévy measure. Recall that the Beta process is a discrete random measure whose atom weights are generated by a Poisson point process with Lévy density
| (13) |
where and . We simulate the associated Poisson process by first drawing candidate jumps from a dominating stable Lévy density proportional to , whose inverse tail admits a closed-form expression. Each proposed jump is then accepted with probability , yielding an exact thinning scheme whenever . This approach avoids the numerical instabilities of stick-breaking representations (Broderick et al., 2012) when is close to one, while preserving the correct distribution of large weights.
The simulations are displayed in Figure 4, as expected, the RapidBeta process produces sparser graph sequences than the three Beta process. We also provide a comparison to the Barabási-Albert model (Barabási and Albert, 1999), which produces even sparser sequences () but lacks exchangeability.
The code used to generate the edge-exchangeable graphs can be found at: https://github.com/ValentinKil/ExtremelySparseEdgesExchangeable.git.
7 Conclusion
In this work, we presented a graph generative model for edge-exchangeable networks. We first showed that the extremely sparse regime can be achieved through a sequence of edge-exchangeable graphs. We then proposed a practical generative model that enables the simulation of these sequences. Our experimental results demonstrate that, despite necessary computational approximations, the model effectively captures the extremely sparse regime.
While Li and Campbell (2021) provide rigorous quantitative bounds on the error for fixed-rank truncation of edge-exchangeable networks, equivalent theoretical guarantees for the threshold -truncation employed in our sampler are not yet available. Extending their error analysis to weight-based thresholds is a promising direction for future work.
Extreme sparsity for edge-exchangeable simple graphs has previously been studied by Janson (2018), who proposed two examples that reach exact linear dependence (). However, the produced graphs exhibit some unwanted properties, such as bounded degrees or a structure composed mostly of stars. By relaxing the strict linearity requirement and accepting linear scaling up to a slowly varying function, we are able to generate a much richer, more realistic internal graph structure with the RapidBeta process; indeed, experiments show the emergence of a giant connected component. A direct application of Theorem 14 in Eriksson (2025) demonstrates that the RapidBeta graphs are almost surely not eventually forever connected, meaning there will always be some edges that do not join the giant component. To the best of our knowledge, there are no more precise results regarding the giant component of edge-exchangeable graphs, although Eriksson (2025) suggests some interesting directions.
Other properties of the generative model, such as the power-law behaviour of the degree distribution, the clustering coefficient, and the small-world property, remain to be explored. Additionally, developing inference procedures for RapidBeta edge-exchangeable graphs is a compelling avenue for future research.
Acknowledgements
I would like to thank François Caron for his insightful advice and thorough proofreading of this manuscript. I am funded by the Clarendon Scholarship.
References
- Representations for partially exchangeable arrays of random variables. Journal of Multivariate Analysis 11 (4), pp. 581–598. External Links: ISSN 0047-259X, Document Cited by: §1.
- Emergence of scaling in random networks. Science 286 (5439), pp. 509–512. Cited by: §1, §6.
- Regular variation. Vol. 27, Cambridge university press. Cited by: §G.4, §G.4, Proposition G.3, Proposition G.5, Theorem G.6, Lemma G.9, Appendix G.
- Random Graphs. Cambridge University Press. External Links: ISBN 978-0-521-79722-1 Cited by: §1.
- Sparse exchangeable graphs and their limits via graphon processes. JMLR 18 (210), pp. 1–71. Cited by: §1.
- Edge-exchangeable graphs and sparsity. arXiv:1603.06898. External Links: 1603.06898 Cited by: §1, §2.1, §2.
- Beta Processes, Stick-Breaking and Power Laws. Bayesian Analysis 7 (2), pp. 439–476. External Links: ISSN 1936-0975, 1931-6690 Cited by: §6.
- Edge-exchangeable graphs and sparsity. In Advances in Neural Information Processing Systems, Vol. 29. Cited by: §1, §1, §2.2, Definition 2.1, Proposition 2.2, §2, §3, §6.
- Exchangeable trait allocations. Electronic Journal of Statistics 12 (2), pp. 2290–2322. External Links: ISSN 1935-7524, 1935-7524 Cited by: §2.1.
- Sparse graphs using exchangeable random measures. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 79, pp. 1295–1366. Cited by: §1, §2.2.
- On sparsity, power-law, and clustering properties of graphex processes. Advances in Applied Probability 55 (4), pp. 1211–1253. External Links: ISSN 0001-8678, 1475-6064 Cited by: §1.
- A framework for statistical network modeling. arXiv:1509.08185. External Links: 1509.08185 Cited by: §1, §2.
- Edge exchangeable models for network data. arXiv:1603.04571. External Links: 1603.04571 Cited by: §1, §2.
- Edge Exchangeable Graphs: Connectedness, Gaussianity and Completeness. arXiv:2501.09511. External Links: 2501.09511, Document Cited by: §7.
- An introduction to probability theory and its applications. Vol. 2, John Wiley & Sons. Cited by: §G.4.
- On Tail Probabilities for Martingales. The Annals of Probability 3 (1), pp. 100–118. External Links: ISSN 0091-1798, 2168-894X, Document Cited by: Theorem H.2.
- Notes on the occupancy problem with infinitely many boxes: general asymptotics and power laws. Probab. Surv 4 (146-171), pp. 88. Cited by: Lemma G.8.
- Relations on Probability Spaces and Arrays of Random Variables. Institute for Advanced Study, Princeton. External Links: NJ08540 Cited by: §1.
- On Edge Exchangeable Random Graphs. Journal of Statistical Physics 173 (3), pp. 448–484. External Links: Document, ISSN 1572-9613 Cited by: Appendix C, §3, §7.
- Probabilistic Symmetries and Invariance Principles. Springer Science & Business Media. External Links: ISBN 978-0-387-28861-1 Cited by: §1.
- Rapidly Varying Completely Random Measures for Modeling Extremely Sparse Networks. arXiv:2505.13206. Cited by: §1, §4.
- Completely random measures. Pacific Journal of Mathematics 21 (1), pp. 59–78. Cited by: §2.2.
- Poisson processes. Vol. 3, Oxford University Press, USA. Cited by: §F.1.1, §2.2.
- Truncated simulation and inference in edge-exchangeable networks. Electronic Journal of Statistics 15 (2), pp. 5117–5157. External Links: ISSN 1935-7524, 1935-7524 Cited by: §7.
- Models beyond the Dirichlet process. In Bayesian Nonparametrics, N. L. Hjort, C. Holmes, P. Müller, and S. G. Walker (Eds.), Cited by: §2.2.
- Dynamic sparse graphs with overlapping communities. arXiv:2512.10717. Cited by: §1.
- Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press. External Links: ISBN 978-0-521-83540-4 Cited by: Appendix H.
- Bayesian Nonparametrics for Sparse Dynamic Networks. arXiv. External Links: 1607.01624, Document Cited by: §1.
- Sparse networks with core-periphery structure. Electronic Journal of Statistics 15 (1), pp. 1814–1868. External Links: ISSN 1935-7524, 1935-7524 Cited by: §1.
- Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (2), pp. 437–461. External Links: ISSN 0162-8828, 2160-9292, Document Cited by: §1.
- Exchangeable random measures for sparse and modular graphs with overlapping communities. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82 (2), pp. 487–520. External Links: Document, ISSN 1467-9868 Cited by: §1.
- Freedman’s inequality for matrix martingales. Electronic Communications in Probability 16 (none). External Links: ISSN 1083-589X, Document Cited by: Theorem H.2.
- The Class of Random Graphs Arising from Exchangeable Random Measures. arXiv:1512.03099. Cited by: §1.
Appendix A Asymptotic notation
Throughout this article, we write:
-
•
if there exists such that, for in a neighbourhood of , ;
-
•
if there exists such that, for in a neighbourhood of , ;
-
•
if both and hold;
-
•
if .
where . When and are random variables, the notation indicates that the relation holds almost surely. This usage of should not be confused with the standard distributional notation , which indicates that the random variable has distribution .
Appendix B Main lemma
Lemma B.1.
Let be the edge-exchangeable multigraph sequence constructed from the weights , and let and denote its associated vertex and edge counts. For , define the quenched means
Then, for almost every realization of such that, for every ,
| (14) |
we have
Proof.
Fix a realization of such that
| (15) |
This event has probability one by the assumptions on . Throughout the proof, we work conditionally on this realization of , so the numbers are deterministic.
Write
| (16) |
Since , we have
| (17) |
In particular, the conditional means below are finite for every fixed .
Conditionally on , the edge multiplicities , , are independent and satisfy
| (18) |
Therefore
| (19) |
Hence
| (20) |
Edge count.
For each , write
| (21) |
where the variables are independent Bernoulli random variables with success probability . Thus, is a countable sum of independent Bernoulli random variables with total mean . Applying Theorem H.1, we obtain, for every ,
Since almost surely, the right-hand side is summable in . Hence, by the Borel–Cantelli lemma,
| (22) |
almost surely, conditionally on .
Vertex count.
The vertex indicators are not conditionally independent. Indeed, the indicators that vertices and are active both depend on the edge variable . Therefore, one cannot prove the vertex part using the same Chernoff bound as for the edge count.
For , define
| (23) |
Then
| (24) |
We shall use a martingale bounded-differences argument.
First, truncate the edge set to the finite collection
| (25) |
Let be the number of active vertices in the graph obtained by keeping only the edge indicators with , and let
| (26) |
The variables are independent. Reveal them one at a time, in any deterministic order, and let be the Doob martingale
| (27) |
where is the chosen ordering of .
Changing a single edge multiplicity can change the number of active vertices by at most , because only the two endpoints of that edge can change their active/inactive status. Hence, the martingale increments satisfy
| (28) |
Moreover, if , then the conditional variance of the th martingale increment is bounded by
| (29) |
Indeed, conditionally on the previously revealed variables, the only remaining randomness in this step is the binomial variable . The functional depends on this variable only through whether or . Thus the conditional expectation after revealing can take two possible values, whose difference is at most , since changing the status of the edge can affect only the two endpoints and .
Therefore, the predictable quadratic variation of the martingale is bounded by
| (30) |
We can therefore apply Freedman’s inequality (Theorem H.2) to the martingale
| (31) |
with and . For every ,
| (32) |
Applying the same argument to yields the lower-tail bound
| (33) |
Combining the two inequalities, we obtain
| (34) |
Since
| (35) |
it follows that
| (36) |
Therefore, by Fatou’s lemma,
| (37) |
Applying the previous Freedman bound yields
| (38) |
Taking , we obtain
| (39) |
By the summability assumption (14) and the Borel–Cantelli lemma,
| (40) |
almost surely, conditionally on . ∎
Appendix C Proof of Theorem 3.1
Conditionally on the weights , our model is closely related to the rank-1 edge-exchangeable multigraph defined by (Janson, 2018). As a result, the proof strategy is formally parallel to that of (Janson, 2018): edge variables are independent, while vertex indicators are dependent only through shared incident edges.
We prove the following sharper result.
Theorem C.1.
Assume that, as , we have where , is slowly varying, and
| (41) |
for all sufficiently large . Let
| (42) |
Then and a.s., and
| (43) | ||||
| (44) |
Consequently, if , then
| (45) |
If moreover then
| (46) |
In particular, if with , then
| (47) |
Proof.
We write for the Poisson process of weights. Since we have almost surely, . Moreover,
| (48) |
Preliminary.
We first evaluate the asymptotic behaviour of the realised Poisson process.
| (49) |
Let us show that
| (50) |
Let . By Campbell’s theorem,
| (51) |
For any , the integral over is bounded by . Since as , this bounded term is . Thus, it is sufficient to analyse the behaviour near .
By hypothesis, for any , there exists such that for all :
| (52) |
Let . With the change of variables , we obtain
| (53) |
We partition this integral at , assuming that is sufficiently large so that .
For the first part, , we use the bound to obtain
| (54) |
Let us show that this upper bound is .First by Proposition G.3, We must now establish that . Recall that , and by assumption, this integral is finite. According to Proposition G.5, since is slowly varying and its tail integral converges, is also slowly varying and satisfies:
| (55) |
By inverting this relationship, we conclude that the first integral is bounded by .
For the second part, , we apply the inequalities for . Setting , we get:
| (56) |
The upper bound evaluates exactly to . For the lower bound, Proposition G.3, yields . Therefore, the subtractive error term scales as , which is .
Consequently,
| (57) |
and it follows that
| (58) |
To establish , we derive a specific concentration bound for . Since is a functional of the Poisson point process , its moment generating function is given by Campbell’s theorem: for any ,
| (59) |
For any , the term lies in the interval . By the convexity of the function , we have the bound . Applying this inequality to the integrand yields:
| (60) |
So the moment generating function of is bounded above by that of a Poisson random variable with parameter . Consequently, Chernoff’s inequality yields, for any ,
| (61) |
Let us define a geometric grid for a fixed . Since is regularly varying with index 1, it grows asymptotically faster than for any . Thus, grows exponentially with , meaning the sequence of probabilities decays exponentially and is therefore summable. By the Borel-Cantelli lemma, almost surely as .
To extend this convergence to the continuous parameter , we rely on the monotonicity of . For any , there exists an index such that . As is nondecreasing and is strictly increasing in , we have
| (62) |
Taking the limit as (and therefore ), and utilizing the regular variation of which guarantees , we obtain almost surely:
| (63) |
Since this holds for any arbitrarily small , taking the intersection of these almost sure events over a countable sequence proves that a.s. This establishes (50).
Edge count.
By the first part of Lemma B.1, we have established that, almost surely,
| (64) |
This is exactly Equation 43.
Vertex count.
We now study the number of active vertices. Conditionally on , the probability that vertex is active after interactions is
| (65) |
Hence We shall show that
| (66) |
Fix . The number of atoms with is finite almost surely, and their total contribution to is , which is negligible compared to . Thus, it is enough to consider atoms with . For such atoms, write
| (67) |
Then To bound , we use the inequalities , valid for . Applying the lower bound with yields:
| (68) |
For the upper bound, recall that we are only considering atoms where . Because the total sum of the weights is , every individual weight satisfies . This implies that for any pair, the product is bounded by . Since almost surely, we can choose sufficiently small such that . Applying the second inequality to the sum gives:
In total
| (69) |
Since , it follows that . Thus,
| (70) |
We also have , and since ,
| (71) |
Hence,
| (72) |
As the function is strictly monotonically increasing for , we have
| (73) |
Summing these inequalities over all atoms satisfying , we obtain
| (74) |
Recall that is almost surely finite. Consequently,
for every as . By substituting into the left-hand side, into the right-hand side, and incorporating the terms, we obtain the final bounded inequalities:
| (75) |
Using (50) and the slow variation of , we obtain
| (76) |
and
| (77) |
Letting proves (66).
Now, we apply Lemma B.1 conditionally on . Recall that the summability condition requires . We know that and . Because , grows strictly slower than , meaning the denominator of the exponent is dominated by . Consequently, the exponent scales asymptotically as . To lower bound this growth, we apply Lemma G.9 to the slowly varying function , which states that . This means that for any , and for all sufficiently large ,
| (78) |
Thus the summability condition holds. Therefore
| (79) |
Sparsity results.
We now prove Equation 45. Let Since is slowly varying, is regularly varying with index . According to Theorem G.6, any regularly varying function with index possesses an asymptotic inverse that is regularly varying with index . Therefore, has an asymptotic inverse which is regularly varying with index , satisfying as . We have proven
| (80) |
Because is regularly varying with index , it preserves asymptotic equivalence and satisfies for any constant . Applying to both sides yields:
| (81) |
Since and are strictly positive and finite almost surely, we can rearrange this asymptotic equivalence to isolate :
| (82) |
This implies the exact asymptotic bound:
| (83) |
Since , it follows that
| (84) |
We now prove Equation 46: assume that Since slow variation gives
| (85) |
Therefore
| (86) |
Since , we conclude that
| (87) |
Appendix D Proof of Proposition 4.1
When , consider
using the change of variables . Let . We have, as ,
Using the Tauberian Theorem G.7, we obtain, as
It follows that, as ,
The result when follows immediately as is a multiplicative constant.
Appendix E Conditions
We now prove that the RapidBeta measure satisfies the assumption and .
Divergence of
The integral is:
| (91) |
By Fubini’s theorem, we can swap the order of integration:
| (92) |
The inner integral over is the Beta function, . We identify the parameters by setting , which gives .
The Beta integral converges only if its parameters are positive. This requires , which implies .
However, the given constraint is . For any , we have . This violates the necessary condition . Because the inner integral
| (93) |
diverges for all in the domain, the entire expression for diverges.
Convergence of .
We want to show that By definition of , we have
| (94) |
Combining the powers of gives
| (95) |
By Tonelli’s theorem,
| (96) |
For , the inner integral is the Beta integral with parameters and . Since and , it is finite and satisfies
| (97) |
Therefore, for ,
| (98) |
Thus,
| (99) |
This identity is immediate when . If , the Beta integral itself is singular at the single point , but this point has Lebesgue measure zero. Moreover, has a continuous extension to , since
| (100) |
Hence the endpoint does not affect the value or finiteness of the integral. Finally, the function is continuous on the closed interval . Indeed, for all ,
| (101) |
and the gamma function is finite, strictly positive, and continuous on . Consequently,
| (102) |
Therefore,
| (103) |
Appendix F Sampling algorithm
F.1 Proofs
F.1.1 Proof of Theorem 5.1
On each region, and , the proposal intensity dominates the target intensity. By the thinning theorem for Poisson point processes (Kingman, 1993, Section 5.1), the accepted points on each region form independent Poisson point processes with an intensity equal to the target intensity restricted to that region.
Since and are disjoint, their superposition yields a Poisson point process on with intensity . Finally, marginalising out the latent dimension (which corresponds to the projection property of Poisson processes) yields a Poisson point process on with intensity:
| (104) |
This establishes that is a realisation of the -truncated RapidBeta process.
F.1.2 Proof of Proposition 5.2
By Campbell’s theorem for Poisson point processes,
| (105) |
Using Proposition 4.1 with , there exist constants and such that for all ,
| (106) |
Hence, for all sufficiently small ,
| (107) |
Now make the change of variables , so that . Then
| (108) |
Combining the upper and lower bounds yields
| (109) |
as claimed.
Appendix G Background on rapidly varying function
Most of the background material in this section originates from the book of Bingham et al. (1987).
G.1 Definitions
Definition G.1 (Slowly varying function).
A function is slowly varying at infinity if for all ,
Examples of slowly varying functions include , for , and functions converging to a constant .
Definition G.2 (Regularly varying function).
A function is regularly varying at infinity with exponent if for some slowly varying function . We note the set of all functions regularly varying at infinity with exponent . A function is regularly varying at if is regularly varying at infinity that is, for some .
G.2 Karamata theorems
The following propositions and corollaries relate to integrals of regularly varying functions.
Proposition G.3 (Karamata theorem).
See (Bingham et al., 1987, Propositions 1.5.8 and 1.5.10). Let for some locally bounded slowly varying function . Then
-
(i)
If
-
(ii)
If
The following corollaries will be useful.
Proposition G.4.
Let for some locally bounded slowly varying function .
-
(i)
If
-
(ii)
If
If is slowly varying and then converges and
| (110) |
Proof.
Let . We have . Writing , we obtain, using Proposition G.3(ii)
| (111) |
or
| (112) |
The case follows similarly, using Proposition G.3(i). ∎
Proposition G.5 (Proposition 1.5.9b, Bingham et al., 1987).
Let be slowly varying and suppose . Then is slowly varying and
| (113) |
G.3 Asymptotic inverse
Theorem G.6 (Theorem 1.5.12, Bingham et al., 1987).
Let with . Then there exists with
| (114) |
Here is unique to within asymptotic equivalence, and one version of is the generalized inverse .
G.4 Tauberian theorem
The following theorem is a variation of Bingham et al. (1987, Theorem 1.7.6 p.46), where the two limits at 0 and infinity are exchanged. The proof is similar. See also Feller (1971, Chapter XIII).
Theorem G.7 (Tauberian theorem).
Assume , , , convergent for , and a slowly varying function. Then
| (115) |
implies
| (116) |
Proof.
Write (this is finite for any as is convergent for any ), then is non-decreasing and by Proposition G.4
| (117) |
Then by Bingham et al. (1987, Theorem 1.7.1, p.38), this is equivalent to
| (118) |
Finally, note that . Thus the above equation is equivalent to
| (119) |
∎
G.5 Other useful results on regular variation
Lemma G.8 (Lemma 14, Gnedin et al., 2007).
For slowly varying, the relation
implies
with is another slowly varying function defined by .
Lemma G.9 (Proposition 1.3.6, Bingham et al., 1987).
If varies slowly then .
Appendix H More background results
Theorem H.1.
Let be a sequence of independent Bernoulli random variables with parameters such that . Let
| (120) |
Then, for every ,
| (121) |
Proof.
For , let
| (122) |
The sequence converges to .
Since
| (123) |
the first Borel–Cantelli lemma implies that the events occur only finitely often almost surely. Consequently, is almost surely finite, and the partial sums converge almost surely to .
Fix and consider the events
For any sample path such that , if , then for all sufficiently large ,
| (124) |
also holds, since . Therefore, the indicator functions satisfy
| (125) |
Taking expectations on both sides and applying Fatou’s lemma yields
| (126) |
By Corollary 4.6 of Mitzenmacher and Upfal (2005), for every ,
| (127) |
Consequently,
| (128) |
Combining the previous inequalities completes the proof. ∎