diff --git a/docs/index.rst b/docs/index.rst index 247402e..c6badbc 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -96,6 +96,14 @@ Advanced examples auto_examples/index +*********************** +Stochastic mux analysis +*********************** +.. toctree:: + :maxdepth: 2 + + muxanalysis + ************* API Reference ************* diff --git a/docs/muxanalysis.rst b/docs/muxanalysis.rst new file mode 100644 index 0000000..4cd4cb9 --- /dev/null +++ b/docs/muxanalysis.rst @@ -0,0 +1,195 @@ +.. _muxanalysis: + +Analysis of Stochastic Mux +========================== + +:ref:`mux` objects (*mux* for short, *muxen* for plural) allow multiple :ref:`Streamer` objects to +be combined into a single stream by selectively sampling from each constituent stream. +The different kinds of mux objects provide different behaviors, but among them, and among +them, ``StochasticMux`` is the most complex. +This section provides an in-depth analysis of ``StochasticMux``'s behavior. + + + +Stream activation and replacement +--------------------------------- + +``StochasticMux`` differs from other muxen (``ShuffledMux``, ``RoundRobinMux``, etc.) by +maintaining an **active set** of streamers from the full collection it is multiplexing. +At any given time, samples are drawn only from the active set, while the remaining streamers are +**inactive**. +Each active streamer is limited to produce a (possibly random) number of samples, after which, it is removed from +the active set and replaced by a new streamer selected at random; hence the name **StochasticMux**. + +A key quantity to understand when using ``StochasticMux`` is the streamer replacement rate: how +often should we expect streamers to be replaced from the active set, as a function of samples +generated by the mux? +This quantity is important for a couple of reasons: + + * If we care about the distribution of samples produced by ``StochasticMux`` being a good + approximation of what you would get if all streamers were active simultaneously (i.e., + ``ShuffledMux`` behavior), then the streamer replacement rate should be small. + * If we have large startup costs involved with activating a streamer (e.g., loading data + from disk), then streamer replacement should be infrequent to ensure high throughput. + What's more, replacement events should be spread out among the active set, to avoid having several replacement events in a short period of time. + +In the following sections, we'll analyze replacement rates for the different choices of rate +distributions (`constant`, `poisson`, and `binomial`). +We'll focus the analysis on a single (active) streamer at a time. +The question we'll analyze is specifically: how many samples :math:`N` must we generate (in +expectation) before a specific streamer is deactivated and replaced? +Understanding the distribution of `N` (its mean and variance) will help us understand how often +we should expect to see streamer replacement events. + + +Notation +-------- + +Let :math:`A` denote the size of the active set, let :math:`r` denote the number of samples +generated by a particular streamer, and let :math:`p` denote the probability of selecting the +active streamer in question. +We'll make the simplifying assumption that the ``weights`` attached to all streamers are +uniform, i.e., :math:`p = 1/A`. + + +Constant distribution +--------------------- + +When using the ``constant`` distribution, the sample limit :math:`r` is fixed in advance. +Our question about the number of samples generated by StochasticMux can then be rephrased +slightly: +how many samples :math:`K` must we draw from *all other active streamers* before drawing the +:math:`r`\ th sample from the streamer under analysis? + +This number :math:`K` is a random variable, modeled by the `negative binomial distribution `_: + +.. math:: + + \text{Pr}[K = k] = {k + r - 1 \choose k} {(1-p)^k p^r} + + +It has expected value + +.. math:: + + \text{E}[K] = r \cdot \frac{1-p}{p}, + +and variance + +.. math:: + + \text{Var}[K] = r \cdot \frac{1-p}{p^2}. + + +The total number of samples produced by the mux before the streamer is replaced is now a random +variable :math:`N = K + r`. +We can use linearity of expectation to compute its expected value as + +.. math:: + + \text{E}[N] = \text{E}[K] + r = r \cdot\frac{1-p}{p} + r = \frac{r}{p}. + + +Since :math:`N` and :math:`K` differ only by a constant (:math:`r`), they have the same +variance: + +.. math:: + + \text{Var}[N] = \text{Var}[K]. + + +If we apply the simplifying assumption that streamers are selected uniformly at random (:math:`p += 1/A`), then we get the following: + + * :math:`\text{E}[N] = r \cdot A`, and + * :math:`\text{Var}[N] = r \cdot A \cdot (A-1)`. + +In plain language, this says that the streamer replacement rate scales like the product of the size of the active set and the number of samples per streamer. +Making either of these values large implies that we should expect to wait longer to replace an active streamer. +However, the variance of replacement event times is approximately **quadratic** in the size of the active set. +This means that making the active set larger will increase the dispersion of replacement events away from the expected value. + + +Poisson distribution +-------------------- + +In pescador version 2 and earlier, the sample limit :math:`r` was not a constant value, but a +random variable :math:`R` drawn from a Poisson distribution with rate parameter :math:`\lambda`. +The analysis above can mostly be carried over to handle this case, though it does not lead to a +closed form expression for :math:`\text{E}[N]` or :math:`\text{Var}[N]` because we must now +marginalize over the variable :math:`R`: + +.. math:: + + \text{Pr}[K=k] &= \sum_{r=0}^{\infty} \text{Pr}[K=k, R = r]\\ + &= \sum_{r=0}^{\infty} \text{Pr}[K=k ~|~ R = r] \times \text{Pr}[R=r]\\ + &= \sum_{r=0}^{\infty} {k + r - 1 \choose k} {(1-p)^k p^r} \times \frac{\lambda^r e^{-\lambda}}{r!} + + +While this distribution is still supported, it has been replaced as the default by a binomial +distribution mode which is more amenable to analysis. + +Binomial distribution +--------------------- + +In the binomial distribution mode, :math:`R` is a random variable governed by a binomial +distribution with parameters :math:`(m, q)`: + +.. math:: + + \text{Pr}[R=r] = {m \choose r} q^r (1-q)^{m-r}. + +(We will come back to determining values for :math:`(m, q)` later.) + +This distribution can be integrated with the negative binomial distribution above to yield a +straightforward computation of :math:`\text{Pr}[N]`. + +.. math:: + + \text{Pr}[N=n] &= \sum_{r=0}^{\infty} \text{Pr}[K=n-r ~|~ R= r] \times \text{Pr}[R=r]\\ + &= \sum_{r=0}^{\infty} {n-1 \choose n-r} {\left(1-p\right)}^{n-r} p^r \cdot {m \choose r} q^r {(1-q)}^{m-r}. + +If we set :math:`q = 1-p`, this simplifies as follows: + +.. math:: + + \text{Pr}[N=n] &= \sum_{r=0}^{\infty} {n-1 \choose n-r} {\left(1-p\right)}^{n-r} p^r \cdot {m \choose r} {(1-p)}^r p^{m-r}\\ + &= \sum_{r=0}^{\infty} {n-1 \choose n-r} {\left(1-p\right)}^n p^m {m \choose r}\\ + &= {\left(1-p\right)}^n p^m {n + m - 1\choose n}. + +This distribution again has the form of a negative binomial with parameters :math:`(m, 1-p)`. +If we further set + +.. math:: + + m = \frac{\lambda}{1-p} + +for an expected rate parameter :math:`\lambda > 0` (as in the Poisson case above), then the +distribution :math:`\text{Pr}[N=n]` is + +.. math:: + + N \sim \text{NB}\left(\frac{\lambda}{1-p}, 1-p\right), + +where NB denotes the probability mass function of the negative binomial distribution. +This yields: + + - :math:`\text{E}[R] = \lambda`, + - :math:`\text{E}[N] = \lambda / p`, and + - :math:`\text{Var}[N] = \lambda \frac{1-p}{p^2}`. + +These match the analysis of the constant-mode case above, except that the number of samples per +streamer is now a random variable with expectation :math:`\lambda`. +Again, in the special case where :math:`p=1/A`, we recover + + - :math:`\text{E}[N] = \lambda A`, and + - :math:`\text{Var}[N] = \lambda A (A-1)`. + + +Limiting case :math:`p=1` +------------------------- + + +Discussion and recommendations +------------------------------ +