Let \(f\left(x\right)\) be a density and \(\pi\left(x\right)\) be a function satisfying \(0\leq\pi\left(x\right)\leq1\). In other words, \(\pi\left(x\right)\) is a probability for every \(x\). Then \(g\left(x\right)\propto f\left(x\right)\pi\left(x\right)\) is density, since \(\rho=\int f\left(x\right)\pi\left(x\right)dx<1\). This is an example of a , a class of models introduced by Rao (1965). Since \(p\left(x\right)\) is a probability, we can call this a . This note views rejection sampling (Neumann 1951) as sampling from a particular sort of probability weighted density. The intuitively correct way to sample from a probability weighted density is by using a natural form of rejection sampling. The sampling consists of two steps: First sample a proposal \(y\) from \(f\), then accept it with probability \(\pi\left(y\right)\).


The following methods gives samples from the probability weighted density \(g=\rho^{-1}f\left(x\right)\pi\left(x\right)\): i.) Sample a proposal \(y\) from \(f\), ii.) accept it with probability \(\pi\left(y\right)\). Moreover, if \(T\) is the number of proposals \(y\) until success, then \(T\) is geometric with probability parameter \(\rho\).


Let \(\left\{ Y_{i}\right\} _{i=1}^{\infty}\) be a sequence of proposals sampled from \(f\), let \(\left\{ Z_{i}\right\} _{i=1}^{\infty}\) be conditionally \(\textrm{Bernoulli}\left(\pi\left(Y_{i}\right)\right)\), and let \(T\) be the first time \(Z_{i}=1\). Letting \(V=Y_{T}\), the mathematical formalization of the algorithm above is to let \(V\) be our sample. Our goal is to prove that \(p\left(v\right)=g\left(v\right)\), where \(p\left(v\right)\) is the density of \(V\). First observe that \(p\left(T=1,v\right)=\pi\left(v\right)f\left(v\right)\). Using this observation we see that \(p\left(T=1\right)=\int f\left(y\right)\pi\left(y\right)dy=\rho\), and since the attempts are independent, the distribution of \(T\) is geometric. As \(V\) is independent of \(T\), \(p\left(v\mid T=1\right)=p\left(v\right)\), and since \(p\left(v\mid T=1\right)=p\left(T=1,v\right)\rho^{-1},\)we get that \(p\left(v\right)=\rho^{-1}\pi\left(v\right)f\left(v\right)\) as claimed.

Connection to rejection sampling

Assume \(f,g\) are densities such that \(f\left(x\right)\leq Mg\left(x\right)\) for every \(x\). Then \(f\) is equal to the probability weighted model \(f\left(x\right)\propto g\left(x\right)p\left(x\right)\), where \(p\left(x\right)=\frac{f\left(x\right)}{Mg\left(x\right)}\). The usual rejection sampling methodology follows, together with the common remark (in e.g. Robert and Casella (2013)) that \(M\) is the mean number of proposals before rejection.


Neumann, John von. 1951. “Various Techniques Used in Connection with Random Digits.” Applied Math Series 12 (36-38): 1.

Rao, C Radhakrishna. 1965. “On Discrete Distributions Arising Out of Methods of Ascertainment.” Sankhyā: The Indian Journal of Statistics, Series A, 311–24.

Robert, Christian, and George Casella. 2013. Monte Carlo Statistical Methods. Springer Science & Business Media.