Quantcast
Channel: Sum of 'the first k' binomial coefficients for fixed $N$ - MathOverflow
Viewing all articles
Browse latest Browse all 16

Answer by Vili for Sum of 'the first k' binomial coefficients for fixed $N$

$
0
0

Since you are interested in information theory you might want to consider the following bound.In the limit of large N and setting k/N = p, there is a simple trick (I learned it from "Information, Physics and Computation", by Mezard and Montanari, but it seems to be well known in the Statistical physics community):

${N \choose k} \sim 2^{N H_2(p)}$

where $H_2(p) = -p\log_2(p) - (1-p)\log_2(1-p)$ is the binary entropy function. You can do most of the computations from this approximation (it serves also as upper bound for $6\leq Nk(N-k)$) and get rather simple results, but if you want to be more precise you can work out with Stirling's approximation and get

$\frac{2\pi}{e}\sqrt{\frac{1}{Np(1-p)}} 2^{N H_2(p)} \leq {N \choose k} \leq \frac{e}{2\pi}\sqrt{\frac{1}{Np(1-p)}} 2^{N H_2(p)}$

In the case where $p<1/2 \iff k< N/2$, you can move to calculus and get the upper bound

$\sum_{i=0}^{k}{N \choose k}\leq \frac{e}{2\pi}\sqrt{\frac{1}{Np(1-p)}} \sum_{i=0}^{k} 2^{N H_2(k/N)}\approx \frac{e}{2\pi}\sqrt{\frac{1}{Np(1-p)}} \int_{x=0}^p 2^{N H_2(x)}dx$

and the integral can be approximated by setting $H_2(x)\leq H_2(p) + \left.\frac{dH_2(x)}{dx}\right|_{x=p}(x-p) = H_2(p) + r (x-p)$, where $r = -\log_2{\frac{p}{1-p}}$. Then

$\int_{x=0}^p 2^{N H_2(x)}dx \leq 2^{N H_2(p)-Nrp}\int_{x=0}^p 2^{Nrx}dx = 2^{N H_2(p)} \dfrac{p^2}{1-p} \dfrac{1-\left(\frac{p}{1-p}^{-pN}\right)}{N\log_2\left(\frac{p}{1-p}\right)}$.

Note that the approximation becomes better and better with larger $N$, because most of the mass of the integral is at the approximation point. With large $k$,$\sum_{i=0}^{k}{N \choose k} \leq \frac{e}{2\pi}\left(\frac{p}{N(1-p)}\right)^{\frac{3}{2}} \log_2^{-1}\left(\frac{p}{1-p}\right)2^{N H_2(p)} $

which corresponds to having the exponential integral having $-\infty$ as lower bound.

If $k>N/2$, we can solve this by Laplace's method, which is very similar but it considers the case where the approximation is done at $p=1/2$, since that's the region that contains most of the mass. Then we can just make the calculation

$\int_{x=0}^p 2^{N H_2(x)}dx \leq 2^{N}\int_{x=0}^p e^{N \ln(2)\left.\frac{d^2H_2(x)}{dx^2}\right|_{x=1/2}(x-0.5)^2}dx= 2^{N}\int_{x=0}^p e^{-4N(x-0.5)^2}dx = 2^{N}\Phi\left(4N(p-0.5)\right)$

where $\Phi$ is the CDF of the Gaussian distribution. Again, when $N$ is large and $p-0.5\gg1/N$, $\Phi\left(4N(p-0.5)\right)$ is almost $1$, so we can ignore it. Then, the whole thing convergest to $2^N$, which simply means that most of the mass on thas sum is on the central values ($k\approx N/2$).


Viewing all articles
Browse latest Browse all 16

Trending Articles