# Poisson Error Bars Histogram

Then we are given a random variable M with some finite mean and variance μM and σM2 and want to talk about the distribution of the sum of M of these Try it out The following Mathematica code generates bigN (N above, and 10000 here by default) samples from a histogram bucket - the distribution given has 30% of data points in

Where do the error bars go? So instead of thinking about our Nk weights Wj, let's think about N weights, Nk of which are the original Wj, and the rest of which are...

## Histogram Errors

We only have one set of real data so I tried using one model and generating a bunch (~30,000) of simulated images drawing the pixel values from the model and randomizing This makes perfect sense, really - every sample from the original population contributes to the sum of weights in this bin, it's just that lots of them contribute zero.

Plugging this into the boxed formula above, we get ... Phew.

Finally, for the variance of Nk, we note that we want essentially the variance of the fraction Nk/N of data points in this bin, which suggests using the multinomial distribution, which

## Poisson Errors

Here are two ways of looking at the problem: Skip to the good way... Please follow these instructions to install Firefox. And now if I take the PDF from one of those and compare it the predicted model PDF it gives a good chi squared using the errors from the simulation but

In addition, they look nicer when plotting with a yscale='log'. I then compute the mean of each bin from these simulations and the standard deviation. I'm fitting just using chi-square/least squares but if I use $\sqrt{N}$ as the error bars then there is basically never a "good" fit (reduced chi-squared of 3 or 4 for ~200

And now you know. It creates metaSamples such histograms, and then compares the actual standard deviation of the gathered distribution with the (average) estimate of the variance according to the above formula. The number of pixels in a bin is dependent upon the underlying model and how well your particular realization/image samples that underlying distribution. Summary of random sum formulae The two results we have proven are Note that we can explicitly see here how the two sources of error () identified above contribute to the

which is not a very flabbergasting result. Suppose you are taking N observations in total, and pick some particular bin, k.

This is pretty easy to work out using a so-called conditional expectation - the idea is that the expected value of an expression f(M,X1,...) depending on a random variable M can This manifests itself if we consider Nk/N small, as then the above variance is approximately Nk. There is a marker in the center of each bin and each bin has the requisite Poisson errorbar.

Because we no longer have any mention of Nk, and more importantly no horrible random sums.

