## Toss a coin – The Central Limit theorem

The central limit theorem of statistics is one of those facts of mathematics – it isn’t at all obvious why it should be true. It is, nevertheless, ubiquitous. Everything from mundane traffic patterns to the intensity level of a laser beam are governed by this law of nature.

The central limit theorem states that the many-times convolution with itself of any distribution with a finite mean and a finite variance will tend to a Gaussian. If the four bits of jargon in that last sentence are unfamiliar, fret not. I’ll see if I can’t try and explain it with an example (which was on a homework for some course. Herr Roy, I want credit for this.)

Let’s say you decide to toss a (fair) coin. You toss this coin a hundred times, note down how it lands each time, and tally up the results. You have a bunch of friends do the same thing, and have them give you their results as well. Or you realise that the number of friends you have isn’t nearly enough for a good experiment, and decide to have MATLAB do the coin-tossing experiment for you.

Either way, you note down how many times your coin landed heads-up out of 100, for each person/trial. This done, you plot a histogram of the results: for how many trials does the number of heads that turned up lie in certain bins? If you did get that statement of the central limit theorem, you would’ve realised that this distribution is just the convolution of a binomial distribution many times over (as many times as your number of trials).

I, obviously, had to go the MATLAB way. Which isn’t really all that bad – it took me four lines of code.

[Gaussian]

[/Gaussian]

It is worth noting that the number of trials matters a lot more towards getting a decent Gaussian than the number of tosses in each trial. It is also worth remembering that the distribution we started with was a binomial distribution – the number of heads from a (fair) coin tossed a given number of times. Many processes in nature have Gaussian distributions to start with, as well.

It can also be shown that in the limit of a large number of tosses and a fixed mean, the binomial distribution ‘becomes’ a Poisson distribution. Further, around its mean, the Poisson distribution ‘looks’ Gaussian.

Addendum: Anubhab Roy points out that the tails of the distribution formed by many-times convolving a distribution with itself are only approximately described by a Gaussian. He says the Gaussian over-predicts the tails. I did not know this. Thank you, Meister.

[End. Fini. Kaputski. Gaussian]

Aashishsaid, on November 3, 2010 at 11:04 amThe Gaussian-looking histogram reminds me of this image: (its some church in Iceland!)

http://www.stuckincustoms.com/2010/09/27/the-labyrinth-rocket/

Croor Singhsaid, on November 3, 2010 at 11:26 amAashish means this church. It looks like a nice place to visit. The picture is brilliant too.

Nairsaid, on November 3, 2010 at 4:33 pmWhich pattern is more common HTH or HTT? I know the answer but just wanted to know what happens over the course of a million tosses or so

Ananthsaid, on November 3, 2010 at 8:12 pm“…that the tails of the distribution formed by many-times convolving a distribution with itself are only approximately described by a Gaussian.”Why would the tails be any different from the heads? Bizarre!

Croor Singhsaid, on November 3, 2010 at 8:18 pmLK, One thu for you for that!

The deviation of the convolution from a Gaussian is larger for the regions of the sample space that are further away from the mean.

Croor Singhsaid, on November 4, 2010 at 8:01 pm@Nair’s question:

HHT and HTH are equally frequent in the long term. However, the average number of tosses required to get an HHT is smaller than for HTH. 8 to 10.

I wrote about it: http://croor.wordpress.com/2010/11/04/coin-toss-conundrum/

Jaysaid, on November 25, 2010 at 2:50 pmCan you attach the code that you used to complete thge histogram.

Thank you

Croor Singhsaid, on November 25, 2010 at 11:41 pm@Jay, too much trouble. Matlab has a random number generator. It also has a tool for histograms. Use them.

Jaysaid, on March 14, 2011 at 3:03 pmive tried to do this on matlab. All I am getting is the histogram but no smooth line on top.

see code below:

trial = 0; % initialising the number of coin tosses

trial(1) = 0;

toplot = 0;

step = zeros(1,101); % To initialise matrix

%xrealplot = 0;

%yrealplot = 0;

for m = 1:1000 % number of experiments

for n = 2:101 % for each coin toss (2:101 = 100 coin tosses)

a = rand(); % choose random number, a, between 0 and 1

if a < 0.5 % if random number is smaller than 0.5 assign it 'Heads'

trial(n) = trial(n-1) + 1; % therefore increase vector TRIAL by one

else % if random number bigger or equal to 0.5 assign it 'Tails'

trial(n) = trial(n-1) – 1; % therefore decrease vector TRIAL by one

end % after each m, save result in matrix trial

end

toplot(m) = trial(n-1); % Save all results in matrix, toplot

%xrealplot(m) = toplot(m)/sqrt(n-1)

%yrealplot(m) = toplot(m)/1000

end

maxNumberInMatrix = max(toplot); % Find max result

minNumberInMatrix = min(toplot); % Find min result

figure(1)

%plot(xrealplot, yrealplot)

hist(toplot,abs((minNumberInMatrix-maxNumberInMatrix)/2))

title('Histogram of 100,000 Experiments of 100 Coin Tosses')

xlabel('x-axis')

ylabel('y-axis')