## Quick facts about the normal curve

#### Also called the Bell curve1 or Gaussian distribution

This page produced somewhat off-the-cuff in response to two e-mail queries.  Purists will find it oversimplified. Any basic statistics textbook would give you much more depth, but here's a quick list of features:

• The formula for the curve is .  The shape of the curve is like the silhouette of a bell, hence the name "the bell curve":
• This curve lies entirely above the horizontal axis, and that axis is an asymptote in both horizontal directions (i.e. as x grows large and positive or large and negative, the curve approaches arbitrarily close to the axis, but never reaches it).

•
• The area between the curve and the horizontal axis is exactly 1. Note that this is the area of a region that is infinitely wide, since the curve never actually touches the axis.

•
• (Perhaps most important): Many, many, many chance experiments, if repeated long enough, will generate histograms that approximate the shape of the normal curve.  For example:

•
• The number of heads obtained in 100 tosses of a coin (even if the coin is moderately biased): Count the number of heads in each of many 100-toss trials; make a histogram of the number of heads obtained in each trial.  Do it long enough, the histogram will be approximately the normal curve, except (for a fair coin) shifted right 50 units, and stretched horizontally by a factor of 5. Also, the histogram will be a bit "jaggy" since one can obtain 50 or 51 or 52 heads in a 100-toss trial, but not 51.4 heads:

•
• The total number of spots seen in 50 rolls of a die (similar to above).

•
• Choose 200 people randomly from the Philadelphia population. Repeat 99 more times to get 100 different samples of 200 people each.  Find the average weights of the 200 people in each sample, to get 100 different averages.  Make a histogram of those averages.  The curve will be approximately the normal curve, though shifted right (I'd guess about 140 pounds worth) and stretched horizontally by a factor of about, say, 5 lbs [averages tend to cancel out extremes of variation].

• This last fact (#4) is why the normal curve is so important to statisticians.  Many, many calculations about chance can be approximated very well with the normal curve.  This is particularly important when looking at a sample from a population (of people, cats, or computer chips).

• For example, if a random sample of 500 Philadelphia voters shows 74% will vote for Candidate A, there is a chance that this sample is completely non-representative of the overall Philadelphia voting population.  In fact, if the sample was really random (everone in Philly having an equal chance of being picked for the sample), the chance of the sample not representing the population reasonably well is low...

• and the punchline

the normal curve is a tool a statistician can use to tell how far the sample is likely to be off from the overall population, i.e. how big a "margin of error" there is likely to be in his/her poll.

• Another example:  I test 200 tires from a production run, by wearing them out, to see how many miles they last.  I select those 200 at random from the entire production run.  I can't test the entire production run (because I can't sell tested, i.e. worn-out tires).  Again, my sample may be unrepresentative, but the normal curve will give me a way to estimate the likely margin of error.

The normal curve is often called the Gaussian distribution, after Carl Friedrich Gauss, who discovered many of its properties.  Gauss, commonly viewed as one of the greatest mathematicians of all time (if not the greatest), is properly honored by Germany on their 10 Deutschmark bill:

You will notice the normal curve to his left:
Here the formula for the curve has been modified to shift its center to   on the x-axis, and to arrange that its inflection points are at - and +.  The factor in front arranges that the area under the curve remains equal to 1.

### Other miscellany:

Some numbers from the graph of the curve:

 x -3 -2 -1 0.5 0 0.5 1 2 3 y 0.0044 0.054 0.242 0.352 0.399 0.352 0.242 0.054 0.0044
Some areas under the curve:

 horizontal interval -1 to 1 -2 to 2 -3 to 3 area under curve for that interval 0.68 0.95 0.997
(all numbers rounded above).

#### Notes:

1) In the late 90s the name "Bell Curve" got a bad name due to a controversial book of the same title.
The name was around for probably a century before the book was written, and has remained and will remain long after the book is forgotten.