Online College Courses for Credit

+
4 Tutorials that teach Normal Distribution Approximation of the Binomial Distribution
Take your pick:
Normal Distribution Approximation of the Binomial Distribution

Normal Distribution Approximation of the Binomial Distribution

Author: Sophia Tutorial
Description:

Calculate the mean, standard deviation, or variance using the normal distribution approximation of a binomial distribution. 

(more)
See More

Try Our College Algebra Course. For FREE.

Sophia’s self-paced online courses are a great way to save time and money as you earn credits eligible for transfer to many different colleges and universities.*

Begin Free Trial
No credit card required

37 Sophia partners guarantee credit transfer.

299 Institutions have accepted or given pre-approval for credit transfer.

* The American Council on Education's College Credit Recommendation Service (ACE Credit®) has evaluated and recommended college credit for 32 of Sophia’s online courses. Many different colleges and universities consider ACE CREDIT recommendations in determining the applicability to their course and degree programs.

Tutorial
what's covered
This tutorial will cover the normal distribution approximation of the binomial distribution. Our discussion breaks down as follows:

  1. Binomial Distribution
  2. Normal Distribution Approximation of the Binomial Distribution


1. Binomial Distribution

First, let's review the binomial distribution itself:

table attributes columnalign left end attributes row cell P left parenthesis X equals k right parenthesis equals open parentheses table row n row k end table close parentheses space p to the power of k space open parentheses 1 minus p close parentheses to the power of n minus k end exponent end cell row blank row cell open parentheses table row n row k end table close parentheses equals

Using that formula, you can create a probability distribution for all the values of k: zero successes, one success, two successes, all the way up to n successes.

k 0 1 2 3 ... n
P(X = k)






That can be made into a histogram, where the x-axis represents the values of k, the number of successes; and the y-axis is the relative frequency of those successes. Each bucket for the values of k go up to the height corresponding to the probability.

Just like all distributions, that histogram is going to have a mean and a standard deviation. The mean is fairly obvious to calculate.

EXAMPLE

Suppose you rolled a die six times. How many threes would you expect? What if you rolled it 60 times or 600 times? How many threes would you expect?

What you should be thinking is if you rolled it six times, you'd expect one of those six rolls to be a three. If you rolled it 60 times you'd expect about 10 threes. If you rolled it 600 times, you'd expect about 100 threes.

You were multiplying by 1/6 because 1/6 was the probability (1/6 of six is one). Therefore, 60 times 1/6, which is the probability of a three, gives you 10 as the expected value.

big idea
The average, or the expected value, is going to be the number of trials times the probability of success.

For a binomial distribution, we can identify the mean, standard deviation, and variance:

formula
Mean of a Binomial Distribution
mu subscript x equals n p
Standard Deviation of a Binomial Distribution
sigma subscript x equals square root of n p q end root

Variance of a Binomial Distribution
sigma subscript x superscript 2 equals n p q

2. Normal Distribution Approximation of the Binomial Distribution

Every distribution has three key features:

  1. Center
  2. Spread
  3. Shape

You just dealt with center and spread by finding the mean and the standard deviation.

formula
Center of a Binomial Distribution
mu subscript x equals n p
Spread of a Binomial Distribution
sigma subscript x equals square root of n p q end root

But what about the shape? Shape of this distribution is affected by two things: both n and p.

Look at the following distributions for a high, low, and middle probability. Notice how the distribution changes as the probability changes and number of trials change.

High Probability
p = 0.90
Number of Trials Distribution
10 Probability of 0.90, 10 Trials
100 Probability of 0.90, 100 Trials

When the probability of success is very high, the distribution is skewed very heavily to the left when the number of trials is low. But as the number of trials increases, the distribution is nearly symmetric. It's a little skewed to the left, but not heavily skewed as with the lower number of trials.

Low Probability
p = 0.15
Number of Trials Distribution
10 Probability of 0.15, 10 Trials
100 Probability of 0.15, 100 Trials

When the probability of success is very low, the distribution is skewed very heavily to the right when the number of trials is low. But as the number of trials increases, the distribution is now only slightly skewed to the right.

High Probability
p = 0.50
Number of Trials Distribution
10 Probability of 0.50, 10 Trials
100 Probability of 0.50, 100 Trials

When the probability of success is in the middle, near 0.50, the distribution becomes nearly symmetric when the number of trials is low. As the number of trials increases, the distribution stays symmetric.

That's what you should see when we look at the normal distribution approximation of the binomial distribution.

big idea
When n is high, the distribution is approximately normal. The only exceptions are when the value of p is very low or very high. When n is low, the skew, if any, is more prominent.

This is a critical concept. This means that when you have a large number of trials, the distribution of binomial probabilities is nearly normal, with the mean of what you found the mean to be, and standard deviation of what you found the standard deviation to be. Ultimately what you're finding is the binomial distribution with parameters n and p--which is what makes the binomial look like what it looks like--looks a lot like the normal distribution with that mean and that standard deviation.

The distribution has to be large enough to satisfy these two conditions:

  1. The mean, np, has to be at least 10.
  2. The expected number of failures, nq, has to be at least 10.

This means that you had to be far enough off of the left-hand side and far enough off the right-hand side. When you had that distribution, it looked normal when you were safely in the middle of the distribution, and not near the very ends. These two conditions have to be satisfied. This makes looking at a lot of these problems much easier.

EXAMPLE

Suppose a baseball player gets a hit 28% of the time when he comes to bat. What’s the probability that he gets over 30 hits in his next 95 at bats?

Using the old way, you would have to find the probability that he gets exactly 31 hits, plus the probability that he gets exactly 32, all the way up to the probability that he gets exactly 95 hits. But if you completed all 65 calculations, you would get a probability of 0.185, or 18.5%.

table attributes columnalign left end attributes row cell P open parentheses X greater than 30 close parentheses equals space P open parentheses X equals 31 close parentheses space plus space P open parentheses X equals 32 close parentheses space plus space... space plus space P open parentheses X equals 95 close parentheses equals 0.185 end cell row cell space space space space space space space space space space space space space space space space space space space space space space space stack space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space with underbrace below end cell row cell space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space 65 space I n d i v i d u a l space c a l c u l a t i o n s end cell end table

The new way uses the normal approximation. First, check to see if the distribution is large enough to satisfy the two conditions above. We know there are 95 trials, the probability of success is 0.28, and the probability of failure is 1 minus 0.28, or 0.72.

table attributes columnalign left end attributes row cell n p greater than 10 end cell row cell left parenthesis 95 right parenthesis left parenthesis 0.28 right parenthesis greater than 10 end cell row cell 26.6 greater than 10 space end cell end table table attributes columnalign left end attributes row cell n q greater than 10 end cell row cell left parenthesis 95 right parenthesis left parenthesis 0.72 right parenthesis greater than 10 end cell row cell 68.4 greater than 10 space end cell end table

Both conditions are satisfied, np and nq are both bigger than 10. We can now find the mean and standard deviation.
table attributes columnalign left end attributes row cell mu subscript x equals n p equals left parenthesis 95 right parenthesis left parenthesis 0.28 right parenthesis equals 26.6 end cell row cell sigma subscript x equals square root of n p q end root equals square root of left parenthesis 95 right parenthesis left parenthesis 0.28 right parenthesis left parenthesis 0.72 right parenthesis end root equals 3.763 end cell end table
Using the mean of 26.6 and a standard deviation of 3.763, the normal distribution, or the binomial distribution, is going to look very much like this and we can find the probability of hitting over 30 hits.
Normal Distribution: Mean of 26.6, Standard Deviation of 3.763
Based on this normal distribution, calculate out a z-score and use a z-table to find the probability that way.
table attributes columnalign left end attributes row cell P left parenthesis X greater than 30 right parenthesis equals P open parentheses z greater than fraction numerator 30 minus 26.6 over denominator 3.763 end fraction close parentheses end cell row cell P left parenthesis X greater than 30 right parenthesis equals P open parentheses z greater than 0.90 close parentheses end cell row cell P left parenthesis X greater than 30 right parenthesis equals 1 minus P open parentheses z less than 0.90 close parentheses end cell row cell P left parenthesis X greater than 30 right parenthesis equals 1 minus 0.8159 end cell row cell P left parenthesis X greater than 30 right parenthesis equals 0.1841 end cell row blank end table
This is almost the same as what you would get if you used the binomial calculations.
term to know

Normal Distribution Approximation of the Binomial Distribution
If a random variable has a binomial distribution, the number of trials is sufficiently large, and the probability of success is not too close to 0 or 1, the variable's distribution can be approximately modeled using a normal distribution.

summary
The normal distribution is a good approximation for the binomial under certain conditions: n has to be large, and p has to be not too extreme, meaning not too high and not too low. You can use the mean and standard deviation of the binomial as the mean and standard deviation for the normal, and use z-scores to find the probabilities. This simplifies the problem.

Good luck!

Source: Adapted from Sophia tutorial by Jonathan Osters.

Terms to Know
Normal Distribution Approximation of the Binomial Distribution

If a random variable has a binomial distribution, the number of trials is sufficiently large, and the probability of success is not too close to 0 or 1, the variable's distribution can be approximately modeled using a normal distribution.

Formulas to Know
Center of a Binomial Distribution

mu subscript x space equals space n p

Mean of a Binomial Distribution

mu subscript x space equals space n p

Spread of a Binomial Distribution

sigma subscript x space equals space square root of n p q end root

Standard Deviation of a Binomial Distribution

sigma subscript x space equals space square root of n p q end root

Variance of a Binomial Distribution

sigma subscript x superscript 2 space equals space n p q