Hi. This tutorial covers the normal distribution approximation of the binomial distribution. So let's try to make a little bit of sense of what we're even talking about here. So if a random variable has a binomial distribution, the number of trials is sufficiently large, and the probability of success is not too close to 0 or 1, the variable's distribution can be approximately modeled with a normal distribution.
So if we already know we have a binomial distribution, number of trials is large, and the probability of success is not too close to 0 or 1-- so somewhere in the middle-- then we can instead of using a binomial distribution, we can use a normal distribution, because remember, we know a lot of good things about normal distributions. We know the 68-95-99.7 rule. We have z-tables that can help us calculate probabilities. So it's going to be nice to be able to use a normal distribution instead of having to use a binomial distribution.
So before we get too far, let's make sure that we're working with the right variables here. So what we're going to do is, we're going to let n represent the number of trials in a binomial setting. We're going to let p represent the probability of success. And then we're going to let q equal the probability of failure. Since we are going to have a binomial setting here, we are only going to have two outcomes, success or failure. So if we know the probability of success, if we take 1 minus that probability, we're going to get q, which represents the probability of failure.
Now, we said n needs to be large and p has to be not too close to 0 or 1. So two conditions that kind of help us decide if n is big enough and if the probability of success is not too close to 0 or 1 are these two. So a variable with a binomial distribution can be approximated with a normal distribution if n times p is greater than or equal to 10 and n times q is greater than or equal to 10. So if these conditions are met, we can use a normal distribution to model a binomial distribution.
Now, if these conditions are met, the following will be true for a normal distribution. So mu, which is the mean, is equal to n times p. And the variance, sigma squared, is equal to n times p times q. Generally when we're describing a distribution, it's better to use the standard deviation. So the standard deviation, sigma, is just simply going to be square root of n times p times q. So we'll probably use the standard deviation more frequently than we'll use the variance there.
So let's take a look at an experiment, chance experiment. So we're going to randomly select a Visa credit card holder. We're then going to ask the person if they pay their credit card bill in full each month. So in full means you pay off the entire balance. You don't leave any balance for the next month to pay.
So suppose that the probability of a person paying their credit card bill in full each month is 25%. Consider 100 trials of this experiment. So basically, we're going to randomly select 100 different people and ask them the question here. So we want to know, is this a binomial setting?
So let's remember what the four conditions are-- so condition one, you need to have a fixed number of trials with only two outcomes. Is that satisfied here? Yes. We have 100 trials, and only two outcomes. They're either going to pay their credit card bill in full each month or they're not going to. So condition one is met.
Condition two-- the trials are independent. Yeah, chances are if you ask somebody about their credit card balance and then you ask somebody else about their credit card balance, the response of the first person is going to have no effect on the response of the second. So I would say trials are independent here.
Case three is probability of outcomes does not change. So we're going to assume that 25% is the probability for all the trials here. And the variable of interest is the number of successes. So yeah, we would be counting how many people pay their bill in full. So out of 100, we would want to know how many pay their bill in full.
So it is a binomial distribution because those four outcomes were met. Now, can this binomial distribution be approximated with a normal distribution? So remember, we need to test n times p greater than or equal to 10 and n times q greater than or equal to 10. So we know that n is 100. p, the probability of success, is 0.25. And q, the probability of failure, is the complement, so 0.75.
So let's figure out what n times p is first. So n times p will be 100 times 0.25. And 100 times 0.25 is equal to 25. So 25 is greater than or equal to 10. Well, actually it's just greater than 10.
Now let's do n times q, so 100 times 0.75. That's going to equal 75, which is greater than 10. So both conditions were met. So the answer to that question is yes. We can model this binomial distribution with the normal distribution because n times p and n times q are both greater than or equal to 10.
So if so, what are some features of this normal distribution? So the features we're talking about here are the mean and the standard deviation. So the mean, we know, is n times p. So if we take 100 times 0.25, which we already did up here, but that means the mean would be 25 people.
We would expect out of the 100 trials, 100 people we ask, 25 will pay their credit card in full. That number should make sense. If 25% of 100 people pay their credit card bill in full, 25 of those people will be those ones that pay their bill in full.
All right, let's also do the standard deviation. So the standard deviation is going to be the square root of n times p times q, so 100 times 0.25 times 0.75. Let's go to the calculator to do this. I'll multiply the numbers first, so 100 times 0.25 times 0.75. That gives me 18.75.
And then I need to do the square root of that number. So the square root ends up being about 4.33. So standard deviation is approximately equal to 4.33.
So now, if we were to make a picture of this distribution, I'm going to draw my normal distribution. This is going to represent x. x is the number of people that pay their bill in full.
And we'll mark off a couple of the important landmarks of a normal distribution. So this then will be 29.33, one standard deviation above the mean. Now, we'll go ahead and add one more standard deviation there. And that'll give me 33.66.
So now to get one standard deviation below the mean, let's do 25 minus 4.33. And that's 20.67. And then we'll subtract another standard deviation, and get 16.34.
So now a couple of things now that this helps us determine about people paying their credit card bill is according to the 68-95-99.7 rule, we know that 68% of the time that we sample 100 people, we're going to get between 20.67 and 29.33 people that pay their credit card bill in full each month. Again, if we sampled 100 people, 95% of the time, we're going to get between 16.34 and 33.66 people that pay their credit card bill in full. So there's a lot of different information we can get by using this normal distribution.
So that has been the tutorial on the normal distribution approximation of the binomial distribution Thanks for watching.
Terms to Know
Normal Distribution Approximation of the Binomial Distribution
If a random variable has a binomial distribution, the number of trials is sufficiently large, and the probability of success is not too close to 0 or 1, the variable's distribution can be approximately modeled using a normal distribution.