Lyndon Appalsamy Posted September 9, 2014 Posted September 9, 2014 Hi Guys, I'm hoping one of you can assist me. I would the like the below transformed to layman's terms so the average person (me) can understand the calculation: If 50% of all the people in a population of 20000 people drink coffee in the morning, and if you were repeat the survey of 377 people ("Did you drink coffee this morning?") many times, then 95% of the time, your survey would find that between 45% and 55% of the people in your sample answered "Yes". The remaining 5% of the time, or for 1 in 20 survey questions, you would expect the survey response to more than the margin of error away from the true answer. When you survey a sample of the population, you don't know that you've found the correct answer, but you do know that there's a 95% chance that you're within the margin of error of the correct answer. Try changing your sample size and watch what happens to the alternate scenarios. That tells you what happens if you don't use the recommended sample size, and how M.O.E and confidence level (that 95%) are related. To learn more if you're a beginner, read Basic Statistics: A Modern Approach and The Cartoon Guide to Statistics. Otherwise, look at the more advanced books. In terms of the numbers you selected above, the sample size n and margin of error E are given by x = Z(c/100)2r(100-r) n = N x/((N-1)E2 + x) E = Sqrt[(N - n)x/n(N-1)] where N is the population size, r is the fraction of responses that you are interested in, and Z(c/100) is the critical value for the confidence level c. If you'd like to see how we perform the calculation, view the page source. This calculation is based on the Normal distribution, and assumes you have more than about 30 samples. About Response distribution: If you ask a random sample of 10 people if they like donuts, and 9 of them say, "Yes", then the prediction that you make about the general population is different than it would be if 5 had said, "Yes", and 5 had said, "No". Setting the response distribution to 50% is the most conservative assumption. So just leave it at 50% unless you know what you're doing. The sample size calculator computes the critical value for the normal distribution. Wikipedia has good articles on statistics.
studiot Posted September 9, 2014 Posted September 9, 2014 If 50% of all the people in a population of 20000 people drink coffee in the morning, and if you were repeat the survey of 377 people ("Did you drink coffee this morning?") many times, then 95% of the time, your survey would find that between 45% and 55% of the people in your sample answered "Yes". Let us break it down into easy steps. Did you understand this statement, ignoring how the numbers themselves were calculated?
imatfaal Posted September 9, 2014 Posted September 9, 2014 ! Moderator Note please stop posting on sampling bias - that branch has been moved to applied maths
Lyndon Appalsamy Posted September 10, 2014 Author Posted September 10, 2014 Hi Studiot, No, in actual fact I don't understand it completely My Boss wants an explanation of the actual formula in layman's terms. We know the end result but we want to know what happens in between to get to that result x = Z(c/100)2r(100-r) n = N x/((N-1)E2 + x) E = Sqrt[(N - n)x/n(N-1)]
studiot Posted September 10, 2014 Posted September 10, 2014 OK step 1 WWE are told the the 'population' is 20,000 and that 50% of them drink coffee in the morning. This means that if we asked every one of them 10,000 would say yes and 10,000 would say no. We don't, however want to expend the effort of asking every one so we only ask some of them. We call the the group that we ask the sample. Unfortunately it is entirely possible that all the ones we ask could be coffee drinkers (perhaps we make the mistake of asking in the coffee lounge), or non drinkers (perhaps we only asked those under 6 years of age). Clearly this is not a good way to go about things. We want a systematic method. In particular we want our results to reflect the actual split between drinkers and non drinkers in some rational manner. So we find a way of selecting members of our 20,000 population to ask at random. This means that every member has the same probability (1 in 20,000) of being selected. This is our definition of random. Now another way of saying "This means that if we asked every one of them 10,000 would say yes and 10,000 would say no." is to say that the probability of a yes answer is exactly 0.5 for a random population member. I expect you understand these intuitively, but what I ahve said so far provides a formal basis (puts things on a firm footing) towards step 2. How are we doing so far? 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now