DylsexicChciken Posted November 20, 2014 Posted November 20, 2014 (edited) So the hypothesis test procedure pits the null hypothesis ([math]H_0[/math]) against the alternative hypothesis ([math]H_a[/math]). If a given sample provides more evidence that [math]H_a[/math] is true, than [math]H_0[/math] is rejected. If there isn't enough evidence to reject [math]H_0[/math], then we can't conclude anything. Two errors result from this: Type I error: the error of rejecting [math]H_0[/math] when [math]H_0[/math] is true.Type II error: the error of failing to reject [math]H_0[/math] when [math]H_0[/math] is false. 1. First example: " The U.S. Bureau of Transportation Statistics reports that for 2009, 72% of alldomestic passenger flights arrived on time (meaning within 15 minutes of the scheduledarrival). Suppose that an airline with a poor on-time record decides to offer itsemployees a bonus if, in an upcoming month, the airline’s proportion of on-timeflights exceeds the overall 2009 industry rate of .72. Let p be the actual proportion ofthe airline’s flights that are on time during the month of interest. A random sampleof flights might be selected and used as a basis for choosing betweenH0: p = .72 and Ha: p >.72In this context, a Type I error (rejecting a true H0) results in the airline rewarding itsemployees when in fact the actual proportion of on-time flights did not exceed .72.A Type II error (not rejecting a false H0) results in the airline employees not receivinga reward that they deserved. " The book claims that type II error, the failing to reject [math]H_0[/math] when [math]H_0[/math] is false, means that the airline employees did not receive the reward that they deserve. But isn't that not always true? Failing to reject [math]H_0[/math] means that [math]p< 0.72[/math] or [math]p > 0.72[/math], i.e. [math]p\neq0.72[/math], hence there could be less than 72% of domestic passenger flights being on time as well as there being more than 72% of domestic passenger flights being on time, yet the book only considers one possibility. Is there a reason for considering only one outcome? This is repeated in other examples that analyze type I and type II errors. 2. Another example I don't quite understand: The probability of a Type I error is denoted by a and is called the significancelevel of the test. For example, a test with a .01 is said to have a significancelevel of .01.The probability of a Type II error is denoted by b. " Women with ovarian cancer usually are not diagnosed until the disease is in an advancedstage, when it is most difficult to treat. The paper “Diagnostic Markers forEarly Detection of Ovarian Cancer” (Clinical Cancer Research [2008]: 1065–1072)describes a new approach to diagnosing ovarian cancer that is based on using six differentblood biomarkers (a blood biomarker is a biochemical characteristic that ismeasured in laboratory testing). The authors report the following results using the sixbiomarkers:• For 156 women known to have ovarian cancer, the biomarkers correctly identified151 as having ovarian cancer.• For 362 women known not to have ovarian cancer, the biomarkers correctlyidentified 360 of them as being ovarian cancer free.We can think of using this blood test to choose between two hypotheses: H0: woman has ovarian cancerHa: woman does not have ovarian cancer Note that although these are not “statistical hypotheses” (statements about a populationcharacteristic), the possible decision errors are analogous to Type I and Type II errors.In this situation, believing that a woman with ovarian cancer is cancer free wouldbe a Type I error—rejecting the hypothesis of ovarian cancer when it is in fact true.Believing that a woman who is actually cancer free does have ovarian cancer is aType II error—not rejecting the null hypothesis when it is in fact false. Based on thestudy results, we can estimate the error probabilities. The probability of a Type I error,a, is approximately 5/156 .032. The probability of a Type II error, b, is approximately2/363 .006." From the above example, the type II error exists when we fail to reject [math]H_0[/math] when [math]H_0[/math] is false. This means that as the statistician conducting the research, the statistician could accept [math]H_0[/math] or reject [math]H_0[/math] because failure to reject [math]H_0[/math] means [math]H_0[/math] could be true or false. Therefore if by luck the statistician rejects [math]H_0[/math], then no error is made. The only error made in type II error is if he accepts [math]H_0[/math] when it is false. So why doesn't this factor into the calculation of error of type II error b? 3. Another example: From the book:"The Environmental Protection Agency (EPA) has adopted what is known as the Leadand Copper Rule, which defines drinking water as unsafe if the concentration of leadis 15 parts per billion (ppb) or greater or if the concentration of copper is 1.3 partsper million (ppm) or greater. With m denoting the mean concentration of lead, themanager of a community water system might use lead level measurements from asample of water specimens to testH0: m = 15 versus Ha: m > 15The null hypothesis (which also implicitly includes the m > 15 case) states that themean lead concentration is excessive by EPA standards. The alternative hypothesisstates that the mean lead concentration is at an acceptable level and that the watersystem meets EPA standards for lead. (How is this correct?)..."Shouldn't [math]H_a[/math] be m < 15? Because if [math]H_a[/math]: m > 15, then it's still not save by EPA standards. So we actually want [math]H_a[/math], right? Edited November 20, 2014 by DylsexicChciken
imatfaal Posted November 20, 2014 Posted November 20, 2014 Whats the book.? The first example doesn't read as I remember the concept BTW - Philip Stark at UC Berkeley has an excellent online stats text book. It covers his first year Stats option for Scientists iirc
DylsexicChciken Posted November 20, 2014 Author Posted November 20, 2014 Whats the book.? The first example doesn't read as I remember the concept BTW - Philip Stark at UC Berkeley has an excellent online stats text book. It covers his first year Stats option for Scientists iirc This is my book: http://www.amazon.com/Introduction-Statistics-Analysis-Available-Titles/dp/0840054904/ref=sr_1_1?s=books&ie=UTF8&qid=1416491336&sr=1-1&keywords=Introduction+to+Statistics+and+Data+Analysis+peck
imatfaal Posted November 20, 2014 Posted November 20, 2014 I think you need to reread section 10.1 - especially sections such as this As a consequence, the conclusion when testing H 0: mu = 8000 versus H a : m < 8000 is always the same as the conclusion for a test where the null hypothesis is H 0 : m >= 8000. For these reasons, it is customary to state the null hypothesis H 0 as a claim of equality
Cap'n Refsmmat Posted November 20, 2014 Posted November 20, 2014 The book claims that type II error, the failing to reject [math]H_0[/math] when [math]H_0[/math] is false, means that the airline employees did not receive the reward that they deserve. But isn't that not always true? Failing to reject [math]H_0[/math] means that [math]p< 0.72[/math] or [math]p > 0.72[/math], i.e. [math]p\neq0.72[/math], hence there could be less than 72% of domestic passenger flights being on time as well as there being more than 72% of domestic passenger flights being on time, yet the book only considers one possibility. Is there a reason for considering only one outcome? This is repeated in other examples that analyze type I and type II errors.Failing to reject H0 does not mean that [math]p \neq 0.72[/math]; it means p could be 0.72. If you reject H0, yes, it could be that the proportion is smaller than 0.72. But you'd use a one-sided hypothesis test which only rejects when [math]p > 0.72[/math]. 2. Another example I don't quite understand: The probability of a Type I error is denoted by a and is called the significance level of the test. For example, a test with a .01 is said to have a significance level of .01. The probability of a Type II error is denoted by b. " Women with ovarian cancer usually are not diagnosed until the disease is in an advanced stage, when it is most difficult to treat. The paper “Diagnostic Markers for Early Detection of Ovarian Cancer” (Clinical Cancer Research [2008]: 1065–1072) describes a new approach to diagnosing ovarian cancer that is based on using six different blood biomarkers (a blood biomarker is a biochemical characteristic that is measured in laboratory testing). The authors report the following results using the six biomarkers: • For 156 women known to have ovarian cancer, the biomarkers correctly identified 151 as having ovarian cancer. • For 362 women known not to have ovarian cancer, the biomarkers correctly identified 360 of them as being ovarian cancer free. We can think of using this blood test to choose between two hypotheses: H0: woman has ovarian cancer Ha: woman does not have ovarian cancer Note that although these are not “statistical hypotheses” (statements about a population characteristic), the possible decision errors are analogous to Type I and Type II errors. In this situation, believing that a woman with ovarian cancer is cancer free would be a Type I error—rejecting the hypothesis of ovarian cancer when it is in fact true. Believing that a woman who is actually cancer free does have ovarian cancer is a Type II error—not rejecting the null hypothesis when it is in fact false. Based on the study results, we can estimate the error probabilities. The probability of a Type I error, a, is approximately 5/156 .032. The probability of a Type II error, b, is approximately 2/363 .006. " From the above example, the type II error exists when we fail to reject [math]H_0[/math] when [math]H_0[/math] is false. This means that as the statistician conducting the research, the statistician could accept [math]H_0[/math] or reject [math]H_0[/math] because failure to reject [math]H_0[/math] means [math]H_0[/math] could be true or false. Therefore if by luck the statistician rejects [math]H_0[/math], then no error is made. The only error made in type II error is if he accepts [math]H_0[/math] when it is false. So why doesn't this factor into the calculation of error of type II error b? I don't understand your reasoning here. Are you suggesting that the statistician sees a statistically insignificant result, and hence fails to reject H0, the null might be true or false and hence the statistician might decide to reject? Because that's not how testing is done. I can't tell if you're trying to distinguish between "accepts H0" and "fails to reject H0". They're synonymous, though the latter is a better description. 3. Another example: From the book: " The Environmental Protection Agency (EPA) has adopted what is known as the Lead and Copper Rule, which defines drinking water as unsafe if the concentration of lead is 15 parts per billion (ppb) or greater or if the concentration of copper is 1.3 parts per million (ppm) or greater. With m denoting the mean concentration of lead, the manager of a community water system might use lead level measurements from a sample of water specimens to test H0: m = 15 versus Ha: m > 15 The null hypothesis (which also implicitly includes the m > 15 case) states that the mean lead concentration is excessive by EPA standards. The alternative hypothesis states that the mean lead concentration is at an acceptable level and that the water system meets EPA standards for lead. (How is this correct?) ... " Shouldn't [math]H_a[/math] be m < 15? Because if [math]H_a[/math]: m > 15, then it's still not save by EPA standards. So we actually want [math]H_a[/math], right? Yeah, I think they got mixed up here. Typically you'd make the hypotheses as they describe them, so the alternative hypothesis represents an unacceptable level of lead. When you reject the null, you know something is wrong with the water. 1
DylsexicChciken Posted November 20, 2014 Author Posted November 20, 2014 (edited) Failing to reject H0 does not mean that ; it means p could be 0.72. If you reject H0, yes, it could be that the proportion is smaller than 0.72. But you'd use a one-sided hypothesis test which only rejects when . The book's example assumes the case when it is true that we have committed a type II error, in that p = 0.72 is false and we fail to reject it. Then the actual value of p would be [latex] p< 0.72 [/latex] or [latex] p> 0.72 [/latex]. So the proportion of on time flights can be greater or less than 0.72, but the book only considers the case when [latex] p> 0.72 [/latex]. So the question is why did they choose only one of the consequences? Is it because they just happen to choose one example of consequence? I don't understand your reasoning here. Are you suggesting that the statistician sees a statistically insignificant result, and hence fails to reject H0, the null might be true or false and hence the statistician might decide to reject? Because that's not how testing is done. I can't tell if you're trying to distinguish between "accepts H0" and "fails to reject H0". They're synonymous, though the latter is a better description. Now that I thought about it again, it makes a little more sense. This was my problem if you're still interested: When we fail to reject [latex] H_0 [/latex] given that [latex] H_0 [/latex] is false. The fact that [latex] H_0 [/latex] is false is something a statistician won't know for certain. The statistician only knows the probability that [latex] H_0 [/latex] is false and he fails to reject it. This is error is denoted by b, the probability of type II error. So therefore, the statistician could have either chosen to accept [latex] H_0 [/latex] or reject [latex] H_0 [/latex] because the samples he collected is inconclusive about the truth of [latex] H_0 [/latex]. The probability formula for b is (# of failed rejections given that [latex] H_0 [/latex] is false) / sample size. The sample size should be 362, so I believe the book made another error. As you can see, this calculation does not take into account the fact that the statistician could have accepted or rejected it. But I now understand that b is just the probability when we fail to reject a false [latex] H_0 [/latex], therefore whether or not the statistician ultimately accepts [latex] H_0 [/latex] or rejects [latex] H_0 [/latex], the error of failing to reject a false [latex] H_0 [/latex] is still made and the formula for b is only concerned with failing to reject a false [latex] H_0 [/latex], regardless of what happens afterwards. Edited November 20, 2014 by DylsexicChciken
studiot Posted November 20, 2014 Posted November 20, 2014 I don't know this book. Has the text said anything about the acceptance criteria or decision rules or whatever? These are an indispensible part of hypothesis testing and your text in green doesn't seem to include them, so I don't see how you can come to any conclusion. As regards one tailed and two tailed tests, it is possible for only one tail of the two tailed test to fall within the considered area for typeII error comsideration.
Cap'n Refsmmat Posted November 21, 2014 Posted November 21, 2014 The book's example assumes the case when it is true that we have committed a type II error, in that p = 0.72 is false and we fail to reject it. Then the actual value of p would be [latex] p< 0.72 [/latex] or [latex] p> 0.72 [/latex]. So the proportion of on time flights can be greater or less than 0.72, but the book only considers the case when [latex] p> 0.72 [/latex]. So the question is why did they choose only one of the consequences? Is it because they just happen to choose one example of consequence?If [math]p < 0.72[/math] but we fail to reject, I wouldn't consider that a type II error. That's intentional, since we're testing for the alternative that [math]p > 0.72[/math]. A one-tailed test would specifically try not to reject when [math]p < 0.72[/math]. When we fail to reject [latex] H_0 [/latex] given that [latex] H_0 [/latex] is false. The fact that [latex] H_0 [/latex] is false is something a statistician won't know for certain. The statistician only knows the probability that [latex] H_0 [/latex] is false and he fails to reject it.The statistician does not know the probability that H0 is false; that's not what a p value is. So therefore, the statistician could have either chosen to accept [latex] H_0 [/latex] or reject [latex] H_0 [/latex] because the samples he collected is inconclusive about the truth of [latex] H_0 [/latex].That's not the typical practice. If the sample is inconclusive, we "accept" (fail to reject) H0. We don't have the choice to reject it -- we have no evidence to justify the choice. Significance testing is designed to avoid rejecting H0 unless we're really sure. When the evidence is inconclusive, we don't reject. I think you're making this much more complicated than it needs to be by interpreting "fail to reject" as "I could accept or reject." That's not the case. "Accept" and "fail to reject" are synonymous, and if you fail to reject, you don't have any choice of what to do. You fail to reject. You're done. 1
DylsexicChciken Posted November 21, 2014 Author Posted November 21, 2014 (edited) I don't know this book. Has the text said anything about the acceptance criteria or decision rules or whatever? These are an indispensible part of hypothesis testing and your text in green doesn't seem to include them, so I don't see how you can come to any conclusion. As regards one tailed and two tailed tests, it is possible for only one tail of the two tailed test to fall within the considered area for typeII error comsideration. I looked through the book and I don't think I see anything called acceptance criteria or decision rules. I am still reading an early section on hypothesis testing, so it might be somewhere later on. The statistician does not know the probability that H0 is false; that's not what a p value is. I was referring to the value b (of the test procedures on the variable being tested, p) the probability value of type II error. Type I error, a, and Type II error, b, are inversely proportional. The statistician has only control over type I error, a, therefore the statistician can control type II error, b, indirectly by minimizing or maximizing type I error, a. So even knowing this, in practice the statistician still won't be able to find the value of b(so far I haven't learned how to calculate b, if there even is a way)? I am still in an early part of the hypothesis testing chapter, so the information I learned so far should be about basic concepts of error and interpreting a problem. Edited November 21, 2014 by DylsexicChciken
studiot Posted November 21, 2014 Posted November 21, 2014 I'm just shutting down for the night, so look again when I have has a chance to post something.
Cap'n Refsmmat Posted November 21, 2014 Posted November 21, 2014 I was referring to the value b (of the test procedures on the variable being tested, p) the probability value of type II error. Type I error, a, and Type II error, b, are inversely proportional. The statistician has only control over type I error, a, therefore the statistician can control type II error, b, indirectly by minimizing or maximizing type I error, a. So even knowing this, in practice the statistician still won't be able to find the value of b(so far I haven't learned how to calculate b, if there even is a way)? I am still in an early part of the hypothesis testing chapter, so the information I learned so far should be about basic concepts of error and interpreting a problem.The probability of type II error depends on the size of the true effect, so you can't calculate it. You can, however, calculate it for different assumed sizes of true effect, so you could say "If the true effect is this big, then I have a 50% chance of detecting it." 1
DylsexicChciken Posted November 23, 2014 Author Posted November 23, 2014 (edited) If a type I error a for a hypothesis test is 0.05, and the p-value=0.01. We reject [latex] H_0 [/latex] because p-value [latex] \leq [/latex] a. What is the reasoning or intuition for this? Edited November 23, 2014 by DylsexicChciken
studiot Posted November 23, 2014 Posted November 23, 2014 OK I said I'd post some more. I have tried to show things in diagrams and I am assuming normal distributions. Bignose will no doubt wish to generalise this if he comments again. If we go through my diagram, hopefully it will become clear. Most texts show a diagram like sketch A, but do not make it clear that there are two distributions in play, not one. And few show you the sequence sketch B through sketch F I cannot stress this enough. My sketch A refers to the distribution of all possible sample distributions. (of the size of sample we are taking) All the others refer to the population distribution. Unlike in your previous thread we either have the population mean and standard deviation or we are assuming it as H0 So in my example the population mean is postulated as being 8.0. So in this case the mean of all possible sample distributions should be the mean of the population, and we will take this as our null hypothesis and examine the errors that may arise if this is not true. This allows us to set the cut off points which are also called critical values. In my example they are 7.9 and 8.1 These cutoff points are where the acceptance criteria / decision rules I mentioned arise. If we take a sample and the mean of the values falls between 7.9 and 8.1 we accept the H0 Since H0 and H1 are mutually exclusive accepting H0 means rejecting H1 So we don't test for H1 That is our acceptance criterion is [math]7.9 \le {\mu _{sample}} \le 8.1[/math] Outside this acceptance range we reject H0 and the area of the tails gives us the probability of a TYPE I error This reappears in sketch D. OK so now we ask what happens if the population mean is not 8, because there is something wrong that requires action. In sketches B, C, D, E and F I have successively moves the population curve along the axis to show it in various positions in relation to the critical values I have projected down from above by the dashed lines. Note these have not moved from the original sample basis in sketch A So in sketch B if [math]{\mu _{pop}} = 7.7[/math] then the right hand tail only enters the acceptance region. That is there is a small probability that a sample drawn from this population could have a mean within the acceptance region. This only occurs for a small % of cases but if our sample mean lies between 7.9 and 8.1 we will (wrongly) accept it. This is a TYPE II error The area that this tail intrudes into the acceptance region yields the % or probability of this. This is the reason you were asking why a one tailed value was calculated in one of your examples In sketch C I have moved the population curve mean along the first critical point at 7.9 Now there is a considerable probability that a sample mean drawn from this population could fall within the acceptance region. The right hand tail may even extend beyond the upper critical value. In sketch D the curve has moved back to a mean of 8 and we are back to TYPE I error possibilities. Sketches E and F are simple mirror images of C and B as the lower tail moves past the acceptance region. 1
Cap'n Refsmmat Posted November 24, 2014 Posted November 24, 2014 If a type I error a for a hypothesis test is 0.05, and the p-value=0.01. We reject [latex] H_0 [/latex] because p-value [latex] \leq [/latex] a. What is the reasoning or intuition for this?"If H0 were true, we would get this sort of data less than 5% of the time. But we got it. H0 must be wrong." You can think of it as nearly a proof by contradiction. If [math]p = 0[/math] exactly, then it is a proof by contradiction: if H0 were true, we would never get this data, but we did, so H0 must be false. Another way to phrase it is "Either we're very lucky and got unlikely results, or H0 is wrong." At some point you're more willing to reject the null than assume you have incredible luck. 1
DylsexicChciken Posted November 26, 2014 Author Posted November 26, 2014 (edited) "If H0 were true, we would get this sort of data less than 5% of the time. But we got it. H0 must be wrong." You can think of it as nearly a proof by contradiction. If [math]p = 0[/math] exactly, then it is a proof by contradiction: if H0 were true, we would never get this data, but we did, so H0 must be false. Another way to phrase it is "Either we're very lucky and got unlikely results, or H0 is wrong." At some point you're more willing to reject the null than assume you have incredible luck. I was thinking of it in terms of the definition of a. It took some time, but I formulated the below intuition: If p-value [latex] \leq [/latex] a (the equal sign under the inequality is not showing on this forum for some reason): Getting our observed statistic from the sample is at most as likely as the chance of rejecting a true [latex] H_0 [/latex]. In other words, hence we have very likely a higher chance that [latex] H_0 [/latex] is false than we have the chance of rejecting a true [latex] H_0 [/latex]. This simplifies to: we have very likely a higher chance of rejecting a false [latex] H_0 [/latex] than the chance of rejecting a true [latex] H_0 [/latex]. Edited November 26, 2014 by DylsexicChciken
Cap'n Refsmmat Posted November 26, 2014 Posted November 26, 2014 This simplifies to: we have very likely a higher chance of rejecting a false [latex] H_0 [/latex] than the chance of rejecting a true [latex] H_0 [/latex].Not necessarily true. The chance of rejecting a false H0 is the power of the test. It depends on how different HA is from H0. If it's not very different, it may be very difficult to tell the difference, so you have a very small chance of rejecting a false null.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now