sangui Posted April 21, 2020 Posted April 21, 2020 Hi, I have to do some statistic, and I don't understand how works the shapiro-wilk test or the barltett test. I think I understand why do we use it (to see if an sample follow a normal distribution for shapiro and see if our vaiance are equal for Bartlett). But I am completely unable to do the math, can somebody help me ? Thanks
taeto Posted April 21, 2020 Posted April 21, 2020 On 4/21/2020 at 8:16 AM, sangui said: Hi, I have to do some statistic, and I don't understand how works the shapiro-wilk test or the barltett test. I think I understand why do we use it (to see if an sample follow a normal distribution for shapiro and see if our vaiance are equal for Bartlett). But I am completely unable to do the math, can somebody help me ? Thanks Expand Could you show some example of what you are unable to do? Shapiro-Wilk and Bartlett tests are not really that closely related. So if you bunch them together like this, it may mean that you are in need of some of the basic concepts that underlie them both.
sangui Posted April 21, 2020 Author Posted April 21, 2020 For example I have to look if those sample follow a normal distribution. Temperature_1 Temperature_2 Temperature_3 Temperature_4 Temperature_5 2.56 2.28 2.73 2.44 2.54 2.92 2.78 2.97 2.81 2.67 2.00 2.74 2.00 2.08 2.43 2.83 2.47 2.13 2.90 2.10 2.61 2.52 2.09 3.05 2.78 3.10 2.16 1.90 2.69 2.85 2.42 2.70 3.04 3.03 2.76 2.28 2.70 2.57 2.91 2.47 I'm suppose to use shapiro-wilk . On this one I need to find if the variance are equal. Temperature_1 Temperature_2 Temperature_3 2.42 3.05 1.95 2.83 2.21 2.23 2.25 2.18 2.54 3.02 2.35 2.56 I think I must use Bartlett.
Dagl1 Posted April 21, 2020 Posted April 21, 2020 On 4/21/2020 at 12:13 PM, taeto said: Shapiro-Wilk and Bartlett tests are not really that closely related. So if you bunch them together like this, it may mean that you are in need of some of the basic concepts that underlie them both. Expand Isn't it common practice to check for both normal distribution and equal variance, before applying tests to test if the nulhypothesis can be rejected? I remember using Levene's and Shapiro (and on the one occasions I had large sample size, kolmogorov-smirnov). Aren't they related in the sense that you need these assumptions for follow up tests (ANOVA's for instance)?
sangui Posted April 21, 2020 Author Posted April 21, 2020 On 4/21/2020 at 1:01 PM, Dagl1 said: Aren't they related in the sense that you need these assumptions for follow up tests (ANOVA's for instance)? Expand Actually, it's exactly why I need to understand those test ^^
taeto Posted April 21, 2020 Posted April 21, 2020 (edited) On 4/21/2020 at 1:01 PM, Dagl1 said: Isn't it common practice to check for both normal distribution and equal variance, before applying tests to test if the nulhypothesis can be rejected? I remember using Levene's and Shapiro (and on the one occasions I had large sample size, kolmogorov-smirnov). Aren't they related in the sense that you need these assumptions for follow up tests (ANOVA's for instance)? Expand Yes, it seems to be what the assignment is about. But sangui said that he cannot do the math. And the math differs a lot. Bartlett is a standard \chi^2 test, whereas the Shapiro-Wilk testor W needs looking up in a table. That makes the mechanics a little different. The value of W is actually not quite easy to compute, so it would be a place to start, if necessary. Edited April 21, 2020 by taeto
Dagl1 Posted April 21, 2020 Posted April 21, 2020 On 4/21/2020 at 1:17 PM, taeto said: Yes, it seems to be what the assignment is about. But sangui said that he cannot do the math. And the math differs a lot. Bartlett is a standard χ2 test, whereas the Shapiro-Wilk testor W needs looking up in a table. That makes the mechanics a little different. The value of W is actually not quite easy to compute, so it would be a place to start, if necessary. Expand Ahh I see, ye I didn't really consider the math differences when reading your post, apologies!
sangui Posted April 21, 2020 Author Posted April 21, 2020 On 4/21/2020 at 1:17 PM, taeto said: Yes, it seems to be what the assignment is about. But sangui said that he cannot do the math. And the math differs a lot. Bartlett is a standard χ2 test, whereas the Shapiro-Wilk testor W needs looking up in a table. That makes the mechanics a little different. The value of W is actually not quite easy to compute, so it would be a place to start, if necessary. Expand It's my fault I haven't been clear. I'm suppose to learn how to use those test, and my teacher doesn't gave me more information. So, I don't really know the difference.
taeto Posted April 21, 2020 Posted April 21, 2020 On 4/21/2020 at 1:29 PM, Dagl1 said: Ahh I see, ye I didn't really consider the math differences when reading your post, apologies! Expand Don't be silly. I am happy you point out the obvious connection. When I see things from a math viewpoint, I sometimes forget the, well, obvious 🧐 On 4/21/2020 at 1:32 PM, sangui said: It's my fault I haven't been clear. I'm suppose to learn how to use those test, and my teacher doesn't gave me more information. So, I don't really know the difference. Expand At least Bartlett is not so difficult, so let us go through it 🙂. But first, what is the meaning of your first table? We see five columns T1 to T5 (where T stands for "Temperature_") and eight rows with an entry for every column. Do you expect that you have to test whether all the 40 entries in the rows and columns follow the same normal distribution? Or just that the entries in the same column follow a normal distribution, possibly not the same for all columns? Or the entries in the same row? And then, what is the similar meaning of the second table? The two tables are not related in any way by coming from the same experiment or anything like that, is that right?
sangui Posted April 21, 2020 Author Posted April 21, 2020 (edited) On 4/21/2020 at 1:35 PM, taeto said: But first, what is the meaning of your first table? We see five columns T1 to T5 (where T stands for "Temperature_") and eight rows with an entry for every column. Do you expect that you have to test whether all the 40 entries in the rows and columns follow the same normal distribution? Or just that the entries in the same column follow a normal distribution, possibly not the same for all columns? Or the entries in the same row? Expand We must see if all entries follow the same normal distribution. And it's the same for the second table. I don't think those table are related (but I'm not sure, I just have the exercice and the value of Shapiro for the first and Bartlett for the second). I'm sorry to don't be more precise but, I haven't a lot of information (I need to understand this test for the following of my study, but my teacher choose to don't work on it). Thank you for your help. Edited April 21, 2020 by sangui
taeto Posted April 21, 2020 Posted April 21, 2020 On 4/21/2020 at 1:55 PM, sangui said: We must see if all entries follow the same normal distribution. I don't think those table are related (but I'm not sure, I just have the exercice and the value of Shapiro for the first and Bartlett for the second). Expand If we take Bartlett first, then the purpose of the test is to figure out for several sets of data, and assuming that each set is normal distributed, whether they also have the same variance. If we think about the second table, it is possible that it represents four sets of data, one for each row, each set of data containing three values, for T1, T2 and T3. Or it is (more) possible that the table represents three sets of data, each containing four values. Let us say that it makes sense that the second table represents experiments in which for each of three temperatures T1,T2,T3 there were made four measurements. Then for T1 it means that values 2.42, 2.83, 2.25, 3.02 were measured, for T2 they were 3.05, 2.21, 2.18, 2.35, and for T3 it was 1.95, 2.23, 2.54, 2.56. We can calculate the estimate variances of each of these samples in the standard way, as 1/3 of the difference between the average of the squares minus the square of the averages. I trust that this is familiar to you? Then we have three estimated variances V1, V2, V3, one for each T. We also have to compute the estimate of the common variance V in case they were actually all equal. That will be V = 1/(12 - 3) \sum_{i=1,2,3} (4-1)Vi, where the 3 means that we have three data sets, the 4 means that we have four data in each set, and 12 is the total number of data in the table. I have not made the computations, since I have no good calculator handy, and I would probably make confusing mistakes, sorry. Finally you have to compute the Bartlett testor itself. First we need the number D = (12-3)\log V - \sum_{i=1,2,3} (4-1)\log V_i. We can see from the formula above for V that it would be a pretty good match if all the V1,V2,V3 are the same, because then V would be equal to all of them, and this D would be zero. So D having a small value is good. To compute the final Bartlett testor we also need to have C=1 + (\sum_{i=1,2,3} \frac{1}{4-1} - \frac{1}{12-3})/(3(3-1)). The testor becomes B = D/C. Now you have to check B against a \chi^2 distribution with 3 - 1 = 2 degrees of freedom. Having typed all of that, maybe it is not as easy as I first thought. But try to compute as many of the numbers as you are able to. 3
taeto Posted April 21, 2020 Posted April 21, 2020 (edited) On 4/21/2020 at 6:46 PM, sangui said: Stupid question but : what is frac ? Expand On 4/21/2020 at 9:39 PM, taeto said: That is not a stupid question at all. It seems there is a particular quirk to this site, which forces you sometimes to reload a page to view something that was typeset in latex. But if you reload the page, and this thing still persist, then please reply with precise information about the piece of text where it occurs, typically somewhere that was supposed to be mathematical· Expand Edited April 21, 2020 by taeto 1
sangui Posted April 22, 2020 Author Posted April 22, 2020 (edited) Thanks Is it possible than D is negative ? I don't understand why do we use the sum for C, we don't have anything to sum ? ∑i=1,2,3 Edited April 22, 2020 by sangui
taeto Posted April 22, 2020 Posted April 22, 2020 (edited) On 4/22/2020 at 7:24 AM, sangui said: Is it possible than D is negative ? I don't understand why do we use the sum for C, we don't have anything to sum ? ∑i=1,2,3 Expand No, D cannot be negative. And the sum for C is \sum_i \frac{1}{n_i-1} where n_i is the number of elements in the i'th data set. In our case each T contains 4 numbers. So we have to add 1/3 to itself three times, and the sum adds up to 1. And then when you have that sum, you subtract \frac{1}{n-3}, where n is the sum of the n_i and 3 is the number of data sets. Sorry that this was not so clearly written. Edited April 22, 2020 by taeto
sangui Posted April 22, 2020 Author Posted April 22, 2020 I must be wrong somewhere because I found this. My B=D/C=0.13257072 ANd in my correction : Bartlett’s K-squared =0.305 bertlett.xlsxFetching info...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now