Shapiro-Wilk and Bartlett test

sangui · April 21, 2020

Hi,

I have to do some statistic, and I don't understand how works the shapiro-wilk test or the barltett test.

I think I understand why do we use it (to see if an sample follow a normal distribution for shapiro and see if our vaiance are equal for Bartlett).

But I am completely unable to do the math, can somebody help me ?

Thanks

taeto · April 21, 2020

On 4/21/2020 at 8:16 AM, sangui said:

Hi,

I have to do some statistic, and I don't understand how works the shapiro-wilk test or the barltett test.

I think I understand why do we use it (to see if an sample follow a normal distribution for shapiro and see if our vaiance are equal for Bartlett).

But I am completely unable to do the math, can somebody help me ?

Thanks

Expand

Could you show some example of what you are unable to do?

Shapiro-Wilk and Bartlett tests are not really that closely related. So if you bunch them together like this, it may mean that you are in need of some of the basic concepts that underlie them both.

sangui · April 21, 2020

For example I have to look if those sample follow a normal distribution.

Temperature_1	Temperature_2	Temperature_3	Temperature_4	Temperature_5
2.56	2.28	2.73	2.44	2.54
2.92	2.78	2.97	2.81	2.67
2.00	2.74	2.00	2.08	2.43
2.83	2.47	2.13	2.90	2.10
2.61	2.52	2.09	3.05	2.78
3.10	2.16	1.90	2.69	2.85
2.42	2.70	3.04	3.03	2.76
2.28	2.70	2.57	2.91	2.47

I'm suppose to use shapiro-wilk .

On this one I need to find if the variance are equal.

Temperature_1	Temperature_2	Temperature_3
2.42	3.05	1.95
2.83	2.21	2.23
2.25	2.18	2.54
3.02	2.35	2.56

I think I must use Bartlett.

Dagl1 · April 21, 2020

On 4/21/2020 at 12:13 PM, taeto said:

Shapiro-Wilk and Bartlett tests are not really that closely related. So if you bunch them together like this, it may mean that you are in need of some of the basic concepts that underlie them both.

Expand

Isn't it common practice to check for both normal distribution and equal variance, before applying tests to test if the nulhypothesis can be rejected? I remember using Levene's and Shapiro (and on the one occasions I had large sample size, kolmogorov-smirnov).

Aren't they related in the sense that you need these assumptions for follow up tests (ANOVA's for instance)?

sangui · April 21, 2020

On 4/21/2020 at 1:01 PM, Dagl1 said:

Aren't they related in the sense that you need these assumptions for follow up tests (ANOVA's for instance)?

Expand

Actually, it's exactly why I need to understand those test ^^

taeto · April 21, 2020

On 4/21/2020 at 1:01 PM, Dagl1 said:

Isn't it common practice to check for both normal distribution and equal variance, before applying tests to test if the nulhypothesis can be rejected? I remember using Levene's and Shapiro (and on the one occasions I had large sample size, kolmogorov-smirnov).

Aren't they related in the sense that you need these assumptions for follow up tests (ANOVA's for instance)?

Expand

Yes, it seems to be what the assignment is about. But sangui said that he cannot do the math. And the math differs a lot. Bartlett is a standard $\chi^2$ test, whereas the Shapiro-Wilk testor W needs looking up in a table. That makes the mechanics a little different. The value of W is actually not quite easy to compute, so it would be a place to start, if necessary.

Edited April 21, 2020 by taeto

Dagl1 · April 21, 2020

On 4/21/2020 at 1:17 PM, taeto said:

Yes, it seems to be what the assignment is about. But sangui said that he cannot do the math. And the math differs a lot. Bartlett is a standard χ2 test, whereas the Shapiro-Wilk testor W needs looking up in a table. That makes the mechanics a little different. The value of W is actually not quite easy to compute, so it would be a place to start, if necessary.

Expand

Ahh I see, ye I didn't really consider the math differences when reading your post, apologies!

sangui · April 21, 2020

On 4/21/2020 at 1:17 PM, taeto said:

Yes, it seems to be what the assignment is about. But sangui said that he cannot do the math. And the math differs a lot. Bartlett is a standard χ2 test, whereas the Shapiro-Wilk testor W needs looking up in a table. That makes the mechanics a little different. The value of W is actually not quite easy to compute, so it would be a place to start, if necessary.

Expand

It's my fault I haven't been clear.

I'm suppose to learn how to use those test, and my teacher doesn't gave me more information. So, I don't really know the difference.

taeto · April 21, 2020

On 4/21/2020 at 1:29 PM, Dagl1 said:

Ahh I see, ye I didn't really consider the math differences when reading your post, apologies!

Expand

Don't be silly. I am happy you point out the obvious connection. When I see things from a math viewpoint, I sometimes forget the, well, obvious 🧐

On 4/21/2020 at 1:32 PM, sangui said:

It's my fault I haven't been clear.

I'm suppose to learn how to use those test, and my teacher doesn't gave me more information. So, I don't really know the difference.

Expand

At least Bartlett is not so difficult, so let us go through it 🙂.

But first, what is the meaning of your first table? We see five columns T1 to T5 (where T stands for "Temperature_") and eight rows with an entry for every column. Do you expect that you have to test whether all the 40 entries in the rows and columns follow the same normal distribution? Or just that the entries in the same column follow a normal distribution, possibly not the same for all columns? Or the entries in the same row?

And then, what is the similar meaning of the second table?

The two tables are not related in any way by coming from the same experiment or anything like that, is that right?

sangui · April 21, 2020

On 4/21/2020 at 1:35 PM, taeto said:

But first, what is the meaning of your first table? We see five columns T1 to T5 (where T stands for "Temperature_") and eight rows with an entry for every column. Do you expect that you have to test whether all the 40 entries in the rows and columns follow the same normal distribution? Or just that the entries in the same column follow a normal distribution, possibly not the same for all columns? Or the entries in the same row?

Expand

We must see if all entries follow the same normal distribution.

And it's the same for the second table.

I don't think those table are related (but I'm not sure, I just have the exercice and the value of Shapiro for the first and Bartlett for the second).

I'm sorry to don't be more precise but, I haven't a lot of information (I need to understand this test for the following of my study, but my teacher choose to don't work on it).

Thank you for your help.

Edited April 21, 2020 by sangui

taeto · April 21, 2020

On 4/21/2020 at 1:55 PM, sangui said:

We must see if all entries follow the same normal distribution.

I don't think those table are related (but I'm not sure, I just have the exercice and the value of Shapiro for the first and Bartlett for the second).

Expand

If we take Bartlett first, then the purpose of the test is to figure out for several sets of data, and assuming that each set is normal distributed, whether they also have the same variance. If we think about the second table, it is possible that it represents four sets of data, one for each row, each set of data containing three values, for T1, T2 and T3.

Or it is (more) possible that the table represents three sets of data, each containing four values.

Let us say that it makes sense that the second table represents experiments in which for each of three temperatures T1,T2,T3 there were made four measurements. Then for T1 it means that values 2.42, 2.83, 2.25, 3.02 were measured, for T2 they were 3.05, 2.21, 2.18, 2.35, and for T3 it was 1.95, 2.23, 2.54, 2.56.

We can calculate the estimate variances of each of these samples in the standard way, as 1/3 of the difference between the average of the squares minus the square of the averages. I trust that this is familiar to you? Then we have three estimated variances V1, V2, V3, one for each T.

We also have to compute the estimate of the common variance V in case they were actually all equal. That will be $V = 1/(12 - 3) \sum_{i=1,2,3} (4-1)Vi$ , where the 3 means that we have three data sets, the 4 means that we have four data in each set, and 12 is the total number of data in the table.

I have not made the computations, since I have no good calculator handy, and I would probably make confusing mistakes, sorry.

Finally you have to compute the Bartlett testor itself. First we need the number $D = (12-3)\log V - \sum_{i=1,2,3} (4-1)\log V_i.$ We can see from the formula above for V that it would be a pretty good match if all the V1,V2,V3 are the same, because then V would be equal to all of them, and this $D$ would be zero. So $D$ having a small value is good.

To compute the final Bartlett testor we also need to have $C=1 + (\sum_{i=1,2,3} \frac{1}{4-1} - \frac{1}{12-3})/(3(3-1)).$ The testor becomes $B = D/C$ .

Now you have to check $B$ against a $\chi^2$ distribution with 3 - 1 = 2 degrees of freedom.

Having typed all of that, maybe it is not as easy as I first thought. But try to compute as many of the numbers as you are able to.

sangui · April 21, 2020

Stupid question but : what is frac ?

taeto · April 21, 2020

On 4/21/2020 at 6:46 PM, sangui said:

Stupid question but : what is frac ?

Expand

On 4/21/2020 at 9:39 PM, taeto said:

That is not a stupid question at all. It seems there is a particular quirk to this site, which forces you sometimes to reload a page to view something that was typeset in latex. But if you reload the page, and this thing still persist, then please reply with precise information about the piece of text where it occurs, typically somewhere that was supposed to be mathematical·

Expand

Edited April 21, 2020 by taeto

sangui · April 22, 2020

Thanks

Is it possible than D is negative ?

I don't understand why do we use the sum for C, we don't have anything to sum ? ∑i=1,2,3

Edited April 22, 2020 by sangui

taeto · April 22, 2020

On 4/22/2020 at 7:24 AM, sangui said:

Is it possible than D is negative ?

I don't understand why do we use the sum for C, we don't have anything to sum ? ∑i=1,2,3

Expand

No, D cannot be negative.

And the sum for C is $\sum_i \frac{1}{n_i-1}$ where $n_i$ is the number of elements in the $i$ 'th data set. In our case each T contains 4 numbers. So we have to add 1/3 to itself three times, and the sum adds up to 1. And then when you have that sum, you subtract $\frac{1}{n-3}$ , where $n$ is the sum of the $n_i$ and 3 is the number of data sets. Sorry that this was not so clearly written.

Edited April 22, 2020 by taeto

sangui · April 22, 2020

I must be wrong somewhere because I found this.

My B=D/C=0.13257072

ANd in my correction : Bartlett’s K-squared =0.305

bertlett.xlsxFetching info...

Sign In

Shapiro-Wilk and Bartlett test

Recommended Posts

sangui

taeto

sangui

Dagl1

sangui

taeto

Dagl1

sangui

taeto

sangui

taeto

sangui

taeto

sangui

taeto

sangui

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information