Jump to content

Recommended Posts

Posted (edited)

The question:

"
Mean=38.68, Standard deviation=11.42

Inside what interval is it guaranteed that at least 89% of the concentration observations will lie?
"

I solved for k in Chebyshev's rule equation:

 

[math]100(1-\Big(\dfrac{1}{k^2}\Big))=89[/math]

 

I get k= 3.015...

So at least 89% of all data are within 3.015... standard deviations from the mean.
The bounds are 4.247 to 73.113, rounded to 3 decimal places.

The answer bound is (4.247, 73.113) and not [4.247, 73.113]. Why does it not include the boundaries?

Edited by DylsexicChciken
Posted

First it is an inequality not an equation

 

 

Chebyshev's inequality specifies that no more than 1/k^2 of a distributions values are more than k standard deviations from the mean - my reading of that is that exactly 1/k^2 of a distributions values can be k standard deviations from mean but no more than that. when you reverse to find out how much is inside k standard deviations you must include the boundary as you did in the original statement. If 1/k^2 can be on the boundary then the remainder cannot be on the boundary

Posted (edited)

First it is an inequality not an equation

 

 

Chebyshev's inequality specifies that no more than 1/k^2 of a distributions values are more than k standard deviations from the mean - my reading of that is that exactly 1/k^2 of a distributions values can be k standard deviations from mean but no more than that. when you reverse to find out how much is inside k standard deviations you must include the boundary as you did in the original statement. If 1/k^2 can be on the boundary then the remainder cannot be on the boundary

 

So:

 

[math]100(1/k^2)>[/math] (percent amount of data outside k standard deviations from the mean)

 

[math]100(1-\Big(\dfrac{1}{k^2}\Big))\leq[/math] (percent amount of data inside k standard deviations from the mean)

 

The question asks for the boundaries of where at least 89% of the data are inside some k standard deviations from the mean. Since k=3.015, so at least 89% of data are inside 3.015 standard deviations from the mean. At least means the data have to be greater than or equal to the boundaries, but the answer doesn't include the boundaries. So I am still confused. I got everything correct on this homework except the bracket signs(I chose the include brackets, or the sharp brackets: "[x, y]").

Edited by DylsexicChciken
Posted

What difference do you think including the boundaries of the interval makes? Assuming a continuous underlying distribution, what percentage of events lies exactly on the boundary?

Posted (edited)

What difference do you think including the boundaries of the interval makes? Assuming a continuous underlying distribution, what percentage of events lies exactly on the boundary?

 

So you're asking what percentage of the entire data is located at an instantaneous point on the graph? Wouldn't that be 0.00000..., which can be approximated by 0? So even if the equation/inequality includes the boundaries, we don't include it anyway?

Edited by DylsexicChciken
Posted

I'd go as far as to claim that 0.0000... equals zero. So it simply doesn't matter, as long as nothing special happens at the interval borders (e.g. if you have discrete possible outcomes that happen to lie exactly on the interval borders). Note that this is only the statement of someone with a little background in mathematics and statistics. I cannot tell you about conventions in fields that use this Chebyshev's rule that I never heard about.

Posted (edited)

Chebyshev's inequality is that given a random variable [math]X[/math] with expected value [math]\mu[/math] and standard deviation [math]\sigma[/math], we have

[math]P(|X-\mu| \geq k\sigma) \leq \frac{1}{k^2}[/math],

 

where [math]k[/math] is a real number greater than 0. That is, we're looking at the case where [math]|X-\mu|[/math] is greater than *or equal to* [math]k\sigma[/math], so the boundaries are included. Thus, when looking at the values within [math]k\sigma[/math] of [math]\mu[/math], we're looking at [math]P(|X-\mu| < k\sigma)[/math], so the boundaries are not included.

 

timo is also correct in saying that the inclusion of boundaries doesn't matter in general with a continuous interval.

And 0.000... = 0 for sure. If you were intending to mean something like "infinitely many zeroes with a 1 at the end," then such a statement is invalid.

Edited by John
Posted

The book is giving me a conflicting definition of Chebyshev's rule with what John has posted.

 

This is derived from the probability form of Chebyshev's rule:

 

[latex]P(|X-\mu| \geq k\sigma) \leq \frac{1}{k^2}[/latex]

 

Is this converse true?

[latex]P(|X-\mu| < k\sigma) < 1 - \frac{1}{k^2}[/latex]

 

This is what I translate from my book(Peck Introduction to Statistics & Data Analysis 4th edition, page 201). You can find similar statements around the web, such as here: http://www.csus.edu/indiv/s/seria/LectureNotes/Chebyshev.htm

 

I edited the percentage into probability:

 

"

Consider any number k, where k is greater than or equal to 1. Then the (probability) of observations that are within k standard deviations of the mean is at least:

 

[latex](1-\Big(\dfrac{1}{k^2}\Big))[/latex]

 

"

 

Therefore, translating definition from the book into probability form:

[latex]P(|X-\mu| \leq k\sigma) \geq 1 - \frac{1}{k^2}[/latex]

 

Did I do something wrong in translating from the sentence to the formula, or is something wrong here? The two formulas are contradicting:

 

[latex]P(|X-\mu| < k\sigma) < 1 - \frac{1}{k^2}[/latex]

 

[latex]P(|X-\mu| \leq k\sigma) \geq 1 - \frac{1}{k^2}[/latex]

Posted (edited)

This is really annoying.

 

I lost the entire post when I went to preview.

 

Anyway I will try again.

 

 

The answer bound is (4.247, 73.113) and not [4.247, 73.113]. Why does it not include the boundaries?

 

It really depends upon your point of view, either could be correct.

 

Both Timo and John are correct, but they are making different points.

 

Timo is saying that it depends upon your point of view.

 

John is making the point that when considering (dividing it into regions of interest) the entire universal population set, you must be careful to avoid counting the boundary twice.

Since John has presented the standard Chebychef inequality that includes equalities, the region of interest in the question should be denoted by strict inequalities when Chebychef is reversed.

 

This is what I think your book has done.

 

However many statisticians start with the reversed inequality, including the equality, so the strict inequalities will appear in the other region and the region of interest to your question will then be denoted by inequalities including equality and the outer region by strict inequalities.

 

This is because the inner region is normally the one of interest.

Here is an excerpt from my old textbook (Clark and Schkade 1969), who adopt this latter approach.

 

post-74263-0-50786000-1411815339_thumb.jpg

 

post-74263-0-37984700-1411815383.jpg

 

The important thing is to be consistent when dividing up the universal population set into regions.

Edited by studiot
Posted (edited)

The book is giving me a conflicting definition of Chebyshev's rule with what John has posted.

 

This is derived from the probability form of Chebyshev's rule:

 

[latex]P(|X-\mu| \geq k\sigma) \leq \frac{1}{k^2}[/latex]

 

Is this converse true?

[latex]P(|X-\mu| < k\sigma) < 1 - \frac{1}{k^2}[/latex]

Not quite. Chebyshev's inequality, in terms of values within [math]k\sigma[/math] of [math]\mu[/math], is equivalent to the following:

 

[math]P(|X - \mu| < k\sigma) \geq 1 - \frac{1}{k^2}[/math].

 

It may seem strange at first, but if you think of dividing up the interval [0, 1] (since probabilities must be between 0 and 1), then Chebyshev's inequality states that *at most* [math]\left(0, \frac{1}{k^2} \right)[/math] is occupied by values outside [math]\mu \pm k\sigma[/math], which leaves *at least* [math]\left( \frac{1}{k^2}, 1 \right)[/math] to be occupied by values inside [math]\mu \pm k\sigma[/math]. I don't know if that clears things up extremely well, but there you have it. I think this addresses the rest of your post, too.

 

Was this a test question you answered that was marked wrong, or is it simply the answer in the back of your book? The reason I ask is that, in the latter case, it's likely just a matter of convention. If it's the former, then it may be a case of the instructor wanting the tightest interval for which the conditions of the exercise are satisfied.

 

That is to say, the way the question is worded in your original post, [math](-\infty, \infty)[/math] is also technically correct, and so is the answer you provided, but the same interval without inclusion of the boundaries is also correct and is in some sense the "smallest" valid interval.

Edited by John
Posted (edited)

Was this a test question you answered that was marked wrong, or is it simply the answer in the back of your book? The reason I ask is that, in the latter case, it's likely just a matter of convention. If it's the former, then it may be a case of the instructor wanting the tightest interval for which the conditions of the exercise are satisfied.

 

 

 

It was a homework question on Webassign. The book for the Webassign didn't cover boundaries of Chebyshev's bounds, but it still gave us a question on boundaries of Chebyshev's bounds and whether to include the end points.

 

Yes, the number [0, 1] example cleared things up(it helped when I drew a number line and marked arbitrary values within [latex][ 0, \dfrac{1}{k^2} ][/latex] and then the remaining would always be greater than or equal to [latex]1 - \dfrac{1}{k^2} [/latex]. Thanks for the help.

Edited by DylsexicChciken

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.