Jump to content

Can you help me find a generally recognised mathematical rule/law/curve which most closely matches a fixed set of numbers?


Recommended Posts

Posted

Hi, I hope someone can help me with this problem! I have a distribution of 2000 numbers but I only know the first 10. The first 10 numbers are: 
2025, 1000, 335, 300, 187.5, 135, 99.5, 20, 17.5, and 13.5. I know that the total of all 2000 numbers is about 12,000. 

What I am looking for is a recognised long tail curve model which using the first 10 numbers only predicts a total for all 2000 numbers in the sequence. I need the total to ideally be between 11,000 and 13,000. I did look at Zipfs law but I'm not sure this works. 

I am not a mathematician but would appreciate any recognised models/distribution curves/laws you could suggest and what the model you suggest would give as a total for all 2000 numbers in the sequence knowing only the first 10 numbers. 

The key thing that I am trying to do is fit these numbers to a recognised curve / long tail model. I am less bothered about the total (although ideally in the range 11000-13000) and more bothered about the model being something that would be recognised by mathematicians globally.

I hope this is clear. Thank you so much in advance to anyone who tries to solve this for me!  

Posted

You haven't said whether these numbers are the result of a specific calculation, or whether they are experimental observations.  They very much appear to be exponential decay, but not all the numbers fit perfectly.  My scientific calculator suggests that a reasonable mathematical model (assuming exponential decay) could be y = (2870)*(0.576)^x, where x =1 for the first term (the 2025 term).  Exponential decay does have a long tail as you suggested.

Posted

Hi, Thanks for your reply. To give you the context the numbers are actually costs. I know that there were approximately 2000 costs incurred. I don't know the exact total spent but believe it was approximately $12,000 ...most of the costs were just small amounts. 

Lots of people are estimating the total cost at $12,000 (we will never know for sure the total cost as we can't find out what every single one of the 2000 costs incurred actually were). I want to see if I can add a mathematical dimension to all the people just sticking their finger in the air and guessing what the total is. 

What I'm trying to get to is a statement that says something like "If the 10 known costs know follow an XYZ distribution then the total costs incurred would be XXXXXX" where XXXXX is a total cost of around $11,000 to $13,000. I don't know if this is even possible but hopefully someone can pin this to a recognized distribution/rule/law/model/curve. 

Hope this explains! And thanks again for your reply. Apologies I am not in any way an expert in maths so really appreciate any help I can get ! 

Jamie. 

Posted (edited)

The 10 costs you showed add up to about about 33% of the total of $12,000.  But the 10 costs represent only .5% of the total number of costs.  That means to me that your 10 cost sample is not representative of the total.

I do not see how you could get anything that is useful out of these numbers since your sample is not representative of the total.  I am not a statistician so maybe there is some "magic" they could do to give you something useful.

Good luck. 

Edited by Bufofrog

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.