usO Posted April 19, 2012 Posted April 19, 2012 • I am about working on the Final effluent of a Waste water treatment plant. I will be collecting samples for a period of 12 months; 1 (one) sample per month. Sample collections will be split between autumn, winter, summer and spring. The physiochemical parameters like Temperature, pH, Electrical conductivity (EC), Turbidity, total dissolved oxygen (TDS) and dissolved oxygen (DO) will be taken along with the microbiological components like bacteria (Feacal coliform , E.coli and Vibrio) and viruses (Enteri virus, Rotavirus, Norovirus and Adenovirus)will also be looked into. The whole data collected is to access the quality of the final effluent wastewater and what interaction exist between the physiochemical components and microbiological components • I want to know which statistic analysis method best analyse the data collected and why? • If more than one statistic method can be use, please can all responders be simple and clear about their explanation on their choice of methods. Advices and suggestions will be highly appreciated. Thank you
Moontanman Posted April 19, 2012 Posted April 19, 2012 One sample a month? I would want at least one sample a day to do that effectively...
CharonY Posted April 20, 2012 Posted April 20, 2012 First of all I agree that once a month increases the variability of the data set and makes detailed comparisons very complicated. To assess what kind of analyses may be useful you need to add the single most important bit to your post. What do you actually want to see? What are the parameters? Do you want to find associations between parameters (e.g. temperature and bacterial content)? Or seasonal changes, or...? The more parameter you collect the more likely it is that you find random associations that will become significant with specific tests. Thus, instead of a fishing expedition (which are prone to false positive identifications) you should think about what hypotheses you want to test. Random data mining rarely yields results that survive scrutiny. In that regard I would recommend reading the Ionannidis paper (2005 PlosMed).
sheldysheldon Posted April 20, 2012 Posted April 20, 2012 Here's something that may help. Looks like a lot of work but interesting. I have often thought that statistical analysis is a great tool to figure out what the heck is really going on. The ultimate truth filter. here's links http://davidmlane.com/hyperstat/Statistical_analyses.html#gen http://spotfire.tibco.com/products/s-plus/statistical-analysis-software.aspx http://www.ehow.com/how_5666129_conduct-water-quality-statistical-analysis.html
usO Posted April 20, 2012 Author Posted April 20, 2012 One sample a month? I would want at least one sample a day to do that effectively... The reason for collecting a sample per month is because of the distance of the treatment work to the laboratory. It takes about 5 hours journey Here's something that may help. Looks like a lot of work but interesting. I have often thought that statistical analysis is a great tool to figure out what the heck is really going on. The ultimate truth filter. here's links http://davidmlane.co...alyses.html#gen http://spotfire.tibc...s-software.aspx http://www.ehow.com/...l-analysis.html Thank you for the links...the first two links are not really relevant since I will be using SPSS, the third link was helpful. I wish for more similar guide First of all I agree that once a month increases the variability of the data set and makes detailed comparisons very complicated. To assess what kind of analyses may be useful you need to add the single most important bit to your post. What do you actually want to see? What are the parameters? Do you want to find associations between parameters (e.g. temperature and bacterial content)? Or seasonal changes, or...? The more parameter you collect the more likely it is that you find random associations that will become significant with specific tests. Thus, instead of a fishing expedition (which are prone to false positive identifications) you should think about what hypotheses you want to test. Random data mining rarely yields results that survive scrutiny. In that regard I would recommend reading the Ionannidis paper (2005 PlosMed). I thought I stated the parameters (pH, BOD, etc). and Yes I want to find assoication between the parameters. Collecting more parameters; the research is limited to the time stipulated. Can please throw more light on "Random data mining.......". (Just reading the Ioannidis paper). You sounded more enlighten on the use of statistic, please do give more guide..Thank you
CharonY Posted April 20, 2012 Posted April 20, 2012 Stating a list of parameters but not having a hypothesis of the interdependencies is precisely on of the problems. In this case it would be the multiple-hypothesis problem. What happens is that if you test every parameter against everything else (which is very tempting to do as it is rather issue to amass a lot of data) then you are basically treating the parameters as independent. However, they are all from the same sample. I.e. if you test enough parameters you are bound to find something that correlates, by pure chance, with something else. Think of flipping an unloaded coin. If you do it 20 times, you expect an outcome of 50% head to tail (or whatever is on the coin). However, if you let, say hundreds of people do it, you are likely to find at least a few in which the ratio is skewed. If you had blindly tested, you would have assumed that these coins were loaded, however it is basically due to repeated testings (in this particular case one would have had to average over all individuals, something you cannot do with the different parameters of the water sample). Note that you will find many publications (especially epidemiological or environmental engineering papers) in which this is totally ignored. What you could do, is adjust the outcome of e.g. regression analyses depending on parameters tested. The most rigorous one are Bonferroni corrections. However, since they throw away a lot of true positives, usually one of the modified versions are used. However, imo the correct way to do these kinds of analyses is to pre-define the dependencies that you expect. E.g. pH is temperature dependent. And then only utilize that relevant data and test for those. The correction are more minor that way (as you test fewer hypotheses on the same data set).
ewmon Posted April 20, 2012 Posted April 20, 2012 The reason for collecting a sample per month is because of the distance of the treatment work to the laboratory ... takes about 5 hours journey One sample a month for 12 months results in 12 samples, and depending on the statistics you use, 12 samples may be an insufficient amount from which to draw reliable conclusions. I always understood 30 samples to be a minimum number. This would be one every 12 days for you. As to showing cause and effect, would a Pareto Chart be appropriate?
usO Posted April 21, 2012 Author Posted April 21, 2012 I thank you all for your contributions. I will come around again!!!
Damian1991EL Posted December 28, 2017 Posted December 28, 2017 (edited) That's good but in Polish, not English Commercial link removed by moderator On this website i find all answers Edited December 28, 2017 by Phi for All No advertising, please
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now