Water analysis

usO · April 19, 2012

• I am about working on the Final effluent of a Waste water treatment plant. I will be collecting samples for a period of 12 months; 1 (one) sample per month. Sample collections will be split between autumn, winter, summer and spring. The physiochemical parameters like Temperature, pH, Electrical conductivity (EC), Turbidity, total dissolved oxygen (TDS) and dissolved oxygen (DO) will be taken along with the microbiological components like bacteria (Feacal coliform , E.coli and Vibrio) and viruses (Enteri virus, Rotavirus, Norovirus and Adenovirus)will also be looked into. The whole data collected is to access the quality of the final effluent wastewater and what interaction exist between the physiochemical components and microbiological components

• I want to know which statistic analysis method best analyse the data collected and why?

• If more than one statistic method can be use, please can all responders be simple and clear about their explanation on their choice of methods. Advices and suggestions will be highly appreciated. Thank you

Moontanman · April 19, 2012

One sample a month? I would want at least one sample a day to do that effectively...

**CharonY** · April 20, 2012

First of all I agree that once a month increases the variability of the data set and makes detailed comparisons very complicated. To assess what kind of analyses may be useful you need to add the single most important bit to your post. What do you actually want to see? What are the parameters? Do you want to find associations between parameters (e.g. temperature and bacterial content)? Or seasonal changes, or...? The more parameter you collect the more likely it is that you find random associations that will become significant with specific tests.

Thus, instead of a fishing expedition (which are prone to false positive identifications) you should think about what hypotheses you want to test. Random data mining rarely yields results that survive scrutiny. In that regard I would recommend reading the Ionannidis paper (2005 PlosMed).

sheldysheldon · April 20, 2012

Here's something that may help. Looks like a lot of work but interesting. I have often thought that statistical analysis is a great tool to figure out what the heck is really going on. The ultimate truth filter.

here's links

http://davidmlane.com/hyperstat/Statistical_analyses.html#gen

http://spotfire.tibco.com/products/s-plus/statistical-analysis-software.aspx

http://www.ehow.com/how_5666129_conduct-water-quality-statistical-analysis.html

usO · April 20, 2012

One sample a month? I would want at least one sample a day to do that effectively...

The reason for collecting a sample per month is because of the distance of the treatment work to the laboratory. It takes about 5 hours journey

Here's something that may help. Looks like a lot of work but interesting. I have often thought that statistical analysis is a great tool to figure out what the heck is really going on. The ultimate truth filter.

here's links

http://davidmlane.co...alyses.html#gen

http://spotfire.tibc...s-software.aspx

http://www.ehow.com/...l-analysis.html

Thank you for the links...the first two links are not really relevant since I will be using SPSS, the third link was helpful. I wish for more similar guide

First of all I agree that once a month increases the variability of the data set and makes detailed comparisons very complicated. To assess what kind of analyses may be useful you need to add the single most important bit to your post. What do you actually want to see? What are the parameters? Do you want to find associations between parameters (e.g. temperature and bacterial content)? Or seasonal changes, or...? The more parameter you collect the more likely it is that you find random associations that will become significant with specific tests.

Thus, instead of a fishing expedition (which are prone to false positive identifications) you should think about what hypotheses you want to test. Random data mining rarely yields results that survive scrutiny. In that regard I would recommend reading the Ionannidis paper (2005 PlosMed).

I thought I stated the parameters (pH, BOD, etc). and Yes I want to find assoication between the parameters. Collecting more parameters; the research is limited to the time stipulated. Can please throw more light on "Random data mining.......". (Just reading the Ioannidis paper). You sounded more enlighten on the use of statistic, please do give more guide..Thank you

**CharonY** · April 20, 2012

Stating a list of parameters but not having a hypothesis of the interdependencies is precisely on of the problems. In this case it would be the multiple-hypothesis problem. What happens is that if you test every parameter against everything else (which is very tempting to do as it is rather issue to amass a lot of data) then you are basically treating the parameters as independent. However, they are all from the same sample. I.e. if you test enough parameters you are bound to find something that correlates, by pure chance, with something else.

Think of flipping an unloaded coin. If you do it 20 times, you expect an outcome of 50% head to tail (or whatever is on the coin). However, if you let, say hundreds of people do it, you are likely to find at least a few in which the ratio is skewed. If you had blindly tested, you would have assumed that these coins were loaded, however it is basically due to repeated testings (in this particular case one would have had to average over all individuals, something you cannot do with the different parameters of the water sample).

Note that you will find many publications (especially epidemiological or environmental engineering papers) in which this is totally ignored. What you could do, is adjust the outcome of e.g. regression analyses depending on parameters tested. The most rigorous one are Bonferroni corrections. However, since they throw away a lot of true positives, usually one of the modified versions are used.

However, imo the correct way to do these kinds of analyses is to pre-define the dependencies that you expect. E.g. pH is temperature dependent. And then only utilize that relevant data and test for those. The correction are more minor that way (as you test fewer hypotheses on the same data set).

ewmon · April 20, 2012

The reason for collecting a sample per month is because of the distance of the treatment work to the laboratory ... takes about 5 hours journey

One sample a month for 12 months results in 12 samples, and depending on the statistics you use, 12 samples may be an insufficient amount from which to draw reliable conclusions. I always understood 30 samples to be a minimum number. This would be one every 12 days for you.

As to showing cause and effect, would a Pareto Chart be appropriate?

usO · April 21, 2012

I thank you all for your contributions. I will come around again!!!

Damian1991EL · December 28, 2017

That's good but in Polish, not English

Commercial link removed by moderator

On this website i find all answers

Edited December 28, 2017 by Phi for All
No advertising, please

Sign In

Water analysis

Recommended Posts

usO

Moontanman

CharonY

sheldysheldon

usO

CharonY

ewmon

usO

Damian1991EL

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information