Water analysis

April 19, 201213 yr

• I am about working on the Final effluent of a Waste water treatment plant. I will be collecting samples for a period of 12 months; 1 (one) sample per month. Sample collections will be split between autumn, winter, summer and spring. The physiochemical parameters like Temperature, pH, Electrical conductivity (EC), Turbidity, total dissolved oxygen (TDS) and dissolved oxygen (DO) will be taken along with the microbiological components like bacteria (Feacal coliform , E.coli and Vibrio) and viruses (Enteri virus, Rotavirus, Norovirus and Adenovirus)will also be looked into. The whole data collected is to access the quality of the final effluent wastewater and what interaction exist between the physiochemical components and microbiological components

• I want to know which statistic analysis method best analyse the data collected and why?

• If more than one statistic method can be use, please can all responders be simple and clear about their explanation on their choice of methods. Advices and suggestions will be highly appreciated. Thank you

April 19, 201213 yr

One sample a month? I would want at least one sample a day to do that effectively...

April 20, 201213 yr

First of all I agree that once a month increases the variability of the data set and makes detailed comparisons very complicated. To assess what kind of analyses may be useful you need to add the single most important bit to your post. What do you actually want to see? What are the parameters? Do you want to find associations between parameters (e.g. temperature and bacterial content)? Or seasonal changes, or...? The more parameter you collect the more likely it is that you find random associations that will become significant with specific tests.

Thus, instead of a fishing expedition (which are prone to false positive identifications) you should think about what hypotheses you want to test. Random data mining rarely yields results that survive scrutiny. In that regard I would recommend reading the Ionannidis paper (2005 PlosMed).

April 20, 201213 yr

Here's something that may help. Looks like a lot of work but interesting. I have often thought that statistical analysis is a great tool to figure out what the heck is really going on. The ultimate truth filter.

here's links

http://davidmlane.com/hyperstat/Statistical_analyses.html#gen

http://spotfire.tibco.com/products/s-plus/statistical-analysis-software.aspx

http://www.ehow.com/how_5666129_conduct-water-quality-statistical-analysis.html

April 20, 201213 yr

Author

One sample a month? I would want at least one sample a day to do that effectively...

The reason for collecting a sample per month is because of the distance of the treatment work to the laboratory. It takes about 5 hours journey

Here's something that may help. Looks like a lot of work but interesting. I have often thought that statistical analysis is a great tool to figure out what the heck is really going on. The ultimate truth filter.

here's links

http://davidmlane.co...alyses.html#gen

http://spotfire.tibc...s-software.aspx

http://www.ehow.com/...l-analysis.html

Thank you for the links...the first two links are not really relevant since I will be using SPSS, the third link was helpful. I wish for more similar guide

First of all I agree that once a month increases the variability of the data set and makes detailed comparisons very complicated. To assess what kind of analyses may be useful you need to add the single most important bit to your post. What do you actually want to see? What are the parameters? Do you want to find associations between parameters (e.g. temperature and bacterial content)? Or seasonal changes, or...? The more parameter you collect the more likely it is that you find random associations that will become significant with specific tests.

Thus, instead of a fishing expedition (which are prone to false positive identifications) you should think about what hypotheses you want to test. Random data mining rarely yields results that survive scrutiny. In that regard I would recommend reading the Ionannidis paper (2005 PlosMed).

I thought I stated the parameters (pH, BOD, etc). and Yes I want to find assoication between the parameters. Collecting more parameters; the research is limited to the time stipulated. Can please throw more light on "Random data mining.......". (Just reading the Ioannidis paper). You sounded more enlighten on the use of statistic, please do give more guide..Thank you

April 20, 201213 yr

Stating a list of parameters but not having a hypothesis of the interdependencies is precisely on of the problems. In this case it would be the multiple-hypothesis problem. What happens is that if you test every parameter against everything else (which is very tempting to do as it is rather issue to amass a lot of data) then you are basically treating the parameters as independent. However, they are all from the same sample. I.e. if you test enough parameters you are bound to find something that correlates, by pure chance, with something else.

Think of flipping an unloaded coin. If you do it 20 times, you expect an outcome of 50% head to tail (or whatever is on the coin). However, if you let, say hundreds of people do it, you are likely to find at least a few in which the ratio is skewed. If you had blindly tested, you would have assumed that these coins were loaded, however it is basically due to repeated testings (in this particular case one would have had to average over all individuals, something you cannot do with the different parameters of the water sample).

Note that you will find many publications (especially epidemiological or environmental engineering papers) in which this is totally ignored. What you could do, is adjust the outcome of e.g. regression analyses depending on parameters tested. The most rigorous one are Bonferroni corrections. However, since they throw away a lot of true positives, usually one of the modified versions are used.

However, imo the correct way to do these kinds of analyses is to pre-define the dependencies that you expect. E.g. pH is temperature dependent. And then only utilize that relevant data and test for those. The correction are more minor that way (as you test fewer hypotheses on the same data set).

April 20, 201213 yr

The reason for collecting a sample per month is because of the distance of the treatment work to the laboratory ... takes about 5 hours journey

One sample a month for 12 months results in 12 samples, and depending on the statistics you use, 12 samples may be an insufficient amount from which to draw reliable conclusions. I always understood 30 samples to be a minimum number. This would be one every 12 days for you.

As to showing cause and effect, would a Pareto Chart be appropriate?

April 21, 201213 yr

Author

I thank you all for your contributions. I will come around again!!!

December 28, 20178 yr

That's good but in Polish, not English

Commercial link removed by moderator

On this website i find all answers

Edited December 28, 20178 yr by Phi for All
No advertising, please

Sign In

Water analysis

Featured Replies

Archived

Important Information

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)