Dak Posted June 6, 2006 Posted June 6, 2006 Could someone please explain to me why the following is statistically falliciouse: The prosecutor's falicy The DNA profile found at the scene matches the suspects. The probability of a randomly chosen person having the same DNA profile is calculated as 1/100. So... if the suspect left the DNA at the scene of crime, the probability that the DNA from the crime scene matches the suspects DNA is 1. If some random person left the DNA at the scene of crime, the chances of the DNA matching the suspects is 1/100. Therefore, the fact that the DNA from the crime scene matches the suspect's is 100 times more probable if the suspect left the DNA at the crime scene than if some unknown person left it. (This next bit is apparently the falicy) It is therefore 100 times more probable that the suspect left the DNA at the crime scene than some unknown person. as i see it: A --> C B --> C Both A and B can result in C (A being the suspect leaving DNA at the scene, B being some random blokey leaving DNA at the scene who happens to have the same DNA profile as A, and C being a DNA profile retrieved from the scene that matches A's). C We have observed C (i.e., found DNA at the crime scene, and have profiled it, and found it to be matching the suspects profile). Either: A --> C or B --> C Because C is true, either A or B must be true to have caused it (i.e. the DNA profile was found, so we can deduce/abduct that either the suspect left it, or someone who coincidentally has a matching DNA profile left it). P(A | A --> C) = 1 P(B | B --> C) = 1/100* A is calculated as being 100 times more likely to result in C than B is. P(A) != 100*P(B) Why can we not now say that because we have observed C, and A is 100 times more likely than B to result in C, that A is 100 times more likely to have been the case than B is? === * in case i've got my notation wrong: I'm taking P(X | Y) = Z to mean 'the probability of Y, given that X is true, equals Z'
Aeternus Posted June 6, 2006 Posted June 6, 2006 But isn't the problem that a) the suspect could be "the random person" and some other guy could be the actual culprit and b) 1/100 means that out of 100 people 1 person would match (I'm assuming here, otherwise you could simply eliminate the problem by retesting) who wasn't guilty, however that one persons chance of matching will be 1, not 1/100, because they will always match, not 1/100 times, every time (ie both the random person and the culprit are equally likely to match). Maybe I'm misunderstanding, I'm tired.
Dak Posted June 6, 2006 Author Posted June 6, 2006 But isn't the problem that a) the suspect could be "the random person" and some other guy could be the actual culprit and Sort of' date=' yeah. but note that he couldn't be the 'random person' in the example, cos that's the actual perpetrator. b) 1/100 means that out of 100 people 1 person would match (I'm assuming here, otherwise you could simply eliminate the problem by retesting) who wasn't guilty, Indeed. however that one persons chance of matching will be 1, not 1/100, because they will always match, not 1/100 times, every time (ie both the random person and the culprit are equally likely to match). Not quite sure what you mean, but: If some random person left the DNA at the scene of crime, the chances of the DNA matching the suspects is 1/100. In other words, if 1/100 people have DNA matching the suspects, and one (random) person left DNA at the crime scene, there is a 1/100 chance that that DNA would match the suspects. Which is why i'm not getting the 'we can't say that he's 100 times more likely to be the originator of the crime-scene-DNA than not'
ecoli Posted June 6, 2006 Posted June 6, 2006 It seems to me that you can't relate statistics and probability here. I think that just because something is statistically true, doesn't mean you can say that it's probabilistically true. You're kind of working backwards in a way that only leaves you with half truths.... does that make any sense?
Aeternus Posted June 6, 2006 Posted June 6, 2006 Why can't he be the random person? You are assuming that you already know the suspect is the culprit?? What I'm saying is that say they picked a person whose DNA matched but they weren't the culprit, as their suspect (ie they were that 1 out of 100 people but they were by some fluke suspected). That person's chance of matching was always 1 because they match. The chance that if you picked a completely random person up and their DNA matched is 1/100 but that doesn't imply that simply because the persons DNA matched that they aren't that 1/100.
Tartaglia Posted June 6, 2006 Posted June 6, 2006 Assuming only one person is arrested and tested then probability of guilt is given by Bayes theorem P(A|C) = P(A and C)/ (P(A and C) + P(B and C)) = 1/1.01 However if a whole bunch of people are tested until one is found positive then other distributional assumptions need to be made and the calculation is more complicated. A geometric distribution could be made use of here Edit - Mistake in this post see post below
ecoli Posted June 6, 2006 Posted June 6, 2006 ok sorry. Let me put it this say. probability of suspect: 1 probability of random: 1/100 it is 100X more likely that the sample matches the the suspect then a random person. That doesn't mean that it is 100X more likely that the person commited left the sample. Because the matched sample could still have been left by somebody else, even if it matches. I believe this is what it's saying.
Aeternus Posted June 6, 2006 Posted June 6, 2006 http://en.wikipedia.org/wiki/Prosecutor's_fallacy seems to have some examples and such explaining why it doesn't work.
Dak Posted June 6, 2006 Author Posted June 6, 2006 @ Aeternus Ah, i see. we're both taking 'random person' to reffer to different things. Your taking it to reffer to a non-guilty suspect, i'm taking it to reffer to a non-supect perpetrator. Focus on the suspect's DNA: if the suspect is actually guilty and actually left his own DNA at the scene, then the chance of the scene DNA matching the suspects will be 1 ('cos it's his). If the suspect is innocent and some-one else left the DNA at the scene, then that person will basically be a random person from the population, and the chance that their DNA will coincidentally match the suspect's is 1/100 (hence why the suspect can't be that random person -- if he was he'd be the guilty one). Logically equivellent to what you said, but that's why I said he cant be the random person in my example
Dak Posted June 6, 2006 Author Posted June 6, 2006 http://en.wikipedia.org/wiki/Prosecutor's_fallacy seems to have some examples and such explaining why it doesn't work. Ah, i assumed that the 'procecutor's fallicy' term was made up by my tutor, so didn't bother googling. Cheers for the link. i kinda understand now. (and sorry for getting your name wrong earlyer) Tartaglia: what values were you using for P(A) and P(B)?
Tartaglia Posted June 6, 2006 Posted June 6, 2006 Dak - I made a mistake when I put it up. In my calculation they would be the same which clearly would not be true, but this does lead you into how to demonstrate the fallacy mathematically. When I put the post up I was thinking in terms of testing a long line of suspects
Aeternus Posted June 6, 2006 Posted June 6, 2006 Ah' date=' i assumed that the 'procecutor's fallicy' term was made up by my tutor, so didn't bother googling. Cheers for the link. i kinda understand now. (and sorry for getting your name wrong earlyer) [b']Tartaglia:[/b] what values were you using for P(A) and P(B)? Heh, it's ok, you should see what Klaynos ends up calling me sometimes.
Dak Posted June 6, 2006 Author Posted June 6, 2006 Cheers. Would this be right? Note that i have finally reallised that my notation was wrong. Henceforth, P(A|B) means the prob of A given B. A = suspect being at the scene C = DNA profile that matches suspects as forensic scientists are required to be unbiased, assume a prior P(A) to be 0.5 P(A|C) = P(C|A)P(A)/P© P(A|C) = 1*0.5/0.01 = 50... umm.. OK, that'd be 'no, dak, that's not right' then what'd i do wrong?
Tartaglia Posted June 6, 2006 Posted June 6, 2006 P(A|C) = P(C|A)*P(A)/(P(C|A)*P(A) + P(C|B)*P(B)) where P(C|A) = 1, P(C|B) =0.01 and P(A) and P(B) are chosen appropriately
Dak Posted June 6, 2006 Author Posted June 6, 2006 P(A|C) = P(C|A)*P(A)/(P(C|A)*P(A) + P(C|B)*P(B)) P(A|C) = 1*0.5/1*0.5 + 0.01*0.5 = 0.99 umm... that's just the same as saying if there's only a 1/100 chance of someone else having the profile, theres a 99/100 chance of him being guilty, which is pretty much the procecutors falicy?
Tartaglia Posted June 6, 2006 Posted June 6, 2006 You have forgotten a bracket - assuming P(A) = P(B) = 0.5 (which is not necessarily a good assumption) then your answer should be 1/1.01
Dak Posted June 6, 2006 Author Posted June 6, 2006 i think i noticed that and edited about the same time you noticed it. And it's a neccesary assumption in forensics (according to my lecture notes... i found a bit that briefly touches on bayes theorum, tho it's not much help)
Tartaglia Posted June 6, 2006 Posted June 6, 2006 Clearly if the police are just rounding up suspects P(A) is a lot less than 0.5, but if they are making a single arrest after a lot of detective work then P(A) could be a lot more than 0.5
Dak Posted June 6, 2006 Author Posted June 6, 2006 hmm... actually, i didn't go to that lecture, and the notes that i picked up off of someone else are a tad confusingly worded, so maybe you're right. How would one deal with that? could baye's theorum not be applied if we can't estimate an a prior P(A)?
Tartaglia Posted June 6, 2006 Posted June 6, 2006 Your estimate of p(A) would be a prior estimate. After applying Bayes theorem you have your posterior estimate P(A|C)
Dak Posted June 6, 2006 Author Posted June 6, 2006 yea, but how would i work out P(A)? i can't take other evidence into account, otherwize it becomes tatologous: 'assuming the guy is probably guilty, then this is probably his DNA, indicating that he's probably guilty'. I can't take on the procecutor or the defendants view, otherwize P(A) would have to be set at either 0 or 1, and if i take the view that i strictly speaking should (i.e., unbiased -- no prior assumptions) then both P(A) and P(B) are going to be 0.5, and, given that P(C|A) is allways going to be 1, it becomes 1/(1+P(C|B)), which seems to return awfully high results reguardless of P(C|B) Can i not, then, use bayes theorum in this case?
Tartaglia Posted June 6, 2006 Posted June 6, 2006 That's where experience comes into it. You would have to collate data from similar situations in the past or guess
matt grime Posted June 7, 2006 Posted June 7, 2006 Whenever I hear of something like this I tend to think of this case: (fallacy) There is a 1/1,000,000 chance of someone's DNA test returning as a match. If you match DNA of the criminal then the chances of a match being 1/1,000,000 mean you must be that criminal 999,999 times out of 1,000,000. Now, really, what you should think is this: there are (in the uk) 65 people then who match that DNA sample. So if that's the only evidence they have then the suspect really has only a 1/65 chance of being the correct person.
Dak Posted June 7, 2006 Author Posted June 7, 2006 That would be the defendants fallacy The reason it's invalid is that it assumes each person with that DNA profile was in the area, and thus capable of leaving their DNA, when in actual fact it's very unlikely that the other 64 were anywhere near the crime scene.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now