Anecdotal evidence and probability theory
Hi my question my generally like a scenario like this:
Say in this scenario someone claimed they won a lottery where the odds of winning were 1 in a billion
Person 1 Claim: "I won the lottery, my friend saw the ticket and can confirm"
Person 2 Claim: "I won the lottery, 10 people saw the ticket and can confirm"
Is "Person 2"'s claim have a slightly higher probability of being correct due to the number of eye witnesses they claim are available? I'm just talking about the claim on its own, without doing further investigation like questioning the witnesses or any other analysis.
That was my non-supernatural example. What got me thinking about it was accessing biblical accounts of miracles that claim many eyewitnesses and wondering if the claim of more eyewitnesses adds any credibility to the claim or not.
Say in this scenario someone claimed they won a lottery where the odds of winning were 1 in a billion
Person 1 Claim: "I won the lottery, my friend saw the ticket and can confirm"
Person 2 Claim: "I won the lottery, 10 people saw the ticket and can confirm"
Is "Person 2"'s claim have a slightly higher probability of being correct due to the number of eye witnesses they claim are available? I'm just talking about the claim on its own, without doing further investigation like questioning the witnesses or any other analysis.
That was my non-supernatural example. What got me thinking about it was accessing biblical accounts of miracles that claim many eyewitnesses and wondering if the claim of more eyewitnesses adds any credibility to the claim or not.
Comments (46)
According to a Bayesian view of evidence, the answer is yes. Though this of course does not alter the likelihood of winning the lottery in and of itself. An alternative interpretation is to say person 2 theory has higher predictive power (it predicts the statements of 10 people) and is easier to falsify (you can ask any of the 10).
The scenarios when it's somewhere close to the number of reports are when you have something like a controlled randomised double-blind experiment. The scenarios where it's close to 1 are when each subject has either been conjured into existence or is likely to have exactly the same reaction to the treatment/question due to unobserved factors; correlating their responses/beliefs strongly.
I leave it up to you whether the anecdotal evidence in the bible resembles more strongly people passing along and writing down stories with a selection mechanism that makes only strong believers write stories or a controlled and randomised double blind experiment.
First, the only way we could establish that the number of witnesses testifying to something implies that it has a greater probability of being the case would be if we had a large set of data showing, for multiple scenarios, that there is some correlation to how many witnesses there are relative to whether something turned out to be the case, where the latter was checked via independent means.
In other words, we'd need actual frequentist data to plausibly support a probability claim, in my opinion.
Even with the frequentist data, however, there would still be a number of problems to overcome. That's because there are so many different variables that can come into play. Making a probability claim on this sort of frequentist data implies that we're parsing the witnesses as ideal--no sort of bias, no sort of hidden agenda, no perceptual problems, ideally intelligent and rational, etc., and it also implies that we're assuming they have a more or less ideal access to information. Otherwise there would be no way to establish that the correlation is implicational, and that's what you'd be looking for here.
Not to be too nit-picky, but these statements are equivalent. In both cases, the only information provided is your claim. If adequate documentation from the other observers is provided, which it hasn't been here, that will probably change.
So, do you have independent confirmation of the biblical observations by the people who observed, or only claims by the writer?
While @fdrake has already provided a fairly in-depth post on the value of multiple accounts, for most everyday examples it seems fairly self-evident that multiple witnesses increase the probability of the event having occurred. It's unlikely that multiple people hallucinate similar observations.
Quoting Terrapin Station
The witnesses need not be ideal. It's sufficient that every individual witness account has a non-zero probability of relating the true event.
There no difference, because in both cases it is just a claim that YOU are making. If it's a lie, it's just a somewhat bigger lie to claim 10 people have confirmed.
On the other hand, if 10 people actually tell me they saw your winning ticket, that increases the epistemic probability to me that you actually won.
Good catch.
Which would also be the answer to:
Quoting coolguy8472
Just cause Matthew says so and so many people saw miracle X, doesn't mean they did.
Additionally, the probability nears zero when the allegedly witnessed X (miracle or lottery ticket) is logically or practically impossible. Like, say, you claim to have won the lottery and people claim to have seen you, but you didn't actually play. Or you lost the ticket. Or there was no lottery.
Jesus returning sight to the blind or walking on water, or Moses dividing the Red Sea... those are pretty impossible things, and so eyewitness claims are not as convincing.
That's even worse, because the author of Matthew was not even an eyewitness. He's just passing along hearsay.
Basically you're restating the common belief that witnesses matter re probability of something being the case. I'm aware of the belief. I addressed. You didn't address anything I said. You're just restating the status quo.
Your argument is that we'd need a large set of data. I say we already have a large set of data for everyday occurrences. Our knowledge of current events essentially relies on witness reports.
I also don't see how you can deny, in principle, that a witness report of an event is more likely in a world where that event happened.
That was the beginning of the sentence. The rest was:
Quoting Terrapin Station
And I said that this was just the start of what we'd need to do.
Stated differently for that reason I can buy that P(Person 1 won the lottery | Person 1 is being truthful) < P(Person 2 won the lottery | Person 2 is being truthful). That part make the case that claiming more people observed it helps the probability of the claim.
But there are multiple scenarios to consider besides that one I put in bold
P(Person 1 won the lottery | Person 1 is being truthful) + P(Person 1 won the lottery | Person 1 is not being truthful) + P(Person 1 did not win the lottery | Person 1 is being truthful) + P(Person 1 did not win the lottery | Person 1 is not being truthful) = 1
P(Person 2 won the lottery | Person 2 is being truthful) + P(Person 2 won the lottery | Person 2 is not being truthful) + P(Person 2 did not win the lottery | Person 2 is being truthful) + P(Person 2 did not win the lottery | Person 2 is not being truthful) = 1
conversely someone is more likely to have actually won the lottery and not be mistaken if more people look at the ticket can confirm it.
Stated differently for that reason I can buy that P(Person 2 did not win the lottery | Person 2 is being truthful) < P(Person 1 did not win the lottery | Person 1 is being truthful).
P(Person 1 won the lottery | Person 1 is not being truthful) and P(Person 2 won the lottery | Person 2 is not being truthful) seem negligible or about the same low value.
The main thing it looks like that determines how P(Person 1 won the lottery) compares with P(Person 2 won the lottery) is how P(Person 1 did not win the lottery | Person 1 is not being truthful) compares with P(Person 2 did not win the lottery | Person 2 is not being truthful). But it seems to me that if someone were trying to be dishonest, they would choose to be as convincing as possible making P(Person 2 did not win the lottery | Person 2 is not being truthful) > P(Person 1 did not win the lottery | Person 1 is not being truthful) for the same reason why someone who bluffs in poker might bet more if it's their goal to deceive others.
If Person 2 is more likely to lie big when they are lying and Person 2 is also more likely to have more impressive evidence when being honest, I don't know if that helps if it just leads back to determining whether likelihood of comparative probabilities of Person 1 and Person 2 being honest versus deceptive.
Not in the least. The sole arbiter is the issuing authority of the lottery, whether a country or a state or a church group. No number of non-arbiters, no matter how large, can confirm a win. If a thousand people see your winning ticket, the lottery authority can always claim machine error. Here is a real life case. https://www.npr.org/sections/thetwo-way/2017/12/28/574070736/how-the-glitch-stole-christmas-s-c-lottery-says-error-caused-winning-tickets
And now you're the one just repeating what you already said without engaging with the substance of my reply. Perhaps I am fundamentally misunderstanding you. Could you rephrase the argument that you think I am not addressing?
Quoting coolguy8472
Unless of course the ticket is fake or otherwise invalid. No amount of witnesses will modify that probability.
Quoting coolguy8472
But it's harder to find 10 people willing to lie for you than it's to find 2, so even if they were willing to forge more evidence, the evidence still increases the probability of them being truthful. You can always construct reasons to not consider any single piece of evidence convincing, but it's still evidence and you still need to take it into account.
I think you’ve misused “probability” here? The numbers are irrelevant in the manner you’ve presented the problem. The of ONE person being “correct” would rise for sure.
To expand a little, let us assume that some people are “duped” by what they see and some are not. It would seem to me, psychologically speaking, that there is a threshold where once a certain proportion believe something to be true the causal bystander will just agree out of the psychological need to “fit in” - this kind of experiment can be easily demostrated where a group of people repeatedly and purposefully give the wrong answers and the one left in the dark starts to agree with them against their better judgement.
Quoting Echarmion
Yeah I've considered that when determining "P(Person 1 won the lottery | Person 1 is being truthful) < P(Person 2 won the lottery | Person 2 is being truthful)". I would agree P(Person 1 did not win the lottery | Person 1 is being truthful and the ticket is invalid or fake) >= P(Person 2 did not win the lottery | Person 2 is being truthful and the ticket is invalid or fake). But I was thinking there's more scenarios of individual people thinking they won the lottery and are just mistaken due to human error within P(Person 1 won the lottery | Person 1 is being truthful). In cases of human error I consider more people making the same verification a way to minimize that.
Quoting Echarmion
Except we don't know if the 10 people exists when considering the probability. They could just being saying there are 10 people that can verify and are making it up. That's the part I'm tripped up on the most: determining the likelihood that someone is being untruthful then the probability that they would make a claim like "1 other person can verify" versus "10 other people can verify" if their goal is to be as convincing as possible.
Oh, the scenario was supposed to be just a claim? Well in that case the answer is that a statement alleging more witnesses is less likely to be true, by virtue of alleging extra facts. For a reasonable number of witnesses, the probability of the statements is roughly identical and only depends on the likelihood the person is lying in the first place.
I would have thought the more witnesses with consistent answers adds credibility. Assuming honesty and the existence of the witnesses in order for them to be mistaken every witness has to be wrong. The likelihood of all witnesses being wrong approaches 0 with the more witnesses you have.
We can see this in real life all the time when rumors and accusations spread. Like if someone just claims they were assaulted versus someone claims they were assaulted with many eye witnesses according to them. Or if someone makes a claim about the government versus someone a claim about the government that was corroborated by many anonymous sources according to them. Do people correctly apply more likelihood of the event being true when introducing more facts like that? Whether the person expects to be fact checked, how disprovable the facts are, and how intelligent the person is all pay a factor too.
as already said, lots of variables involved
Quoting Terrapin Station
Because we don't know a lot of the facts it makes it difficult to make a probability judgement. But maybe there's some kind of ambient probability like a weighted average of people who would tell a big lie that has more to attack versus a small lie that's harder to verify.
But you just said that we are dealing with merely the claim of witnesses, not actual witness testimony. Only actual witnesses add credibility.
Quoting coolguy8472
People are generally bad at intuitively assigning correct probabilities. There is a tendency to evaluate how vivid and plastic a story is when determining whether it's likely. This is, however, a mistake. A naked claim is more likely than one with added details (such as alleged additional witnesses) because every detail is also an additional claim.
In the P("I own a car") > P("I own a red car") sense yeah.
More detail can increase the likelihood too like:
P("I own a red car given that I own something that's red, it makes noise, and has lights on it") > P("I own a red car")
But the original scenario is different than that example because we're dealing with claims and not "givens". But I'm thinking often times we can see that a statement is more likely to be true when it's claimed versus when it's not claimed if we can determine that it's more likely to not be fabricated. Maybe an example of that would be if I forgot what day of the week it was and asked someone then they told me "Wednesday", then that should raise the probability of it being "Wednesday" from 1 in 7 to something pretty close to 100% even though all that's changed is the introduction of someone else claiming it's Wednesday.
Yes, the difference is between P(X and Y) and P(X, given Y). When we're looking at the content of a claim, we have P(X and Y). When we are looking at a claim within a specific situation, we are additionally dealing with P(X, given Y). The resolution of these will also depend on whether the probabilities are independent (as in your first example) or not (as in your second one).
Quoting coolguy8472
When we evaluate the likelihood of a person being truthful, we need to evaluate both the content of their claim and the fact that they make the claim given what we know about the person and the situation. Given that there are no ordinary reasons why a random person should lie to us about the current date, their claim has a high likelihood of being true. This is a case of P(X, given Y). Mathematically, we take the chance that it's Wednesday (i.e. P(X), 1/7) and modify it with the chance a random person would lie (or be mistaken etc.) about the date (i.e. P(Y, given ~X), say 1/10). The prior likelihood was 1/7, the new likelihood is roughly 2/3 (1-(6/7 * 1/10)).
Yet in scientific practice Bayes' rule is usually used for prescriptive induction; it is often the case that g is derived from a different data-set from that used to derive h, such that the product of g and h constructs an unseen joint-distribution that is used to make novel inferences. As with all induction, no statistical justification for this can be given and Bayesian statisticians should remain silent.
Of course, g and h are rarely known explicitly and are more naturally represented in terms of computer programs representing our physical knowledge and assumptions from which we can simulate a distribution of pseudo-data for comparison against new real-world data.
But none of that should detract from the fact that g together with h are synonymous with empirical knowledge + empirical assumptions ; for whatever we are ignorant about can play no role in our predictions or calculations.
Returning to your question, it is under-determined without reference to a distribution correlating independent witness reports to the identity of lottery winners. Of course, we might say that we know this intuitively and are prepared to make an induction, but this further serves to illustrate why Bayesian statistics is pretty useless as a formalism for directly expressing prescriptive induction.
Person 1 Claim: "I won the lottery, my friend saw the ticket and can confirm"
Person 2 Claim: "I won the lottery, 10 people saw the ticket and can confirm"
My guess is that that in a lottery where the odds are 1 in a billion:
P(Person 1 won the lottery given they claimed "I won the lottery, my friend saw the ticket and can confirm") = 1%
P(Person 2 won the lottery given they claimed "I won the lottery, my friend saw the ticket and can confirm") = 1.01%
It depends on the setting, odds of invalid tickets, odds of being mistaken, odds that they were joking, etc... I didn't specify what would be random variables.
In terms of christian apologetics, making an unconfirmable claim that many people witnessed a miracle versus not making that claim I would say probably slightly increases the likelihood of the claim being true but the probability amount that increases from them making that claim is so minuscule it's really not worth mentioning as evidence that the claim is true. But people cite it as an argument that miracles occurred and seems to have persuasive power to some so that leads me to think double and tripple hearsay carry some slight amount of weight.
When it comes to the lottery the chance of winning, or guessing that someone will win, is the same for everyone. Guesswork doesn’t change this, it only a\narrows the margin down that SOMEONE will guess correctly.
Witnessed experiences (illusionary of otherwise) are not in the same ball park.
Unless this really is a completely random guess, can you give any reason for your numbers, or for the difference between them? What does make the second uncorroborated claim more probable (however slightly) than the first uncorroborated claim?
Isn't the Bayesian position that there is no qualitative distinction between assumptions and knowledge? It's all just probabilities with different values.
It's my best guess. Because the claim that claims more eyewitnesses has more persuasive power to some people. Double and triple hearsay is a persuasive enough topic for courts to at least discuss the issue before rejecting the idea of it being valid persuasive evidence.
There are piecewise functions involved to weed out these absurdities I'm sure. Like if someone said a trillion people confirmed something verses a trillion and 1 people confirmed something, which is more likely to be true. Well there aren't even that many people on Earth.
Regarding your absurdity example, it's conceivable that there exists a person with super powers that can devour an elephant in 2 seconds. We see feats like that performed in fictional writings. The likelihood is some low number like 10^-999999999999999999999999999. But more probable than square circles I would estimate.
Hearsay only provides evidence of the overheard (or otherwise recorded) statement being made. It's not evidence for the content of the claim.
Some people? What do you think? What are your reasons? Isn't this why you opened a discussion on a philosophy forum?
Quoting coolguy8472
Look up what "hearsay" means. "Double hearsay" would be something like "My cousin heard from her hairdresser that X won the lottery." Your case is completely different.
Statistically speaking yes. Impossible is 0. Improbable is near 0.
Quoting Echarmion
Why can't evidence of the overheard also be considered evidence of the content of the claim?
Quoting SophistiCat
I was interested in what others had to say, not myself. I don't find it convincing. Others do. Often times when there's stark contrasts on both sides then the truth comes out I find out that there was some truth on both sides. So I figure the likelihood on average is alittle higher when someone makes a claim with more sources to back it up versus otherwise.
Quoting SophistiCat
Someone says that someone else will say that they have a winning ticket. That's called double hearsay.
Even frequentist probability doesn't require a qualitative distinction between assumptions and knowledge, for assumptions can be represented as "pseudo frequencies" to augment actually obtained frequencies and applied to a given likelihood function. No frequentist statistician would reject to this, provided one can provide a real-world justification for those pseudo-frequencies.
The reason Bayesian probability has been so controversial is in it's non-frequentist interpretations and usage of "prior" distributions, for when "prior" distributions are non-controversially applied they ironically represent objective posterior knowledge. And it makes no sense whatsoever to interpret flat priors as representing the state of ignorance of an experimenter, unless that prior is redundant in playing no role whatsoever in subsequent inferences.
If an assertion of ignorance was to influence the calculation of an expectation, then by definition the assertion isn't of ignorance but of knowledge or assumption.
By definition, a hearsay witness has no information on the actual event in question. Hearing a claim does not make that claim more or less likely (unless the claim is about being overheard).
Quoting sime
Sorry, that's a bit too technical for me. In what way is a prior supposed to represent the ignorance of the experimenter? Why does this ignorance influence the expectation?
Impossible is “as good as 0” not just, and only, zero. Improbable is a very low chance not an insignificantly minute chance (that is called “impossible”).
It sure is strange terminology compared to day-to-day speech!
I would say instead "has no certain information on the actual event in question" it has possible information of the actual event in question. Because the claim on its own cannot scientifically verified need not imply that it follows that the likelihood of the hearsay being true is unchanged.
I think that numbers matter here. 10 witnesses make for a stronger case than just 1 or no witness at all. Multiple accounts that agree in content make it objective because it is unlikely that so many people are wrong about something. One person alone could be mistaken, hallucinating, deluded, etc.
However it seems that the value of witnesses is relative. 10 witnesses may be better than one/no witness but a 100 witnesses is better than just 10 witnesses. I think the number of witnesses should fit the nature of the claim. More out-of-the-ordinary the claim the more witnesses required.
The motivations for lying play a factor. Someone probably wouldn't have a reason to convince me that they won the lottery unless it was a scam of some kind. It reminds me of this:
All things being equal though I do think that more eyewitnesses make the claim slightly more likely. Unfortunately this is why people exaggerate or make stuff up to deceive others. For that reason I'd also say the more unlikely the claim and the more incentive to the lie, the less of an improvement the odds become when claiming more evidence within the claim.
To assess the credibility of a claim made by a person the following conditions need to be satisfied:
1. Appropriate credentials: the person claiming something must have an established reputation in the area the claim is about.
2. Appropriate area of expertise: a physicist must make claims only about physics for example
3. Lack of bias: no ideological, monetary, etc. reasons for making a claim
4. Expert consensus: experts in the area the claim is about must agree on it. I think this is where your question comes in.
Hearsay evidence can increase the reliability of the witness, and thereby increase the likelihood of the claim being true. But it's not about the substance of the claim, that's what "hearsay" means.
Quoting coolguy8472
You still haven't explained how this is supposed to work. Just claiming to have witnesses is just another claim.
I'm talking about in probability theory, not about practical persuasion in a court of law. If you were to:
1) grab the population of people who ever claimed to have won the lottery and have an eyewitness to back them up who actually won the lottery
2) divide it by the total number of people who claimed to have won the lottery and have an eyewitness to back them up
3) grab the population of people who claimed to have ever won the lottery and have 10 eyewitnesses to back them up who actually won the lottery
4) divided it by the total number of people who claimed to have won the lottery and have 10 eyewitness to back them up
5) compare these ratios
6) I think you would have a slightly higher ratio of people who claim more eyewitness testimony also have a slightly higher percentage of being correct in their claim
Quoting Echarmion
Because the entities making the claim are people. They're not 8-balls. People make observations and can accurately report those observations. While there's still the possibility of error and deception involved if you analyzed the psychology or statistics of it all I think you would find the claims that claim to have more evidence are more likely to be true than not.
This is pure speculation though. There is no data suggesting it, and as speculative psychology it's not terribly convincing. Possible, yes, but I wouldn't bet on it.
Yea as I said its my best guess. This source may help https://www.apa.org/science/about/psa/2017/08/gut-truth