Saturday, August 15, 2015

Disproving Carrier's Proving History

Richard Carrier presumes to use Bayesian probability theory to disprove Bible history. One problem is that he's a dilettante. Both Tim and Lydia McGrew have documented what a hack he is–and they were just scratching the surface. But because he's shameless, and he has a sycophantic following, he keeps right on doing it. 

When he first got into this, he made so many mistakes that he had to revise the little treatise he put online. It's just a way to make himself look fancy. I have it on good authority that he has someone who has served as technical backup for him and helped him to catch and correct some of his more egregious blunders.

For those of you who wish to get into the weeds, here's a technical critique of his book Proving History:

And here's a follow-up, in which Hendrix responds to Carrier at length:

Tim Hendrix  says
Hi Richard,
That’s quite a long response which raises many points I am not sure I can cover here, at least without a lot of repetition from the review. I do think many of the things you write amounts to a misunderstanding of what I wrote or at least intended to say. If there are particular points in my review you feel I did not address and would like my responds to please just raise them as questions.
Re. “does all historical reasoning reduce to Baye’s theorem”
I raised a few points regarding the claim all of historical reasoning reduces to Bayes theorem.
Firstly, that probability theory is one way to handle uncertainty or vagueness (for simplicity I will use these terms interchangeably even if some may not consider this completely accurate), and there exist other valid theories for uncertainty (the term is understood broadly), for instance Dempster-Shafer and multi-valued logics (For instance Petr Hajek has written extensively on this subject). As I believe it is the case these have a place when analyzing uncertainty it raises the question why we can distinctly rule these, or some other notion we might come up with in the future, out as *not* being relevant for history (I will accept I would not know how to practically apply them). You responded:
I’ll just say, if you can’t define it, then you can’t answer it. So these kinds of unanswerable questions are moot. But even if we do define the terms usefully in a question like this and what we end up with is not a factual statement but an evaluative statement, then we are no longer making a claim about history. We are making a claim about what value people should assign to something.
However, even if we could not point to any another method for handling uncertainty that does not mean our current method is the only one, and in fact we have several. Broadly speaking, do you accept we have several mathematical theories for handling uncertainty/vagueness?
You continue:
For example, “the teachings of Jesus were widely valued in historical period P” is a hypothesis that will have a probability value derivable from Bayes’ Theorem
The issue is this type of proposition would be a textbook example of a proposition with graded truth value (because of the term: “widely valued”) and would, strictly speaking, fall outside the scope of those propositions probability theory analyses. See for instance the first two chapters of Jaynes 2003 (which is referenced in PH) where he limits himself to classical (Boolean) truth-values or see:
for further information.
Do you accept this type of proposition could plausibly (by proponents of fuzzy logic) be said to have a graded truth value?
Secondly, as I understand this comment (from your conclusion):
Meanwhile his disjunctive alternative, that Proving History does not demonstrate “anything which one would not already accept as true if a Bayesian view of probabilities is accepted as true,” is wholly circular: that historians who already accept that their conclusions should follow a Bayesian model do not need it proved to them. That’s true…as a conditional statement floating around in Plato’s realm of ideas. But it’s irrelevant. Because historians have yet to be convinced that their conclusions should follow a Bayesian model. So they do need it to be proved to them.
Now, I suspect this conclusion might rest upon a misunderstanding and I agree that your summary of my argument is circular. What I considered in my review was this alternative statement:  Bayesian inference describes the relationship between probabilities of various propositions (c.f. Jaynes, 2003). In particular it applies when the propositions are related to historical events.
To take the first part of this statement, I take it you accept Bayes theorem as not in dispute for a historian, that is, we can grant a historian accept the Bayesian account of probabilities. If he does not, would the proof not have to first set out to establish that Bayes theorem was true?
However if that is the case, my argument is very simple: Bayes theorem also applies when the propositions are historical events. Can you point out which parts of my argument which is circular and where your argument departs from mine?
I agree it is worthwhile (as a matter of instruction) to point out to historians how historical arguments can be mapped to a probabilistic form (Bayes theorem if you like), and it would follow the various elements of a historical argument would then find a probabilistic counterpart as priors/likelihoods, however I see this more as a matter of explanation than a formal proof.
Re. the scope of PH and the applicability of Bayes theorem to history
Hendrix is concerned that I don’t prove any new facts about history by applying the theorem. In fact that wasn’t the function of Proving History. A test application on a serious problem is in the sequel, On the Historicity of Jesus, as is repeatedly stated in PH
My concerns is more that I would have hoped to see more practical applications of Bayes theorem to existing historical questions to see how Bayes theorem is imagined to apply in practice. It is of course your choice what to include in the book, however there were many issues with applying Bayes theorem in a setting such as history that I was and am not sure how to address even after reading the book (I give some examples). However I think these issues are best discussed with a practical example in mind.
One thing I would like to point out on this note, regarding my analysis of the criteria of embarrassment, is that as far as I understood your analysis it did not have ways to specify if a text was actually embarrassing or not to the person doing the preservation, and as far as I could tell your analysis pointed to the conclusion the text would always have a probability less than 0.5 of being true. I tried to re-do an analysis where I allowed embarrassment to enter as a variable and got a (qualitatively) different result than yours. My main point was that, as best as I could tell, our results were in qualitative disagreement so who are the more correct? What do we do from here in practical terms?
Also, I admit I had a very hard time telling how you defined the various Boolean propositions that was used in your analysis. Furthermore it would be good to have them specified here explicitly for further reference?
The unification of Bayesian/frequentist view of probabilities
I understand the section to provide an account for how probabilities should be defined which is considered different than what e.g. Jaynes or Cox does.
It should be stressed the question how probabilities should be “defined” and its relationship to (and meaning of) subjectivity has different answers in the literature which I cannot survey here (for instance an axiomatic approach such as Jaynes and Cox vs. a rational choice approach of e.g. de Finetti). I have taken Jaynes (2003) to be the standard reference because of the way it is sited in PH and because it gives a fairly accurate description of my own view.
However this makes it sometimes difficult to understand the definition of terms. For instance I tried to provide the reader with a definition of probability (is the “degree of belief” in your respond your definition of probabilies?) and I claim I think at least some of the discussion may be circular to which you reply:
confess I found little on point in what Hendrix attempts to say about this. He goes weird right away by saying that the demarcation of physical and epistemic probabilities is circular because they both contain the word probability. That makes no sense (“mammalian cats” and “robotic cats” is a valid distinction; it does not become circular because the word “cat” is used in both terms).
The quote from PH I was referring to was this:
…by probability here I mean epistemic probability, which is the probability that we are correct when affirming a claim is true. Setting aside for now what this means or how they’re related, philosophers have recognized two different kinds of probabilities: physical and epistemic. A physical probability is the probability that an event x happened. An epistemic probability is the probability that our belief that x happened is true
Where it was the first phrase which puzzled me: “by probability here I mean epistemic probability, which is the probability that we are correct when affirming a claim is true.”
Consider the phrase: “By Bar I mean the Foo-Bar, which is the Bar that…”. I think this phrase may actually be circular since Bar is Foo-Bar which is Bar.
As you mentioned, the chapter is quite difficult to follow, and I would rather not spend too much time discussing issues which relates to a misunderstanding on my part rather than the proposed unification of Bayesian and frequentists probabilities. However I would like to address some items from your response:
But some of Hendrix’s complaints miss the point. For example, he objects to my saying that when we roll a die the probability of it rolling, say, a 1, will either be the actual frequency (rolling the actual die a limited number of times and counting them up) or the hypothetical frequency (what we can predict will happen, from the structure of the die and the laws of physics, if the die were rolled forever and counted up). Why does he object to so obviously correct a statement?
I can accept my complaints might miss a point; however it is important to be very precise when we offer a definition of these terms. When I consider the above statement I think, regarding the first situation, that I can imagine rolling it (say) 100 times and it coming up 1 for instance 14 times, and so the “probability of it rolling 1” is the actual frequency or 14/100.
Then we can consider the second situation. In this we consider what we predict will happen and lets just assume I know nothing about the die (which is the case) and conclude the probability is 1/6.
What confuses me about this sentence is the probability is now at least two things (if the sentence is taken at face value). I say at least because in the first definition it was not stated what “a limited number of times” was and I could readily roll the die again to get (for instance) 18/100 (did you mean the limit frequency?).
So what is “the probability” (singular)? Do you believe probability refers to different things and has a situation-dependent definition?
It is very important to keep these terms absolutely precise for us to understand each other and *I still do not understand what probability truly refers to*
To take another case:
“But what is the true frequency of the 8th digit in pi being a 9? Why should we think there is such a thing? How would we set out to prove it exists? What is the true value of the true frequency?” This is just a really strange thing to say.
I agree, however I was trying to apply the vocabulary used in chapter 6 where it equates (or so I read it) probability with “true frequency”.
He is asking about a statement of (I presume epistemic) probability, that he believes it is 80% likely that the 8th digit of pi is a nine.
To be clear about the terminology, I am considering the probability of the proposition
H : “the 8th digit of pi is nine”
And the information stated in my review amounts to p(H) = 0.8. My question is simply how we define this probability if we equate probability with “true frequency” (as the first half of your above quote doe). You continue:
Okay. Let’s walk him through it. What does he mean by “it is 80% likely that the 8th digit of pi is a nine”? He must mean that given the data available to him, he is fully confident (I suppose to near a 99% certainty) that there is an 80% chance of his being right. To be so uncertain that you know you have only a dismal 80% chance of being right about this, I can only imagine some scenario whereby he doesn’t know how to calculate that result, and thus is reliant on, let’s say, a textbook, and apparently this is a post-apocalyptic world where it’s the only textbook that mentions pi that survives, and the text in the textbook is damaged at that point, and damaged in such a way that there is an 80% chance the smudged or worm-eaten symbol on the page is a 9 and a 20% chance it’s 6.
What you are suggesting amounts to considering a different proposition:
H1 : “the 8th digit of pi in a textbook which is damaged in a post-apocalyptic world will read as a 9 provided it is so damaged about 2 in 10 people will read it wrongly.”
For which we can say: p(H1) = 0.8. However notice my question was related to how we define the probability *of the proposition I proposed*. Is the proposal for defining probability then to switch the proposition? Do we agree your proposition (which I write as H1), and mine (H) are quite different?
I want to stress on a Bayesian view where probability are subjective and refers to a lack of knowledge the example is easy to treat without hypothetical situations. If I considered it right now I would say: p(H) = 1/10 because I cannot remember the 8th digit of pi at all. This has nothing to do with frequencies or any other beliefs I may hold and can in fact be derived from a symmetry argument (see Jaynes). The important thing is on this account probabilities are subjective and *not* as such *rooted* in what will actually turn out to be true or false in the real world. This is the key point in Jaynes.
Now, returning to the more substantial points which can hopefully clarify how probability should be defined. In PH as quoted in my review there is the statement:
So when you say you are only about 75% sure you’ll win a particular hand of poker, you are saying that of all the beliefs you have that are based on the same physical probabilities available to you in this case, 1 in 4 of them will be false without your knowing it
Try to walk me through this example: Suppose I believe 4 things with probability 0.512. Then, can we agree that it will be the case either: 0, 1, 2, 3 or 4 of these will be correct limiting the frequency at which i am correct to five values? My point is simply how you define probability, for the probability of 0.512, in this particular situation? Do you envision some sort of limiting procedure? Do you define the word frequency differently than Jaynes or I (see the review for a definition)?
Then, as an introductory point I pointed out a fraction-based definition of probabilities would have a problem representing probabilities like 1/sqrt(2) which cannot be written as fractions and it is very easy to imagine situations where these arises (I gave one example in the review) to which you respond:
Nor will I bother with his silly attempt to insist we need to account for infinities and irrational fractions in probability theory. Nope.
I am only trying to explore the definition you provide and I am sorry you will not take the example serious. I take your respond to agree with my point that there are a great many probabilities which cannot be represented using your definitions (in fact, those which can be represented have measure 0). Probability theory already accounts for these probabilities very well no matter which definition one subscribe to; do you feel the work on probability theory which relies on probabilities as being a subset of the reals (this would be any textbook I have seen) can be dismissed?
Tim Hendrix  says
Re. Richard,
Just to clarify, I do not disagree at all you can make probabilistic statements about frequencies or that frequencies can inform probabilities (technically, we can infer for instance the probability* a coin comes up heads from a sequence of flips and prior information). I gave an instance of the former type inference in my review and examples of the later can be found in any book on Bayesian data analysis (this is one of the few places where I do not think Jaynes is the more apt reference for more advanced aspects) and is in fact what I do every day.
But I think it is very convincingly argued in the literature that this is the more accurate relationship between probabilities and frequencies, especially when considering one-time events like history.
* technically a probability density.
Tim Hendrix  says
That we can model uncertainty in different ways does not answer the demonstration in Chapter 4 that all historical arguments nevertheless still reduce to Bayes’ Theorem. As I note of Dempster-Shafer in PH (p. 303, n. 19), it’s simply far too complicated to be of any use to historians. Likewise other modes. I think you would catch on to this if you treated the syllogism as a syllogism and sought out a premise in it that you could challenge
Treating it as a syllogism, from a technical standpoint, I think we should first clarify what we mean by “a historical method”. Very generally speaking, I take it the argument assumes a historical method is something which allows us to reason in situations where we cannot be certain (specifically given partial information and about statements which cannot be known to be true or false). There are different theories for reasoning in situations where we cannot be certain, and these operate under very different assumptions and sometimes about different types of propositions. Then considering the conclusion of the argument “not C” where
C = There is some historical method which is logically valid but contradicts BT
however from this it does not follow:
“(b) there is at least one valid historical method that does not contradict BT but that nevertheless entails a different epistemic probability than BT”
because such a method would not necessarily be expressed in the semantic of (epistemic) probabilities (for instance a fuzzy logic). In addition one could consider hybrid approaches and a range of different things. I accept the claim (as stated in my review) if you by historical method limit yourself to a consistency requirements for probabilities, but then I am not sure I see why we should need the argument.
Keep in mind I am not claiming you should apply fuzzy logic to history; do what gets the job done! I just don’t think it has/can be proven Bayes is the unique historical method.
Tim Hendrix  says
Hi Richard,
A historical method is something which allows us to ascertain how likely a given claim about history is to be true.
But that’s just it. You assume a probabilistic semantic (how  likely a claim is). I could say:
A historical method is something which allows us to ascertain the truth of a given claim about history.
and to the extend either of these two quotes are saying something definite I am simply asserting a historical method is about the (degree of) truth. This does not amount to a proof of anything, just an expression of personal opinion.
This comment is unintelligible to me. Analogy: It does not matter whether it’s in German. If you can write it in German, you can write it in English
Yes, but your analogy is false since different notions of uncertainty/vagueness are not equivalent.
Secondly, fuzzy logic is just another way of manipulating confidence levels and intervals.
I simply have to disagree. From the standford encyclopedia:
this makes fuzzy logic to something distinctly different from probability theory since the latter is not truth-functional (the probability of conjunction of two propositions is not determined by the probabilities of those propositions).
If you operate under the premise probability theory and fuzzy logic express the same thing, well, I think we will simply have to agree to disagree.
Tim Hendrix  says
Untrue. That you can gradate things does not mean you have to.
You just define the boundary of “widely” (e.g. “with more than trivial frequency,” “found in more than half of materials in each genre,” etc.) and then it’s a straightforward binary question of whether the evidence matches that boundary or does not
The statement we are discussing is:
H : “the teachings of Jesus were widely valued in historical period P”
The problem is when you introduce boundary of widely to say for instance:
H1 : “the teachings of Jesus were said to be ‘widely valued’ by more than 50% of historical documents in the historical period P”
then H is not H1 (an even better example of constructing a binary proposition from a fuzzy one: H2 : ‘Celcius said H'; still the case that H is not H2). I am very well aware the distinction between probability and (graded) truth appears superficial and pedantic when one first encounters it, but it has very deep formal roots. You can check out the wikipedia page or the standford encyclopedia which you yourself cited previously for more information. I stress I am not claiming you are doing something *wrong* by focusing on probabilities, that’s what I do as well, however I want to highlight this is not the only notion of vagueness and in fact for statements such as H it is most likely not the appropriate one. Anyway this is a digression.
Tim Hendrix  says
Well, the point still remains that my original proposition
H : “the teachings of Jesus were widely valued in historical period P”
is not the same as the proposition you introduced:
H1 : “if the teachings of Jesus were said to be valued by more than 50% of historical documents in the historical period P, then the teachings of Jesus were ‘widely valued’ in the sense intended by historian Z”
(actually it is difficult to see why H1 is a binary proposition at all and not expressing a definition but i digress).
Maybe you didn’t realize that the H1 I just reconstructed is what it means to reduce the problem to a binary question of probability?
Once again, what the example illustrates is we have two different types of systems for handling uncertainty and they operate on different types of propositions. That you can take a proposition expressed in one systems and point to an expression in another system which express something which you feel is reasonably similar (but not equivalent) is not bringing us closer to anything which would amount to a proof only one system is needed. It’s simply not a proof, however once again I think we have arrived at a point where we might disagree about the fundamental premises in the argument (what amounts to a proof and if multi-valued logic and probability theory are equivalent) and I am not sure how I can proceed at this point.
  • Tim Hendrix  says
    Okay, I think we can close this discussion. it seems we agree the arguments in PH has as a premise that BT is true. That is my alternative statement:
    “Bayesian inference describes the relationship between probabilities of various propositions (c.f. Jaynes, 2003). In particular it applies when the propositions are related to historical events.”
    holds and what the proof in PH sets out to establish is the second sentence. I don’t really want to argue if this is easy to see or not for someone in the humanities.
    Tim Hendrix  says
    Finally we are getting to the interesting stuff…
    Just to clarify, I have not retracted anything I have written and I am sorry if something I have written has given this impression (English is not my first language and I am partly dyslexic in my primary language).
    To begin to clarify the argument in PH. You are carrying out a Bayesian analysis over Boolean propositions. I just wonder if you could write out what those Boolean propositions are? in particular, if the word “embarrassing” refers to if a christian TODAY believes the story is embarrassing, or if it refers to the same story being embarrassing to someone writing in the first/second century.
    Secondly, as I understand your position, sometimes the criteria of embarrassment does work right? Isn’t it more believable when someone tells us something private and embarrassing than if he tells us something which makes him look good? The point of my analysis was to allow this situation.
    Tim Hendrix  says
    But “the word “embarrassing” means a christian TODAY believes the story is embarrassing” is immediately eliminated as irrelevant. A historian cannot judge the past by the background knowledge of the present (…) So someone has to be able to show that a story was embarrassing then. And if they can’t, then they cannot assign the attribute “embarrassing” to that story.
    I know this is taking it slightly out of context but try to read the last sentences again :-). Here is what I am getting at (and please keep in mind I am not a historian). Consider one of the letters we both agree Paul wrote with near certainty. We can imagine the following process: Poul writes the letter, Paul gets old and dies, then the later church preserves, redacts and mangle Pauls letters often to fit it’s theology and this is what we have today.
    As you have pointed out in OHOJ (and I think very convincingly) Paul says things in his letters which are likely at odds with the later church (the first century church). For example in Romans 15:3-4 he says he learns everything from revelation, and as far as i understand it the writer of acts (i.e. the later church) properly goes out of the way to contradict this. So we can imagine we define:
    T : Paul wrote romans 15:3-4
    Tem : Romans 15:3-4 was truly embarrassing to the 1st century church
    Em : Romans 15:3-4 seems embarrassing to us today.
    Pres : Romans 15:3-4 was preserved
    Then, in my understanding, it would be fair to say that the fact Romans seems embarrassing (Em) increases the chance that Paul actually wrote the passage (and it is not a later invention), T. Symbolically:
    p(T | Em, Pres) > p(T | Pres)
    The point is in evaluating the above expression we have to use Tem (I show how in the review) because *thats whats relevant for the early church which did the redaction* but we do not have access to Tem directly but only indirectly from Em (it appears embarrassing today), which is especially relevant for passages which appears embarrassing today but might have had a literary point in the first century (such as the gospel examples you provided). I included both variables because it appears to me to be needed to make the analysis work in both cases.
    You are free to disagree with any of this and I am sure you can reduce the expression I discussed to give qualitatively consistent results with yours. I simply wanted to make the point in the review that I did not feel there was a variable which actually expressed the same as Em and this had an effect on the result, as well as discuss certain issues relating to introducing such a variable. But I suggest postponing this discussion until I have read OHOJ. If you wish to discuss the relationship between our two expressions further I would still hope if you could provide a list of the Boolean variables you make use of such that the argument can be put in standard “P(X|Y) = …” form. Perhaps it is a language issue but I am not sure I can do so correctly as it is (is my translation wrong in the review?)
    Tim Hendrix  says
    I will respond to the past 3 posts here.
    Firstly, what is your definition of the word “frequency” as it is used for defining the terms relevant to probability? (you can compare to my definition, which corresponds to that of Jaynes, in the review)
    To take my example:
    Try to walk me through this example: Suppose I believe 4 things with probability 0.512. Then, can we agree that it will be the case either: 0, 1, 2, 3 or 4 of these will be correct limiting the frequency at which i am correct to five values? My point is simply how you define probability, for the probability of 0.512, in this particular situation? Do you envision some sort of limiting procedure? Do you define the word frequency differently than Jaynes or I (see the review for a definition)?
    to which you respond:
    As I noted before, this whole response is unintelligible.
    You don’t believe 4 things. You have a vast quantity of beliefs. But more importantly, the number of beliefs you have is irrelevant. We are talking about conditional probability, not unconditional. Thus, the question is “if you had evidence of kind x, then how often would you be wrong about x?” which means “how often” in an extended hypothetical set of infinite runs. You could have an answer to that question without ever polling how many beliefs you have.
    All probabilistic statements I am making are conditional (I follow the same convention as in PH). Now, as I understand the reply, you are saying that when we make sense of the statement: p(H|x) = 0.512 for some hypothesis H and background evidence x, the actual number of beliefs I have is irrelevant (in your words). So according to this definition, we imagine an “extended hypothetical set of infinite runs” and (i assume) consider the limit of the frequency of the true propositions to the total number of propositions in this series. As I mentioned in my review, I anticipated we would be talking about infinities at some point and I guess this is now :-). The issue is that this view — that probabilities are defined from repeatable events — is just the frequentist view which is what (e.g.) Jaynes argues against. If you have not had a chance to read his book i would strongly suggest for you to do so. To re-iterate some of those points, for this definition to make sense requires two things:
    1) what is this series exactly?
    2) it must be defined (including the limiting procedure) without making references to probabilities or probabilistic language (otherwise the definition would be circular). Notice this includes “random”, “chance”, etc.
    It is in particular for question (2) the issue of having a clear and unambiguous definition of “frequency” is important.
    To take an example, consider:
    H : ” The 8th digit of pi is 9″
    What is then this “extended hypothetical set of infinite runs” exactly? Is it a possible-world scenario where in some worlds pi is different? is it something else, for instance  random flips of a  biased coin? (in this case would we not have to say: the coins is biased to produce heads with  probability 0.521?
    On the standard Bayesian view there is no such problems. I simply have a degree of belief* of 0.521 of H for the example of pi (or more relevantly, 0.1 because I really don’t know). I don’t have to imagine anything else like a hypothetical series of infinite runs (of what?) and moreover, *I can actually use this definition to derive the rules of probability theory* (as Cox argues). This is a major point. The definition is not just a semantic we can place on top of an existing mathematical framework, it has a normative effect.
    To re-consider the relationship between probabilities and frequencies from a Baysian view, as i pointed out, if i believe m things “on the same strength of evidence”, e.g. with the same probability, then I can compute the probability of the statement:
    H_n : n of the propositions H1, … Hm are true.
    and e.g. the probability of n/m (the “frequency of being right” if you will). The formula is in the review. This requires *no* consideration of infinite runs or anything else and follows completely naturally from the basic laws of probability theory (iirc. this is chapter 3 of Jaynes). I would argue this is far more natural.
    On a final note, regarding the quote about confidence intervals, I wonder if you had read about the Bayesian procedure for obtaining the same result (i.e. infer the probability density of for instance the probability a coin comes up heads)? If you are interested I did write a few pages about this in my review which i subsequently removed. If you are interested in how this is normally done I could send it.
    Tim Hendrix  says
    Okay I think we are finally getting to the hearth of the matter.
    Epistemic probability is: “when I have z scale of evidence, then I will be right that x is true at frequency f.”
    That is not a measure of past success/fails in being right/wrong with z scale evidence. That is a prospective measure of infinite future runs: every time you will ever have z scale of evidence, you will be right that x is true at frequency f. That is what you literally mean when you say “I believe there is an 80% chance x is true.”
    Lets break this down. The first issue is the definition of “frequency” with “rate of occurence”. The issue is the phrase “Rate of occurence” has a probabilistic connotation especially when you take it in the context of a hypothetical “infinite run”. For instance the first technical reference on google for “rate of occurence” is:
    where you can tell rate of occurence is defined using probabilities. This is a general feature whenever you have words like “expected”, “average”, “random”, “limit frequency”, etc. that it is very hard to give these a precise definition without resorting probabilities; this is a very old problem and you can find plenty of instances where people give circular definitions. I am sorry to ask the question again, but what *exactly* is the frequency/rate of occurrence defined as *in this exact situation*.
    The second issue here is what “scale of evidence” is. Presumably a “scale” is something which is measured using a single number which sounds very much like a probability, but lets leave this issue aside for the moment.
    These lead to the main difficulty here which is what the above definition refers to exactly. I am sorry that I keep asking about this, but it is not clear at this point, especially how you consider the “infinite runs” and what the events are. When I attempt to consider the above definition from a formal perspective I end up with something along:
    When I claim the probability of H is 0.521, p(H|x) = 0.521, What I mean is if i consider an infinite sequence of events H1, H2, H3, …, such that each event is taken to be independent and conditional on background evidence of the same strength as H, then if i let m be the number of times the first n elements of the series of events H1, H2, .. are true, then m/n to converge to 0.521.
    However these events must from a formal perspective be *random variables*. And for the limit to converge to 0.521 they must each be true with *probability* 0.521 (for simplicity i have assumed independence). *what other definitions which are exact could be offered?* But in that case the statement is now circular — it is defining probability from a (true!) statement about probabilities. But this is not a definition any more than it is a definition of a number x to say it is equal to twice x divided by two (I discuss this in the review).
    A way out is to define probabilities by making use of a physical process. That is, the probability a die comes up 3 is obtained by repeating a certain experiment, and thus it is a physical feature of the die. This definition is not circular, but problematic for other reasons Jayens discuss. At any rate *THAT* definition can only apply to repeatable events, because that’s how it avoids the circularity.
    The case of Pi is fairly interesting as I think it might highlight some of the differences in our way of thinking. I can tell you have cheated a bit and looked up Pi. Please try not to cheat and consider, as you are sitting in the chair right now, the following proposition:
    H : “Digit 117 of Pi is 9″
    What is then p(H | b) where b is all of your current background evidence available?
    In my view p(H | b) = 0.1. This is because *I do not know what the 117th digit of pi is* More about this in a moment.
    You wrote:
    If instead we have somehow found ourselves in a place where we actually have a legitimate epistemic probability of 80% (and not “near zero”) that the 8th digit of pi is 9, even though in fact it isn’t (and it is logically impossible for it to be), then we have to be in some extremely bizarre science fiction contrafactual scenario. I struggled hard to invent one as best I could. But the thought experiment entails that we would have to have evidence greatly misleading us as to what the 8th digit of pi is
    Should I take it from this quote the only way to arrive at the answer p(H|b) = 0.1 rationally, on your view, is if we are in “some extremely bizarre science fiction counterfactual scenario” and thus you do not think this is a rational assignment of probability for me to have? What is your assignment of probability? (don’t use google!)
    I wrote:
    I simply have a degree of belief* of 0.521 of H for the example of pi (or more relevantly, 0.1 because I really don’t know).
    To which you responded:
    You do not “simply have a degree of belief.” It comes from somewhere. It is not conjured by magic.
    I agree of course. But I have set out very precisely where it comes from. If you look at the section “numerical values” from Jaynes, chapter 2 it set out the idea behind the symmetry arguments which allow us to derive that the assignment of probability should be 0.1.


  1. Hey Steve, what about "On the Historicity of Jesus"? Do you know any refutation?

    1. Google James McGrath, Christ Hallquist, Mark Goodacre, and David Marshall.