Let’s say you work in a hospital. One random day, I come to your office and ask, “Is it going to rain today?” You look outside and see that it’s not particularly cloudy but it’s not one of those bright sunny days either. You have seen days like this pass without a single drop of rain and you have seen days that start like this and end in a flood. So you say, “I am not sure.”

Since I like asking questions, I don’t stop here and instead, I say, “What’s the probability that it will rain today?” You think for a while. You try to remember the number of days which started like this in your life and the fraction that ended up in rain. But to your disappointment, you don’t have that sharp a memory. You try to estimate this fraction, but very soon realize that you have absolutely no clue what it is. So you say, “I am not sure.”

Now let’s pause here and ponder for a while. Whenever someone says ‘I am not sure’ for any question, it is only reasonable to ask them to assign a probability distribution to the set of possible answers to the question. For example, if it is a yes/no question, then the natural thing to ask in reply is, “So what’s the probability that the answer is yes and what’s the probability that it’s no?” If the set of possible answers is, say {1, 2, 3}, that is, the set comprising the numbers 1, 2 and 3, then you will want to ask, “What’s the probability that the answer is 1, what’s the probability that it’s 2 and what’s the probability that it’s 3?” The set of answers may not be countable, that is, it may occupy a continuum. But you can still ask to assign a probability distribution to the set of answers. If you don’t understand how, then this post will not make any sense to you.

Anyway, the point is, when I ask you to assign a probability to the event that it rains today and you say you are not sure, I can ask you to assign a probability distribution to the different values of probability with which it might rain today. This means that I can ask you for the probability that the probability that it rains today lies in a small interval around x, where x is a real number between 0 and 1. And if you claim you are still not sure, you know what I can ask next. Let me still state it, just for the kicks. The question I will ask next is this – “What is the probability that the probability that the probability that it rains today lies in a small interval around x, where x is a real number between 0 and 1?”

I am going to state something really interesting in this paragraph. But before that, I will need to number these questions so that I am able to state the interesting thing in a precise way. So let’s say the question “Is it going to rain today?” was my zeroth question. The question “What’s the probability that it rains today?” was the first question and so on. So here comes the interesting thing. If you choose to answer my nth question, instead of just saying that you are not sure, then from the values that you give me, I can calculate your answer to the (n-1)th question (with a simple integration), and from there, I can calculate the answer to the (n-2)th question and so on till the first (not the zeroth) question. Or in other words, if you are sure about the nth question in the series for any value of n, then you must be sure about the first question as well, or else you don’t understand probability. Or, stated in the contrapositive, if you are not sure about the first question, then you cannot be sure about any of the questions in the series. Or, finally, to summarize everything, you either know the answer to all questions in the series (the set of questions from the first question to the last question) or you don’t know the answer to any one of them.

Some time ago, I was thinking about these things and was trying to figure out what was wrong when I realized something that made it all clear. What I realized was as follows. When I ask you what’s the probability that a certain event will happen, I am not asking you to give me some magic number so that if I perform a certain experiment in similar conditions a hundred times, then the event in question will happen that many number of times. What I am actually asking you is to give me an estimate of your own uncertainty about the event, based only on the information that you currently have. Different people might assign different values to the probability. But it won’t mean that any one of them is wrong. The probability that one assigns to a certain event’s occurrence shows his own uncertainty about it.

This is why Bayes’ Theorem makes sense. It gives you a precise way to update your own uncertainty about something based on the knowledge you receive and the evidence you assimilate.

So, to sum it up, the problem I tried to explain in the post was that you either know the answer to an infinite number of questions or you don’t know the answer to any one of them. The way it is resolved is that you actually know the answer to all the questions, because the reason the questions sounded unanswerable earlier was that you had not understood them correctly.

This is good understanding for the start. I plan to read E.T. Jaynes’ Probability Theory: The Logic of Science some time soon. I hope I will be wiser after that.

“When I ask you what’s the probability that a certain event will happen, I am not asking you to give me some magic number so that if I perform a certain experiment in similar conditions a hundred times, then the event in question will happen that many number of times. What I am actually asking you is to give me an estimate of your own uncertainty about the event, based only on the information that you currently have. So if you have absolutely no idea if it will rain today or not, you should assign a probability of half to the event that it rains.”

Actually, both these interpretations are valid ways of looking at probability. And whether one is more appropriate than the other is a controversy that’s been going on for a while now. Look up the frequentist and the Bayesian/subjectivist interpretations of probability.

But yeah, I do tend to side with the Bayesian camp. And so do you, apparently.

Signed,

Guy-who-had-to-figure-this-stuff-out-for-his-BTP

Bayes rules!

Very well-written post. Actually it is interesting you mentioned the “probability of rain” problem, because this is actually something I am looking at for my research as well, which will hopefully try to explore anticipatory systems in a Bayesian framework.

I don’t know if you have read stuff on predictive control/ preview control. But the models used in that also use past knowledge and uncertain “previews” of future to achieve certain output goals. Its quite elegant.

Interesting…

Actually you will ask “What is the probability that the `probability that the $probability that it rains today$ lies in a small interval around x` lies in a small interval around y”