One of the cases that we face in econometrics is when the response variable is qualitative. In the simplest case, the response can be either 0 or 1. Some examples:
- Given some parameters about a person: will he vote democrats or republicans? will he subscribe to some magazine? will she leave her job if pregnant? Will the student decide to cheat? Will she vote at all?
- Or given some parameters about a bond, how will it be classified? it will surely depend on volatility, leverage, assets…
- A teacher wants to know if a student will pass his exam.
- A financial company wants to know if the about to be issued sub-prime mortgage will default.
In those situations we don’t have a value to predict directly as in a regular estimation. We’ll estimate the probability of something happening. P0 that it won’t happen, P1 that it will happen. P0+P1=1. That means that we’ll have a Bernoulli probability distribution with a mathematical expectation P1.
Imagining a two-dimensional situation, we’d have something like this:
Fitting two parallel lines with a diagonal line, that should be tough
That means that if we estimate with an ordinary least squares regression we’ll have a bad approximation for three reasons: the shape doesn’t resemble a line, the resulting line will span above 1 and below 0 (impossible probabilities) and R2 as a measure of goodness of fit will be higher the more separated the two dot clouds are.
The solution: using the logit model. We need something like this:
The function that we are trying to fit is the logit function, an special case of the logistics function. As you can see it asymptotically tends to one on the right side and to zero on the left side. That would mean that our regression would be now:
You need an statistical package to do that. You can even try to do it by means of a Gaussian distribution. That would be the probit model instead. Daniel McFadden earned the 2000 Nobel Prize in Economics partly because of his developments of econometrics with the logit regression.
So, what’s the relationship with bits? In communications the basic unit of information is the bit. As you already know a bit just expresses a simple basic idea “0” or “1”. That could mean “on” and “off”, but also “democrat” or “republican”, “male” or “female”. The simplest information, and, with a combination of them, more complex ideas: height, number of children, passport number, marital status.
When you’re sending bits, you’re making also a Bernoulli trial. You can either send zeros or ones. You don’t know if they’ll be ordered or not, but usually you imagine that they will have similar probabilities.
A measure of the “order” or “disorder” is the entropy. If there’s a similar quantity of zeros and ones, the entropy will be maximum, if they’re all ones, or zeros, minimum. There wouldn’t be any disorder if you only sent zeros. There would be no information either. So entropy must be kept to a maximum to send information. (And to ensure a working channel, that’s what randomisers are for).
There’s a function that measures the entropy of a Bernoulli trial:
The maximmum, of course, is when both probabilities are the same, when you are sending 0’s and 1’s with the same probability.
Now with coincidences
So where’s the funny thing? First let’s transform the logit regression function into a log function (take logarithms at both sides):
Now, let’s take the entropy of the Bernoulli trial and let’s derive it:
Do you see it? Here there is some coincidence going on. Both are the same function. (Well, there’s that minus sign, but who cares).
So, again, is Economy a Science?
Why do I care, you might ask. Well, lately I’ve been thinking about if Economy is solid enough as a Science or not. There’s a lot of things to say about that.
First I thought it wasn’t. What kind of science could have his rules changed by politicians? I posted my negative opinion here: Economy did not have solid enough foundations to say what was true or not.
But were the other sciences solid enough long ago? Maybe it’s just my perception of mine to think that Physics is more solid than Economics. Reality rules can’t be changed -for now-, but they can’t be totally determined either. Do you know about the Heisenberg uncertainty principle? It’s one of the greatest concepts that humanity has achieved. I have to blog about it some day.
There’s something deep going on here, laying the foundations of Economy as a science. The more I think about it, the more I can grasp or feel it. Even with planned communist economies, there still were cycles. Even when the Phillips curve did not work anymore, people still were making the same decisions, work more if there was more reward to it, hire cheaper if there was more workforce to choose from.
The more I think about it, the more I see Economy is a solid science in its own right.