Why in the races it is more difficult to predict the second ?

cosmicsports · Oct 14, 2023

Try this experiment with a friend:

First study all the race cards carefully. Then the friend locks you in a sealed room and after the races are over he asks you to identify the winners.
You do that and you find some correct, probably 25%.

Now perform another experiment.
This time the friend announces the winners to you and you are asked to name the horses that finished second.
As you have N-1 horses to pick from now, it should be easier.
But it's not. You will find maybe 23% correct (if 25% was your score for the winners).

Of course if in both experiments you do it by chance then you will find more seconds than firsts (1/(N-1) as opposed to 1/N).
But those will be low scores.
You are supposed to do this after handicapping the horses and this will happen.

You may apply some common corrections.
Such that if the front runner is not the winner then something fishy went on and he did n't finish second.
Still you will end up with a small deficit.
If however you play this game for the third placed horse -i.e. the friend gives you the winner and the second- then you score higher.

gerry · Oct 15, 2023

Well at the minute i am deff finding more seconds than winners from start and i would say i have always found that out over the years.

cosmicsports · Oct 15, 2023

gerry said:
Well at the minute i am deff finding more seconds than winners from start and i would say i have always found that out over the years.

Wrong count.

cosmicsports · Oct 19, 2023

How do we compute the win probabilities of the horses in a given card in a speed model ?
The common approach is this:

- compute the expected speed for each horse using histogram analysis
- compute a function of the form p ~ exp(-(t - t0)/τ) and normalise to 1.

In this formula t is the horse's expected speed (time from wire to wire), t0 is the best expected speed in the group and τ is a time constant characteristic of the distance.

There are more elaborate ways, including the use of a multiple integral to compute the p values, but to a first approximation this is it.

Now suppose we want the probabilities for seconds and thirds, so we can go for the forecasts and tricasts.
How to do this ?

We use Harville's law.
This in bare form is grossly inaccurate but we can make corrections.
First the histogram analysis has to be modified, second the time constant is not the same.

After all this work suppose:

We find a mean probability Q1 for the winners.
We find for the second finishers a mean probability Q2, this being the mean probability with the winners supposedly known (before we add the lost using Harville to work out our p's).
We find for the third finishers a mean probability Q3, this being with the winners and the second finishers supposedly known.

We expect that Q3 > Q2 > Q1.
Because for the second finishers we have to deal with N-1 horses and for the third finishers with N-2, fewer that is.

But in all the databases I have analyzed it turns out that it is not so.
Q2 is less than Q1 !
Q3 is better than Q1 and Q2 by a margin (this being as expected).

Because the same thing was observed everytime I did this kind of analysis, it cannot be chance. It's a property of nature (horse racing nature).
Why you think ?

Why in the races it is more difficult to predict the second ?

cosmicsports

Colt

gerry

Mare

cosmicsports

Colt

cosmicsports

Colt