• Hi Guest, The forum will be moving hosts on 26 July and as such will be closed from Midday until the move has completed.
    As we will be with new hosts it may take a while before DNS get updated so it could take while before you can get back on the forum.
    I think it will take at least 4 hours but could easily be 48!
    Ark Royal
  • There seems to be a problem with some alerts not being emailed to members. I have told the hosts and they are investigating.
  • Sorry for the ongoing issues that you may have been experiencing whilst using the forum lately

    It really is frustrating when the forum slows down or Server Error 500 pops up.

    Apparently the hosts acknowledge there is a problem.
    Thank you for using our services and sorry for the experienced delay!
    Unfortunately, these errors are due to a higher server load. Our senior department knows about the issue and they are working towards a permanent resolution of the issue, however, I'd advise you to consider using our new cPanel cloud solutions: https://www.tsohost.com/web-hosting


    I will have to investigate what the differences are with what We have know compared to the alternative service they want us to migrate to.
    Keep safe.
    AR

Combining predictions for one outcome

AustinDillon75

Yearling
I've posted in this sub-forum before so if its in the wrong place please do move.

What I've got is two set of predictions. They all relate to AW Handicaps at Wolverhampton, run between 2016 and 2018.

I've got HRB rankings, which are customised and levelled, to say this:

1597765595263.png
So the top ranked win 20.9% of the time and return a loss of -6.4%.

For the same set of races, I've taken the SPs, which are completely independent of the HRB data, and it comes out with this:

1597765660726.png
So the top rated horse wins 30.4% of the time and returns a loss of 1.6%.

Given both predictions are completely independent of one another, I'm convinced I can improve on this. I'll return later with ideas on how, but if anyone else has thoughts I'd welcome them.
 

AustinDillon75

Yearling
What I tried to do with these two totals was to essentially rank all the horses by percent rank in each of the two categories, so the top 5% of horses in the sample scored 1, the next 5 0.95, etc.

I found that a horse in the first 5% on my HRB rating would beat 70.7% of all horses home.
I then found that a horse in the first 5% on odds rating would beat 81.2% of horses home - the SP is obviously going to be more accurate, but can we get HRB to help improve it?

It emerges that if a horse in fell the top 5% in both categories, it would beat home 85.5% of horses.
If in the first 5% on odds but position 5-10% on HRB, it would beat home 81.7% of horses.
If in the first 5% on HRB but position 5-10% on odds, it would beat home 82.3% of horses.

I've then applied it to an seperate race at Wolverhampton from 2018 and it gave this suggested outcome - so Passing Star might beat 73.2% of all horses, anything over 50% might be expected to finish in the first half of the field. An odds line could probably be created from this but for another day:

1597840609854.png

So when we tested the combined forecasts for a load more races this came up:

1597840905619.png
But when these were combined, as per above, we got this:
1597840954867.png
So a dip in winner percentage but the ROI improves on both of them.

So the aim is to try and take these percentages and combine them into a forecast that will get perhaps nearer to a third accuracy, and some sort of profit.
 

AustinDillon75

Yearling
LukeyBoy LukeyBoy

I think your suggestion of Impact Values is interesting, I think what you suggest looks a bit like this:

1597841144564.png

So top rated in both categories would get the square root of 2.02 x 2.94 (2.44) and if 7th in HRB but top odds it would be 0.80 x 2.94 (1.27)
Is that where you would be going with it?
If so I can try them in combination to see what occurs in a new out of sample test?
 

AustinDillon75

Yearling
So I've combined the impact values together and this is the result - training data from Wolverhampton handicaps from 2016-2018, the test is data from 2018 to the present.

1597853269410.png
It could be said we've improved on the HRB forecasts, but we've now got a -10.8% ROI instead. The issue to a degree is you'll have second ranked horses close to the first horse getting the same value whereas we probably need to weigh the impact values a bit, so if a horse in second has a similar rating to the first horse, it will maybe get an IV of 2, rather than 1.36, which is a significant drop between best and second best. Additionally if the favourite has an IV of nearly 3, what if its 5/2, when a horse that's 11/4 gets an IV of nearly 2 instead.

I'll try and refine it a bit on this basis.

Maybe training over 800 odd races isn't right, and maybe testing over 750 is too many as well. Is there merit in training over 200, then testing over 100 and repeatedly "folding the results" together, to get a better predictor?

Obviously the fact it includes SPs also means you would in practice be working to odds available at a point in time. In other words, I'm saying if its 5/1 in the morning, it might go off at 2/1 and its then got a totally different rating. I'm wondering if SPs accurate as they are should the percentage of success we need to aim for, rather than being inherent in the figures we create. Sorry if I'm making zero sense. The aim is to get a predictor that will maybe get the top couple via the rating more than 50% of the time. At present, the HRB only delivers 34%.
 

LukeyBoy

Yearling
It's just one of those things that always happens Rating 1 is good Rating 2 is good splice them together and they are shi.....
I've always found it the same when using ranking because like you said they are more a generalisation and From Sp ranked 1 to ranked 2 could be a difference of who knows.
What about A/E instead of IV? But use BFSP instead of standard ISP because it's alot more accurate.
So sum together all the chances of say Ranked 1 for HRB that is Expected (=1/BFSP) and amount of winners that is actual.
So Actual/expected for eg Rank1 HRb you might have 179.99 exp winners and 189 winners 189/179.99 and then multiply same way as before with IV's.

Strike rate prob be down but ROI up.
 
Last edited:

LukeyBoy

Yearling
IMO you will be struggling to get the top 2 hitting 50% without some sort of feedback from the Market.The top 2 with your table for SP are at 50% and just using HRB to try and beat that is difficult.And doing so with out losing ROI is harder!
 

AustinDillon75

Yearling
I agree it would be difficult without some sort of information from the market. Ultimately the market is best and its whether we can use something else to isloating those top rated that win 35% of the time as against those winning say 25% of the time. I might well see if BFSPs make any sort of difference, its easily done. Will look at A/E as well.

The only other thing then is when we want to predict for real, we are then reliant on data that isn't available till near the off. You'd probably want to input Betfairs own likely estimation rather than doing your own so doing them at 11am in the morning is going to rate things a bit differently to what would be the case once the smart money is down nearer the off.
 

AustinDillon75

Yearling
The latest attempt has done a couple of things:
1. We've reduced the training data to all races run at Wolverhampton AW between 23 April 2016 and 31 December 2016.
2. I've swapped the SP for BFSP.
3. Instead of just going with impact values I've derived a formula that will weight the IV more heavily the bigger the advantage. So in other words, if something is ranked 3rd, but 40 HRB pts above the race average, it'll have a better IV than a horse ranked 2nd which is only 20 HRB above average.

1597864366271.png
Combining the data above has produced these results:
1597864768815.png

I've then run a series of races at Wolverhampton, from January-March 2017, and it has produced the following results - again completely uninfluenced by the training data. Will have to try and figure whether it achieves a profit to SP, but the percentage for horses 1 and 2 is really good.

1597864535469.png

Struggling a bit to explain myself now, but we've now given ourselves two distinct tables above, against which we could run a test of races from April to July, and then combine the pair of these in a similar way, to improve our percentages even further?
 

Attachments

LukeyBoy

Yearling
Nice work dude.Be good to see how it was to fair in real life .The winning percentages are pretty linear which is a good sign.If you had 2nd beating 1st and 4th beating 3rd etc you know your're on the wrong path.Did you create you're own HRB ratings? I've tried to in the past but can never make sense of the calculations via the website.
 

AustinDillon75

Yearling
LukeyBoy LukeyBoy I have made my own ratings but they are pretty heavily based on HRB Standard. I have however added in a speed rating and jockey/stallion stats to allow for an estimated rating to be made for unraced horses.

Its not hugely better than the inbuilt one but it just stops instances of a good horse always being bottom just because its unraced.
 

AustinDillon75

Yearling
Its been a good days' racing so time for me introduce more tedium and number crunching to the place.

I've reduced my testing so I have two lots of data in seperate sheets:
TEST1 - this contains 218 races from Wolverhampton AW between 23 April 2016 to the end of the year, identifying horse rankings based on HRB and the BFSP.
TEST2 - this tests 250 races from Wolverhampton AW between 1 January 2017 to end of March, using the statistics contained in TEST1.
TEST3 - this now trials 61 races from April 2017 onwards, with the data carried from both TEST1 and TEST2 sheets. We are now getting the following outcome.

1598022900683.png
Admittedly, the test does assume that BFSP is known already, when it can't be known until near the off, but this is seeming quite promising. I'm not convinced it will profit with ISP but all the same, the strike rates for the first three are pretty impressive, nearing 74%.

The good thing about it is we can "shift" the data around too, so over time the TEST3 races will drop into TEST2 and eventually TEST1, so we are making forecasts that take account of the more recent happenings at the track. There's a bit of fluidity to it.

Sorry if its a bit mind numbing, but the real test is going to be getting the prices as close as possible to BFSP for the data entry, alongside the HRB data.

My HRB formula calculates a standardised total, so if its had less than 10 runs, it will estimate a total based on what it might achieve after 10 runs. If a horse has no runs it estimates based on the trainer, jockey, stallion and today stats. Feels like this is a piece of work I won't regret doing. Its also easy to fetch the required data too, and with HRB we could be building different sheets either track by track, or by race code, race type, etc.
 

AustinDillon75

Yearling
Here's an example of one of the TEST3 races. I've picked one which got us the winner (surprise), but this is rating the runners from the combination of TEST1 and TEST2 data.

1598024353338.png
The HRB column is my own rating, and the ODDS is essentially what the market would give if it was running HRB, so with Whitecliff Park being a solid favourite, the market would suggest it has an HRB of 240, down to Sailor Malan, which gets a 178 but also figures pretty poorly on my HRB too.

The Total column is essentially multiplying a number of impact values derived from TEST1 and TEST2 sheets; so in effect, the assessment is not contaminated in any way shape or form - its taking past results and form on board. We won't get many top rated winners at big prices, because if we are factoring odds into the total outsiders are pretty unlikely to do so well, but what we might get is a fairly steady stream of winners.

If Whitecliff Park was 8/1 in the morning, and Sir Dylan was 3/1, our calculation at that time would be completely different. So we have to get as near as we can to the final market price which is likely to be one of the issues going forward; we've tested on the assumption that we know the BFSP early in the day, but this needn't be a completely insurmountable problem, obviously.

Question here, I've combined impact values and this race gives totals of less than 1, what I can't seem to do is get a sensible way of making 1 the average. In other words, if I have two IV figures, 1 and 0.8, I take the square root of these after multiplying, which is SQRT (1 x 0.8) = 0.89.
Does any know how I do that for four numbers? It wouldn't change the rankings above, just the actual totals.
 

AustinDillon75

Yearling
If every horse had an IV of 1, they all have the same chance of winning.

If it has an IV of 2, its twice as likely as the average.

If it has an IV of 0.5 its half as likely as the average.
 

AustinDillon75

Yearling
It can be better estimated nearer the off but it is never known until the race has started.
Well yes, its the inevitable issue when you factor the market opinion into a rating. If you have something that picks off an estimates BFSP 1 minute prior to the off, it'd be ideal.
 
Top