Backtesting Basics

Organised chaos · Apr 7, 2023

Morning everyone,

Can someone with experience of back testing steer me in the direction of the most productive way of achieving this? I recall reading about a figure of 10k races being used but given I’m a novice any advice would be appreciated.

Many thanks in advance and happy Easter.

OC

dave58 · Apr 7, 2023

Good morning

Organised chaos .

Not too sure what exactly you are asking, but for what it's worth this is how I work.

The first thing I do is try to think of a reason why a horse won or came close in a particular race, and whether there is any pattern in it's entry - for example does the trainer enter a horse in the same race each year and if so why ?
Crazy example but is the race named after someone who the trainer wants to remember , or maybe he's giving his owners a day out and wants to impress them with a winner.

Come up with an idea, spend a few days thinking about it and what factors are involved in it, then and only then back test it (which is just looking at previous results to see if it works).

Keep asking yourself questions at each stage - for example in your post I would ask 'why the 10K cut off point ?'

It's a long process but once done you can just let it tick along, just monitoring it now and again.

Good luck.

mick · Apr 7, 2023

Good post

dave58 . To

Organised chaos i would say be very cautious Re any Back fitted finds. If they appear positive then seek racing reasons to explain, and if these cannot be found then think very hard indeed prior to deciding to risk money on your finds. The astute betting stables will be well aware that many punters will be looking at there stats and i suspect that they sometimes use this to put us away. :eek:

Sandhog · Apr 7, 2023

Perhaps the 10K figure is an attempt to ensure adequate data for making a conclusion about the hypothesis,

dave58?

I remember elsewhere that stats experts would often level the accusation that an idea didn't have sufficient evidence/data to support it.
My only comment is that building enough data takes time, and , over time, things change.
This could mean that the scheme is out-dated before it gets started.

davejb · Apr 7, 2023

10 k races sounds to me to be one of those statements intended to absolve whoever said it from any comeback. The number of races needed in a data sample will vary with the type of race/racehorse/jockey/trainer/whatever being considered and what it is you are trying to find.

Is the trainer (ie stable) in good form - look back 2 years? (10,000 races in the Uk is about that long) you are having a laugh - what use is 2 year old data when looking for something that changes on a daily basis? Okay, so is a trainer targetting the same race each year - well then, 2 years data is insufficient..... both examples chosen to make the point that what you are after is going to determine how far back you need to look.

It would be an error, I think, to overdo the look back period - every aspect of racing, from the participants in the industry to the places they compete at, change over time. A course might have it's surface changed, or drainage changed, they type of horse that gets sent to a trainer may change as owners move on or are attracted to somebody who seems like they might be going places. Different sires become fashionable, older sires retire or expire! Suppose I want to figure out which courses are good for front running tactics, and which are bad - how many races at each distance do i need to collect data for to decide that? You can easily amass data for hundreds of runs over 5f to a mile in just a season or two, you'll have to go back 10 years or more to get even a modest sample of data for staying distances.

I'd suggest that if there's a trend visible in the past 100 races that's giving you current(ish) info that is less likely to have been influenced by different conditions from the past, if there is no trend visible then going further back is probably just going to encourage you to find a trend that doesn't actually exist, or is no longer in play. If you are looking for longer term effects then fine, but bear in mind the further back you go the less reliable the data can be (courses have all been remeasured in the UK in recent years, prior to that race distances were pretty much all in error by sometimes large amounts).

Dave

Sandhog · Apr 7, 2023

I tend to agree ,

davejb .

Personally, I very much doubt that I look back even as far as a 100 races.
But, I'm more of a gambler than a stats man.

Punters have to decide for themselves, I think.

And, if I had some up-to-date information about a runner in an up-coming race, I doubt I'd look back at all.

In other words, I'm totally unable to give any expert recommendation to

Organised chaos .
Maybe he should approach those professionals who produce a database?

Alien · Apr 7, 2023

mick said:
i would say be very cautious Re any Back fitted finds.

Hi

Organised chaos.
Following up on the post by

mick ,
How do you avoid backfitting?
Put your systems together for say the last five years prior to your test year. So use the data for 2017, 2018, 2019, 2020, 2021.
Then run it for 2022 - I reckon 90% of the time it will show a loss.
OK maybe it was a difficult year, rebuild the systems using the data for 2016, 2017, 2018, 2019, 2020.
Then run it for 2021.
And so on.
If the test year shows losses all the time, the system is no good and needs scrapping pronto.

That's how I do things anyway.

Sometimes I will just use the previous three years before the test year. This will no doubt alter qualifiers so it's all about finding the right combination - that's the tricky part.

Good luck in your search for profit.

dave58 · Apr 7, 2023

Good post

Alien

Just for fun I came up with a system based on nothing in particular apart from random things that have been mentioned here on the forum over the years.

Betting on the UK flat in £10K+ races on a Tuesday afternoon, following 8 certain trainers gives results since 2016 of a strike rate of 23% and an ROI at bfsp of 184%.

Sounds good, yes ?

Am I going to follow it ?

Of course not, it's absolute garbage ! - there's absolutely no logic behind it, it just goes to show how you can manipulate facts.

Lies, damned lies and statistics

mick · Apr 8, 2023

The only back fitting i do uses my own past bets as the data source. I have detailed records for the past 30yrs but tend to use the ten most recent yrs to allow for any more recent changes in racing or my own MO. An in depth audit including race type dist and even courses can prove helpful when evidencing my more current strengths and needs, but my back fitting is specific and mostly involves testing new negative filters with the objective being attempting to improve my S/r.

In truth this is a frustrating exercise which most often fails. With the norm being i will apply the filter where able to my past bets and often things appear to be going well with its use negating some of my losers and then you find a 20/1 winner which would also have been swerved.if you had used the filter.! The other aspect to keep in mind is that in seeking the perfection you will never find you can also end up backing nothing.

Tbf over the years an occasional find has proved worth the wait and because all of my looking is done via the horses i have actually backed then this gives the additional confidence needed to implement the filter as part of my future process.

I do not use systems but if i created one which appeared profitable off my back fitted rules then i would need to justify with racing reasons why each of those rules worked and how it meshed with the others to provide the end product, and during my 13 yrs on this forum

dave58 is the only member i have seen make any mention of this..........although apologies to any who have but i missed.

What i have seen is some members make good use of HRB but most often this involves them thinking about and even challenging what those numbers may be telling them.

Organised chaos · Apr 8, 2023

Thank you everyone.

Your posts have made fascinating and insightful reading.

Have a great bank holiday.

Best wishes,

Paul

Backtesting Basics

Organised chaos

Yearling

dave58

Administrator

mick

Sire

Sandhog

Stallion

davejb

Dam

Sandhog

Stallion

Alien

Gelding