• Hi Guest, The software has been updated but I have not had a chance to tweak anything yet.
    It took longer than I had hoped, so I just turned it on and hope everything is OK
    If you spot anything that does not look rigfhyt then please let me know.
    Ark Royal
  • There seems to be a problem with some alerts not being emailed to members. I have told the hosts and they are investigating.
  • Hi Guest, If you are seeing that Lurker has appeared under your name then please take a look here to see why. AR

Early days

davejb

Mare
After faffing about with 'Class Figures' and the like I gave up fairly quickly. So, way back long ago I used to spread the results section of the Sporting Chronicle Handicap Book (the RP Weekender does the same thing these days of course) and handicap 3 mile+ chasers by don't of a great deal of cross referencing of form lines. It wasn't too bad actually, lots and lots of work but I didn't do too badly - so having given up on Class Figures I decided to resurrect my earlier (1970's) methods but get my computer to do the hard and boring bits.
So I've been quite busy, I had to teach myself to program again - must be the 8th or 9th language I've learned to code in my usual brutish fashion...Python this time.

Anyhow, the attached is for info, it's an early work in progress and some of the ratings (very much based on the time factor/distance beaten/average OR at the meeting method) can get a bit screwy - I had a horse rated 759 last week, must have been Frankel sellotaped to a couple of Secretariats, so please don't do anything silly... I'm putting this up for interest and to show I didn't just join, waffle for a week, then give up!

As far as I can tell I've got the form lines on the card to behave - occasional spelling variations in horse names had previously caused some runs to be ignored (another reason to apply liberal doses of salt here) - my aim was to produce a race card with the headline info for each race, a block of form for each runner below, and to automatically highlight 'good' bits - for previous runs over the same distance as today, a horse that won will have *** by the distance (so 6*** denotes a win over 6f, which is today's distance) a "+" symbol denotes a place. Similarly win/place on the same going, the same or higher class of race, the same jockey on board as today.

As and when I can get a window style interface going maybe I'll make it prettier, for now it's all CSV so you can use Excel or similar to view stuff and justify it etc....like I said, it really is a work in progress. I'm hoping to get this into a friendlier process and layout, then start work on a sort of trainers' formbook. The data is from a couple of sources I've mangled together, jumps data only goes back to 1 April, Flat to 1 April 2016.... even the jumps data produces a fair old stack of stuff. Currently there'll be a few on flat/jumps cards that have run on both codes, I'll be updating my programs (there are 5 or so that do different bits of the whole job in turn) to make flat and NH data a bit more easy to spot.... generally speaking anything running over 14f on the flat that has a 3m 2f form line for Uttoxeter should be regarded as possibly not showing 100% only flat form lines.

Anyhow, feel free to take a look - and apologies if there's anything really odd in these cards that I've not programmed a trapping routine to catch.

Oh, I forgot to mention that jumps and flat are both rated with respect to 9-0, ie 126 lbs, so ratings in form lines show the rating earned based on that weight - the best rating of the past 200 days is used as the rating in the upper '1 line per runner info' section, and this rating is adjusted for the weight carried. WFA is not used.
 

Attachments

davejb

Mare
Cheers,
no doubt my stuff will change loads over the next few weeks (after all, this was a blank sheet 5 or 6 weeks ago) but I think I'm making some progress - I'm trying to get the computer to do all the time consuming stuff so I can concentrate on looking at the data I want, in the format I want it. Even when it's all working it isn't meant to be an automatic selection system, the idea is to find top rated/2nd top rated runners that are backed up by having form that shows they can do the distance, are in a sensible class, can handle the going and have a jockey on board who knows the horse.

Okay, here's tomorrow's cards - I can churn these out pretty quickly but I don't want to be a problem by uploading tons of stuff nobody wants, so if this is any sort of a problem let me know....

Changes to the information supplied - to make it easier to spot things like form lines from NH racing (just in case 3m 2f at Uttoxeter doesn't suggest NH racing) I've added a code to the end of each form line, they're pretty obvious - Flat, AW for a run on AW, for the jumps it's NH C for a chase, NH H for hurdles, and NH F for a bumper. After the number of runners I've added 'Hcap' where applicable make handicaps a bit easier to spot, as the race name tends to disappear in Excel being (usually) really lengthy these days. The ratings are now in two columns to show which are on turf, which are on AW.

There's a new bit at the end of the racecard style lines (ie the upper part of each race's info) under the heading 'Lark' - this is (fingers crossed) for those who like systems, or who think the combination of form info and a bit of input from a system might be beneficial - Lark is the rating calculation for each horse using the larkspur system, from Tony Flannagan's 'The Larkspur Method' (Raceform, 2014) - instead of using RP ratings etc it uses the program's own ratings however. (Chiefly because the only way to add RP ratings would be for me to hand crank them in, that may change but to be honest the results from using my own ratings seem to have been okay in the minimal amount of testing I carried out so far).

Have fun,
Dave
 

Attachments

davejb

Mare
Damn bugs - been squashing them all day....there's probably a couple left so don't stick the mortgage money on anything.
Files for tomorrow below, new file called simply "ratings.csv" has a list of all racecard races for all tracks, and lists the top two rated on my ratings alongside the top two for the Larkspur ratings. with a little info about the race (course, time, class, distance, hcap or non hcap) and the runners - for them it's just headgear check and C, D, or CD info.

Due to bug squashing my top 2 list (results.csv) for today has changed a bit, and to simplify things it takes the last 6 ratings of the horse regardless of how far back that might be - so it gave Exclusive Waters top rating at Pontefract in the 1700 race..... coming in at 33/1..... now if ONLY there'd been a big flashing light and a discreet siren to tell me to actually back that one!

I don't want to pull ratings in from too far back, but Exclusive Waters high score is from a run at Sha Tin on 27 Nov last year, there's been one run on the AW in Feb since, so I'm trying to decide whether to go with my current 200 day limit on ratings used, changing that figure, or going with 'last x runs' - any thoughts? I think it's safe to ignore this one as a bit of a fluke, so I'm not looking to tweak stuff to accommodate random rare events like this winner today, but how far back is it reasonable to consider a rating as reliable?

Answers on a postcard....

Although I can produce these cards with little effort (barring the programming that continues of course) so it's really not hard to cover every meeting as I have, but how about if I go with a couple of cards and the top ratings listing from now on, just to save filling the entire server to overflowing inside a week? (Or would folks rather I kept doing the lot?)
 

Attachments

ArkRoyal

Administrator
Is there any particular reason, apart from it being easy to do, that your calculating Larkspur ratings?
I found them to perform very poorly when I tested them a few years back.
 

davejb

Mare
Hi,
I've seen rather a lot of systems, I prefer form or speed ratings as a basis for picking what to bet on. The aim here is to pick horses using the ratings, then use the filters to reduce the number that are actually chosen as bets for the day. The filtration process is to look at class, going, distance to make sure the horse is suitably placed to reproduce their better ratings, and I don't like to see an unknown jockey on board. It occurred to me that I might use a simple system as an added filter, so I added it to the end to see if perhaps finding a horse heading both sets of ratings was an improvement. I'm collecting results to decide if it should stay in or not.

I also intend to add information on trainers, where the trainer has a significant performance in similar races, and to improve the ratings if I can..... it's the nature of speed ratings to bounce around a bit, and I don't intend to do anything to alter that, but there are times when things like rail movements have at least the potential to affect the result and I want to look at how to account for that without making the ratings too much of a guessing game.
Dave
 

davejb

Mare
Hi,
yes I've read through that - I read a fair number of threads/posts on here when I joined, then I went quiet for a good think before starting my program. One of the things I really like on here is the quality, which is why I kept quiet for a couple of months...nothing to say worth listening to!

I have standard times in an Excel sheet, this is accessed by my program to work out the speed figure - as railmove info is available I am looking to have extra lines in my program where an extra fraction of the standard time is added for a race run over longer than the official distance, the trick will be to make the calculation of the new figure for a standard time more reliable as a yardstick than the current 'sometimes not quite over the correct distance' version as supplied by the RP results page.
That's a bit different to @TheBluesBrother's way of handling it, but I suspect we'll end up with very similar figures....

It's on the 'to do' list for the near future, at the moment I'm adding the trainer info and looking to have the upper section of race info in my cards a bit more useful by flagging up the class/going/distance stuff I'm looking for.... not to bypass form reading, but to highlight races and runners that are likely to provide a better return on time invested.

I'd be embarrassed at this stage to bother anyone considered an expert frankly, my skill is more in bludgeoning the PC into doing some of the boring bits - via Excel I got speed figure compilation down to maybe an hour or so a day, typing the results into RI and the RP site (just in case) took another half hour or so, it now takes about 20 seconds to compile them all and store then in a database on my PC, and the PC makes less mistakes than I did using Excel!

Dave
 
@davejb

I don't have any programming skills, if I did, scraping data from the Racing Post website wouldn't be an option for me, due to the excessive rail movements and incorrect times that pop up daily,
for example from the last meeting at 4:05 Uttoxeter – 3m 47.7s, the racing post had the time at 3m 57.0s, 10 seconds out...

Then you have the rail movements, an example from Saturdays meeting at Chester:

Race 1- 7f 140y(+13y)
Race 2- 5f 26y(+11y)
Race 3- 7f 14y(+13y)
Race 4- 5f 26y(+11y)
Race 5- 1m 4f 83y(+20y)
Race 6- 5f 26y(+11y)
Race 7- 2m 2y(+26y)

The worst racecourse for excessive rail movements is at Fakenham, where they have rail movements of anything up to +257yds (17.13s).

The standard times list that I maintain is updated most days, especially in Ireland, where their race distances are just works of fiction, the worst racecourse being Tramore, I just recently
asked the clerk of the course, when did you last measure your race distances?

Mike.
 

davejb

Mare
Wednesday -
I've done cards for Bath, Perth (for the NH fans) and Thirsk, the ratings file covers all the meetings, if anybody particularly wants all meetings (I can do a single file with all the meetings on as well) I can do that, just say so, it only takes a minute or so per meeting.

As always this is work in progress, the trainer addition is coming along okay and when that's done I'll look at trying to improve the ratings.

Dave
@davejb

I don't have any programming skills, if I did, scraping data from the Racing Post website wouldn't be an option for me, due to the excessive rail movements and incorrect times that pop up daily,
for example from the last meeting at 4:05 Uttoxeter – 3m 47.7s, the racing post had the time at 3m 57.0s, 10 seconds out...

...

The standard times list that I maintain is updated most days, especially in Ireland, where their race distances are just works of fiction, the worst racecourse being Tramore, I just recently
asked the clerk of the course, when did you last measure your race distances?

Mike.
My scraper - so far - has been aimed at the timeform results page, it was easier to use RP grab results to feed into excel originally, by moving to a program of my own devising it allows me to tailor things to other sources if I think they are likely to be more reliable.

I notice for the race at Uttoxeter RP and ATR are reporting 3:47.7, Timeform 3:48.7 currently. I've seen some fairly horrendous rail movements too, which is why I'm looking to do some work on this area... could I ask what you consider to be the best source for actual racetimes, and how are you deciding on a standard time for a course and distance? For example are you calculating your own version, or are you obtaining them from elsewhere?

For the railmove calculation I was thinking of doing it by multiplying the standard time over the official race distance by (actual distance covered divided by published race distance), so if a 5f race actually ran over 5.5f and the 5f standard time was 60s, I'd increase that race's standard time to 66s to compensate for the extra 10% distance actually covered.

Obviously speed ratings require good data to start with, which is why I'm still programming rather than betting right now, but I would like to make a decent job of it!

Thanks for the info, I've read your stuff avidly!
Dave
 

Attachments

For the railmove calculation
It is very simple to do, just divide 15 into the rail movement, so if the rail movements is +13yds = 0.87s, you subtract, if the rail movement is -13yds you add.
Or you can use the 14.0s ratio excel table (see below) that Dave Edwards ask me to do for him, very similar to my 15 method.

For the results carry on using Timeform.

Sorry for the late reply, on Wednesdays I go fishing.

Mike.
 

Attachments

davejb

Mare
Thanks a lot,
I do appreciate the help, yds/15 is fine for me!
I've added the trainer bits I wanted, again finding some of the web racing sites a little confusing when I tried to check my calcs were correct. Essentially the program steps through all runners since 1 April this year looking at the trainer's name, for then adding totals up for number of runs, number of wins, places and so on... looking on RP and ATR (couldn't find similar tables on TF) they don't seem to give a start date for their counts, other than 'I year data' for RP (so, 365 calendar days from today, from the official flat season start date in 2016...?) So, I'm going with my numbers until proven otherwise.

Trainer stats come after the larkspur rating, 7 day results showing wins, places, runs, win percentage(may vary a little to values from web sites/papers , as it's 7 days from date I ran the card compiler, but it's still giving the recent view of trainer form which is all I want from it), then win/place/run/win% since 1 April, then there are a few filtered sets :
AT - figures for win, runs, win% for this age of horse running in the same type - handicap or non handicap) as the race being analysed, for this trainer (AT - age, type) since 1 April
AD - figures as AT but it's for matching age and distance, so if the race is a 6f race and the trainer has a 4 year old entered you see how their 4yo's have done over 6f since 1 April
CD - as the above pair, but the paired factors are course and distance - how has the trainer in other races here over this distance since 1 April this year

As I said, I am almost certainly counting from a different start date, so you can expect to see slightly different trainer stats elsewhere - if I spot any glaring errors I'll fix them asap.

Next up is to improve the ratings, railmove here I come.

Late due to bug squashing, apologies for the delay if anyone is looking at this stuff - I did three meetings for tomorrow....
Dave
 

Attachments

davejb

Mare
Cheers,
I didn't check against HRB but their figures are closer to mine, mine have Flat and AW combined and HRB totals for that are either the same as mine or perhaps 1 day different (my figures are from end of day 4th, HRB have the results for 5th in there as well it looks like which I still have to add). I can use HRB to double check, but it looks like it's me and HRB against the racing papers who I can only assume are counting from a different base date.... that's fine, as long as my stuff isn't make believe I'm happy to have a slightly different view to the RP ;-)

Thanks for the help, as I've built my own trainer database now it's not a problem but I didn't like to see the variations from RP/ATR, even though I was sure my data was correct- you've shown me the most reliable place to check! (For info I loaded my 'runners' database into Excel for a quick visual check, I have for example 66 lines of runners trained by Sir mark Prescott and 11 of them show as having won - HRB has exactly the same).

Dave
 

davejb

Mare
I've programmed the railmove information in, and recalculated all ratings back to 1/4/16. That has changed a number of the top2 selections for today, so I'm posting the new version and will see how the new ratings compare to the old ones....only a very small sample of course, but having made the change it's worth comparing results. I thought I'd go back over the last few days too, but then realised that having the actual results of those races in my database might tend to skew the selections a bit...dohhhh!
(At least I realised it was a stupid idea before I actually did it....)
Dave
 

Attachments

@davejb

I started adjusting my speed figures for rail movements back in Dec 2015, the difference it made to my figures was a real eye opener,
I came up with my "15 method" by playing about with linear regression in SPSS.

STRATFORD-ON-AVON/TUE 04 JUL

Rails: The 2m bend is shared, all others divided making the following alterations to advertised distances:

Races 1 & 6 (+42 yards) - subtract 2.8s
Race 2 (-6 yards) - add 0.4s
Races 3 & 5 (-24 yards) - add 1.6s
Race 4 (+72 yards) - subtract 4.8s

I make the adjustments in the Comparison per furlong column

Stratford.png

Note: I use my own standard times for Stratford as I found the Racing Post's standard time table to be 11.0s out.

Food for thought...

Mike.
 

davejb

Mare
Cheers,
I'm about to go out so will reply properly later, but one point (if it's not giving anything too precious away) how are you calculating your own standard time? I can think of a few different ways I'd consider doing it. Many years ago I handicapped the local greyhounds by comparing winner times against my own calculation of standard - as far as I remember I took the first three home in each race for a given distance, at Blackpool istr that was 460yds, and averaged times out using 0.06s/length. When I started the program up I considered taking 5 years of times and doing a very similar thing with the first three home.

Your ratings method is the one I'm using, with the HRB daily results file for the data - I want to scrape Timeform as a comparison/backup, but I have to learn how to do it first...

I'll get back to you later, curry and a pint await!
Dave
 
Many years ago I handicapped the local greyhounds by comparing winner times against my own calculation of standard
That is exactly how I started, back in the early 90's I was system admin for Swindon greyhounds and started doing greyhound ratings.

The problem with compiling standard times in this country is the rail movements, you have racecourses like Uttoxeter who very rarely run to the re-measured BHA distances, imagine
trying to work out a set of standard times for Fakenham where the race distance can been increased by +257yds.

I have simplified the way I compile standard times, and then check them using linear curve regression using SPSS.

25 samples - 15th percentile.
20 samples - 20th percentile
15 samples - 25th percentile
10 samples - 30th percentile

Try and ignore times achieved on fast surfaces like firm going.

Lingfield AW 5f

56.67
56.77
56.80
56.95
57.12
57.14
57.15
57.32
57.40
57.40

Statistics
N Valid 10
Missing 0
Percentiles 30th 56.8450 (56.9s)

So using only 10 samples the 30th percentile of the data is 56.9s, the Racing Post's standard time for the 5f AW is 57.10s.
When selecting a group of data on the AW, you have to take into account the different variations of surface speed.

AW going allowance table:
Fast +0.50s/f
Stand/Fast +0.18s/f to +0.40s/f
Standard -0.15s/f to +0.15s/f
Stand/Slow -0.48s/f to -0.18s/f
Slow -0.70s/f to -0.50s/f

Mike.



 

Attachments

Last edited:

davejb

Mare
I think SPSS might be a little pricey for me - however I could imagine an alternative method to deriving this so I'll give it a try and see what I come up with.... I was really just looking to confirm that you were calculating your own version from raw time data. Well, time to go play with numbers again,
Dave
 

davejb

Mare
The pace of development is about to slow right down, as I've got the basic program(s) doing most of what I want - changes now are likely to be more a tweak as I take some time to sit back and think hard about improving the ratings. Just in case anyone is watching the figures going past I'll keep posting a couple of cards and the 'all meetings' ratings top 2 list, for what it's worth today's list (as posted above) ended up with 49 top rated (in one race both top and second top were withdrawn) runners and a return to 1pt win of somewhere in the region of 75 pts, so whilst I wouldn't dream of betting on these blind it is encouraging to see something other than a 49 pt loss! (greatly helped by a 20/1, a 14/1, and a 9/1 - the returns without them look rather worse!)

Attached are two versions of the card for Sandown and Doncaster tomorrow - there's a card like I've been posting, then the second card for each named sandown2 and Doncaster2 - these 2nd versions are more the sort of thing I am thinking of using, they show the races that runners have won or placed in only, making it a bit simpler to spot trends in the returns and spot whether a horse has run well on a particular going/distance/class/jockey basis. My master plan is to use the ratings to pick potential bets, then use the reduced form display to decide if the horse is placed to have a chance of winning.

I've amended the ratings - the full form card ratings were ignoring figures older than 200 days while the top2 ratings list took the last 6 ratings regardless of age. One or two recent runners have emerged from the mists of time with top ratings, and I don't think it's a great idea to rely on a figure from 18 months back - so now the ratings list is, like the main cards, looking back 200 days for rating values.

Dave
 

Attachments

Top