Well here's day 3 of the project.
(DAY 2 Southwell Results and Summary Updated)
I find when using Bayesian and other tissue calculator variants, if the top rated were poor value, I'd tend to not to lay them, but rather look for the other better value ones at a fair price, or leave the race alone altogether. (just a personal view)
There will plenty of races where the ratings will show a poorly rated runner that are priced up as a favourite. A further check on the data sheet could be done to see where the weaknesses are in the profile, then a decision can be made. I also focus on smaller fields, max 10 runners. The overrounds are smaller and there's less excuse for getting boxed in during a race, and in some cases, the draw has less impact.
Your audit (I like that) @ Southwell summarised how well your data sheet found backs and lay. This time of year results can be inconsistent, as there are cheeky unexposed 3 year olds improving and wrecking handicap calculations.
One area that I seem to second guess myself is the draw/pace angles. When you get it to the stage (you might already be there) where your data sheet consistently identifies where the pace will be, and the relevant draw bias, that alone will be some sort of edge.
The pedigree/breeding is another one that not many (me included) have really got on top of, but if you get consistency with that data, that's an edge right there too.
There's also a saying "Value is in the eye of the beholder" (if not there should be). How you produce your own rating, or any other one that is used, will be the key mid-long term. You have to give your own rating(s) a fair test, if you find something that works better on the all weather but not on the turf, then re-testing that on previous data is worth doing before making an actual change.
For me, I enjoy the process and try and learn new things, be it in excel, gpt, sports, some things take longer than others, but I've learned not to be hard on myself if I get it wrong from time to time.