Just to add to this, as I see I posted a very similar question without reading back through the comments (apologies).
I think chi and A/E help a lot, but don't fully account for selection bias. I've a number of systems in HRB that are very profitable over 10 years, but (except for luck) would probably make a loss in 2021. These are mostly ones that rely on picking individual humans/horses that have done well (eg, trainers/jockeys/stallions) rather than concepts (eg ratings/CD winners). Even with a good chi and A/E score, there's insufficient reason to think the same criteria and qualifications would do the same again the following year.
As an example, I ran with an idea that I think has some mileage. Entered the criteria, along with some basic control variables, and broke it down by the most successful trainers with my idea over 2011-2019. High ROI, high A/E, high strike rate. Results below for 2011-2019 very promising. Two bets a week, good SR, good ROI. Good A/E. Good chi. Etc etc.
Testing that system in 2020, you see the results below. Very poor by comparison, and certainly not reliable.
My point is that backtesting in some shape or form is the best I can think of (beyond proofing your system in the future). Run your system and its criteria to generate qualifiers as if it was a year ago, and see how it would have performed in 2020. HRB allows you to do that.