• Hi Guest Just in case you were not aware I wanted to highlight that you can now get a 20% discount on Inform Racing.
    Simply enter the coupon code ukbettingform when subscribing here.
    We have a lot of members who are existing users of Inform Racing so help is always available if needed.
    Best Wishes
    AR
  • Hi Guest Just in case you were not aware I wanted to highlight that you can now get a free 7 day trial of Horseracebase here.
    We have a lot of members who are existing users of Horseracebase so help is always available if needed, as well as dedicated section of the fourm here.
    Best Wishes
    AR

Any Geeks, Software Developers here ?

MrDamon

Colt
Wondering how many of us dabble in software development and how many use it to enhance their sports betting ability.

I use vb.net and have done for years. Not sure where I would be without it now to be honest. Depend on code for so many things..
 
I need to bite the bullet and move to a DB rather than huge unweildly spreadsheets.
I would love some pointers, and I am currently wrestling with how many tables to use.

AR
 
I'm with you there AR.. Excel all the way.. I'm daunted by DB Queries etc but I have been told time and again that I will never look back once i make the switch..
 
I also know that Ray (rpjd99) and Jason (jrob09) do their own code as well.
Perhaps I should create a new section on the forum?
 
MS SQL Server is a heavyweight professional db that I used as a developer in my last job for many years. I now use the Developer's Edition of SQL Server 2008, which cost me £50 and is fully-featured, crucially including SSIS (Integration Services) which allows the creation of reusable packages to import and manipulate data from external sources (in my case, on-line flat files). The only restriction is to use in a development or test environment, ie not as the back end of a live website. The current Developer edition is 2012 and is the same price.

If you're not intending to hold millions of rows of data, MS Access should also do the job, though others here are probably better versed in its use than I am.

Naturally, both can be queried using SQL, and Access has its own graphic query builder.


MrDamon said:
I'm daunted by DB Queries etc but I have been told time and again that I will never look back once i make the switch..

100% true. If you're good with Excel you'll find the jump to SQL an easy one, and it's massively time-saving.
 
arkroyal said:
I also know that Ray (rpjd99) and Jason (jrob09) do their own code as well.
Perhaps I should create a new section on the forum?

Might be an idea to give it a try and see if we can work together with ideas etc


To be honest, if I am to get my head around databases, I think I will need the For Dummies guidebook! I can grasp many things but the last few times I have looked at online examples of databases I have left confused..
 
A new section might be useful for people to get help with development/query problems: I work from home every day and am very willing to browse such a section regularly to see if there's any db or SQL help I can give.
 
jalfrezi said:
..... I now use the Developer's Edition of SQL Server 2008, which cost me £50 and is fully-featured, crucially including SSIS (Integration Services) which allows the creation of reusable packages to import and manipulate data from external sources (in my case, on-line flat files)....

Do you put your flat files into one table or do create seperate tables from the single flat file?
 
I first import the whole flat file into a simple table, then split the data out (using SQL) into separate tables.

A horse racing db might have these tables:

Horse
Jockey
Trainer
Course
Race
Horse_Race
etc

The tables are self-explanatory apart from the last, which is a table that links a row in Horse to a row in Race, so a row in Horse_Race would contain the primary key of the row in Horse in one column, and the primary key of the associated Race in a second column (when held in another table (Horse_Race), these keys are called Foreign Keys).
 
I'd really appreciate a SQL forum to exchange ideas.

I'm in the throes of getting my own head around all that MS SQL has to offer. Had been using VBA in Excel for years and was very comfortable with it. But then I decided to build a database of French racing ('cos I live there), as there's no decent one to be had here for love nor money. Courtesy of Paris-Turf (= Racing Post) and France Galop (= BHA + Jockey Club), I managed to get myself a 30-year database in a load of text files. I initially had a look at the Proform SQL code, but was finding it tough slogging to make full sense of it. I decided, therefore, to stick with VBA for Excel.

However a chance posting by arkroyal on the Proform forum ended up with us exchanging emails on the use of MS SQL and it reawakened my interest. (I thought he was still continuing with it, but it looks like he was off setting up excellent forums, or other silly nonsense like that. Tch, tch, tch...) I loaded up all of my text files into a SQL database using the same server as the Proform database on my desktop, using just two tables: a race table and a horse table. At first, each (simple) query was taking about 20 seconds to execute, but this worsened to about 40 seconds and remained at that. I was disappointed, but decided to persevere nonetheless. However, I installed the 2012 Developer edition after Christmas, using a different server, and the performance difference is incredible. I don't think any query has taken more than 3 seconds to execute. I'm hooked and there's no way I'm going to move away.

Well, my learning curve is still in the vertical, but I've got loads of time now, as I retired last November. I've found some great learning resources along the way. The first is a forum essentially for professional database administrators, but it also caters for neophytes, such as myself, with excellent articles, hand-holding guides and even a newbie forum:
Code:
http://www.sqlservercentral.com/

Second is a literature resource, with hundreds of books on the different flavours of SQL, as well as other programming languages. The free downloads can be done at one per hour:
Code:
http://www.wowebook.be/

Third is an application called QlikView. It can import data from any database, text, .CSV or Excel files and with a little deft programming some pretty good analysis can be done. I'm not familiar enough with it to give it the description it deserves, so it's best to check it out for yourselves. It's free for personal use, meaning that the document files you save from within the application are tied to your particular machine and cannot be opened by any one else. Some books on it can be found at the link above and there are some tutorial videos on YouTube. It can be found at:
Code:
http://www.qlikview.com/

Allowing for the fact that I'm still very much feeling my way around in the dark as far as MS SQL is concerned, I think the main thing I've learned is that my database of two tables with approx 0.25 and 2.5 million rows (37 and 40 columns), respectively, is really miniscule when compared with the industrial DBs in use today. So I doubt if putting constraints on these tables, such as primary/foreign keys, and creating indexes will make any difference to the performance. Would it not be better to keep everything in one table and not worry about joins and all that? Maybe someone who knows more about it can point out the error of my thinking. The more simple we can keep it, the easier it will be to analyse the data.

Anyway, that's where I'm coming from. I'd really enjoy having a coding forum dedicated to horse racing to come to.

Ray
 
rpjd99 said:
Allowing for the fact that I'm still very much feeling my way around in the dark as far as MS SQL is concerned, I think the main thing I've learned is that my database of two tables with approx 0.25 and 2.5 million rows (37 and 40 columns), respectively, is really miniscule when compared with the industrial DBs in use today. So I doubt if putting constraints on these tables, such as primary/foreign keys, and creating indexes will make any difference to the performance. Would it not be better to keep everything in one table and not worry about joins and all that? Maybe someone who knows more about it can point out the error of my thinking. The more simple we can keep it, the easier it will be to analyse the data.

What you have, in essence, is two big Excel spreadsheets held in SQL Server. It's not the "correct" way to do it because the fact that you're storing the same data in many different places (repeated course names, horse names, jockey names etc) makes it very inefficient. SQL is a set based technology/language and is optimised for joining tables on keys, which is therefore a very efficient way of extracting data by queries.

However, if your database already has 30 years of data I guess it won't expand a lot in size over the coming years, so your queries should continue to run quickly, though I'd be surprised if a table of 40 columns and 2.5 million rows didn't benefit from having some indexes.
 
There are plenty of good free databases too, like My SQL. For those that want something free, small and easy to get going without all the complicated DB Admin strings SQLite is worth a look. It is an excellent first database.
 
jalfrezi said:
What you have, in essence, is two big Excel spreadsheets held in SQL Server. It's not the "correct" way to do it because the fact that you're storing the same data in many different places (repeated course names, horse names, jockey names etc) makes it very inefficient. SQL is a set based technology/language and is optimised for joining tables on keys, which is therefore a very efficient way of extracting data by queries.

Yes, I agree that it's not the "correct" way of doing it and that what I'm proposing is a "relational" database with no relations (an "orphanal" database?). I accept that very large DBs will require the degree of optimisation that you describe in order to be truly efficient, but if the SQL engine sees my database as a piddling little glorified text file, why go to all the trouble of making it complex? Is it possible that the complexity necessary for large DBs puts people off using a very powerful analysis engine?


jalfrezi said:
However, if your database already has 30 years of data I guess it won't expand a lot in size over the coming years, so your queries should continue to run quickly, though I'd be surprised if a table of 40 columns and 2.5 million rows didn't benefit from having some indexes.

I'd like to be able to say that it would or would not benefit from having indexes in place, but I simply don't know. Only one way to find out of course... I'll continue with the way it is for the moment and see what the performance is like when I start to run much more complex queries than I've been running up to now. I have planned for but not implemented all the indexes, so I can set them up if needed in the future.

Ray
 
pika said:
There are plenty of good free databases too, like My SQL. For those that want something free, small and easy to get going without all the complicated DB Admin strings SQLite is worth a look. It is an excellent first database.

Good point. I haven't used either of those , but I have used PostgreSQL which is really good and free (open source too).
 
rpjd99 said:
I'd like to be able to say that it would or would not benefit from having indexes in place, but I simply don't know. Only one way to find out of course... I'll continue with the way it is for the moment and see what the performance is like when I start to run much more complex queries than I've been running up to now. I have planned for but not implemented all the indexes, so I can set them up if needed in the future.

A relational version of your db would have normalised out the data, so instead of having 2.5M rows containing horses for instance, a relational Horse table might only contain a few thousand rows.

When you run complex queries you will have to effectively join your 2.5 million row table to itself (perhaps several times), when the relational db would instead be joining a big table to much smaller tables (eg Horse).

If it works well there's no reason to change it, as long as writing the queries isn't overly time consuming - as you say, it all depends what sort of queries you intend to run against it. I'd be interested to hear how this works out.
 
pika said:
There are plenty of good free databases too, like My SQL. For those that want something free, small and easy to get going without all the complicated DB Admin strings SQLite is worth a look. It is an excellent first database.

Thanks for that pika, I have a few tutorials queued up on Youtube for later so hopefully it will give me the push I need.

I have a fair bit of footy data that i want to make more use of before too long.
 
For me the speed and easy of defining a system is much more important speed of execution. I don't use SQL directly I've my own Domain Specific Language on top of that. Typically I can define a system in one line of code in a matter of seconds but it usually takes 3+ minutes to run. (the inverse of what you might get from direct SQL queries) Properly factored Relational Database is not always intuitive to achieve or use. I wanted something easy so I can think about my system without having to think about how the tables relate. Research type databases are essentially different from your normal Customer Relationship Management databases. A flat file or single table database might be a hassle to update because of the redundancies but it does make research queries easy.
 
Back
Top