The home of intelligent horse racing discussion
The home of intelligent horse racing discussion

54,000+ Horse Race Database.

Home Forums Archive Topics Trends, Research And Notebooks 54,000+ Horse Race Database.

Viewing 14 posts - 1 through 14 (of 14 total)
  • Author
    Posts
  • #22601
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    I put together an Excel.xlsx 54,000+ database of every winner in the GB and Ireland from 1/1/2005 up to 5/9/12.

    You will need Excel 2007 or 2010 to read the .xlsx file (5.8MB) or get an excel .xlsx reader/converter.

    The database contains: race date,country code,course name,pattern class,race distance,race title,horse name,horse age,winner time,time/secs,weight,bha ratings,rpr/raceform ratings and official going.

    This might be useful to somebody who wants to compile their own standard times.

    <!– m –>http://tinyurl.com/btco724<!– m –>

    EDIT:<i><b> I am working my way through the list slowly correcting the lines that are out of sync.</b></i>

    #412647
    MoleHorse
    Member
    • Total Posts 127

    Just taken a look at this, one big error is you can’t distinguish between Maiden’s and Handicaps.

    #412649
    Avatar photoitsawar
    Member
    • Total Posts 213

    I didn’t know you can export such a big sample, i tried to every trainer in cats A-D but RI kept saying "overflow" and crashing. Do i need to do something different for big samples?

    #412650
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    Just taken a look at this, one big error is you can’t distinguish between Maiden’s and Handicaps.

    I will amend the file to add race types.

    I omitted the official going which has just been added.

    http://tinyurl.com/btco724

    #412654
    MoleHorse
    Member
    • Total Posts 127

    Thank you The Blues Brother, maybe add the weight of the winner and the finishing position of other runners if you can, then we can predict what sort of time you need to finish in to run different places.

    Now lets get to work on making some reliable standards, I am going to start with

    Lingfield AW.

    #412661
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    Thank you The Blues Brother, maybe add the weight of the winner and the finishing position of other runners if you can, then we can predict what sort of time you need to finish in to run different places.
    Now lets get to work on making some reliable standards, I am going to start with

    Lingfield AW.

    Just added the race title to the file.

    http://tinyurl.com/btco724 (5.8MB)

    I will add the weight of the winner to the file but not the finishing position of the other runners as the file will be end up being massive. 8)

    #412674
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    Just amended the file to show the winners weights 8)

    http://tinyurl.com/btco724 (5.8MB)

    #412678
    MoleHorse
    Member
    • Total Posts 127

    Do you know how to project what time a horse should be running on the basis of their BHA Rating?

    I’ve just quickly done Ascot but you seem to only have 3 running’s of The Kings Stand despite the data going back to 2005?

    Anyhow this is how you do it; filter a distance you’d like to use, copy the Time/Sec’s alongside the BHA Rating from the data file and put the data into SPSS.

    Go to Variables make sure both cases are "Numeric" and measured to "Scale". Then you click Data > Define Variable Properties > BHA > (Un-tick Limt Number of values displayed to) continue > tick missing on the value "0" then ok.

    You then find Analyze up the top > Regression > Curve Estimation > BHA goes into Independent Variable & Time goes into Dependent Variable > Click Linear > Ensure Plot Models and Constant in equation is ticked > Save > Predicted values.

    Go to Data View, find the new data beside your BHA figures and copy into Excel, sort by ascending order and you get something like this.

    116 59.80
    114 59.88
    113 59.92
    110 60.04
    108 60.12
    103 60.31
    102 60.35
    102 60.35
    101 60.39
    101 60.39
    101 60.39
    100 60.43
    98 60.51
    98 60.51
    98 60.51
    98 60.51
    96 60.59
    96 60.59
    96 60.59
    95 60.62
    94 60.66
    94 60.66
    94 60.66
    92 60.74
    89 60.86
    88 60.90
    88 60.90
    88 60.90
    88 60.90
    85 61.02
    85 61.02
    83 61.09
    83 61.09
    82 61.13
    81 61.17
    79 61.25
    79 61.25
    77 61.33
    74 61.45
    73 61.48
    72 61.52
    71 61.56
    71 61.56

    #412701
    Avatar photocormack15
    Keymaster
    • Total Posts 9232

    Good stuff BB.

    Is the BHA rating the rating awarded for that race or going into that race (i.e. it’s best rating to that point)?

    #412745
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    Good stuff BB.

    Is the BHA rating the rating awarded for that race or going into that race (i.e. it’s best rating to that point)?

    It would be the BHA rating going into the race :D

    #412746
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    Do you know how to project what time a horse should be running on the basis of their BHA Rating?

    I’ve just quickly done Ascot but you seem to only have 3 running’s of The Kings Stand despite the data going back to 2005?

    Anyhow this is how you do it; filter a distance you’d like to use, copy the Time/Sec’s alongside the BHA Rating from the data file and put the data into SPSS.

    Go to Variables make sure both cases are "Numeric" and measured to "Scale". Then you click Data > Define Variable Properties > BHA > (Un-tick Limt Number of values displayed to) continue > tick missing on the value "0" then ok.

    You then find Analyze up the top > Regression > Curve Estimation > BHA goes into Independent Variable & Time goes into Dependent Variable > Click Linear > Ensure Plot Models and Constant in equation is ticked > Save > Predicted values.

    Go to Data View, find the new data beside your BHA figures and copy into Excel, sort by ascending order and you get something like this.

    Nice work here, I might download the latest version of SPSS and have a go myself 8)

    The problem with compiling Ascot standard times comes when you come to the mile races there are two race distances, one on the straight course and the other on the round course :shock:

    #412772
    Marginal Value
    Participant
    • Total Posts 703

    Do you know how to project what time a horse should be running on the basis of their BHA Rating?

    I’ve just quickly done Ascot but you seem to only have 3 running’s of The Kings Stand despite the data going back to 2005?

    Anyhow this is how you do it; filter a distance you’d like to use, copy the Time/Sec’s alongside the BHA Rating from the data file and put the data into SPSS.

    Go to Variables make sure both cases are "Numeric" and measured to "Scale". Then you click Data > Define Variable Properties > BHA > (Un-tick Limt Number of values displayed to) continue > tick missing on the value "0" then ok.

    You then find Analyze up the top > Regression > Curve Estimation > BHA goes into Independent Variable & Time goes into Dependent Variable > Click Linear > Ensure Plot Models and Constant in equation is ticked > Save > Predicted values.

    Go to Data View, find the new data beside your BHA figures and copy into Excel, sort by ascending order and you get something like this.

    Great data BB. Thanks

    On this particular bit of the topic:

    Surely if you want to predict Time, then Time should be the independent variable and BHA should be the dependent variable. Also, for the sake of precision why would you not first find out a coefficient of correlation between the two variables to check the strength of the relationship. And then look at the similar coefficient using more than one dependent variable, most probably BHA and Official Going, and produce a multiple regression analysis that will then predict Times for BHA and Official Going. Otherwise in predicting Times for BHA rating the data will be based on the average Official Going thus giving a Time only for average Official Going. Which is probably not what you want if the race you want to predict Times for is run on non-average Official Going.

    #412775
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    I didn’t know you can export such a big sample, i tried to every trainer in cats A-D but RI kept saying "overflow" and crashing. Do i need to do something different for big samples?

    Do your project in blocks of about 20,000 lines and you will have no problem with a stack overflow, Raceform Interactive memory cannot handle anymore. 8)

    #415131
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    This database has now been cleaned up and the rows that were out of sync has been corrected.

    Database has been updated from 1/1/2005 to 1/10/2012 8)

Viewing 14 posts - 1 through 14 (of 14 total)
  • You must be logged in to reply to this topic.