The home of intelligent horse racing discussion
The home of intelligent horse racing discussion

50,000+ Horse Race Database

Home Forums Archive Topics Trends, Research And Notebooks 50,000+ Horse Race Database

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • #16869
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    I have downsized the original 150,000 horse race database to a more manageable 50,000+ file (4MB), and made it a .xls file so that old versions of Excel i.e. Excel 97+ can view the file.

    The data covers 2006 to 2010 and was originally put together to produce some standard times and to learn web scraping.
    It will be updated on a regular basis if anybody is interested.

    <b>Download</b>:
    [code:u97hp9rh]http://tinyurl.com/23yymyn[/code:u97hp9rh]

    #331886
    closer
    Participant
    • Total Posts 2

    Hi, yes, I’m interested. updates would be good.

    Thanks for making this available, I’ve been looking through it and there it would seem there are quite a few errors, the extra-fast and extra-slow times are easy enough to spot but mistakes within normal parameters aren’t. If possible (!) it might be a good idea to scrape RP data and compare the two.

    The date format is a little awkward, if it were along the lines of 2010-12-10 (for example) it would make re-ordering easier and enable a unique numbering of each entry (per spreadsheet) – useful when correcting mistakes.

    Probably the most serious omission from the data is the class of race, I’m not sure how you can make accurate standard times without this data. If you take an average from all races you are not accounting for the different average class peculiar to each course. It would also be useful (at least from a research perspective) to have age, weight, OR and winning distance data (if possible!).

    Whatever, it’s an admirable and inspirational project, I wish you well.

    #331941
    Avatar photoTheBluesBrother
    Participant
    • Total Posts 1085

    Yes you are right about there being mistakes, if you click on the link of any race it will take you to the SportingLife achive race, there is nothing wrong with the scraping it’s a fault with the Sportinglife database.

    I have dropped the project anyway, as web scraping the Sportinglife data is a waste of time :shock:

    #331970
    Anonymous
    Inactive
    • Total Posts 17716

    Yes you are right about there being mistakes, if you click on the link of any race it will take you to the SportingLife achive race, there is nothing wrong with the scraping it’s a fault with the Sportinglife database.

    I have dropped the project anyway, as web scraping the Sportinglife data is a waste of time :shock:

    Dont give up, there was only a few minor errors!?

    You tried the Racing Post?

Viewing 4 posts - 1 through 4 (of 4 total)
  • You must be logged in to reply to this topic.