Skip to content

drbryan/ncaa_baseball

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

ncaa_baseball

Baseball data and data science tools for NCAA Divisions 1-3, 2012-2014.

Currently this project contains data from ~2200 team-seasons, including rosters, schedules, box scores, and a play-by-play (unparsed). All files are in CSV form but might not play well with Excel, since fields are separated by whitespace, not commas.

Running the code yourself on Windows will require Cygwin (cygwin.com) and RubyInstaller for Windows (rubyinstaller.org). The files should be readable in any text editor (I prefer Notepad++).

Obviously this is nowhere near as fun as the Retrosheet database, and there is a lot of useful information still missing. Below is a partial list of stuff I want/need to add to make these data more useful:

  • Conference affiliations
  • Pitcher ID and batter position in lineup
  • Handedness of batter and pitcher
  • Fielder IDs (low priority)
  • Relational database compatibility

I'm also open to requests/bug reports. You can find me on Twitter at @Doctor_Bryan.

-Bryan Cole
Feb. 5, 2015

Updates

  • Jan. 27: Initial version
  • Feb. 5: First crack at parsing events into base/out state, hit type, play description, etc.

About

NCAA baseball data and data science tools

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 50.6%
  • Ruby 49.0%
  • Shell 0.4%