Ohio State is in the process of revising websites and program materials to accurately reflect compliance with the law. While this work occurs, language referencing protected class status or other activities prohibited by Ohio Senate Bill 1 may still appear in some places. However, all programs and activities are being administered in compliance with federal and state law.

Sports Data Sets

Football

NFL

pro-football reference – Pro-football reference includes NFL data, dating back to 1967. This data includes player statistics, all-time leaders, draft history, coaches, and much more. Statistics are updated every week, no later than Tuesday at 6 pm. Additional data can be found behind a paid subscription.

NFLSavant – This website provides a csv of NFL play by play data from 2013-2024. The data is divided into several categories, including team, play, down, formation, play type, etc. This data can be useful if you’re looking to do an NFL-related project.

NFL Concussions – This website provides a CSV data set on concussions in the NFL from 2012-2014. This data includes the player, the game, position, number of weeks missed, etc. This information is very useful in researching NFL concussion data. This data could be compared with youth football participation rates to see how concussions impact the youths' participation in football.

NFL and betting – This website provides a data set comparing NFL game scores to the betting projections. This data shows the betting favorite, as well as the final score. This data set would be useful in trying to compare the outcome to whatever the betting odds were.

NFL PBP 2009-2016 – This website provides CSV data sets on play-by-play data in the NFL from 2009-2016. This data includes the basic statistics such as down, quarter, and yard line. It also includes more specific data like QB hit, expected points result, and air yards. This data would be useful in looking at NFL play-by-play trends from 2009-2016.

NFLscrapR – This GitHub leads to an R package that makes scraping in-game NFL data much easier. A lot of this data is based on play-by-play data. This package has data from almost every statistic tracked during an NFL game.

NFL Data Analytics – This website provides a few basic CSV data sets for the NFL. This data includes games, plays, players, and weekly data. This data is pretty basic; it would mostly be used if you’re trying to do some basic NFL research.

NFL Play – This website provides NFL data in several different CSV files. These include data from the NFL combine, draft, and a variety of performance metrics. This detailed data set allows you to research a wide range of NFL statistics.

puntr – This package is for importing, manipulating, analyzing, and visualizing data related to football punting. puntr is a great resource if you want to do a deep dive into punting analytics.

nflverse – This is a set of packages containing NFL data. This includes play-by-play data back to 1999, resources for season simulations, 4th down analysis, and more.

College Football

sports-reference college – Sports-reference college has data from college football since 1956. This data includes statistics, Heisman winners, all-time leaders, bowl history, and more.

College Football Data – This website provides data for college football. This data consists of play-by-play data, drive results, and historical ratings. As well as predictive statistics like EPA and WPA. This website is very useful in providing a variety of college football data.

cfbfastR – This R package makes accessing data from College Football Data (above) much easier.

College Football Game Stats – This Kaggle dataset contains data from every CFB box score between 2002 and 2025. 

Baseball

Baseball Reference – Baseball Reference is a source of baseball data dating back to 1871. This data includes players, teams, statistics, all-time leaders, and much more. Baseball Reference has a feature where you can filter out specific players or teams to look at their data. Additional data can be found behind a paid subscription.

Baseball Savant – Baseball Savant is a source for various baseball data. This includes more advanced statistics such as xwOBA, barrel%, and much more. Baseball Savant also allows you to create a CSV file and a visual with select statistics of your choice.

BaseballOdds – This website provides data on baseball games in comparison to the betting odds entering the game. The data provided includes the opening and closing lines. This is an interesting data source if you want to compare baseball scores with the projected result. Data ranges from 2010-2021.

Baseball Databank – This website provides CSV data sets on baseball. This data covers a wide range of baseball topics. It includes data specifically for a player’s performance in the postseason. This would be very useful for any research related to baseball.

2016 Sportradar – This website provides a CSV data set for the 2016 MLB season. This data includes every pitch, steal, and lineup event for the regular and postseason in 2016. This data would be very useful if you wanted to research baseball events from the 2016 season.

Retrosheet – This website provides data for several different baseball events. This includes postponed games, ejections, protested games, and no hitters. There is a lot of data here, it includes a variety of very specific topics. This would be a great resource to research specific events that have happened in baseball.

KBO Pitching Data – This website provides CSV data sets of the KBO 1982-2021. The statistics used are very similar to those used in MLB. ERA, BB, K, etc. The KBO became more popular in the United States during 2020, while MLB was not going on due to the pandemic. This data set could be used to research and collect data about the KBO.

KBO Batting Data – This website provides a CSV data set on KBO batting data from 1982-2021. These statistics are very similar to those used in MLB. These include runs, hits, home runs, and ops. The KBO became more popular in the United States during 2020, while MLB was not going on due to the pandemic. This data set could be used to research and collect data about the KBO.

Baseball Height/Weight – This website provides a CSV data set on over 1000 MLB players' height and weight. This data also includes their position. This allows us to explore the correlation between height/weight and position.

Fangraphs – Fangraphs is a website that provides baseball data. This data includes standings, projections, scores, and teams. Fangraphs also has more player-based data, such as AVG, K%, and WOBA. This is one of the most well-known baseball data sources. This source would be very helpful in researching any baseball data.

PyBaseball – This website provides data for baseball that can be used in Python. This source uses statistics from Statcast to analyze baseball. This data includes pitch type, launch angle, and WOBA. This data source uses advanced baseball analytics. This website would be helpful if you are a Python user and are very familiar with baseball analytics.

Baseballr – Baseballr is an R package that can be used to analyze baseball data. Baseballr has scraped data from various sources and created an R package so it can easily be used. This data includes box scores, standings, sabermetrics, etc.

Hitters – This website provides a CSV data set of MLB hitters. This data includes hits, home runs, RBIs, years, and more. This data set only covers more basic MLB statistics. This website would be good if you’re researching MLB hitter statistics.

64analytics – While much of this website is behind a paywall, 64analytics is a great source for NCAA player data. The website includes players from D1, D2, and D3. 

Basketball

Professional

Basketball Reference – Basketball-reference has data from basketball, both NBA and WNBA. The data has been tracked since 1946. This data includes statistics, leaders, teams, and draft history. Additional data can be found behind a paid subscription.

Basketball Dataset – This website provides data sets on basketball. This data includes drafts, players, salary, game officials, etc. This website also provides data on the draft combine, so you can compare a player’s combine score to their draft position. This source would be great for researching basketball data and statistics.

NBA Play-by-Play – This website provides individual data sets for NBA play-by-play data from 2000-2020. This data is very detailed in going over each play from each of these seasons. This data would be useful if you were trying to compare shot and score data trends over the last 20 years. These data sets would also allow you to compare a player’s trends over the last 20 years.

NBA Games Data – This website provides CSV data sets on NBA game data. This data includes teams, rankings, games, and players. These data sets provide details about positions, minutes played, and the conference of the teams. This data would be helpful in researching NBA game data.

NBA Player Stats – This website provides data for NBA players since 1999. This data can be separated into regular season or playoffs. The data set provides players name, team, field goals made, three points made, etc. This data would be useful in comparing a player’s statistics over their career.

Player Statistics – This website provides data for NBA player statistics since 2008. This data includes the basic statistical data such as PPG, APG, and RPG. This data also includes more advanced statistics such as eFG%, USG%, and VI. This source would be great to research NBA player data.

MJ, Kobe, and Lebron – This website provides csv data sets on Michael Jordan, Lebron James, and Kobe Bryant. This data compares these players statistics based on their age. Some of these statistics include TS%, USG%, PER, etc. This source would be very useful if you wanted to compare some of the all time great NBA players.

Steph Curry – This website provides data on Steph Curry from 2009 to 2023. Widely considered the best shooter of all time, this source provides data on Steph Curry. The data is divided into preseason, regular season, and postseason. This source allows you to see his progression as a player.

Player Statistics – This website provides csv data sets for NBA players individual statistics. This data mainly consists of more basic statistics such as points, rebounds, assists, steals, blocks, and fouls. While these statistics aren’t very advanced, they’re very useful in evaluating NBA players at a basic level.

WeHoop – This website provides data set for the WNBA and women’s CBB. This data includes Play-by-play data, which can be very useful in analyzing WNBA or women’s CBB. This data would be useful if you are researching women’s basketball.

NBA Travel Data – This website provides data on NBA teams travel schedules. This data includes time zones, cities, and rest days. NBA schedules have been a growing issue over the past few years; this data would allow for a researcher to analyze the NBA travel schedule.

Basketball – This website provides data on games and players within the NBA. This includes games, drafts, players, and teams. This data is mostly information based, not analytically based. This website would be helpful if you’re researching NBA information.

NBA 1991-2021 – This website provides csv data sets on NBA data since 1991. This data includes MVPs, teams, and players. This data set also includes player statistics for their MVP season. This website would allow you to research NBA data since 1991.

College

Sports-reference CBB – Sports-reference CBB has data from college basketball since 1892. This data includes scores, leaders, tournament history, awards, and more. This page allows you filter out what conference and school you want to look at.

NCAA Basketball – This website provides data on NCAA men’s basketball teams. This data includes mascots, teams, play-by play data, historical games, etc. This data would be very useful in researching information about men’s college basketball.

Soccer

FBref – FBref has data soccer data going back to the 1800s. This website has data from all the major soccer leagues in the world, and it allows you to filter out the league you want to view. This data includes statistics, standings, all-time leaders and more.

Who Scored – Who Scored is a source that provides soccer-based data. This includes live scores, offensive statistics, defensive statistics, their own player grades, and more. This source also provides information on upcoming games and events.

World Cup – This source provides datasets on FIFA World Cup tournaments from 1930-2014. This data includes game stadium, result, city, etc. This data can be useful for general research about the history of World Cup games.

2022 FIFA World Cup – This website provides csv data sets for the 2022 Fifa World Cup. These statistics are in the form of team and individual. The data tracked includes goals, assists, yellow cards, shots, etc. This data would be helpful in researching or analyzing games and statistics from the 2022 Fifa World Cup.

Metrica Sports – This GitHub source provides tracking and event data for soccer data. The data comes in the form of a csv, as well as a glossary of the definitions for the data labels. The data can come in the form of an entire game summary, or isolated based on each team.

World FootballR – This source provides information to an R package commonly used for soccer data. The package is titled ‘worldfootballR’. You can download the CRAN version of ‘worldFootballR’ and download the package of ‘worldfootballR’ that JaseZiv has already created.

Tyrone Mings – Tyrone Mings uses this github source to provide data in an R package he created. The goal of this package is to help make data more easily accessible. The information on this package includes players, clubs, leagues, and market value.

Euro Soccer – This website provides data sets for European Soccer. This data includes country, league, match, and team. This website also provides basic information on the players. This website would be good if you’re researching European Soccer data.

Indian Premier League – This website provides csv data sets on the Indian Premier League. These data sets include match, player, season, and team. This data set has data ranging from 2008-2006. This website would be helpful if you’re researching Indian Premier League Data.

Last5Games –  Last5Games provides a combination of live soccer scores, historic head-to-head data, recent results, and betting odds for soccer teams and matches all over the world. At any given time it has information for matches that day and up to one week ahead, refreshing once daily.

Statsbomb Open Data – Statsbomb is one of the premier providers of data in soccer. Much of their data is not available to the public, but they release some subsets of data (Euro 2025, Lionel Messi data, etc.) on this GitHub. This data can be accessed easily using the Statsbomb packages in R or Python.

Hockey

Hockey Reference – Hockey Reference provides hockey data since 1917. This data includes players, statistics, records, awards, and more. Hockey Reference provides a way to filter out specific teams and players if you want to look at their data. Additional data can be found behind a paid subscription.

Hockey database – This website provides several CSV data sets for professional hockey. These data sets include teams, scoring, goalies, the Hall of Fame, etc. This data covers a wide range of hockey topics. This website would be very useful in researching professional hockey data.

NHL Salaries Predictions – This website provides CSV data sets on NHL players used to predict their salaries. This data set uses several pieces of data, including goals, assists, and position, to predict their salary. This website would be great to use if you’re researching NHL players’ salaries. These data sets can allow you to see if there is a correlation between player statistics and their salaries.

NHL Playoffs – This website provides a CSV data set on the Stanley Cup Playoffs from 1918-2022. These include the team, playoff win %, year, and goal differential. This website would be helpful in researching data on the Stanley Cup Playoffs.

NHL Draft – This website provides a CSV data set of NHL draft data. This data includes team, year, overall pick, etc. This data would be helpful if you are researching NHL draft data.

NHL Play-by-Play data – This website provides CSV data sets on NHL play-by-play data since 2007. This data includes a description of the play to better envision what happened. This data also includes all the players on the ice from each team. This data would be good if you’re researching NHL play-by-play data.

Money Puck – This website provides data on a wide variety of hockey data. This data includes playoff odds, teams, and players. This data is from 2008-2023. All this data comes in a CSV file. This website would be useful if you’re researching any hockey data over the past 15 years.

Stat Trick – This website provides data on NHL games as they occur. This data includes scores, shots, and expected goals. This website also tracks high danger chances. You can also view more in-depth data of each game. This website would be good if you’re researching NHL data each day.

NHL Game Data – This website provides CSV data sets on NHL game data. This data includes team, player, and game statistics. This data also includes the venue of the game. This website would be useful if you’re researching NHL data.

Tennis

ATP Matches – This website provides CSV data sets on ATP matches from 2000-2017. This data includes the tournament, seeds, winner, rank, etc. This data also includes more specific data, such as games won and 1st serve %. This data would be useful if you’re researching ATP match data.

WTA Matches – This website provides CSV data sets on WTA matches from 2000-2016. This data includes tournament, surface, and winner. This data is very useful if you’re researching WTA match data.

Australian Open 2019 – This website provides CSV data sets on the 2019 Australian Open. This data includes statistics from every rally for the tournament. While this data set only includes data from the 2019 tournament, this data is very detailed. This data would be very useful if you’re researching data from the 2019 Australian Open.

Tennis Betting Odds – This website provides CSV data sets for men's and women's tennis. These data sets provide the betting odds entering the match. These data sets would be useful in comparing tennis match results to the betting odds.

Motor Sports

F1 Race Data – This website provides CSV datasets on Formula 1 races. This data includes circuits, drivers, races, results, seasons, etc. This data is very precise; it includes data such as pit stop time and the driver’s birth date. These datasets are very useful if you’re doing research on F1 data.

F1 World Championships – This website provides csv datasets on the F1 World Championships since 1950. These datasets include drivers, lap times, results, etc. These datasets can be used for a variety of reasons involving F1 research.

MMA

Ultimate UFC Datasets – This website provides csv data on UFC fights. This data includes the fighters, betting odds, winner, etc. There is also a data set that compares the winner to the betting odds. These data sets can be used for research on UFC results.

MMA Grappler GitHub – This GitHub provides CSV data sets on MMA fighters. This data includes fighters, rankings, and data on each fight. These data sets would be useful for someone doing research on MMA fighters and their rankings and fighting results.

Conor McGregor – This website provides a CSV data set on Conor McGregor. This data includes time, round, fighter, etc. This data set only includes 10 variables. This data set can be used to research Conor McGregor’s fight history.

UFC Refactored – This website provides a CSV data set of UFC fights. There are over 400 variables with this data set. This data set provides many advanced UFC statistics. You can use this data to find a correlation of the winner of the fight based on statistics during the fight.

UFC Data – This website provides UFC data from 1993-2021. This data set has 144 variables covering a wide range of topics, including data, weight class, fighters, and referee. This data would be useful if you’re researching UFC data.

Golf

PGA Tour 2015-2022 – This website provides a csv data set for the PGA Tour from 2015-2022. This data includes golfers name, hole, strokes, etc. This data can be useful if you’re hoping to research golfers on the PGA tour.

PGA Tour Data – This website provides a csv data set for the PGA Tour. This data is mostly based on golf statistics. This data includes fairway percentage, average score, and wins. This data is very useful if you are trying to research golfers from the PGA Tour based on their statistics.

PGA Tour Rankings – This website provides a csv data set on PGA Tour golfers based on their rankings. This data shows rankings, events played, and points gained and lost. This is a great data set to look at trends of golfers and how their ranking has moved.

Track & Field

World Athletics – World Athletics is the go-to data source for professional track & field. Diamond League meets, national meets, and many more competitions are recorded on the website. You can also look at toplists, world records, and world rankings. 

TFRRS – TFRRS is the most reliable source for collegiate track & field and cross country data. The website contains data going back to 2009 and is constantly updating with new meets. Explore the website by competition, team, conference, or toplist. The website works for D1, D2, D3, NAIA, NJCAA, and NCCAA.

Athletic.net – Athletic.net is the best source of high school track & field and cross country. Containing meets from across the country and the nation’s top times, this is a great resource for researching high school running.

Outside the Game

NSASS – National survey on sports and society.

EADA – EADA provides equity data for school athletics. This source allows you to compare multiple schools, view trend data, get data for a specific school, and download custom data. The data includes the number of participants and the number of teams, as well as financial information for the sports.

HS Participation – This source provides data on high school participation in sports. The data dates back to 1969. This data includes every single sport. The data is divided into several categories, including state, gender, and the number of schools that offer each sport.

NBA vs. WNBA salary – This GitHub provides data comparing NBA and WNBA players' salaries. The data comes via a CSV file, so it can easily be applied to RStudio or any other language. This can allow you to compare and see how NBA players are paid compared to WNBA players.

NIL Data – This article provides data for colleges based on NIL revenue. NIL allows college athletes to profit from their name without suffering on-field consequences. This data is separated into categories by sport, method of compensation, and position.

Highest Paid NIL – This website provides data on the most valuable college athletes. The data provided includes their school, sport, and endorsement potential. NIL allows college athletes to profit from their name. This data would be useful if you are researching the most valuable college athletes.

Injuries in Sports – This website provides data on sports-related injuries. The data is divided into a few categories. This is by injury rates, where the injury takes place, and the sport. This data was mostly focused on children ages 5-14.

Participation – This study provides data on sports participation. These sports include football, baseball, basketball, cross country, and volleyball. Data was also gathered on how many years the sport has been played and how many hours a week were spent on the sport. This website would be great to use for researching participation in various sports.

Covid Participation – This website provides data for sports participation during 2020. This data was mostly compared to 2019. This data allows us to see the negative impact the pandemic had on sports. The only sport to see an increase in participation was ultimate frisbee. This data would be great to use in researching sports and the pandemic.

Olympic Athletes – This website provides data on all the athletes who have participated in the Olympics over 120 years. The data is very broad; it just names the person, country, event, medal, etc. This data would be useful in researching Olympic athletes.

Injury Analytics – This website provides data sets on injuries in sports. This website takes into account workload when providing data on the injuries. The analysis in the injuries is very advanced, it includes data such as hip mobility, groin squeeze, and rest period. This website would be good to use for advanced sports injury research analysis.

Injuries by Sport – This website provides data on injuries by each sport. These sports included cycling and softball. The data also showed which body part was injured most often by each sport. This data mostly focused on people over the age of 25. 

Injuries by age – This website provides data on injuries by sport and by age. This data includes a wide range of sports such as ATVs, fishing, and trampolines. The age range begins at 5 and younger, and it goes up to 65 and older. 

Stadiums – This website provides data on MLB, NBA, NHL, NFL, and MLS stadiums. This data includes team, league, division, latitude, and longitude. 

NCAA Academics – This website provides data on NCAA athletes from 2004 to 2014. This data includes their sport, school, and conference. This data also provides academic scores. This data set looks at the athletes' academic progress rate. This data would be useful if you’re comparing schools and their academic score.

High School Women's Soccer – This website provides data on women’s High School Soccer participation. This data includes sport, state, year, and participation. This data allows you to compare trends of women’s high school soccer participation over the years.

Female Olympians – This website provides data on Female Olympians. This data includes sports, events, and the percentage of women’s participants. This data would be useful if you’re researching female Olympians.

Biathlon – This website provides data on Olympic Biathlon from 1960-2022. The Biathlon is an Olympic event that combines skiing and rifle shooting. This data includes athlete, country, medal, and year. This data would be helpful if you’re researching Biathlon data.