nfl_data_py is a Python library for interacting with NFL data sourced from nflfastR, nfldata, dynastyprocess, and Draft Scout.
Includes import functions for play-by-play data, weekly data, seasonal data, rosters, win totals, scoring lines, officials, draft picks, draft pick values, schedules, team descriptive info, combine results and id mappings across various sites.
Use the package manager pip to install nfl_data_py.
pip install nfl_data_py
import nfl_data_py as nfl
Working with play-by-play data
nfl.import_pbp_data(years, columns, downcast=True, cache=False, alt_path=None)
Returns play-by-play data for the years and columns specified
years : required, list of years to pull data for (earliest available is 1999)
columns : optional, list of columns to pull data for
downcast : optional, converts float64 columns to float32, reducing memory usage by ~30%. Will slow down initial load speed ~50%
cache : optional, determines whether to pull pbp data from github repo or local cache generated by nfl.cache_pbp()
alt_path : optional, required if nfl.cache_pbp() is called using an alternate path to the default cache
nfl.see_pbp_cols()
returns list of columns available in play-by-play dataset
Working with weekly data
nfl.import_weekly_data(years, columns, downcast)
Returns weekly data for the years and columns specified
years : required, list of years to pull data for (earliest available is 1999)
columns : optional, list of columns to pull data for
downcast : converts float64 columns to float32, reducing memory usage by ~30%. Will slow down initial load speed ~50%
nfl.see_weekly_cols()
returns list of columns available in weekly dataset
Working with seasonal data
nfl.import_seasonal_data(years)
Returns seasonal data, including various calculated market share stats
years : required, list of years to pull data for (earliest available is 1999)
Additional data imports
nfl.import_rosters(years, columns)
Returns roster information for years and columns specified
years : required, list of years to pull data for (earliest available is 1999)
columns : optional, list of columns to pull data for
nfl.import_win_totals(years)
Returns win total lines for years specified
years : optional, list of years to pull
nfl.import_sc_lines(years)
Returns scoring lines for years specified
years : optional, list of years to pull
nfl.import_officials(years)
Returns official information by game for the years specified
years : optional, list of years to pull
nfl.import_draft_picks(years)
Returns list of draft picks for the years specified
years : optional, list of years to pull
nfl.import_draft_values()
Returns relative values by generic draft pick according to various popular valuation methods
nfl.import_team_desc()
Returns dataframe with color/logo/etc information for all NFL team
nfl.import_schedules(years)
Returns dataframe with schedule information for years specified
years : required, list of years to pull data for (earliest available is 1999)
nfl.import_combine_data(years, positions)
Returns dataframe with combine results for years and positions specified
years : optional, list or range of years to pull data from
positions : optional, list of positions to be pulled (standard format - WR/QB/RB/etc.)
nfl.import_ids(columns, ids)
Returns dataframe with mapped ids for all players across most major NFL and fantasy football data platforms
columns : optional, list of columns to return
ids : optional, list of ids to return
nfl.import_ngs_data(stat_type, years)
Returns dataframe with specified NGS data
columns : required, type of data (passing, rushing, receiving)
years : optional, list of years to return data for
nfl.import_depth_charts(years)
Returns dataframe with depth chart data
years : optional, list of years to return data for
nfl.import_injuries(years)
Returns dataframe of injury reports
years : optional, list of years to return data for
nfl.import_qbr(years, level, frequency)
Returns dataframe with QBR history
years : optional, years to return data for
level : optional, competition level to return data for, nfl or college, default nfl
frequency : optional, frequency to return data for, weekly or season, default season
nfl.import_pfr_passing(years)
Returns dataframe of PFR passing data
years : optional, years to return data for
nfl.import_snap_counts(years)
Returns dataframe with snap count records
years : optional, list of years to return data for
Additional features
nfl.cache_pbp(years, downcast=True, alt_path=None)
Caches play-by-play data locally to speed up download time. If years specified have already been cached they will be overwritten, so if using in-season must cache 1x per week to catch most recent data
years : required, list or range of years to cache
downcast : optional, converts float64 columns to float32, reducing memory usage by ~30%. Will slow down initial load speed ~50%
alt_path :optional, alternate path to store pbp cache - default is in program created user Local folder
nfl.clean_nfl_data(df)
Runs descriptive data (team name, player name, etc.) through various cleaning processes
df : required, dataframe to be cleaned
I'd like to recognize all of Ben Baldwin, Sebastian Carl, and Lee Sharpe for making this data freely available and easy to access. I'd also like to thank Tan Ho, who has been an invaluable resource as I've worked through this project, and Josh Kazan for the resources and assistance he's provided.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.