Skip to content

Python code for working with NFL play by play data.

License

Notifications You must be signed in to change notification settings

ztaylor96/nfl_data_py

 
 

Repository files navigation

nfl_data_py

nfl_data_py is a Python library for interacting with NFL data sourced from nflfastR, nfldata, dynastyprocess, and Draft Scout.

Includes import functions for play-by-play data, weekly data, seasonal data, rosters, win totals, scoring lines, officials, draft picks, draft pick values, schedules, team descriptive info, combine results and id mappings across various sites.

Installation

Use the package manager pip to install nfl_data_py.

pip install nfl_data_py

Usage

import nfl_data_py as nfl

Working with play-by-play data

nfl.import_pbp_data(years, columns, downcast=True, cache=False, alt_path=None)

Returns play-by-play data for the years and columns specified

years : required, list of years to pull data for (earliest available is 1999)

columns : optional, list of columns to pull data for

downcast : optional, converts float64 columns to float32, reducing memory usage by ~30%. Will slow down initial load speed ~50%

cache : optional, determines whether to pull pbp data from github repo or local cache generated by nfl.cache_pbp()

alt_path : optional, required if nfl.cache_pbp() is called using an alternate path to the default cache

nfl.see_pbp_cols()

returns list of columns available in play-by-play dataset

Working with weekly data

nfl.import_weekly_data(years, columns, downcast)

Returns weekly data for the years and columns specified

years : required, list of years to pull data for (earliest available is 1999)

columns : optional, list of columns to pull data for

downcast : converts float64 columns to float32, reducing memory usage by ~30%. Will slow down initial load speed ~50%

nfl.see_weekly_cols()

returns list of columns available in weekly dataset

Working with seasonal data

nfl.import_seasonal_data(years)

Returns seasonal data, including various calculated market share stats

years : required, list of years to pull data for (earliest available is 1999)

Additional data imports

nfl.import_rosters(years, columns)

Returns roster information for years and columns specified

years : required, list of years to pull data for (earliest available is 1999)

columns : optional, list of columns to pull data for

nfl.import_win_totals(years)

Returns win total lines for years specified

years : optional, list of years to pull

nfl.import_sc_lines(years)

Returns scoring lines for years specified

years : optional, list of years to pull

nfl.import_officials(years)

Returns official information by game for the years specified

years : optional, list of years to pull

nfl.import_draft_picks(years)

Returns list of draft picks for the years specified

years : optional, list of years to pull

nfl.import_draft_values()

Returns relative values by generic draft pick according to various popular valuation methods

nfl.import_team_desc()

Returns dataframe with color/logo/etc information for all NFL team

nfl.import_schedules(years)

Returns dataframe with schedule information for years specified

years : required, list of years to pull data for (earliest available is 1999)

nfl.import_combine_data(years, positions)

Returns dataframe with combine results for years and positions specified

years : optional, list or range of years to pull data from

positions : optional, list of positions to be pulled (standard format - WR/QB/RB/etc.)

nfl.import_ids(columns, ids)

Returns dataframe with mapped ids for all players across most major NFL and fantasy football data platforms

columns : optional, list of columns to return

ids : optional, list of ids to return

nfl.import_ngs_data(stat_type, years)

Returns dataframe with specified NGS data

columns : required, type of data (passing, rushing, receiving)

years : optional, list of years to return data for

nfl.import_depth_charts(years)

Returns dataframe with depth chart data

years : optional, list of years to return data for

nfl.import_injuries(years)

Returns dataframe of injury reports

years : optional, list of years to return data for

nfl.import_qbr(years, level, frequency)

Returns dataframe with QBR history

years : optional, years to return data for

level : optional, competition level to return data for, nfl or college, default nfl

frequency : optional, frequency to return data for, weekly or season, default season

nfl.import_pfr_passing(years)

Returns dataframe of PFR passing data

years : optional, years to return data for

nfl.import_snap_counts(years)

Returns dataframe with snap count records

years : optional, list of years to return data for

Additional features

nfl.cache_pbp(years, downcast=True, alt_path=None)

Caches play-by-play data locally to speed up download time. If years specified have already been cached they will be overwritten, so if using in-season must cache 1x per week to catch most recent data

years : required, list or range of years to cache

downcast : optional, converts float64 columns to float32, reducing memory usage by ~30%. Will slow down initial load speed ~50%

alt_path :optional, alternate path to store pbp cache - default is in program created user Local folder

nfl.clean_nfl_data(df)

Runs descriptive data (team name, player name, etc.) through various cleaning processes

df : required, dataframe to be cleaned

Recognition

I'd like to recognize all of Ben Baldwin, Sebastian Carl, and Lee Sharpe for making this data freely available and easy to access. I'd also like to thank Tan Ho, who has been an invaluable resource as I've worked through this project, and Josh Kazan for the resources and assistance he's provided.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

About

Python code for working with NFL play by play data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%