-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fare metrics from NTD using the new transit class #8
base: master
Are you sure you want to change the base?
Conversation
Unfortunately, the NTD values do not mean what the passenger paid.
There does not seem a way to distill the passenger paid fare from the dataset, so we turn to the GTFS feed and the fare info provided in that standard. There are two tested providers of GTFS information: transit.land and MobilityDatabase Transit.land and mobilitydatabase comparisonTake the Gwinnett County Transit for transit.land However the mobilitydatabase returns the link: Mobilitydatabase appears to be better because it has no rate limit and higher quality data Number of agencies with fare data in their GTFS source
There are a few slight exceptions here and there: for instance transit.land accurately shows a 0 fare for Athens Clarke County https://www.accgov.com/1770/Fare-free-Transit whereas Mobilitydatabase does not contain that agency. |
The NTD has identified a data source that shows Passenger Paid Fare, separate from Organization Paid Fare, at https://www.transit.dot.gov/ntd/data-product/2022-annual-database-fare-revenues If anything, we would join the two datasets- GTFS and NTD- to get route stops and timings from the former, and true average price paid from the latter. However, I will not immediately do this as I instead prioritize the coordinates-to-fare program logic. NTD Sanity CheckTake the Passenger Paid Fare for Gainesville's MB mode (bus) which is 316,285 Divide 316,285 by 4,302,010 and you get the average fare for any arbitrary rider is $0.07. Reasonable, because many people ride free! RTD in Colorado, Denver Regional Transportation District. Directly Operated bus system. Passenger Paid Fare is $16,716,726 and its Total Number of Trips taken is 25,317,651. Notice how Fare Revenues Earned in the metrics, which we initially used, was wrong because included Organization Paid Fares. RTD average bus ride fare: $0.66 likely because the purchased extended pass will allow for a greater value than paying the base fare everytime. https://data.transportation.gov/Public-Transit/2022-NTD-Annual-Data-Metrics/ekg5-frzt/explore Several Dynamics
|
So it sounds like you want to do:
|
Hi @JGreenlee , this is great. I see how I can add my logic right into that same ntd.ipynb notebook. Should I add on to your ntd.ipynb, or move my notebook to the scripts dir? |
Add to |
A bug has been identified and fixed, where the service Year 2022 has As seen in this csv, the column with a forward slash What Happened?It seems that NTD has updated their data, as we have very slight variations in the calculations, such as changes of 20-30 in the carbon metrics. I know this because I did not change any calculation logic. |
It's also very much possible that I just made a typo with "Actual Vehicle/Passenger Car Miles" and/or forgot to re-create the output files the last time the script was adjusted. |
Issue found thanks to #8 That PR will fix this, but in the meantime I am putting the fix on master to get a release out
I have gone ahead and done total_upt = 0
total_fare = 0
total_records = 0
agency_mode_fueltypes = []
for entry in intensities_data['records']:
# skip entries that don't match the requested modes or UACE
if (modes and entry["Mode"] not in modes) or (uace and entry["UACE Code"] != uace):
continue
total_records += 1
if 'Average Fare' in entry:
total_fare += entry['Average Fare']
...
intensities['average_fare'] = total_fare / total_records if total_records > 0 else None Now the |
That is good for now. We may want to split it up later, depending on what can be extracted or generalized. |
This PR now has logic to launch an OpenTripPlanner (OTP) instance, and to retrieve GTFS data from OpenMobilityData to give to that OTP instance. We provide a Python interface to query the transit times from OTP. |
This PR is made in response to a comment about finding an accurate $/PMT value for transit. e-mission/em-public-dashboard#31 (comment)
NTD Data is taken from https://data.transportation.gov/Public-Transit/2022-NTD-Annual-Data-Metrics/ekg5-frzt/explore and contains columns that tell us Average Passenger Fare, Total Passenger Miles, and number of Passenger Trips (among others)
The notebook in this PR shows that:
"Weighted" means that the number of passengers that use the service affects the final fare, so the number leans more towards the average across all NTD services
However, we are assuming some relations between NTD mode and OpenPath's modes.
https://github.com/jpfleischer/e-mission-common/blob/74a5684458d2cac66f4a6bc14c3250509be1464c/src/emcommon/metrics/transit/transit.py#L30-L37
It is possible to further differentiate the $/PMT by metropolitan area. However, we are assuming that there are individuals that will not take transit because it is too expensive. In reality, individuals do not take transit for other reasons.
We must relocate the files if we do not want them compiled to JavaScript, but for now lets appraise if the code is suitable to incorporate in the analysis.
@shankari @JGreenlee @Abby-Wheelis