Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jun open data + May rerun open data #1146

Merged
merged 7 commits into from
Jun 14, 2024
Merged

Jun open data + May rerun open data #1146

merged 7 commits into from
Jun 14, 2024

Conversation

tiffanychu90
Copy link
Member

@tiffanychu90 tiffanychu90 commented Jun 14, 2024

jun open data

  • Run gtfs_funnel, hqta, open_data, segment_speeds, rt_vs_schedule pipelines for June
  • Noticed May values were really low in quarterly performance metrics - selected a weekend date before. Rerun for a Wednesday.
  • add logger.remove() and wait until next time scripts are run to see if this cleans up duplicate log entries / writing all logs into the first log file open, instead of separate log files for each script
  • hqta datasets not published to geoportal, since Muni BRT is a big part of that, so let's stay with an older version that shows it. other datasets updated. ticket submitted 6/14/24.
  • Epic - Open Data Publishing 2024 #991

hqta

  • Future proof Muni BRT: since we already had stops not getting picked up, add stop_name to Muni's desired BRT stops too. In case stop_ids don't work in the future, maybe stop_name will be more stable. (similar to how route_ids change over a long enough time horizon, but route_short_name or route_long_name might be the same)

TODO: Muni missing service?

  • hqta BRT check showed that Muni's BRT stops were not getting picked up
  • checked trips table and couldn't find Muni. checked through BQ to see if the service summary for Muni showed anything -- showed zero trips and zero service hours. Need to debug why.
  • Can rule out us filtering it out because of "bad quality data": dim_gtfs_datasets.data_quality_pipeline=True
Muni's gtfs_dataset_key = "7cc0cb1871dfd558f11a2885c145d144"

SELECT 
  gtfs_dataset_key,
  ttl_service_hours,
  n_trips,
  n_routes
FROM cal-itp-data-infra.mart_gtfs.fct_daily_feed_scheduled_service_summary
WHERE service_date = '2024-06-12' AND feed_key = 'dd202e1bd834c92db5b6da25d555b581'

@tiffanychu90 tiffanychu90 changed the title Jun open data Jun open data + May rerun open data Jun 14, 2024
@tiffanychu90 tiffanychu90 merged commit efca836 into main Jun 14, 2024
2 checks passed
@tiffanychu90 tiffanychu90 deleted the jun-open-data branch June 14, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant