Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PNW Census: Pipeline development #63

Open
7 tasks
MichaelHanksSF opened this issue Nov 6, 2024 · 1 comment
Open
7 tasks

PNW Census: Pipeline development #63

MichaelHanksSF opened this issue Nov 6, 2024 · 1 comment

Comments

@MichaelHanksSF
Copy link

MichaelHanksSF commented Nov 6, 2024

This is the issue to create the PNW Census pipeline.
The pipeline will need to:

  • validate a single csv file
  • incorporate the new "year_month" element from the filename in exactly the same way "year" is currently handled; for the purpose of differentiating between different returns e.g. PNW Census will allow 2024_Jan, 2024_Feb.... where 903 only has 2024; adding the column "year_month" instead of "year" to all files
  • produce a cleanfile per input following a pipeline.json to be defined with client
  • hash the values for unique identifiers in the census; and apply the same hash to 903 identifiers such that a single data model can be created at the end
  • concatenate cleanfiles at the la level
  • create a reports output for each list for the region
  • make the usual logs and outputs available in the standard places in the infrastructure
@MichaelHanksSF
Copy link
Author

3 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant