Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The dataset seems to be outdated. Where can we help with updating the dataset? #19

Open
rlam3 opened this issue Feb 15, 2019 · 13 comments

Comments

@rlam3
Copy link

rlam3 commented Feb 15, 2019

The dataset seems to be outdated. Where can we help with updating the dataset?

@MacHu-GWU
Copy link
Owner

This dataset is collected from multiple trusted data source, and I run the crawler every 6 month. A lot of work is needed to manually clean, merging the dataset.

And I am trying to come up a way to allow others to send me .csv .json data file if you have better data.

Any suggestion?

@rlam3
Copy link
Author

rlam3 commented Feb 16, 2019

Not sure if this is one of your data sources for housing units, but here is another source link you can probably use.

https://factfinder.census.gov/bkmk/table/1.0/en/ACS/17_5YR/B25001/0100000US.86000

Update it once per year is probably more than enough. 2017 dataset is ready to go. But would need to merge with other datasets to work out.

There are a couple of options:

  1. Firebase Database in NOSQL / JSON format per zipcode.
  2. NoSQL Database with API for GraphQL to pull only necessary data. No over-fetch or under-fetching.
  3. Leave as is...

Benefits of these two are for historical tracking.

Should I assume the data of this package is outdated then? I see most data is coming from 2012.

@MacHu-GWU
Copy link
Owner

@rlam3 most of demographic statistics data are up to date, but the zipcode, total population, land, and water area are from 2012 census.

@MacHu-GWU
Copy link
Owner

@rlam3 I don't want to host the database, I think I can offer a utility tool to allow users to upload to Mongodb, RDBMS, GraphQL to their own database. How's that?

@rlam3
Copy link
Author

rlam3 commented Feb 27, 2019

@MacHu-GWU could you share your sources of demographic statistics? Thanks!

I think that offering a tool to switch from relation to nosql would be nice. A big nosql / json file sounds nice and easily parsable. :)

@sarania86
Copy link

@rlam3 most of demographic statistics data are up to date, but the zipcode, total population, land, and water area are from 2012 census.

This is an amazing dataset. I was wondering by the above reply you meant that the whole population-related data are outdated? What about the Real estate or employment section?

@wenima
Copy link

wenima commented Jan 18, 2020

first, thank you for the tool. is there a way to help contribute missing data, even just basic data like the actual zipcode (example: 75036)

@lscarmic
Copy link

I echo everyone's appreciation for this tool. It's fabulous. Thanks a ton!

Just curious, based on "I run the crawler every 6 months", can we assume population_by_age is most recent, i.e., 6 months old?

I see a fairly large discrepancy between the population_by_year numbers (which go through 2015) and the sum of population_by_age numbers. Is that just reporting errors? Maybe I can just take the distribution of the population_by_age, add 5 year, and apply it to the most recent total population numbers to get an estimate.

Just wanted to get your thoughts on the data from these datasets. Thanks!

@MacHu-GWU
Copy link
Owner

@lscarmic if the data source I crawled is not updated for 6 months, then there's no improvement with the data sadly.

@MacHu-GWU
Copy link
Owner

Hi All, I will make this issue long-living and redirect similar question to this one

@weber-stephen
Copy link

This is such an amazing library! I think one thing that is holding me back from using it is trusting the source it came from. I don't see a list of the data sources where you are pulling this data from. Is there anyway you can provide a list?

@MacHu-GWU
Copy link
Owner

MacHu-GWU commented Jun 11, 2021

@weber-stephen send me a direct mail. you can find my email https://pypi.org/project/uszipcode/

@MacHu-GWU
Copy link
Owner

Hi all, I just released a new version 1.0.1 with some census 2020 data.

The data source are also published https://uszipcode.readthedocs.io/index.html#about-the-data, you can explore it yourself and cross validate the searchengine results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants