Skip to content
This repository has been archived by the owner on Apr 17, 2018. It is now read-only.

[WIP] handle housenumbers that look like postcodes #31

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dianashk
Copy link
Contributor

No description provided.

Copy link

@missinglink missinglink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have code comments explaining why this behaviour is important (possibly with an example?), otherwise LGTM

@dianashk
Copy link
Contributor Author

dianashk commented Apr 4, 2017

looks like the new libpostal release will handle this case. will leave this open until we can verify that no additional handling is needed.

@dianashk
Copy link
Contributor Author

new libpostal is good at this and doesn't make these mistakes anymore.

@dianashk dianashk closed this Apr 24, 2017
@trescube
Copy link
Contributor

libpostal still occasionally gets the number/postcode flipped. I'm going thru Orange County, CA right now and found the following examples where house number is parsed as postcode:

  • 21161 HILLSDALE HUNTINGTON BEACH CA
  • 24241 PORTO NUOVO DANA POINT CA
  • 24855 CROWN ROYALE LAGUNA NIGUEL CA

It might be a good idea to keep this logic around.

@dianashk dianashk reopened this Apr 25, 2017
@dianashk
Copy link
Contributor Author

These cases are slightly different in that there aren't both a housenumber and a postalcode, but rather only the housenumber. In this case we can't assume libpostal got it wrong because it may not have. We might actually need to send queries with both and return anything that matches. Most likely sending both will reveal the clear winner.

@trescube
Copy link
Contributor

In the US we might be able to work some magic since each state is allotted a range for postal codes:

https://en.wikipedia.org/wiki/ZIP_Code#/media/File:ZIP_Code_zones.svg

If libpostal returns a state, it wouldn't be hard to determine whether a zip code is likely a house number instead.

@dianashk
Copy link
Contributor Author

add unit tests and acceptance tests and merge

@missinglink
Copy link

missinglink commented Jun 21, 2017

@dianashk please add code comments before merging.

I feel like, looking back on this in 6 months it wouldn't be clear what the original purpose of this code block was.

@dianashk dianashk self-assigned this Jul 27, 2017
@dianashk
Copy link
Contributor Author

dianashk commented Dec 1, 2017

move this over to API now that text-analyzer is no longer being used.

@dianashk
Copy link
Contributor Author

dianashk commented Dec 4, 2017

Looks like this is still an issue and the fix should be merged.

21161 HILLSDALE HUNTINGTON BEACH CA

      {
        "controller:libpostal": {
          "parsed_text": {
            "postalcode": "21161",
            "neighbourhood": "hillsdale",
            "city": "huntington beach",
            "state": "ca"
          }
        }
      },

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants