From b175ee554c4b9072f5b7976b44b110009a371a77 Mon Sep 17 00:00:00 2001 From: rmglennon Date: Fri, 7 Oct 2016 14:43:44 -0700 Subject: [PATCH 1/2] add address searching topic for libpostal release --- addresses.md | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 addresses.md diff --git a/addresses.md b/addresses.md new file mode 100644 index 0000000..db77e04 --- /dev/null +++ b/addresses.md @@ -0,0 +1,43 @@ +# Address search accuracy and results + +Finding an address is one of the most common functions of a geocoder, but also one of the more complex because of analysis required on the constituent parts of the input text. The search integrates an address-parsing library, known as libpostal, to improve the results when you are looking for an address. + +## Accuracy in address results + +When finding addresses, you can see an indication of the [confidence](response.md#confidence) of the result in the response. The `confidence` is a numerical value increasing from 0 to 1 that estimates how closely this result matches the query. In relation to an address search, if the input text looks like an address, but the house number of the result does not match the house number that was parsed from the input text, the confidence score is lower. + +Properties that are related to confidence include `accuracy` and `match_type`. The `accuracy` is an indication of the geometry of the result, which can be either a `point` or a `centroid`. The `match_type` represents the kind of matching that happened for this address. An `exact` match means the search precisely found your entry, but `fallback` means the result is a less-than-perfect match. The match type is only shown for queries that include an address. + +Here is an example resulting from a search for the text, `30 W 26th street, New York, NY`: + +``` +"properties": { + [...] + "name": "30 West 26th Street", + "housenumber": "30", + "street": "West 26th Street", + "postalcode": "10010", + "confidence": 1, + "match_type": "exact", + "accuracy": "point", + [...] +} +``` + +With an accuracy of point and an exact match, the confidence score is closer to 1, and you will see fewer results in the response. The confidence value decreases when centroid accuracy and a fallback match occurs. When that happens, you may see multiple results appear so you can choose the one you intended. + +## Partial matches and fallbacks + +In some scenarios, good matches cannot be found for what you enter, so fallback behavior occurs. Some examples where this occurs includes if a street is misspelled, the street name changes (such as `West Broadway` turns into `East Main Street`), or the street name does not exist in a city. + +When this happens, the approach is to first try the most specific combination of analyzed fields, then fall back to coarser combinations until a result is returned. For example, if you enter a street address that is not in the city you specified, the house number and street are dropped, and the search attempts to match the city and state names only. + +The search currently supports only address points and not house number interpolation. This means that if a house number is not an address point in the data being searched, the behavior is to fall back to the street name. For example, `32 W 26th Street, New York, NY` is not an address point in the available data, but `W 26th Street, New York, NY` does exist. Therefore, only a street result is returned. + +Sometimes the search input contains a street, city, and state but the street is either misspelled or does not exist in the city. For example, if you enter `Calle de Lago, New York, NY`, the `Calle de Lago` is identified as a street, but one that does not exist in New York City. When the street lookup fails, the city is returned. + +If you enter a city that is not found in a particular state, the results will fall back to the state name you entered. Similar behavior happens for provinces and other administrative regions around the world. + +## Poor address search results + +If the search is unable to return any results based on the address, it functions more as a geographic search engine than a geocoder. When this happens, you may see fuzzy text-matching behavior. For example, the input `10 Main Street, United States of America` is parsed as a street and country but the search only supports `United States` and `USA`, so no results would be returned. In this case, you may see results that match some of the inputs, including `10 Main Street, Fair Haven, VT, USA` and `10 Main Street, Swanland, England, United Kingdom`. From 72b04e1d1a8ff46617fe59f3669e9515ad2fc82a Mon Sep 17 00:00:00 2001 From: Rhonda Glennon Date: Mon, 10 Oct 2016 10:56:27 -0700 Subject: [PATCH 2/2] rework based on feedback --- addresses.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/addresses.md b/addresses.md index db77e04..87646d9 100644 --- a/addresses.md +++ b/addresses.md @@ -24,7 +24,7 @@ Here is an example resulting from a search for the text, `30 W 26th street, New } ``` -With an accuracy of point and an exact match, the confidence score is closer to 1, and you will see fewer results in the response. The confidence value decreases when centroid accuracy and a fallback match occurs. When that happens, you may see multiple results appear so you can choose the one you intended. +With an accuracy of point and an exact match, the confidence score is closer to 1. The confidence value decreases when centroid accuracy and a fallback match occurs. When that happens, multiple results may be returned so you can choose the one you intended. ## Partial matches and fallbacks