The `scale` parameter controls how quickly scores decrease from the
maximum as the distance from the `center_point` to the record in
question increases.
Set this to 50km, which is the same as search.
Connects https://github.com/pelias/api/issues/1206
Linear scoring, by design, gives all records the same score past a
certain point.
This has the disadvantage that identical records that are very far away
cannot be sorted by distance.
By using exponential scoring, we can achieve decent sorting of even very
far away records. This is very helpful for cities and postalcodes.
Connects https://github.com/pelias/api/issues/1206
This adds a structured and detailed log line for each Elasticsearch
query.
It includes information like the total number of Elasticsearch hits, how
long Elasticsearch took to process the request, query parameters, etc.
This is extremely useful for later analysis as the structured nature of
the query allows for powerful filtering.
It's possible for the `text` input to /v1/autocomplete to be of non-zero
length after trimming whitespace and quotes, but still be insufficient
to use for geocoding.
One common case is that it contains only commas, slashes, or other
delimiters.
Our query logic currently does not handle this case, and will generate
Elasticsearch queries that do not have a primary `must` clause and end
up searching every document in the index. These queries are slow, take
up cluster resources, and are not useful.
By detecting unsubstantial inputs, we can prevent this.
By definition, all boundary.country query matches will either be
identical, or not a match. Thus, it does not make sense to put the query
clause for boundary.country in the `must` section of the query.
In theory, because our queries would generally combine this `must`
clause with others, there shouldn't be any performance improvement (or
regression) from this change.
However, semantically, this clause fits better as a `filter`, and in the
case of a bug causing a degenerate query with the `boundary.country`
query clause as the only one under the `must` section, this would have a
big impact.
This condition will cause Elasticsearch to throw an error, we should
catch it outselves first.
The error is more friendly than the case where min>max, but still an
error.
Connects https://github.com/pelias/api/pull/1050
If bounding box lat/lon values are outside the correct range,
Elasticsearch throws very alarming errors.
With a little validation code we can provide more friendly and
actionable error messages.
Fixes https://github.com/pelias/pelias/issues/750
This is a somewhat roundabout fix to #1179,
as a way to deal with the persistent npm ls
and commit hook troubles we were dealing with
due to dependencies of the iso3166 package.
Additionally it should give us a faster
definition of these ISO lookups, since the
existing approaches were implemented using
linear scans through an array rather than
map-based lookups.
Fixes#1132
The Quattroshapes source has been deprecated from
some time now, and any users for whom the deprecation
notice was helpful have presumably either upgraded
or moved on by now.
Previously, our text sanitizer code did not trim whitespace before
checking that the string was non-empty. This lead to strings consisting
only of whitespace being treated as valid. Not all our downstream
services (such as libpostal) accept whitespace-only input, so this
causes a rather harsh error.
This PR builds upon the code in https://github.com/pelias/api/pull/1170
and moves the trimming code above the nonEmptyString check. Now, a
whitespace-only input string produces the normal error for empty input.
Fixes https://github.com/pelias/api/issues/1158
This query extends the standard focus query view with hardcoded layers
for which the query applies. The intent was to apply the focus scoring
only to non-admin areas, but the list of layers was already out of date,
as it was missing streets.
The query is fundamentally problematic with custom layers as well.
In cases where several conditions are met, it is possible for results to
be returned from the API that are not sorted as they were intended.
These conditions are:
* over 10 results total were returned from Elasticsearch
* the interpolation middleware was called
* not all street results end up with possible interpolated address
matches, and some of those streets come before other interpolated
address records, necessitating a re-sorting of the results in the
interpolation middleware
In these cases, the ordering of streets as defined by Elasticsearch,
such as by linguistic match or distance from a focus point, will no
longer be respected in the results.
This is because Node.js's `Array.prototype.sort` uses an
[*un*stable QuickSort for arrays of size 11 or greater](https://github.com/nodejs/node/blob/master/deps/v8/src/js/array.js#L670).
The solution is to switch to a sorting algorithm that is always stable.
This ensures that whatever ordering was specified in the Elasticsearch
queries is observed, without any of that logic having to be duplicated,
and then possibly conflict.
Stable sorting is provided by the [stable](http://npmjs.com/stable) NPM
package.