pelias/api - api - CogtoSrc

Commit Graph

Author	SHA1	Message	Date
Peter Johnson	4c6797b695	feat(dedupe_improved_parent_matching): only check parent fields above the layer of the least granular match	6 years ago
missinglink	b2069606f2	feat(dedupe): improved handling of cases where "name", "parent" or "address_parts" propertiees are not set	6 years ago
Julian Simioni	f153e543c4	Add failing test case for one postcode deduping	6 years ago
Julian Simioni	a31f1a8561	Add failing test case for one postcode deduping	6 years ago
Peter Johnson	c0a0663e21	feat(dedupe): treat all non-canonical layers and analogous to a venue, prefer non-canonical records	6 years ago
Julian Simioni	2a668612ed	feat(confidence): Add support for new query names	6 years ago
Julian Simioni	c548a73cb9	fix(confidence): Update query type names in confidence score code `4adf4b3dd7` renamed some queries to be quite a bit more informative, however it wasn't obvious that these query names were used elsewhere in the code. With those changes, no confidence score middleware was running, which this should help fix.	6 years ago
Julian Simioni	f3b11a16eb	Use default config in search and autcomplete tests The configurable boosts feature can case other unit tests to fail if a user has customizations in their `pelias.json`. This adds proxyquire to all tests that might be affected, which force the default config, preventing such failures.	6 years ago
Peter Johnson	a06683ff68	feat(query): Modify custom boosts feature to use function_score queries	6 years ago
Julian Simioni	32684e0013	feat(query): Add support for custom boosts to search endpoint This adds support for custom boosts to the addressit-style search queries. The newer libpostal based queries do not include this functionality since they can only query for addresses.	6 years ago
Julian Simioni	bb605acb3f	Add tests for autocomplete custom boosts	6 years ago
Julian Simioni	9679c14152	Test handling of undefined configuration	6 years ago
Julian Simioni	9080feef05	WIP: Configurable boosts for sources and layers This is a work in progress to enable customizing boosts for sources and layers. For now, the config must be hardcoded in query/autocomplete.js, but it will eventually be driven by `pelias.json` and take effect on all endpoints.	6 years ago
Julian Simioni	a1add3656e	feat(log): Use structured logs for place endpoint	6 years ago
Julian Simioni	a7932d0b8c	feat(log): move retry info to structured logs	6 years ago
Julian Simioni	0056c0749a	feat(log): Add structured logs for coarse_reverse	6 years ago
Julian Simioni	4adf4b3dd7	feat(queries): Normalize all query names They should start with the endpoint (ideally), and address_search_using_ids should not have the same query name as 'search_fallback'.	6 years ago
Julian Simioni	d681a114d6	feat(log): Remove most unstructured controller logs These are nose now, the structured logs have much better info	6 years ago
Julian Simioni	f5c6dcf882	feat(log): Add structured logs for interpolation service	6 years ago
Julian Simioni	8c37ee63dd	feat(log): Add JSON log for Elasticsearch queries This adds a structured and detailed log line for each Elasticsearch query. It includes information like the total number of Elasticsearch hits, how long Elasticsearch took to process the request, query parameters, etc. This is extremely useful for later analysis as the structured nature of the query allows for powerful filtering.	6 years ago
Peter Johnson	9d4c773ce1	testing: add test case: incorrect parsing of diagonal directionals - no subsequent element	6 years ago
Peter Johnson	4e8a5385f4	feat(libpostal_patch): additional tests	6 years ago
Peter Johnson	27a9e1d900	feat(libpostal_patch): enable field mapping for "unit"	6 years ago
Peter Johnson	69ddbaf3be	feat(libpostal_patch): correctly parse australian-style unit numbers	6 years ago
Peter Johnson	19eb0b57d1	feat(libpostal_patch): add a libpostal patch which allows recasting labels	6 years ago
Julian Simioni	b1107a0c8f	fix(autocomplete): detect the case where input text is unsubstantial It's possible for the `text` input to /v1/autocomplete to be of non-zero length after trimming whitespace and quotes, but still be insufficient to use for geocoding. One common case is that it contains only commas, slashes, or other delimiters. Our query logic currently does not handle this case, and will generate Elasticsearch queries that do not have a primary `must` clause and end up searching every document in the index. These queries are slow, take up cluster resources, and are not useful. By detecting unsubstantial inputs, we can prevent this.	6 years ago
Julian Simioni	c20737f458	fix(boundary.country): use boundary.country as filter By definition, all boundary.country query matches will either be identical, or not a match. Thus, it does not make sense to put the query clause for boundary.country in the `must` section of the query. In theory, because our queries would generally combine this `must` clause with others, there shouldn't be any performance improvement (or regression) from this change. However, semantically, this clause fits better as a `filter`, and in the case of a bug causing a degenerate query with the `boundary.country` query clause as the only one under the `must` section, this would have a big impact.	6 years ago
Julian Simioni	7e4559fdc2	fix(sanitizer): Trim whitespace in addressit queries This is a followup PR to https://github.com/pelias/api/pull/1171 and https://github.com/pelias/api/pull/1170. Apparently we have two different `text` sanitizers, and autocomplete queries were treating a single space as valid input. This had a particularly bad outcome as it would end up generating queries (see an [example](https://gist.github.com/orangejulius/2cc26c7eed39311b6eaf1fb0175c13e6)) that had no main query clause. This caused them to match basically every document in the index. Looking at the geocode.earth slowlog, these queries took __8 seconds per shard__.	6 years ago
Julian Simioni	40e6054523	chore(geo_common): assert that a specific exception was thrown Without checking the message when asserting an exception is thrown, it's possible for the test to pass when undesired behavior is occuring.	6 years ago
Julian Simioni	1553dfb103	fix(geo_common): check for invalid bbox where min=max This condition will cause Elasticsearch to throw an error, we should catch it outselves first. The error is more friendly than the case where min>max, but still an error. Connects https://github.com/pelias/api/pull/1050	6 years ago
Julian Simioni	76bc5c654d	fix(geo_common): check bbox parameters are within range If bounding box lat/lon values are outside the correct range, Elasticsearch throws very alarming errors. With a little validation code we can provide more friendly and actionable error messages. Fixes https://github.com/pelias/pelias/issues/750	6 years ago
Julian Simioni	ff5c66a269	fix(test): Use default pelias-config for type tests Without this change, a user's own `pelias.json` customizations can cause the unit tests to fail.	6 years ago
Julian Simioni	d5a0b9fc86	Remove unused library	6 years ago
Julian Simioni	c9f89fee3d	Whitespace	6 years ago
Julian Simioni	76d88b62b4	fix(tests): Sanitizer tests should use faked config Otherwise a user's own pelias.json settings can cause tests to fail	6 years ago
Peter Johnson	4d9ee0053b	feat(libpostal): patch parser output when confused by directionals	6 years ago
Joxit	7d9e3e29fd	feat(findbyid): Add lang query param for placeholder This will reduce network transfer and speedup requests Related: https://github.com/pelias/placeholder/pull/128	6 years ago
Julian Simioni	800eb8ca03	Move lots of logging from info to debug These are log lines that are not really useful in a production context, and just create a lot of noise.	6 years ago
Horace Williams	ed0a96cff2	Migrate to internal iso3166 defnitions This is a somewhat roundabout fix to #1179, as a way to deal with the persistent npm ls and commit hook troubles we were dealing with due to dependencies of the iso3166 package. Additionally it should give us a faster definition of these ISO lookups, since the existing approaches were implemented using linear scans through an array rather than map-based lookups.	6 years ago
Horace Williams	c7ae37980d	Remove Quattroshapes deprecation notice Fixes #1132 The Quattroshapes source has been deprecated from some time now, and any users for whom the deprecation notice was helpful have presumably either upgraded or moved on by now.	7 years ago
Julian Simioni	9a0f182fb2	fix(whitespace): Trim whitespace and quotes before checking text length Previously, our text sanitizer code did not trim whitespace before checking that the string was non-empty. This lead to strings consisting only of whitespace being treated as valid. Not all our downstream services (such as libpostal) accept whitespace-only input, so this causes a rather harsh error. This PR builds upon the code in https://github.com/pelias/api/pull/1170 and moves the trimming code above the nonEmptyString check. Now, a whitespace-only input string produces the normal error for empty input. Fixes https://github.com/pelias/api/issues/1158	7 years ago
Julian Simioni	435cfd2c4a	Fix linter error This snuck in in https://github.com/pelias/api/pull/1165	7 years ago
Tyler Pedelose	52a66e1bc0	Added new predicates to aid placeholderGeodisambiguationShouldExecute	7 years ago
Peter Johnson	d38d4b1fa8	feat(text_sanitizer): trim whitespace and quotation marks from a range of natural languages	7 years ago
Julian Simioni	561f07950b	Remove focus_selected_layers query view This query extends the standard focus query view with hardcoded layers for which the query applies. The intent was to apply the focus scoring only to non-admin areas, but the list of layers was already out of date, as it was missing streets. The query is fundamentally problematic with custom layers as well.	7 years ago
Julian Simioni	10241047a6	fix(interpolation): Ensure proper sorting of streets with interpolated addresses In cases where several conditions are met, it is possible for results to be returned from the API that are not sorted as they were intended. These conditions are: * over 10 results total were returned from Elasticsearch * the interpolation middleware was called * not all street results end up with possible interpolated address matches, and some of those streets come before other interpolated address records, necessitating a re-sorting of the results in the interpolation middleware In these cases, the ordering of streets as defined by Elasticsearch, such as by linguistic match or distance from a focus point, will no longer be respected in the results. This is because Node.js's `Array.prototype.sort` uses an [unstable QuickSort for arrays of size 11 or greater](https://github.com/nodejs/node/blob/master/deps/v8/src/js/array.js#L670). The solution is to switch to a sorting algorithm that is always stable. This ensures that whatever ordering was specified in the Elasticsearch queries is observed, without any of that logic having to be duplicated, and then possibly conflict. Stable sorting is provided by the [stable](http://npmjs.com/stable) NPM package.	7 years ago
Peter Johnson	e15aa52f63	feat(type_mapping): rename function to avoid confusion with elasticsearch API	7 years ago
missinglink	b1cfd091ed	type_mapping: add unit tests	7 years ago
missinglink	63c962503c	test: improved type_mapping tests	7 years ago
semhul	a0aa5ed7f2	Adding support for unit field in structured search. If libpostal finds it in input use it for searching	7 years ago

1 2 3 4 5 ...

951 Commits (dedupe_improved_parent_matching)