Primarily as a performance optimization, but also to attempt to return
more relevant results, only admin and POI layers were queried when the
text input consisted of only one or two tokens, and there weren't any
numbers. However as shown in #194 that is a bit too optimistic, mostly
in contries other than the USA.
Fixes#194
Not strictly required for this change, but I noticed there was an input
parameter to lots of unit tests for the reverse endpoint. Reverse
doesn't take an input(or text) parameter at all, so these are just
extra, and probably came from a copy/paste in the tests.
This is just in a unit test, so it technically passes, but geonames is
not a valid layer option (geoname, singular, is used instead), so it's
best to correct.
After refactoring, this flag is no longer needed, as all areas of the
code that care about layers do so by setting a key within clean.types,
and then the types helper intelligently combines those together later.
The address parser currently does two things:
1.) make some intelligent guesses as to possible admin regions to
explicitly search against to improve the quality of results returned
2.) make some intelligent guesses as to when no part of the query needs
to search against anything other than admin regions. This somewhat
improves the quality of results returned but mostly improves the speed
of the Elasticsearch query since it's searching significantly fewer
recoords.
These two concerns are now split into two separate methods within the
query_parser helper module. They are mostly independent today, but don't
have to be in the future.
Modifying these sanitiser tests became extremely hard because almost all
of them were looping over lots of individual test cases, which places
assumptions about the common behavior of potentialy very different test
cases, as well as making assertions about huge swaths of output when
only a small amount of that output was really under test.
Hopefully these changes will make our tests easier to modify, and not
really lose any ability to catch bugs.
This code doesn't seem like it will be triggered very often (due to it
comapring space delimited words with comma delimited words from the text
field), and also has the potential to cause quite a bit of weird
behavior.
This moves the list of types created by sanitising the layer API
parameter from clean.layers to clean.types.from_layers. In subsequent
commits, types created from address parsing, and the
yet-to-be-implemented source parameter will also live in the clean.types
object.
This will allow moving logic to set cmd.type out of controllers, and
into separate logic that can be a littler smarter. Also, it will no
longer require the clean.default_layers_set flag to be passed all around
like a nasty global variable.
This is just in a unit test, so it technically passes, but geonames is
not a valid layer option (geoname, singular, is used instead), so it's
best to correct.
After refactoring, this flag is no longer needed, as all areas of the
code that care about layers do so by setting a key within clean.types,
and then the types helper intelligently combines those together later.
The address parser currently does two things:
1.) make some intelligent guesses as to possible admin regions to
explicitly search against to improve the quality of results returned
2.) make some intelligent guesses as to when no part of the query needs
to search against anything other than admin regions. This somewhat
improves the quality of results returned but mostly improves the speed
of the Elasticsearch query since it's searching significantly fewer
recoords.
These two concerns are now split into two separate methods within the
query_parser helper module. They are mostly independent today, but don't
have to be in the future.
Modifying these sanitiser tests became extremely hard because almost all
of them were looping over lots of individual test cases, which places
assumptions about the common behavior of potentialy very different test
cases, as well as making assertions about huge swaths of output when
only a small amount of that output was really under test.
Hopefully these changes will make our tests easier to modify, and not
really lose any ability to catch bugs.
This code doesn't seem like it will be triggered very often (due to it
comapring space delimited words with comma delimited words from the text
field), and also has the potential to cause quite a bit of weird
behavior.