Julian Simioni
6 years ago
committed by
GitHub
6 changed files with 582 additions and 1 deletions
@ -0,0 +1,122 @@ |
|||||||
|
# Considerations for full-planet builds |
||||||
|
|
||||||
|
Pelias is designed to work with data ranging from a small city to the entire planet. Small cities do |
||||||
|
not require particularly significant resources and should be easy. However, full planet builds |
||||||
|
present many of their own challenges. |
||||||
|
|
||||||
|
Current [full planet builds](https://pelias-dashboard.geocode.earth) weigh in at around 550 million |
||||||
|
documents, and require about 375GB total storage in Elasticsearch. |
||||||
|
|
||||||
|
Fortunately, because of services like AWS and the scalability of Elasticsearch, full planet builds |
||||||
|
are possible without too much extra effort. The process is no different, it just requires more |
||||||
|
hardware and takes longer. |
||||||
|
|
||||||
|
To set expectations, a cluster of 4 [r4.xlarge](https://aws.amazon.com/ec2/instance-types/) AWS |
||||||
|
instances (30GB RAM each) running Elasticsearch, and one m4.4xlarge instance running the importers |
||||||
|
and PIP service can complete a full planet build in about two days. |
||||||
|
|
||||||
|
## Recommended processes |
||||||
|
|
||||||
|
### Use Docker containers and orchestration |
||||||
|
|
||||||
|
We strongly recommend using Docker to run Pelias. All our services include Dockerfiles and the |
||||||
|
resulting images are pushed to [Docker Hub](https://hub.docker.com/r/pelias/) by our CI. Using these |
||||||
|
images will drastically reduce the amount of work it takes to set up Pelias and will ensure you are |
||||||
|
on a known good configuration, minimizing the number of issues you will encounter. |
||||||
|
|
||||||
|
Additionally, there are many great tools for managing container workloads. Simple ones like |
||||||
|
[docker-compose](https://github.com/pelias/docker/) can be used for small installations, and more |
||||||
|
complex tools like [Kubernetes](https://github.com/pelias/kubernetes) can be great for larger |
||||||
|
installations. Pelias is extensively tested on both. |
||||||
|
|
||||||
|
### Use separate Pelias installations for indexing and production traffic |
||||||
|
|
||||||
|
The requirements for performant and reliable Elasticsearch clusters are very different for importing |
||||||
|
new data compared to serving queries. It is _highly_ recommended to use one cluster to do imports, |
||||||
|
save the resulting Elasticsearch index into a snapshot, and then load that snapshot into the cluster |
||||||
|
used to perform actual geocoding. |
||||||
|
|
||||||
|
### Shard count |
||||||
|
|
||||||
|
Historically, Mapzen Search has used 24 Elasticsearch shards for its builds. However, our latest |
||||||
|
guidance from the Elasticsearch team is that shards should be no larger than 50GB, but otherwise |
||||||
|
having as few shards as possible is best. At [geocode.earth](https://geocode.earth) we are |
||||||
|
experimenting with 12 shard builds, and may eventually move to 6. We would appreciate performance |
||||||
|
feedback from anyone doing large builds. |
||||||
|
|
||||||
|
The `elasticsearch` section of `pelias.json` can be used to configure the shard count. |
||||||
|
|
||||||
|
```js |
||||||
|
{ |
||||||
|
"elasticsearch": { |
||||||
|
"settings": { |
||||||
|
"index": { |
||||||
|
"number_of_shards": "5", |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
### Force merge your Elasticsearch indices |
||||||
|
|
||||||
|
Pelias Elasticserach indices are generally static, as we do not recommend querying from and |
||||||
|
importing to an Elasticsearch cluster simultaneously. In such cases, the highest levels of |
||||||
|
performance can be achieved by [force-merging](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html) the Elasticsearch index. |
||||||
|
|
||||||
|
## Recommended hardware |
||||||
|
|
||||||
|
For a production ready instance of Pelias, capable of supporting a few hundred queries per second |
||||||
|
across a full planet build, a setup like the following should be sufficient. |
||||||
|
|
||||||
|
### Elasticsearch cluster for importing |
||||||
|
|
||||||
|
The main requirement of Elasticsearch is that it has lots of disk. 400GB across the |
||||||
|
cluster is a good minimum. Increased CPU power is useful to achieve a higher throughput for queries, |
||||||
|
but not as important as RAM. |
||||||
|
|
||||||
|
|
||||||
|
### Elasticsearch cluster for querying |
||||||
|
|
||||||
|
For queries, essentially the only bottleneck is CPU, although more RAM is helpful so Elasticsearch |
||||||
|
data can be cached. On AWS, `c5` instances are significantly more performant than even the `c4` |
||||||
|
instances, and should be used if high performance is needed. |
||||||
|
|
||||||
|
_Example configuration:_ 4 `c5.4xlarge` (16 CPU, 32GB RAM) to serve 250 RPS |
||||||
|
|
||||||
|
### Importer machine |
||||||
|
|
||||||
|
The importers are each single-threaded Node.js processes, which require around 8GB of RAM |
||||||
|
each with admin lookup enabled. Faster CPUs will help increase the import speed. Running multiple |
||||||
|
importers in parallel is recommended if the importer machine has enough RAM and CPU to support them. |
||||||
|
|
||||||
|
_Example configuration:_ 1 `c4.4xlarge` (16 CPU, 30GB RAM), running two parallel importers |
||||||
|
|
||||||
|
### Pelias services |
||||||
|
|
||||||
|
Each Pelias service has different memory and CPU requirements. Here are some rough guidelines: |
||||||
|
|
||||||
|
#### API |
||||||
|
RAM: 200MB per instance |
||||||
|
CPU: Single threaded, one instance can serve around 500 RPS |
||||||
|
Disk: None |
||||||
|
|
||||||
|
#### Placeholder |
||||||
|
RAM: 200MB per instance |
||||||
|
CPU: Single threaded, supports [clustering](https://nodejs.org/api/cluster.html) |
||||||
|
Disk: Requires about 2GB for a full planet index |
||||||
|
|
||||||
|
#### Libpostal |
||||||
|
RAM: 3GB per instance |
||||||
|
CPU: Multi-threaded, but extremely fast. A single core can serve 8000+ RPS |
||||||
|
Disk: about 2-3GB of data storage required |
||||||
|
|
||||||
|
### PIP |
||||||
|
RAM: ~6GB |
||||||
|
CPU: 2 cores per instance recommended, which is enough to serve 5000-7000 RPS |
||||||
|
|
||||||
|
### Interpolation |
||||||
|
RAM: 3GB per instance currently (please follow our efforts to [un-bundle |
||||||
|
libpostal](https://github.com/pelias/interpolation/issues/106) from the interpolation service) |
||||||
|
CPU: Single core. One instance can serve around 200RPS |
||||||
|
Disk: 40GB needed for a full planet interpolation dataset |
@ -0,0 +1,35 @@ |
|||||||
|
## Getting started with Pelias |
||||||
|
|
||||||
|
Looking to install and set up Pelias? You've come to the right place. We have several different |
||||||
|
tools and pieces of documentation to help you. |
||||||
|
|
||||||
|
### Installing for the first time? |
||||||
|
|
||||||
|
We _strongly_ recommend using our [Docker](http://github.com/pelias/docker/) based installation for |
||||||
|
your first install. It removes the need to deal with most of the complexity and dependencies of |
||||||
|
Pelias. On a fast internet connection you should be able to get a small city like Portland, Oregon |
||||||
|
installed in under 30 minutes. |
||||||
|
|
||||||
|
### Want to go more in depth? |
||||||
|
|
||||||
|
The Pelias docker installation should work great for any small area, and is great for managing the |
||||||
|
different Pelias services during development. However, we understand not everyone can or wants to |
||||||
|
use Docker, and that people want more details on how things work. |
||||||
|
|
||||||
|
For this, we have our [from scratch installation guide](pelias_from_scratch.md) |
||||||
|
|
||||||
|
### Installing in production? |
||||||
|
|
||||||
|
By far the most well tested way to install Pelias is to use [Kubernetes](https://github.com/pelias/kubernetes). |
||||||
|
Kubernetes is perfect for managing systems that have many different components, like Pelias. |
||||||
|
|
||||||
|
We would love to add additional, well tested ways to install Pelias in production. Reach out to us |
||||||
|
if you have something to share or want to get started. |
||||||
|
|
||||||
|
### Doing a full planet build? |
||||||
|
|
||||||
|
Running Pelias for a city or small country is pretty easy. However, due to the amount of data |
||||||
|
involved, a full planet build is harder to pull off. |
||||||
|
|
||||||
|
See our [full planet build guide](full_planet_considerations.md) for some recommendations on how to |
||||||
|
make it easier and more performant |
@ -0,0 +1,317 @@ |
|||||||
|
# Installing Pelias from Scratch |
||||||
|
|
||||||
|
These instructions will help you set up the Pelias geocoder from scratch. We strongly recommend |
||||||
|
using our [Docker](http://github.com/pelias/docker/) tools for your first Pelias installation. |
||||||
|
|
||||||
|
However, for more in-depth usage, or to learn more about the internals of Pelias, use this guide. |
||||||
|
|
||||||
|
It assumes some knowledge of the command line and Node.js, but we'd like as many people as possible |
||||||
|
to be able to install Pelias, so if anything is confusing, please don't hesitate to reach out. We'll |
||||||
|
do what we can to help and also improve the documentation. |
||||||
|
|
||||||
|
## Installation Overview |
||||||
|
|
||||||
|
These are the steps for fully installing Pelias: |
||||||
|
1. [Check that the hardware and software requirements are met](#system-requirements) |
||||||
|
1. [Decide which datasets to use and download them](#choose-your-datasets) |
||||||
|
1. [Download the Pelias code](#download-the-pelias-repositories) |
||||||
|
1. [Customize Pelias Configuration file `~/pelias.json`](#customize-pelias-config) |
||||||
|
1. [Install the Elasticsearch schema using pelias-schema](#set-up-the-elasticsearch-schema) |
||||||
|
1. [Use one or more importers to load data into Elasticsearch](#run-the-importers) |
||||||
|
1. [Install and start the Pelias services](#install-and-start-the-pelias-services) |
||||||
|
1. [Start the API server to begin handling queries](#start-the-api) |
||||||
|
|
||||||
|
|
||||||
|
## System Requirements |
||||||
|
|
||||||
|
See our [software requirements](requirements.md) and insure all of them are installed before moving forward |
||||||
|
|
||||||
|
### Hardware recommendations |
||||||
|
* At a minimum 50GB disk space to download, extract, and process data |
||||||
|
* Lots of RAM, 8GB is a good minimum for a small import like a single city or small country. A full North America OSM import just fits in 16GB RAM |
||||||
|
|
||||||
|
## Choose your datasets |
||||||
|
Pelias can currently import data from [four different sources](data-sources.md), using five different importers. |
||||||
|
|
||||||
|
Only one dataset is _required_: [Who's on First](https://whosonfirst.org/). This dataset is used to enrich all data imported into Pelias with [administrative information](glossary.md). For more on this process, see the [wof-admin-lookup](https://github.com/pelias/wof-admin-lookup) documentation. |
||||||
|
|
||||||
|
**Note:** You don't have to run the `whosonfirst` importer, but you do have to have Who's on First |
||||||
|
data available on disk for use by the other importers. |
||||||
|
|
||||||
|
Here's an overview of how to download each dataset. |
||||||
|
|
||||||
|
### Who's on First |
||||||
|
|
||||||
|
The [Who's on First](https://github.com/pelias/whosonfirst#downloading-the-data) importer can download all the Who's |
||||||
|
on First data quickly and easily. |
||||||
|
|
||||||
|
### Geonames |
||||||
|
|
||||||
|
The [pelias/geonames](https://github.com/pelias/geonames/#installation) importer contains code and |
||||||
|
instructions for downloading Geonames data automatically. Individual countries, or the entire planet |
||||||
|
(1.3GB compressed) can be specified. |
||||||
|
|
||||||
|
### OpenAddresses |
||||||
|
|
||||||
|
The Pelias Openaddresses importer can [download specific files from |
||||||
|
OpenAddresses](https://github.com/pelias/openaddresses/#data-download). |
||||||
|
|
||||||
|
Additionally, the [OpenAddresses](https://results.openaddresses.io/) project includes numerous download options, |
||||||
|
all of which are `.zip` downloads. The full dataset is just over 6 gigabytes compressed (the |
||||||
|
extracted files are around 30GB), but there are numerous subdivision options. |
||||||
|
|
||||||
|
### OpenStreetMap |
||||||
|
|
||||||
|
OpenStreetMap (OSM) has a nearly limitless array of download options, and any of them should work as long as |
||||||
|
they're in [PBF](http://wiki.openstreetmap.org/wiki/PBF_Format) format. Generally the files will |
||||||
|
have the extension `.osm.pbf`. Good sources include [download.geofabrik.de](http://download.geofabrik.de/), [Nextzen Metro Extracts](https://metro-extracts.nextzen.org/), [Interline OSM Extracts](https://www.interline.io/osm/extracts/), and planet files listed on the [OSM wiki](http://wiki.openstreetmap.org/wiki/Planet.osm). |
||||||
|
A full planet PBF file is about 41GB. |
||||||
|
|
||||||
|
#### Street Data (Polylines) |
||||||
|
|
||||||
|
To download and import [street data](https://github.com/pelias/polylines#download-data) from OSM, a separate importer is used that operates on a preprocessed dataset |
||||||
|
derived from the OSM planet file. |
||||||
|
|
||||||
|
## Installation |
||||||
|
|
||||||
|
### Download the Pelias repositories |
||||||
|
|
||||||
|
At a minimum, you'll need |
||||||
|
1. [Pelias schema](https://github.com/pelias/schema/) |
||||||
|
2. The Pelias [API](https://github.com/pelias/api/) and other Pelias services |
||||||
|
3. Importer(s) |
||||||
|
|
||||||
|
|
||||||
|
Here's a bash snippet that will download all the repositories (they are all small enough that you don't |
||||||
|
have to worry about the space of the code itself), check out the production branch (which is |
||||||
|
probably the one you want), and install all the node module dependencies. |
||||||
|
|
||||||
|
```bash |
||||||
|
for repository in schema whosonfirst geonames openaddresses openstreetmap polylines api placeholder |
||||||
|
interpolation pip-service; do |
||||||
|
git clone https://github.com/pelias/${repository}.git # clone from Github |
||||||
|
pushd $repository > /dev/null # switch into importer directory |
||||||
|
git checkout production # or remove this line to stay with master |
||||||
|
npm install # install npm dependencies |
||||||
|
popd > /dev/null # return to code directory |
||||||
|
done |
||||||
|
``` |
||||||
|
|
||||||
|
<details> |
||||||
|
<summary>Not sure which branch to use?</summary> |
||||||
|
|
||||||
|
Pelias uses three diferent branches as part of our release process. |
||||||
|
|
||||||
|
`production` **(recommended)**: contains only code that has been well tested, generally against a |
||||||
|
full-planet build. This is the "safest" branch and it will change the least frequently, although we |
||||||
|
generally release new code at least once a week. |
||||||
|
|
||||||
|
`staging`: these branches contain the code that is currently being tested against a full planet |
||||||
|
build for imminent release. It's useful to track what code will be going out in the next release, |
||||||
|
but not much else. |
||||||
|
|
||||||
|
`master`: master branches contain the latest code that has passed code review, unit/integration |
||||||
|
tests, and is reasonably functional. While we try to avoid it, the nature of the master branch is |
||||||
|
that it will sometimes be broken. That said, these are the branches to use for development of new |
||||||
|
features. |
||||||
|
</details> |
||||||
|
|
||||||
|
### Customize Pelias Config |
||||||
|
|
||||||
|
Nearly all configuration for Pelias is driven through a single config file: `pelias.json`. By |
||||||
|
default, Pelias will look for this file in your home directory, but you can configure where it |
||||||
|
looks. For more details, see the [pelias-config](https://github.com/pelias/config) repository. |
||||||
|
|
||||||
|
#### Where on the network to find Elasticsearch |
||||||
|
|
||||||
|
Pelias will by default look for Elasticsearch on `localhost` at port 9200 (the standard |
||||||
|
Elasticsearch port). |
||||||
|
Take a look at the [default config](https://github.com/pelias/config/blob/master/config/defaults.json#L2). You can see the Elasticsearch configuration looks something like this: |
||||||
|
|
||||||
|
```js |
||||||
|
{ |
||||||
|
"esclient": { |
||||||
|
"hosts": [{ |
||||||
|
"host": "localhost", |
||||||
|
"port": 9200 |
||||||
|
}] |
||||||
|
|
||||||
|
... // rest of config |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
If you want to connect to Elasticsearch somewhere else, change `localhost` as needed. You can |
||||||
|
specify multiple hosts if you have a large cluster. In fact, the entire `esclient` section of the |
||||||
|
config is sent along to the [elasticsearch-js](https://github.com/elastic/elasticsearch-js) module, so |
||||||
|
any of its [configuration options](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/configuration.html) |
||||||
|
are valid. |
||||||
|
|
||||||
|
#### Where to find the downloaded data files |
||||||
|
The other major section, `imports`, defines settings for each importer. `adminLookup` has it's own section and its value applies to all importers. The defaults look like this: |
||||||
|
|
||||||
|
```json |
||||||
|
{ |
||||||
|
"imports": { |
||||||
|
"adminLookup": { |
||||||
|
"enabled": true |
||||||
|
}, |
||||||
|
"geonames": { |
||||||
|
"datapath": "/mnt/pelias/geonames", |
||||||
|
}, |
||||||
|
"openstreetmap": { |
||||||
|
"datapath": "/mnt/pelias/openstreetmap", |
||||||
|
"leveldbpath": "/tmp", |
||||||
|
"import": [{ |
||||||
|
"filename": "planet.osm.pbf" |
||||||
|
}] |
||||||
|
}, |
||||||
|
"openaddresses": { |
||||||
|
"datapath": "/mnt/pelias/openaddresses", |
||||||
|
"files": [] |
||||||
|
}, |
||||||
|
"whosonfirst": { |
||||||
|
"datapath": "/mnt/pelias/whosonfirst" |
||||||
|
}, |
||||||
|
"polyline": { |
||||||
|
"datapath": "/mnt/pelias/polyline", |
||||||
|
"files": [] |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
Note: The datapath must be an _absolute path._ |
||||||
|
As you can see, the default datapaths are meant to be changed. |
||||||
|
|
||||||
|
### Install Elasticsearch |
||||||
|
|
||||||
|
Please refer to the [official 2.4 install docs](https://www.elastic.co/guide/en/elasticsearch/reference/2.4/setup.html) for how to install Elasticsearch. |
||||||
|
|
||||||
|
Be sure to modify the Elasticsearch [heap size](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/heap-sizing.html) as appropriate to your machine. |
||||||
|
|
||||||
|
Make sure Elasticsearch is running and connectable, and then you can continue with the Pelias |
||||||
|
specific setup and importing. Using a plugin like [Sense](https://github.com/bleskes/sense) [(Chrome extension)](https://chrome.google.com/webstore/detail/sense-beta/lhjgkmllcaadmopgmanpapmpjgmfcfig?hl=en), [head](https://mobz.github.io/elasticsearch-head/) |
||||||
|
or [Marvel](https://www.elastic.co/products/marvel) can help monitor Elasticsearch as you import |
||||||
|
data. |
||||||
|
|
||||||
|
### Set up the Elasticsearch Schema |
||||||
|
|
||||||
|
Pelias requires specific configuration settings for both performance and accuracy reasons. Fortunately, now that your `pelias.json` file is configured with how to connect to Elasticsearch, |
||||||
|
the schema repository can automatically create the Pelias index and configure it exactly as needed. |
||||||
|
|
||||||
|
```bash |
||||||
|
cd schema # assuming you have just run the bash snippet to download the repos from earlier |
||||||
|
node scripts/create_index.js |
||||||
|
``` |
||||||
|
The Elasticsearch Schema is analogous to the layout of a table in a traditional relational database, |
||||||
|
like MySQL or PostgreSQL. While Elasticsearch attempts to auto-detect a schema that works when |
||||||
|
inserting new data, this generally leads to non-optimal results. In the case of Pelias, inserting |
||||||
|
data without first applying the Pelias schema will cause all queries to fail completely. |
||||||
|
|
||||||
|
### Run the importers |
||||||
|
|
||||||
|
Now that the schema is set up, you're ready to begin importing data. |
||||||
|
|
||||||
|
For each importer, you can start the import process with the `npm start` command: |
||||||
|
|
||||||
|
```bash |
||||||
|
cd importer_directory; npm start |
||||||
|
``` |
||||||
|
|
||||||
|
Depending on how much data you've imported, now may be a good time to grab a coffee. |
||||||
|
You can expect around 800-2000 inserts per second. |
||||||
|
|
||||||
|
The order of imports does not matter. Multiple importers can be run in parallel to speed up the setup process. |
||||||
|
Each of our importers operates independent of the data that is already in Elasticsearch. |
||||||
|
For example, you can import OSM data without importing WOF data first. |
||||||
|
|
||||||
|
#### Aside: When to delete the data already in Elasticsearch |
||||||
|
|
||||||
|
If you have previously run a build, and are looking to start another one, it generally a good idea |
||||||
|
to delete the existing Pelias index and re-create it. Here's how: |
||||||
|
|
||||||
|
```bash |
||||||
|
# !! WARNING: this will remove all your data from pelias!! |
||||||
|
node scripts/drop_index.js # it will ask for confirmation first |
||||||
|
node scripts/create_index.js |
||||||
|
``` |
||||||
|
|
||||||
|
When is this necessary? Here's a guideline: when in doubt, delete the index, re-create it, and start |
||||||
|
fresh. |
||||||
|
|
||||||
|
This is because Elasticsearch has no analog to a schema migration like a relational database, and |
||||||
|
all the importers start over when re-run. |
||||||
|
|
||||||
|
The only time when this isn't necessary is if the following conditions are true: |
||||||
|
1. You are trying to re-import the exact same data again (for example, because the build failed, or |
||||||
|
you are testing changes to an importer) |
||||||
|
2. The Pelias schema has not changed |
||||||
|
|
||||||
|
## Install and start the Pelias Services |
||||||
|
|
||||||
|
Pelias is made up of several different services, each providing a specific aspect of Pelias's |
||||||
|
functionality. |
||||||
|
|
||||||
|
The [list of Pelias services](services.md) descibes the functionality of each service, and can be |
||||||
|
used to determine if you need to install that service. It also includes links to setup instructions |
||||||
|
for each service. |
||||||
|
|
||||||
|
When in doubt, install everything except the interpolation engine (it requires a long download or |
||||||
|
build process). |
||||||
|
|
||||||
|
### Configure `pelias.json` for services |
||||||
|
|
||||||
|
The Pelias API needs to know about each of the other services available to it. Once again, this is |
||||||
|
configured in `pelias.json`. The following section will tell the API to use all services running |
||||||
|
locally and on their default ports. |
||||||
|
|
||||||
|
```js |
||||||
|
{ |
||||||
|
"api": { |
||||||
|
"services": { |
||||||
|
"placeholder": { |
||||||
|
"url": "http://localhost:3000" |
||||||
|
}, |
||||||
|
"libpostal": { |
||||||
|
"url": "http://localhost:8080" |
||||||
|
}, |
||||||
|
"pip": { |
||||||
|
"url": "http://localhost:3102" |
||||||
|
}, |
||||||
|
"interpolation" |
||||||
|
"url": "http://localhost:3000" |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
} |
||||||
|
``` |
||||||
|
|
||||||
|
### Start the API |
||||||
|
|
||||||
|
Now that the API knows how to connect to Elasticsearch and all other Pelias services, all that is |
||||||
|
required to start the API is: |
||||||
|
|
||||||
|
``` |
||||||
|
npm start |
||||||
|
``` |
||||||
|
|
||||||
|
## Geocode with Pelias |
||||||
|
|
||||||
|
Pelias should now be up and running and will respond to your queries. |
||||||
|
|
||||||
|
For a quick check, a request to `http://localhost:3100` should display a link to the documentation |
||||||
|
for handy reference. |
||||||
|
|
||||||
|
*Here are some queries to try:* |
||||||
|
|
||||||
|
[http://localhost:3100/v1/search?text=london](http://localhost:3100/v1/search?text=london): a search |
||||||
|
for the city of London. |
||||||
|
|
||||||
|
[http://localhost:3100/v1/autocomplete?text=londo](http://localhost:3100/v1/autocomplete?text=londo): another query for London, but using the autocomplete endpoint which supports partial matches and is intended to be sent queries as a user types (note the query is for `londo` but London is returned) |
||||||
|
|
||||||
|
[http://localhost:3100/v1/reverse?point.lon=-73.986027&point.lat=40.748517](http://localhost:3100/v1/reverse?point.lon=-73.986027&point.lat=40.748517): a reverse geocode for results near the Empire State Building in New York City. |
||||||
|
|
||||||
|
For information on everything Pelias can do, see our [documentation |
||||||
|
index](README.md). |
||||||
|
|
||||||
|
Happy geocoding! |
@ -0,0 +1,41 @@ |
|||||||
|
# Pelias Sofware requirements |
||||||
|
|
||||||
|
This is the list of all software requirements for Pelias. We highly recommend using our |
||||||
|
[Docker images](https://hub.docker.com/r/pelias/) to avoid having to even attempt to correctly |
||||||
|
install all our dependencies yourself. |
||||||
|
|
||||||
|
## Node.js |
||||||
|
|
||||||
|
Version 6 or newer |
||||||
|
|
||||||
|
Most Pelias code is written in Node.js. Node.js 8 is recommended. |
||||||
|
Node.js 10 is not as well tested with Pelias yet, but should offer notable performance increases and |
||||||
|
may become the recommendation soon. |
||||||
|
|
||||||
|
We will probably drop support for Node.js 6 in the near future, so that we can use the many features |
||||||
|
supported only in version 8 and above. |
||||||
|
|
||||||
|
## Elasticsearch |
||||||
|
|
||||||
|
Version 2.3 or 2.4 |
||||||
|
|
||||||
|
The core data storage for Pelias is Elasticsearch. We recommend the latest in the 2.4 release line. |
||||||
|
|
||||||
|
We do not _yet_ support Elasticsearch 5 or 6, but work is [ongoing](https://github.com/pelias/pelias/issues/461) |
||||||
|
|
||||||
|
## SQLite |
||||||
|
|
||||||
|
Version 3.11 or newerr |
||||||
|
|
||||||
|
Some components of Pelias need a relational database, and Elasticsarch does not provide good |
||||||
|
relational support. We use SQLite in these cases since it's simple to manage and quite performant. |
||||||
|
|
||||||
|
## Libpostal |
||||||
|
|
||||||
|
Pelias relies heavily on the [Libpostal](https://github.com/openvenues/libpostal#installation) |
||||||
|
address parser. Libpostal requires about 4GB of disk space to download all the required data. |
||||||
|
|
||||||
|
## Windows Support |
||||||
|
|
||||||
|
Pelias is not well tested on Windows, but we do wish to support it, and will accept patches to fix |
||||||
|
any issues with Windows support. |
@ -0,0 +1,62 @@ |
|||||||
|
# Pelias services |
||||||
|
|
||||||
|
A running Pelias installation is composed of several different services. Each service is well suited |
||||||
|
to a particular task. |
||||||
|
|
||||||
|
## Service Use Cases |
||||||
|
|
||||||
|
Here's a list of which services provide which features in Pelias. If you don't need everything Pelias |
||||||
|
does, you may be able to get by without installing and running all the Pelias services |
||||||
|
|
||||||
|
| Service | /v1/search | /v1/autocomplete | /v1/reverse | /v1/reverse (coarse) | Multiple language support (any endpoint) | |
||||||
|
| ------ | ----- | ----- | --------- | ------- | ----- | |
||||||
|
| API | **required** | **required** | **required** | **required** | **required** | |
||||||
|
| Placeholder | **required** | | | | **required** | |
||||||
|
| Libpostal | **required** | | | | | |
||||||
|
| PIP | | | recommended | **required** | | |
||||||
|
| Interpolation | optional | | | | | |
||||||
|
|
||||||
|
## Descriptions |
||||||
|
|
||||||
|
### [API](https://github.com/pelias/api) |
||||||
|
|
||||||
|
This is the core of Pelias. It talks to all other services (if available), Elasticsearch, and |
||||||
|
provides the interface for all queries to Pelias. |
||||||
|
|
||||||
|
### [Placeholder](https://github.com/pelias/placeholder) |
||||||
|
|
||||||
|
Placeholder is used specifically to handle the relational component of geocoding. Placeholder |
||||||
|
understands, for example, that Paris is a city in a country called France, but that there is another |
||||||
|
city called Paris in the state of Texas, USA. |
||||||
|
|
||||||
|
Placeholder also stores the translations of administrative areas in multiple languages. Therefore it |
||||||
|
is required if any support for multiple languages is desired. |
||||||
|
|
||||||
|
Currently, Placeholder is used only for forward geocoding on the `/v1/search` endpoint. In the |
||||||
|
future, it will also be used for autocomplete. |
||||||
|
|
||||||
|
### Libpostal |
||||||
|
|
||||||
|
Libpostal is a library that provides an address parser using a statistical natural language processing |
||||||
|
model trained on OpenStreetMap, OpenAddresses, and other open data. It is quite good at parsing |
||||||
|
fully specified input, but cannot handle autocomplete very well. |
||||||
|
|
||||||
|
The data required for Libpostal to run is around 3GB, and has to be loaded into memory, so this |
||||||
|
service is fairly expensive to run, even for small installations. |
||||||
|
|
||||||
|
Unlike the other Pelias services, we didn't actually write a Pelias Libpostal service. We recommend |
||||||
|
using the [go-whosonfirst-libpostal](https://github.com/whosonfirst/go-whosonfirst-libpostal) |
||||||
|
service created by the [Who's on First](https://whosonfirst.org) team. |
||||||
|
|
||||||
|
## [Point-in-Polygon (PIP)](https://github.com/pelias/pip-service) |
||||||
|
|
||||||
|
The PIP service loads polygon data representing the boundaries of cities, states, regions, countries |
||||||
|
etc into memory, and can perform calculations on that geometric data. Its used to determine if a |
||||||
|
given point lies in a particular polygon. Thus, it's highly recommended for reverse geocoding. |
||||||
|
|
||||||
|
## [Interpolation](https://github.com/pelias/interpolation) |
||||||
|
|
||||||
|
The interpolation service combines street geometries with known addresses and address ranges, to |
||||||
|
allow estimating the position of addresses that might exist, but aren't in existing open |
||||||
|
data sources. It is only used by the `/v1/search` endpoint, but [autocomplete support may be added in |
||||||
|
the future](https://github.com/pelias/interpolation/issues/131). |
Loading…
Reference in new issue