From 5cd3e61d2051a5adb33e59cfdcf993a5eaa4f199 Mon Sep 17 00:00:00 2001 From: Julian Simioni Date: Thu, 29 Sep 2016 11:33:47 -0400 Subject: [PATCH 1/6] Update config and running instructions for importers More of them behave in the same way now, hooray! --- installing.md | 22 ++++++---------------- 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/installing.md b/installing.md index 1c3723a..72d64f5 100644 --- a/installing.md +++ b/installing.md @@ -221,6 +221,7 @@ The other major section, `imports`, defines settings for each importer. The defa }, "openaddresses": { "datapath": "/mnt/pelias/openaddresses", + "adminLookup": false, "files": [] }, "whosonfirst": { @@ -233,16 +234,6 @@ The other major section, `imports`, defines settings for each importer. The defa As you can see, the default datapaths are meant to be changed. This is also where you can enable admin lookup by overriding the default value. -Two caveats to this config section. First, the array structure of the OpenStreetMap `import` section -suggests you can specify multiple files to import. Unfortunately, you can't, although we'd like to -[support that in the future](https://github.com/pelias/openstreetmap/issues/55). - -Second, note that the OpenAddresses section does _not_ have an `adminLookup` flag. The OpenAddresses -importer only supports controlling this option by a command line flag currently. Again this is -something [we'd like to fix](https://github.com/pelias/openaddresses/issues/51). See the importer -[readme](https://github.com/pelias/openaddresses/blob/master/README.md) for details on how to -configure admin lookup and deduplication for OpenAddresses. - ### Install Elasticsearch Other than requiring Elasticsearch 2.3, nothing special in the Elasticsearch setup is required for @@ -292,13 +283,12 @@ reindex all your data after making schema changes. Now that the schema is set up, you're ready to begin importing data! Our [goal](https://github.com/pelias/pelias/issues/255) is that eventually you'll be able to run all -the importers with simply `cd $importer_directory; npm start`. Unfortunately only the Who's on First -and OpenStreetMap importers works that way right now. +the importers with simply `cd $importer_directory; npm start`. We are now really close, and all but +one importer follows this pattern! -For [Geonames](https://github.com/pelias/geonames/) and [OpenAddresses](https://github.com/pelias/openaddresses), -please see their respective READMEs, which detail the process of running them. By the way, we'd -love to see pull requests that allow them to read configuration from `pelias.json` like the other -importers. +That importer is the [Geonames](https://github.com/pelias/geonames/) importer, please see its README file +for the most up to date instructions. By the way, we'd love to see a pull request to allow it to +read configuration from `pelias.json` like the other impoters. Depending on how much data you've imported, now may be a good time to grab a coffee. Without admin lookup, the fastest speeds you'll see are around 10,000 records per second. With admin lookup, From e5bb770a6ff090d3a87d8baae4ade441c01dd382 Mon Sep 17 00:00:00 2001 From: Julian Simioni Date: Thu, 29 Sep 2016 11:38:52 -0400 Subject: [PATCH 2/6] Update info on Metro Extracts --- installing.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/installing.md b/installing.md index 72d64f5..5ae9f77 100644 --- a/installing.md +++ b/installing.md @@ -71,9 +71,9 @@ directory, or only selected files. OpenStreetMap has a nearly limitless array of download options, and any of them should work as long as they're in [PBF](http://wiki.openstreetmap.org/wiki/PBF_Format) format. Generally the files will have the extension `.osm.pbf`. Good sources include the [Mapzen Metro Extracts](https://mapzen.com/data/metro-extracts/) -(feel free to submit pull requests for additional cities or regions if needed), and planet files -listed on the [OSM wiki](http://wiki.openstreetmap.org/wiki/Planet.osm). A full planet PBF is about -36GB. +(which has popular cities available immediately, or custom areas that take only +a few minutes to build), and planet files listed on the [OSM wiki](http://wiki.openstreetmap.org/wiki/Planet.osm). +A full planet PBF file is about 36GB. ## Choose your import settings From 9fa2ba6f0547453450e0c4cfd51cb5708380a0bc Mon Sep 17 00:00:00 2001 From: Julian Simioni Date: Thu, 29 Sep 2016 11:39:04 -0400 Subject: [PATCH 3/6] Clarify size of OA data The compressed file is small but it extracts to be quite large --- installing.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/installing.md b/installing.md index 5ae9f77..441796e 100644 --- a/installing.md +++ b/installing.md @@ -61,10 +61,10 @@ instructions for downloading Geonames data automatically. Individual countries, ### OpenAddresses The OpenAddresses project includes [numerous download options](https://results.openaddresses.io/), -all of which are `.zip` downloads. The full dataset is just over 6 gigabytes compressed, but there -are numerous subdivision options. In any case, the `.zip` files simply need to be extracted to a -directory of your choice, and Pelias can be configured to either import every `.csv` in that -directory, or only selected files. +all of which are `.zip` downloads. The full dataset is just over 6 gigabytes compressed (the +extracted files are around 30GB), but there are numerous subdivision options. In any case, the +`.zip` files simply need to be extracted to a directory of your choice, and Pelias can be configured +to either import every `.csv` in that directory, or only selected files. ### OpenStreetMap From 0c8caea90111443960064c3ae2f1bcbf87fdb26d Mon Sep 17 00:00:00 2001 From: Katie Kowalsky Date: Fri, 30 Sep 2016 11:46:10 -0700 Subject: [PATCH 4/6] matching to style guide --- installing.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/installing.md b/installing.md index 441796e..58f86a2 100644 --- a/installing.md +++ b/installing.md @@ -280,15 +280,13 @@ reindex all your data after making schema changes. ### Run the importers -Now that the schema is set up, you're ready to begin importing data! +Now that the schema is set up, you're ready to begin importing data. Our [goal](https://github.com/pelias/pelias/issues/255) is that eventually you'll be able to run all -the importers with simply `cd $importer_directory; npm start`. We are now really close, and all but -one importer follows this pattern! +the importers with simply `cd $importer_directory; npm start`. That importer is the [Geonames](https://github.com/pelias/geonames/) importer, please see its README file -for the most up to date instructions. By the way, we'd love to see a pull request to allow it to -read configuration from `pelias.json` like the other impoters. +for the most up to date instructions. Depending on how much data you've imported, now may be a good time to grab a coffee. Without admin lookup, the fastest speeds you'll see are around 10,000 records per second. With admin lookup, From 5a6fcb70f2508495cfc91bd95e2436890518e803 Mon Sep 17 00:00:00 2001 From: Julian Simioni Date: Mon, 3 Oct 2016 12:17:09 -0400 Subject: [PATCH 5/6] Rewrite importing section This should follow the style guide but also give info about the different importers. --- installing.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/installing.md b/installing.md index 58f86a2..218d323 100644 --- a/installing.md +++ b/installing.md @@ -282,11 +282,17 @@ reindex all your data after making schema changes. Now that the schema is set up, you're ready to begin importing data. -Our [goal](https://github.com/pelias/pelias/issues/255) is that eventually you'll be able to run all -the importers with simply `cd $importer_directory; npm start`. +For all importers except for Geonames, you can start the import process with the `npm start` +command: -That importer is the [Geonames](https://github.com/pelias/geonames/) importer, please see its README file -for the most up to date instructions. +```bash +cd $importer_directory; npm start +``` + +For the [Geonames](https://github.com/pelias/geonames/) importer, please see its +[README](https://github.com/pelias/geonames/blob/master/README.md) file for the most up to date +instructions. We are working towards making all the importers have [the same interface](https://github.com/pelias/pelias/issues/255), +so hopefully you won't have to do anything special for Geonames soon. Depending on how much data you've imported, now may be a good time to grab a coffee. Without admin lookup, the fastest speeds you'll see are around 10,000 records per second. With admin lookup, From a07f5922dc5ebe29e648a5ed071cfc1b24f431a2 Mon Sep 17 00:00:00 2001 From: Julian Simioni Date: Mon, 3 Oct 2016 14:18:51 -0400 Subject: [PATCH 6/6] Even better wording for Geonames importer --- installing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/installing.md b/installing.md index 218d323..643acd6 100644 --- a/installing.md +++ b/installing.md @@ -292,7 +292,7 @@ cd $importer_directory; npm start For the [Geonames](https://github.com/pelias/geonames/) importer, please see its [README](https://github.com/pelias/geonames/blob/master/README.md) file for the most up to date instructions. We are working towards making all the importers have [the same interface](https://github.com/pelias/pelias/issues/255), -so hopefully you won't have to do anything special for Geonames soon. +so the Geonames importer will behave the same as the others soon. Depending on how much data you've imported, now may be a good time to grab a coffee. Without admin lookup, the fastest speeds you'll see are around 10,000 records per second. With admin lookup,