Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added instructions on installing psycopg2 using psycopg2-binary #239

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
252 changes: 144 additions & 108 deletions backend/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,130 +20,153 @@
git clone https://github.com/hotosm/raw-data-api.git && cd backend
```

- Install Requirements
## Installation Guide

Install [psycopg2](https://pypi.org/project/psycopg2/), [osmium](https://pypi.org/project/osmium/) and [dateutil](https://pypi.org/project/python-dateutil/) , wget in your python env . You can install using `requirements.txt` too
### psycopg2

In order to install psycopg2, you need to install the psycopg2-binary package. You can do this by running:

```bash
pip install psycopg2-binary
```

Alternatively, you can refer to the official documentation for psycopg2 installation https://www.psycopg.org/docs/install.html#build-prerequisites , which provides detailed information about the build prerequisites and installation steps.

## Install Requirements

Install [psycopg2](https://pypi.org/project/psycopg2/), [osmium](https://pypi.org/project/osmium/) and [dateutil](https://pypi.org/project/python-dateutil/) , wget in your python env . You can install using `requirements.txt` too

```

pip install -r requirements.txt

```

- Start the Process

You can either export your db params as env variables or pass to script , or create .env and hit `source .env`

```

export PGHOST=localhost
export PGPORT=5432
export PGUSER=admin
export PGPASSWORD=admin
export PGDATABASE=postgres

```

Choose your source for the Import

> You can Download Planet pbf file [Here](https://planet.osm.org/pbf/) or Use Geofabrik Pbf file [Here](https://osm-internal.download.geofabrik.de/index.html) with full metadata (Tested with .pbf file) , or pass download link to script itself . Follow -h help

- Test with the sample data

```
pip install -r requirements.txt
python raw_backend --insert
```

- Start the Process
> This will insert the sample data provided on the code , You can check your backend setup with this

You can either export your db params as env variables or pass to script , or create .env and hit `source .env`
- For eg : To import Turkey and Enable replication later on

```
export PGHOST=localhost
export PGPORT=5432
export PGUSER=admin
export PGPASSWORD=admin
export PGDATABASE=postgres
python raw_backend --insert --replication --source https://download.geofabrik.de/europe/turkey-latest.osm.pbf --country 127
```

Choose your source for the Import

> You can Download Planet pbf file [Here](https://planet.osm.org/pbf/) or Use Geofabrik Pbf file [Here](https://osm-internal.download.geofabrik.de/index.html) with full metadata (Tested with .pbf file) , or pass download link to script itself . Follow -h help

- Test with the sample data

```
python raw_backend --insert
```

> This will insert the sample data provided on the code , You can check your backend setup with this

- For eg : To import Turkey and Enable replication later on

```
python raw_backend --insert --replication --source https://download.geofabrik.de/europe/turkey-latest.osm.pbf --country 127
```

> Here Insert option will do the import after downloading osm.pbf file from source , You can supply filepath of already downloaded file to --source as well . Replication will enable and Prepare the tables for replication and start the replication right away until data becomes now , Country parameter is read from fid of /backend/countries which will make sure to only maintains the replication data for that country


- Import your osm file and run replication for only your custom geojson

```
python raw_backend --replication --boundary pokhara.geojson
```
Example of geojson :
```
{
"type": "Feature",
"properties": {},
"geometry": {
"coordinates": [
[
[
83.90895770965955,
28.279633888511327
],
[
83.90895770965955,
28.131383546395526
],
[
84.10168708213502,
28.131383546395526
],
[
84.10168708213502,
28.279633888511327
],
[
83.90895770965955,
28.279633888511327
]
]
],
"type": "Polygon"
}
}
```
> Here Insert option will do the import after downloading osm.pbf file from source , You can supply filepath of already downloaded file to --source as well . Replication will enable and Prepare the tables for replication and start the replication right away until data becomes now , Country parameter is read from fid of /backend/countries which will make sure to only maintains the replication data for that country

- Import your osm file and run replication for only your custom geojson

```
python raw_backend --replication --boundary pokhara.geojson
```

To Run Replication only
Example of geojson :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here on All the installation instructions are moved to code block . Do verify that and make sure it is readable. You can preview your file and you will notice something like this
image


```
python raw_backend --replication
{
"type": "Feature",
"properties": {},
"geometry": {
"coordinates": [
[
[
83.90895770965955,
28.279633888511327
],
[
83.90895770965955,
28.131383546395526
],
[
84.10168708213502,
28.131383546395526
],
[
84.10168708213502,
28.279633888511327
],
[
83.90895770965955,
28.279633888511327
]
]
],
"type": "Polygon"
}
}
```

> By default this command will run replciation until data becomes up to date and exit ! You can run this script on your custom frequency by specifying your cron / prefeered way to wake the script do the job and sleep
To Run Replication only

````

python raw_backend --replication

Options to Run the Script :
```

```
-h, --help show this help message and exit
--source SOURCE Data source link or file path
--host HOST DB host
--port PORT DB port
--user USER DB user
--password PASSWORD DB password
--database DATABASE DB name
--include_ref Include ref in output tables
--replication Prepare tables for replication and Runs Replication
--country COUNTRY id of the country , if you are loading country , it will filter replication data
--boundary Takes geojson file path or geojson string itself to keep replication within the region
--insert Run osm2pgsql to insert data , Initial Creation Step
--update Run Update on table fields for country info
--download_dir DOWNLOAD_DIR
The directory to download the source file to
--post_index Run Post index only on table
```
> By default this command will run replciation until data becomes up to date and exit ! You can run this script on your custom frequency by specifying your cron / prefeered way to wake the script do the job and sleep



Options to Run the Script :

If you are interested on Manual setup find Guide [here](./Manual.md)
```

-h, --help show this help message and exit
--source SOURCE Data source link or file path
--host HOST DB host
--port PORT DB port
--user USER DB user
--password PASSWORD DB password
--database DATABASE DB name
--include_ref Include ref in output tables
--replication Prepare tables for replication and Runs Replication
--country COUNTRY id of the country , if you are loading country , it will filter replication data
--boundary Takes geojson file path or geojson string itself to keep replication within the region
--insert Run osm2pgsql to insert data , Initial Creation Step
--update Run Update on table fields for country info
--download_dir DOWNLOAD_DIR
The directory to download the source file to
--post_index Run Post index only on table

```

If you are interested on Manual setup find Guide [here](./Manual.md)

## Running the backend service via Systemd

- Create a systemd unit file for raw-data-backend service

```

$ sudo systemctl edit --full --force raw-data-backend.service

```

```

[Unit]
Description=Raw Data Backend Service
Documentation=https://github.com/hotosm/raw-data-api/blob/develop/backend/Readme.md
Expand All @@ -166,43 +189,50 @@ WatchdogSec=43200

[Install]
WantedBy=multi-user.target

```
- Start Your service and look at the status
```

$ sudo systemctl start raw-data-backend.service
$ sudo systemctl status raw-data-backend.service

```

```

● raw-data-backend.service - Raw Data Backend Service
Loaded: loaded (/etc/systemd/system/raw-data-backend.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-02-13 14:30:03 UTC; 4min 25s ago
Docs: https://github.com/hotosm/raw-data-api/blob/develop/backend/Readme.md
Main PID: 50561 (python)
Tasks: 9 (limit: 4700)
Memory: 94.7M
CPU: 14.996s
CGroup: /system.slice/raw-data-backend.service
β”œβ”€50561 /opt/raw-data-api/backend/venv/bin/python app --replication --run_minutely
β”œβ”€50563 python /opt/raw-data-api/backend/replication update -s raw.lua --max-diff-size 10
└─50704 osm2pgsql --append --slim --prefix planet_osm --output=flex --extra-attributes --style raw.lua -d app_backend -U adm_app_backend -H rawdat.postgres.database.azure.com >
Loaded: loaded (/etc/systemd/system/raw-data-backend.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-02-13 14:30:03 UTC; 4min 25s ago
Docs: https://github.com/hotosm/raw-data-api/blob/develop/backend/Readme.md
Main PID: 50561 (python)
Tasks: 9 (limit: 4700)
Memory: 94.7M
CPU: 14.996s
CGroup: /system.slice/raw-data-backend.service
β”œβ”€50561 /opt/raw-data-api/backend/venv/bin/python app --replication --run_minutely
β”œβ”€50563 python /opt/raw-data-api/backend/replication update -s raw.lua --max-diff-size 10
└─50704 osm2pgsql --append --slim --prefix planet_osm --output=flex --extra-attributes --style raw.lua -d app_backend -U adm_app_backend -H rawdat.postgres.database.azure.com >

Feb 13 14:30:03 raw-data-backend-production systemd[1]: Started Raw Data Backend Service.
Feb 13 14:30:04 raw-data-backend-production python[50562]: 2023-02-13 14:30:04 [INFO]: Initialised updates for service 'https://planet.openstreetmap.org/replication/minute'.
Feb 13 14:30:04 raw-data-backend-production python[50562]: 2023-02-13 14:30:04 [INFO]: Starting at sequence 5348603 (2022-12-06 00:59:10+00:00).
Feb 13 14:30:05 raw-data-backend-production python[50563]: 2023-02-13 14:30:05 [INFO]: Using replication service 'https://planet.openstreetmap.org/replication/minute'. Current sequence 5348603 (2>
Feb 13 14:30:10 raw-data-backend-production python[50704]: 2023-02-13 14:30:10 osm2pgsql version 1.6.0
Feb 13 14:30:10 raw-data-backend-production python[50704]: 2023-02-13 14:30:10 Database version: 14.6
Feb 13 14:30:10 raw-data-backend-production python[50704]: 2023-02-13 14:30:10 PostGIS version: 3.2
Feb 13 14:30:10 raw-data-backend-production python[50704]: 2023-02-13 14:30:10 osm2pgsql version 1.6.0
Feb 13 14:30:10 raw-data-backend-production python[50704]: 2023-02-13 14:30:10 Database version: 14.6
Feb 13 14:30:10 raw-data-backend-production python[50704]: 2023-02-13 14:30:10 PostGIS version: 3.2

```

- Setup your every minute update timer
- Setup your every minute update timer
```

$ sudo systemctl edit --full --force raw-data-backend.timer

```

```

[Unit]
Description=Trigger a rawdata database update

Expand All @@ -212,11 +242,17 @@ OnUnitActiveSec=5min

[Install]
WantedBy=timers.target

```

- Enable timer and reload your systemctl

```

$ sudo systemctl enable raw-data-backend.timer
$ sudo systemctl daemon-reload

```

```
````