Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

260 add automatic processing of detected markings #269

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
77f2294
Add UI for downloading further analyses' results
mhhd2020 Jul 11, 2023
7c17ecc
Re-add QGIS to dependencies and add dummy function to process GeoJSON…
mhhd2020 Jul 11, 2023
1945d4b
Update count_overlaps.py: Add GeoJSON and QGIS project to ZIP
mhhd2020 Jul 11, 2023
e5eae98
Add QGIS scripts from Waterproofing Data project
mhhd2020 Jul 11, 2023
891da27
Update count_overlaps.py: Add dummy call of processing.run() and stor…
mhhd2020 Jul 11, 2023
e6a1edf
Revert "Update client-src\digitize\index.js: Fix #262"
mhhd2020 Jul 11, 2023
2c098ef
Update count_overlaps.py: Run overlap counting script
mhhd2020 Jul 11, 2023
230a127
Update count_overlaps.py: Add resulting layer to QGIS project
mhhd2020 Jul 11, 2023
27de854
Update count_overlaps.py: Give fitting names to GeoJSONs included in ZIP
mhhd2020 Jul 11, 2023
ec9aff2
Update count_overlaps.py: Add code to create a heatmap based on the o…
mhhd2020 Jul 12, 2023
00362ac
Add heatmaps to upload processing outputs
mhhd2020 Jul 12, 2023
0bd1ad2
Improve legend of heatmap
mhhd2020 Jul 12, 2023
8b60844
Update poetry.lock
mhhd2020 Jul 13, 2023
beb6877
Revert "Revert "Update client-src\digitize\index.js: Fix #262""
mhhd2020 Jul 13, 2023
d8b48dd
Run autoformatters and refactor
mhhd2020 Jul 13, 2023
28051ac
Correct typos
mhhd2020 Jul 13, 2023
b7fac83
Refactor count_overlaps.py and improve documentation
mhhd2020 Jul 13, 2023
705d1a8
Avoid duplicate marking detection
mhhd2020 Jul 13, 2023
d207164
Update routes.py
mhhd2020 Jul 14, 2023
65d4cfa
Refactor scripts
mhhd2020 Jul 15, 2023
d6a7ec9
Run autoformatters
mhhd2020 Jul 15, 2023
ee1b670
Update qgis_scripts\split_count_merge.py: Refactor
mhhd2020 Jul 15, 2023
081a988
wip
mhhd2020 Jul 22, 2023
79fdc62
wip
mhhd2020 Jul 22, 2023
2e8a9ec
wip
mhhd2020 Jul 23, 2023
16eb77a
wip
mhhd2020 Jul 23, 2023
15f2b90
Run autoformatters
mhhd2020 Jul 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
9 changes: 8 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ RUN mkdir -p /sketch_map_tool/static/bundles
RUN npm run build


FROM ubuntu:22.04
# currently based on docker image ubuntu:22.04
# this image comes with gdal preinstalled
FROM qgis/qgis:release-3_22

# install libzbar (neccessary for pyzbar to read the QR codes)
# install gdal
Expand All @@ -21,8 +23,12 @@ RUN apt-get update \
libzbar0 \
libgdal-dev \
libgl1 \
python3-qgis \
&& rm -rf /var/lib/apt/lists/*

# to prevent poetry from running into a version bug
RUN apt-get remove -y python3-distro-info

# update C env vars so compiler can find gdal
ENV CPLUS_INCLUDE_PATH=/usr/include/gdal
ENV C_INCLUDE_PATH=/usr/include/gdal
Expand All @@ -38,6 +44,7 @@ ENV PATH=$PATH:/home/smt/.local/bin

COPY --chown=smt:smt pyproject.toml pyproject.toml
COPY --chown=smt:smt poetry.lock poetry.lock
COPY --chown=smt:smt poetry.toml poetry.toml
COPY --chown=smt:smt setup.cfg setup.cfg
# install Python dependencies
RUN pip3 install --no-cache-dir poetry
Expand Down
2 changes: 2 additions & 0 deletions client-src/digitize-results/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@ import { getUUIDFromURL, poll, handleMainMessage } from "../shared";

const vectorResultsUrl = `/api/status/${getUUIDFromURL()}/vector-results`;
const rasterResultsUrl = `/api/status/${getUUIDFromURL()}/raster-results`;
const qgisResultsUrl = `/api/status/${getUUIDFromURL()}/qgis-data`;

Promise.all([
poll(rasterResultsUrl, "raster-data"),
poll(vectorResultsUrl, "vector-data"),
poll(qgisResultsUrl, "qgis-data"),
]).then(handleMainMessage);
2 changes: 1 addition & 1 deletion locust/locustfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def digitize(self):
digitize_post = self.client.post("/digitize/results", files=files)
digitize_uuid = digitize_post.url.split("/")[-1]
validate_uuid(digitize_uuid)
for result_type in ("raster-results", "vector-results"):
for result_type in ("raster-results", "vector-results", "qgis-data"):
download_url = self.status_loop(digitize_uuid, result_type)
request_name = "/api/download/[uuid]/{}".format(result_type)
result = self.client.get(download_url, name=request_name)
Expand Down
257 changes: 255 additions & 2 deletions poetry.lock

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions poetry.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[virtualenvs]
[virtualenvs.options]
system-site-packages = true
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ PyMuPDF = {extras = ["Pillow"], version = "^1.21.0"}
psycopg2 = "^2.9.5"
plotly = "^5.15.0"
kaleido = "0.2.1" # Not working with '^', cf. https://github.com/plotly/Kaleido/issues/125
geopandas = "^0.13.2"

# these dependencies are maintained by your local setup and have to be fixed for now, since poetry and (py)gdal packages can't work together
# if you change these versions, please change them in development-setup.md, Dockerfile and .github/workflows/python.yml as well
Expand Down
2 changes: 1 addition & 1 deletion sketch_map_tool/definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

# Types of requests
REQUEST_TYPES = Literal[
"quality-report", "sketch-map", "raster-results", "vector-results"
"quality-report", "sketch-map", "raster-results", "vector-results", "qgis-data"
]
# Colors to be detected
COLORS = ["red", "blue", "green", "yellow", "turquoise", "pink"]
Expand Down
26 changes: 23 additions & 3 deletions sketch_map_tool/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import geojson

# from celery import chain, group
from celery import chain
from flask import Response, redirect, render_template, request, send_file, url_for

from sketch_map_tool import celery_app, definitions
Expand All @@ -23,7 +23,11 @@
)
from sketch_map_tool.helpers import to_array
from sketch_map_tool.models import Bbox, PaperFormat, Size
from sketch_map_tool.tasks import digitize_sketches, georeference_sketch_maps
from sketch_map_tool.tasks import (
analyse_markings,
digitize_sketches,
georeference_sketch_maps,
)
from sketch_map_tool.validators import validate_type, validate_uuid


Expand Down Expand Up @@ -118,6 +122,7 @@ def digitize_results_post() -> Response:
uuids = [args_["uuid"] for args_ in args]
bboxes = [args_["bbox"] for args_ in args]
map_frames = dict()
map_frame_buffer = BytesIO()
for uuid in set(uuids): # Only retrieve map_frame once per uuid to save memory
map_frame_buffer = BytesIO(db_client_flask.select_map_frame(UUID(uuid)))
map_frames[uuid] = to_array(map_frame_buffer.read())
Expand All @@ -126,15 +131,25 @@ def digitize_results_post() -> Response:
.apply_async()
.id
)

map_frame_buffer.seek(0)
marking_detection_analyses_chain = chain(digitize_sketches.s(ids, file_names, uuids, map_frames, bboxes),
analyse_markings.s(bboxes, map_frame_buffer)).apply_async()

result_id_2 = (
digitize_sketches.s(ids, file_names, uuids, map_frames, bboxes).apply_async().id
marking_detection_analyses_chain.parent.id
)
result_id_3 = (
marking_detection_analyses_chain.id
)

# Unique id for current request
uuid = str(uuid4())
# Mapping of request id to multiple tasks id's
map_ = {
"raster-results": str(result_id_1),
"vector-results": str(result_id_2),
"qgis-data": str(result_id_3),
}
db_client_flask.set_async_result_ids(uuid, map_)
id_ = uuid
Expand Down Expand Up @@ -226,6 +241,11 @@ def download(uuid: str, type_: REQUEST_TYPES) -> Response:
download_name = type_ + ".geojson"
if task.successful():
file = BytesIO(geojson.dumps(task.get()).encode("utf-8"))
case "qgis-data":
mimetype = "application/zip"
download_name = type_ + ".zip"
if task.successful():
file = task.get()
return send_file(file, mimetype, download_name=download_name)


Expand Down
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
122 changes: 99 additions & 23 deletions sketch_map_tool/tasks.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
from io import BytesIO
from tempfile import NamedTemporaryFile
from typing import List, Tuple
from uuid import UUID
from zipfile import ZipFile

Expand All @@ -19,8 +21,10 @@
from sketch_map_tool.upload_processing import (
clean,
clip,
create_qgis_project,
detect_markings,
enrich,
generate_heatmaps,
georeference,
merge,
polygonize,
Expand Down Expand Up @@ -81,7 +85,10 @@ def generate_quality_report(bbox: Bbox) -> BytesIO | AsyncResult:


# 2. DIGITIZE RESULTS
#
# TODO: Avoid duplication in the three tasks, instead let the later ones wait for the first to finish to directly
# use intermediary results


@celery.task()
def georeference_sketch_maps(
file_ids: list[int],
Expand All @@ -95,7 +102,7 @@ def process(sketch_map_id: int, uuid: str, bbox: Bbox) -> BytesIO:

:param sketch_map_id: ID under which the uploaded file is stored in the database.
:param uuid: UUID under which the sketch map was created.
:bbox: Bounding box of the AOI on the sketch map.
:param bbox: Bounding box of the AOI on the sketch map.
:return: Georeferenced image (GeoTIFF) of the sketch map .
"""
# r = interim result
Expand Down Expand Up @@ -123,6 +130,34 @@ def zip_(files: list, file_names: list[str]) -> BytesIO:
)


def process_marking_detection(
sketch_map_id: int, name: str, bbox: Bbox, map_frame: NDArray
) -> FeatureCollection:
"""
Process a Sketch Map by extracting the markings on it.

:param sketch_map_id: ID under which the uploaded file is stored in the database.
:param name: Original name of the uploaded file.
:param bbox: Bounding box of the AOI on the sketch map.
:param map_frame: Image of the unmarked map frame.
:return: Feature collection containing all detected markings.
"""
r = db_client_celery.select_file(sketch_map_id)
r = to_array(r)
r = clip(r, map_frame)
r = prepare_img_for_markings(map_frame, r)
geojsons = []
for color in COLORS:
r_ = detect_markings(r, color)
r_ = georeference(r_, bbox)
r_ = polygonize(r_, color)
r_ = geojson.load(r_)
r_ = clean(r_)
r_ = enrich(r_, {"color": color, "name": name})
geojsons.append(r_)
return merge(geojsons)


@celery.task()
def digitize_sketches(
file_ids: list[int],
Expand All @@ -131,29 +166,70 @@ def digitize_sketches(
map_frames: dict[str, NDArray],
bboxes: list[Bbox],
) -> AsyncResult | FeatureCollection:
def process(
sketch_map_id: int, name: str, uuid: str, bbox: Bbox
) -> FeatureCollection:
"""Process a Sketch Map."""
# r = interim result
r = db_client_celery.select_file(sketch_map_id)
r = to_array(r)
r = clip(r, map_frames[uuid])
r = prepare_img_for_markings(map_frames[uuid], r)
geojsons = []
for color in COLORS:
r_ = detect_markings(r, color)
r_ = georeference(r_, bbox)
r_ = polygonize(r_, color)
r_ = geojson.load(r_)
r_ = clean(r_)
r_ = enrich(r_, {"color": color, "name": name})
geojsons.append(r_)
return merge(geojsons)

return merge(
[
process(file_id, name, uuid, bbox)
process_marking_detection(file_id, name, bbox, map_frames[uuid])
for file_id, name, uuid, bbox in zip(file_ids, file_names, uuids, bboxes)
]
)


@celery.task()
def analyse_markings(
markings_collection: FeatureCollection,
bboxes: list[Bbox],
map_frame_template: BytesIO,
) -> AsyncResult | BytesIO:
"""
Create a QGIS project containing the detected markings as layer and count the overlapping markings (from different
sketch maps) per colour. Include these overlap counts in an additional layer in the QGIS project and use them to
create heatmaps.

:param file_ids: IDs under which the uploaded sketch map images are stored in the database.
:param file_names: Original names of the uploaded files.
:param uuids: UUIDs under which the sketch maps have been created.
:param map_frames: Images of the unmarked map frames.
:param bboxes: Bounding boxes of the AOI on the sketch maps. Needs to be the same for all sketch maps the IDs of
which are given in 'file_ids'.
:param map_frame_template: Image of the map frame to be used as background for the heatmaps.
:return: ZIP file including
a) another ZIP file with a QGIS project file and the GeoJSONs for the marking and the overlap count
layers.
b) Images showing for each detected marking colour the overlap counts in a heatmap.
"""
if len(set(bboxes)) != 1:
raise ValueError(
"Because the map frame is used as background for the heatmaps, this process only works "
"when uploading sketch maps covering exactly the same area."
)

def zip_(qgis_project: BytesIO, heatmaps: List[Tuple[str, BytesIO]]) -> BytesIO:
buffer = BytesIO()
with ZipFile(buffer, "w") as zip_file:
zip_file.writestr("qgis_project.zip", qgis_project.read())
for colour, heatmap in heatmaps:
zip_file.writestr(f"heatmap_{colour}.jpg", heatmap.read())
buffer.seek(0)
return buffer

markings = geojson.dumps(
markings_collection
).encode("utf-8")
qgis_project, overlaps = create_qgis_project(BytesIO(markings))
geojson_overlaps_file = NamedTemporaryFile(suffix=".geojson")
map_frame_template_file = NamedTemporaryFile(suffix=".jpg")
with open(geojson_overlaps_file.name, "wb") as fw:
fw.write(overlaps.read())
with open(map_frame_template_file.name, "wb") as fw:
fw.write(map_frame_template.read())
return zip_(
qgis_project,
generate_heatmaps(
geojson_overlaps_file.name,
bboxes[0].lon_min,
bboxes[0].lat_min,
bboxes[0].lon_max,
bboxes[0].lat_max,
map_frame_template_file.name,
),
)