Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Mapillary Image downloader, add GeoPandas Parser #18

Merged
merged 7 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,11 @@ jobs:
cache: "pip"
cache-dependency-path: |
pyproject.toml
requirements.txt

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install .

- name: Lint package
run: |
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ endif
## Install Python Dependencies
requirements: test_environment
$(PYTHON_INTERPRETER) -m pip install -U pip setuptools wheel
$(PYTHON_INTERPRETER) -m pip install -r requirements.txt
$(PYTHON_INTERPRETER) -m pip install .

## Make Dataset
data: requirements
Expand Down
21 changes: 16 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ If you are interested in joining the project, please check out [`CONTRIBUTING.md
- You can use the shortcut command `make create_environment`.
2. Install requirements.
```bash
pip install -r requirements.txt
pip install .
```
- You can use the shortcut command `make requirements` to do the same thing.
3. Put your raw OpenStreetMaps road vector data in `data/raw`.
Expand Down Expand Up @@ -69,6 +69,20 @@ python -m src.create_points --help

Both the input files and output files support any file formats that geopandas supports, so long as it can correctly infer the format from the file extension. See the [geopandas documentation](https://geopandas.org/en/stable/docs/user_guide/io.html) for more details.

### 2. Download an image for each point

We want to fetch a 360 image for each sampled point. You can use the [`mapillary.py`](./src/mapillary.py) script to find the closest image to each point and download it to local file storage.

#### Example

For example, if you're continuing from the example in previous steps and already generated a `Three_Rivers_Michigan_USA_points.gpkg` file:

```bash
python -m src.mapillary "[MAPILLARY_CLIENT_TOKEN]" data/interim/Three_Rivers_Michigan_USA_points.gpkg data/interim/images/
```

To download images from [Mapillary](https://www.mapillary.com/) you will need to create a (free) account and replace `[MAPILLARY_CLIENT_TOKEN]` with your own token. See the "Setting up API access and obtaining a client token" section on this [Mapillary help page](https://help.mapillary.com/hc/en-us/articles/360010234680-Accessing-imagery-and-data-through-the-Mapillary-API). You only need to enable READ access scope on your token.

## Project Organization

├── LICENSE
Expand All @@ -83,10 +97,7 @@ Both the input files and output files support any file formats that geopandas su
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
├── setup.py <- makes project pip installable (pip install -e .) so src can be imported
├── pyproject.toml <- Single source of truth for dependencies, build system, etc
└── src <- Source code for use in this project.
   └── __init__.py <- Makes src a Python module

Expand Down
12 changes: 12 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,18 @@ authors = [{ name = "The American National Red Cross" }]
classifiers = [
]

dependencies = [
"folium",
"geopandas",
"mapclassify",
"matplotlib",
"numpy",
"ruff",
"requests",
"shapely",
"typer"
]
danbjoseph marked this conversation as resolved.
Show resolved Hide resolved

## TOOLS ##

[tool.ruff]
Expand Down
14 changes: 0 additions & 14 deletions requirements.txt

This file was deleted.

Empty file added src/data_parsing/__init__.py
Empty file.
22 changes: 22 additions & 0 deletions src/data_parsing/geopandas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import logging
from pathlib import Path

import geopandas as gpd
from shapely.geometry import Point

log = logging.getLogger(__name__)
log.setLevel(logging.INFO)
log.addHandler(logging.StreamHandler())


class GeoPandasParser:
def __init__(self, gpkg_path: Path):
self.gdf = gpd.read_file(gpkg_path)
log.debug(self.gdf)

def get_coordinates(self) -> list[Point]:
log.info("Get Coordinates")
points = self.gdf["geometry"]
log.debug(points)

return list(points)
145 changes: 145 additions & 0 deletions src/mapillary.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
import json
import logging
from multiprocessing import Pool
from pathlib import Path
from typing import Annotated, Optional

from requests import RequestException, Session
from requests.adapters import HTTPAdapter
from shapely import Point
from typer import Argument, Option, Typer

from src.data_parsing.geopandas import GeoPandasParser

log = logging.getLogger(__name__)
log.setLevel(logging.INFO)
log.addHandler(logging.StreamHandler())
app = Typer()


class Mapillary:
url = "https://graph.mapillary.com/images"

def __init__(
self,
access_token,
basepath=Path(Path(__file__).parent.parent, "data/raw/mapillary"),
):
self.access_token = access_token
self.basepath = basepath
self.basepath.mkdir(parents=True, exist_ok=True)
self.client = Session()
self.client.mount("https://", HTTPAdapter(max_retries=3))
dragonejt marked this conversation as resolved.
Show resolved Hide resolved

def get_image_from_coordinates(self, point: Point) -> dict:
longitude, latitude = point.x, point.y
log.info("Get Image From Coordinates: %s, %s", latitude, longitude)
try:
response = self.client.get(
self.url,
params={
"access_token": self.access_token,
"fields": "id,thumb_original_url",
"is_pano": "true",
"bbox": self._bounds(latitude, longitude),
},
)
response.raise_for_status()
except RequestException as e:
log.error(e)
return {
"latitude": latitude,
"longitude": longitude,
"image_id": None,
"image_path": None,
}

images = response.json()["data"]
log.debug("Successfully Retrieved Image Data: %s", images)
if len(images) == 0:
log.debug(
"No Images in Bounding Box: %s", self._bounds(latitude, longitude)
)
return {
"latitude": latitude,
"longitude": longitude,
"image_id": None,
"image_path": None,
}

image_id = images[0]["id"]
image_url = images[0]["thumb_original_url"]
image_path = self._download_image(image_url, image_id)

return {
"latitude": latitude,
"longitude": longitude,
"image_id": image_id,
"image_path": str(image_path),
}

def _bounds(self, latitude, longitude) -> str:
left = longitude - 10 / 111_111
bottom = latitude - 10 / 111_111
right = longitude + 10 / 111_111
top = latitude + 10 / 111_111
return f"{left},{bottom},{right},{top}"

def _download_image(self, image_url, image_id) -> Optional[Path]:
log.info("Downloading Image: %s", image_id)
try:
response = self.client.get(image_url, stream=True)
response.raise_for_status()
except RequestException as e:
log.error(e)
return None
image_content = response.content
log.debug("Successfully Retrieved Image: %s", image_id)
image_path = Path(self.basepath, f"{image_id}.jpeg")
log.debug("Writing Image To: %s", image_path)

if not image_path.is_file():
with open(image_path, "wb") as img:
img.write(image_content)
log.debug("Successfully Wrote Image: %s", image_path)

return image_path


@app.command()
def main(
client_token: Annotated[
str,
Argument(help="Mapillary Client Token from Developer Portal"),
dragonejt marked this conversation as resolved.
Show resolved Hide resolved
],
points_file: Annotated[
Path,
Argument(help=("Path to Input Points File")),
],
image_path: Annotated[
Path,
Argument(help="Folder to Write Image Data"),
] = Path(Path(__file__).parent.parent, "data/raw/mapillary"),
verbose: Annotated[bool, Option] = False,
):
if verbose:
log.setLevel(logging.DEBUG)

if points_file.suffix == ".gpkg":
parser = GeoPandasParser(points_file)
else:
raise ValueError(f"Unsupported File Extension: {points_file.suffix}")

mapillary = Mapillary(client_token, image_path)
coordinates = parser.get_coordinates()

with Pool() as pool:
image_data = list(pool.map(mapillary.get_image_from_coordinates, coordinates))
log.debug(image_data)
dragonejt marked this conversation as resolved.
Show resolved Hide resolved

with open(Path(image_path, "image_data.json"), "w") as f:
json.dump(image_data, f)


if __name__ == "__main__":
app()