-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Mapillary Image downloader, add GeoPandas Parser #18
feat: Add Mapillary Image downloader, add GeoPandas Parser #18
Conversation
…equirements.txt with dependencies in pyproject.toml
|
Is this what you wanted for bullet number 3?
|
Yeah the Currently I extract the list of points (longitude/latitude) from the geopackage before sending them to the mapillary client, so by the time they get to the mapillary client they already don't have access to the original geopackage. Let me think of a way to send the entire geopackage over. |
@jayqi do you have any suggestions for how to track each point through this step so that the image details can be added back to the correct point feature in the geodata? |
[edited based on @jayqi response below] @dragonejt |
I strongly recommend that we write a new file. The workflow is much easier to reason about and reproduce if we treat the data outputs at each step as immutable, and the overall workflow as a directed acyclic graph (DAG). Here's some discussion about why this is a useful framing. |
So I did some experimentation with this, and GeoDataFrames only support having one column with geospatial data. This means that I cannot add the coordinates of the downloaded image as another column in the GeoDataFrame. Adding the image id and image path work fine as they are both strings. Do we want to ignore the coordinates of the image then, or encode it as strings/numbers? The error I was getting is |
Looks like you edited to remove this? We don't want to use an image more than once. In our Mapillary query we should be restricting the search to within 10m of the sample point and the sample points are 20 m apart. However, they are 20m apart along the road line features (not a plane) and roads may curve or 2 roads may be closer than 20m. Regardless of the above, because of the complexities of recording points on the lumpy, curved surface of the Earth and projecting to a 2D representation - I don't think we could rely on the bounding box to eliminate the possibilities of duplicate images. Can you calculate the distance from the sample point to the image point as you download the images. Then when you are done go through and for each image key that is matched to more than one sample point, throw out the image key for all but the sample point + image key with the shortest distance? The coordinates are in
For example, in the below mock-up (ignore the NULL values in the For the image point, can you encode it as a string/numbers? A WKT format string would look like |
For that, I tested again and there was an issue with my code, now it only uses one image per set of coordinates, and returns nothing if there isn't an image within that bounding box. I can take a look at finding the closest image, since currently it returns all images within the bounding box, and doesn't sort them by closeness to the original point. |
Hey Dan, so the current way I'm avoiding to have the same picture for multiple points is to store a set of all downloaded image IDs while I'm downloading, and filtering out the existing images before searching for the closest image. I think this is a simpler method and avoids the two passes, However, with this method, while that image may be the closest unique image for a point, that point may not be the closest point to that image. Is this method OK, or do you want to ensure that an image is tied to its closest point (probably requiring a second pass)? |
Evan, if you've got that method working, let's stick with it and continue on towards a completed MVP. We can log an issue to remember to take another look at the method later. Thanks! |
c70c34e
to
751bae0
Compare
Implementation changes have been committed, will do documentation changes sometime later |
i think using a |
I'll work on documentation changes today |
…_images.py command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a few things to revisit at a later date, but i think this is great to move forward with. thanks!
Changes
Point
).Points
from the.gpkg
GeoDataFrame file outputted bycreate_points.py
requirements.txt
with a dependencies section inpyproject.toml
, update all installation commands to refer topyproject.toml
Testing
python -m src.mapillary "[MAPILLARY_CLIENT_TOKEN]" [POINTS_FILE] [IMAGE_PATH]
succeeds on theThree_Rivers_Michigan_USA_points.gpkg
generated bycreate_points.py
and downloads images to the selected image pathRisks