-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add MapillaryClient to download images from a latitude and longitude #12
feat: Add MapillaryClient to download images from a latitude and longitude #12
Conversation
Awesome! Thanks for working on this! Can you briefly describe how to test this? E.g. how does one provide the API token, etc? Maybe we can chat as a group at tomorrow's project night how this bit interacts with the rest of the toolchain. That is, does the code that loops through the collection of geographic points from the previous step go into this file or somewhere else, etc? |
Yeah, currently the API token is gotten from the environment variables. You could either set it manually with an I tested it just by running |
Thanks for working on this @dragonejt I am looking into this. So the module (creates_points.py) outputs a .zip file. Could you please add some documentation how a user would work with this file in your module (mapillary_client.py)? I think the outputs are clear but it doesn't hurt to add what the output is and where it would be stored. Again thanks so much for all your hard work on this. cc: @danbjoseph |
So as of right now the Mapillary client and create_points are not integrated at all. In order to use the client, you would likely import it into wherever it is needed, instantiate it, and call it with the latitude and longitude as is done in the dunder method. I have not read through create_points to figure out how to integrate it, but integration should probably be done by importing the Mapillary client in create_points. |
return pool.map(self.download_image, image_ids) | ||
|
||
def download_image(self, image_id: int) -> Path: | ||
mly.set_access_token(self.access_token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you authenticate already in __init__
, you shouldn't need to do it again, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I am using multiprocessing for concurrency, the access token is not set when the image downloads are run concurrently.
image_content = get(image_url).content | ||
with open(filepath, "wb") as image_file: | ||
image_file.write(image_content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to use requests.raise_for_status
in case of responses with error codes, like 404 or 500 or whatever. Otherwise, we'd end up writing the content of a 404 Not Found HTML page to the image.
https://requests.readthedocs.io/en/latest/user/quickstart/#response-status-codes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was a WIP so I was focused on just downloading the images.
self.log = logging.getLogger(self.__class__.__name__) | ||
self.log.addHandler(logging.StreamHandler()) | ||
self.log.setLevel(log_level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a strongly held view, but I personally like loguru a lot as a logging library. It has a lot of nice defaults and lets you avoid needing to set up boilerplate.
https://github.com/Delgan/loguru
Also, I've more commonly seen loggers initialized at the module level, and the level controlled with global log level settings. Attaching the logger to the client class feels unnecessary to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am more familiar with Java's Slf4J, which attaches the logger at the class level to provide the context class and method names at each log. Evidently, this did not happen with the self.__class__.__name__
, so I am open to suggestions.
|
||
|
||
class MapillaryClient: | ||
def __init__(self, access_token: str, log_level: int = logging.INFO, basepath: str = getcwd()) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using a convention of saving data to data/raw
so consider having a default base directory like data/raw/mapillary/
https://github.com/AmericanRedCross/street-view-green-view?tab=readme-ov-file#project-organization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have implemented the option to set a basepath
, so this should be possible by modifying the default basepath
|
||
|
||
if __name__ == "__main__": | ||
client = MapillaryClient(getenv("MAPILLARY_API_KEY")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to have a .env.example
file that contains
MAPILLARY_API_KEY=somefakeapikey
to help document that this is needed. This is a common convention that I've seen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would things like the default basepath
go in that file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is possible. Although maybe it should be renamed to image_path
or something since the scripts will be run separately.
Was able to run the code and download LOTS of images from. Some notes
|
I'm considering renaming this |
@dragonejt do you have a clear path forward to continue work on this? or do you need more group input? |
I haven't gotten response on the issue I opened about the I'm wondering if it would be better to just directly make REST API calls to Mapillary using requests, rather than using their Python SDK. |
Hey sorry I haven't been able to work on this recently. I think I'll reimplement the client with the API then. |
Changes
Add a
MapillaryClient
to call the Mapillary Python SDK to get some images from a latitude or longitude, by following these steps:Steps 3 and 4 execute concurrently through Python
multiprocessing
.Testing
Running
mapillary_client.py
's dunder method, which does the steps above for a selected longitude and latitude, succeeds in downloading images.