Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
API: rewrite fetching functions in script
download
"API" change for module `download`. The module `download` is not considered part of the API of the package `dd`. - API: rewrite `download.fetch()`: - API: rename parameter to `filename` (was named `fname`) - API: do not return any value (was returning the filename) - DOC: add docstring to `fetch()` - UI: print more detailed messages - BUG: catch a `None` that can be returned by the function `urllib.request.urlopen()` in rare circumstances. Quoting the [documentation][urlopen]: > Note that `None` may be returned if no handler handles the > request (though the default installed global `OpenerDirector` > uses `UnknownHandler` to ensure this never happens). - BUG: call the `close()` method of the [`http.client.HTTPResponse`][http_response] instance that is returned from the function [`urllib.request.urlopen()`][urlopen] Do so using a `with` statement, which [is supported by `HTTPResponse` objects][http_with] read [examples][howto_urllib2]. In order to handle `URLError` exceptions separately from local-file related exceptions, `urllib.request.urlopen()` is called within a `try` statement, and the response is later used in a `with` statement, within which the method [`HTTPResponse.read()`][http_read] is called. The `HTTPResponse` and opened file are used as two context managers within a single `with` statement, by writing two [`with_item`s][with_item]. - UI: catch `urllib.error.URLError` and chain it with a `RuntimeError` that points to relevant documentation. [PEP 3134](https://www.python.org/dev/peps/pep-3134/) introduced exception chaining. Exception chaining [happens automatically within `except` sections]( https://docs.python.org/3/tutorial/errors.html#exception-chaining), but the message differs from explicit exception chaining (i.e., `raise RuntimeError('...') from url_error`). This is why explicit exception chaining has been used. - API: check if CUDD tarball already downloaded, and with expected hash. If yes, then do not re-download. NOTE: if hash found different, raise an error, instead of re-downloading. - REF: extract part of function `download.fetch()` as the new function `download._assert_sha()` (which checks the SHA-256, and raises a more detailed exception message) ## Writing to a file before checking the hash Note that first writing the downloaded data to a file, and then reading the file into a `bytes` object, to check the hash value could be avoided, by instead using the `bytes` object returned by the method `HTTPResponse.read()` to check the hash, and then write the `bytes` object to a file. Nonetheless, first writing to a file, then reading from the file to check the hash facilitates diagnosing the causes of errors. For example, if the hash does not match, or any other exception is raised in Python code, the downloaded data has been already written to disk. ## `ConnectionError` upon reading Note that the method `HTTPResponse.read()` can raise a [`ConnectionError`][connection_error]. This is not expected to happen in the script `download`, because `read()` is called almost immediately after the `HTTPResponse` is created. More details are described next. From experimenting outside the script `download`, a [`ConnectionResetError`][connetion_reset_error] is observed when a relatively long time interval ensues between the call to the function `urllib.request.urlopen()`, and the call to `HTTPResponse.read()` of the `HTTPResponse` object that has been returned by `urlopen()`. ## About importing `urllib.request` Note that `import urllib` does not import `urllib.request`. Within `ipython`, `urllib.request` *is* imported upon startup. [http_response]: https://docs.python.org/3/library/http.client.html#http.client.HTTPResponse [http_with]: https://docs.python.org/3/library/http.client.html#httpresponse-objects [http_read]: https://docs.python.org/3/library/http.client.html#http.client.HTTPResponse.read [urlopen]: https://docs.python.org/3/library/urllib.request.html#urllib.request.urlopen [howto_urllib2]: https://docs.python.org/3/howto/urllib2.html [connection_error]: https://docs.python.org/3/library/exceptions.html#ConnectionError [connetion_reset_error]: https://docs.python.org/3/library/exceptions.html#ConnectionResetError [with_item]: https://docs.python.org/3/reference/compound_stmts.html#the-with-statement
- Loading branch information