Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiled server timeout #96

Open
hannahker opened this issue Sep 1, 2023 · 11 comments
Open

Tiled server timeout #96

hannahker opened this issue Sep 1, 2023 · 11 comments
Labels
bug Something isn't working

Comments

@hannahker
Copy link
Collaborator

hannahker commented Sep 1, 2023

Have noticed some timeout errors in the Tiled https requests since updating the Tiled server version: httpx.ReadTimeout: The read operation timed out. This is present both locally and on the deployed app (deployed on Plotly's servers). Some more testing is needed to see if this can be reproduced reliably.

@hannahker hannahker added the bug Something isn't working label Sep 1, 2023
@hannahker
Copy link
Collaborator Author

FYI @Wiebke

@hannahker
Copy link
Collaborator Author

hannahker commented Sep 5, 2023

Hit again today multiple times while running app locally after panning around to view different slices of the lobster claw dataset.

Traceback (most recent call last):
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_backends/sync.py", line 28, in read
    return self._sock.recv(max_bytes)
  File "/Users/hannahker/miniconda3/lib/python3.9/ssl.py", line 1226, in recv
    return self.read(buflen)
  File "/Users/hannahker/miniconda3/lib/python3.9/ssl.py", line 1101, in read
    return self._sslobj.read(len)
socket.timeout: The read operation timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
    resp = self._pool.handle_request(req)
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 262, in handle_request
    raise exc
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 245, in handle_request
    response = connection.handle_request(request)
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/connection.py", line 96, in handle_request
    return self._connection.handle_request(request)
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 121, in handle_request
    raise exc
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 99, in handle_request
    ) = self._receive_response_headers(**kwargs)
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 164, in _receive_response_headers
    event = self._receive_event(timeout=timeout)
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 200, in _receive_event
    data = self._network_stream.read(
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_backends/sync.py", line 28, in read
    return self._sock.recv(max_bytes)
  File "/Users/hannahker/miniconda3/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/hannahker/Desktop/mlex/mlex_highres_segmentation/venv/lib/python3.9/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ReadTimeout: The read operation timed out

The above exception was the direct cause of the following exception:

httpx.ReadTimeout: The read operation timed out

@cleaaum
Copy link
Collaborator

cleaaum commented Sep 5, 2023

I can confirm this has been happening as well on my end during local development.

@cleaaum
Copy link
Collaborator

cleaaum commented Sep 6, 2023

And every once in a while I get this error:

Traceback (most recent call last):
  File "/Users/cleaaum/Documents/mlex_highres_segmentation/app.py", line 3, in <module>
    from components.control_bar import layout as control_bar_layout
  File "/Users/cleaaum/Documents/mlex_highres_segmentation/components/control_bar.py", line 4, in <module>
    from utils import data_utils
  File "/Users/cleaaum/Documents/mlex_highres_segmentation/utils/data_utils.py", line 106, in <module>
    client = from_uri(TILED_URI, api_key=API_KEY)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/constructors.py", line 61, in from_uri
    context, node_path_parts = Context.from_any_uri(
                               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/context.py", line 273, in from_any_uri
    context = cls(
              ^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/context.py", line 151, in __init__
    self.http_client.get(
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_client.py", line 1041, in get
    return self.request(
           ^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_client.py", line 814, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_client.py", line 901, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_client.py", line 929, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_client.py", line 966, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_client.py", line 1002, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/transport.py", line 85, in handle_request
    response = self.transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_transports/default.py", line 217, in handle_request
    with map_httpcore_exceptions():
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 8] nodename nor servname provided, or not known

@dylanmcreynolds
Copy link
Member

It sounds to me that read timeout is hitting it's default 5 seconds for these requests, and that we're not seeing it because we're closer to the service, maybe?

Can you try changing the default timeout? The tiled client uses the httpx package for it's http communication, and exposes the ability to set the timeout to something large as a test?

import httpx
client = from_uri("http://localhost:8000/", api_key=<key>, timeout=httpx.Timeout(60.0))

@cleaaum
Copy link
Collaborator

cleaaum commented Sep 13, 2023

Other (and third type of error) Ive been getting:

Traceback (most recent call last):
  File "/Users/cleaaum/Documents/mlex_highres_segmentation/app.py", line 4, in <module>
    from callbacks.control_bar import *
  File "/Users/cleaaum/Documents/mlex_highres_segmentation/callbacks/control_bar.py", line 28, in <module>
    from utils.data_utils import (
  File "/Users/cleaaum/Documents/mlex_highres_segmentation/utils/data_utils.py", line 107, in <module>
    client = from_uri(TILED_URI, api_key=API_KEY, timeout=httpx.Timeout(30.0))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/constructors.py", line 69, in from_uri
    return from_context(
           ^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/constructors.py", line 134, in from_context
    content = handle_error(
              ^^^^^^^^^^^^^
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/tiled/client/utils.py", line 18, in handle_error
    response.raise_for_status()
  File "/Users/cleaaum/opt/miniconda3/envs/lbl/lib/python3.11/site-packages/httpx/_models.py", line 749, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'https://mlex-segmentation.als.lbl.gov/api/v1/metadata/'
For more information check: https://httpstatuses.com/500

@dylanmcreynolds
Copy link
Member

Sorry about that. It looks like an issue in Tiled. I think that have gotten around it be reducing the number of Tiled pods in our setup to just 1. (If you're curious, I reported the issue here ).

Can you try again?

@hannahker
Copy link
Collaborator Author

@dylanmcreynolds @Wiebke I'm also still getting that 500 Internal Server Error mentioned above. We're also still encountering some timeouts and quite long wait times to retrieve data from the server. I've bumped up the Timeout param in the Tiled client, which has helped, but we're still finding this to be a blocker for development. It still seems to be intermittent -- sometimes things are snappy, other times it seems almost every request times out.

Are there any other strategies that we can look into to improve the consistency of performance on the Tiled server? Client-side caching might help a bit here.

@dylanmcreynolds
Copy link
Member

@hannahker, for the 500 errors, there is a clear path, but it will take some time, and I am unfortunately on the road next week.

For the timeouts and data rates, there are two strategies, one that we can tackle on the server, and one that you can probably tackle in dash.

  • Server caching...we can increase the cache size pretty high. The server side cache in Tiled can help with repetetive files reads, reducing the number of times the server has to open/close files. I have upped the caching size and turned on caching via these instruction:
object_cache:
  available_bytes: 2_000_000_000
  log_level: DEBUG
  • Client side caching....Dash is the client to Tiled. I'm working on a recommendation, and will update soon.

@hannahker
Copy link
Collaborator Author

We've addressed this for now by enhancing the data available when running a local Tiled server. This is a suitable workaround for our local development, but we're still hitting these issues frequently on the apps deployed to our servers, making it difficult to properly test our work in a deployed environment.

@Wiebke
Copy link
Member

Wiebke commented Oct 3, 2023

See additional comments on client-side caching in #133

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants