You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
but when I tried to create an instance, it failed. That's probably not surprising given the mismatches. But I'm trying to confirm it's a mismatch issue. First I found these Nexus log entry:
root@oxz_nexus_f45dcde1-e10d-440b-ade6-a9ab6f283652:~# looker -lwarn < $(svcs -L nexus)
...
17:44:04.961Z WARN f45dcde1-e10d-440b-ade6-a9ab6f283652 (ServerContext): Region requested, not yet created. Retrying in 314.103191ms
file = nexus/src/app/sagas/common_storage.rs:74
region = 9d805060-833f-4b77-8a06-4334795b85c1
saga_id = da8ba743-eb8f-4af1-81dd-5d4574fdda7f
saga_name = instance-create
17:44:04.961Z WARN f45dcde1-e10d-440b-ade6-a9ab6f283652 (ServerContext): Region requested, not yet created. Retrying in 186.191992ms
file = nexus/src/app/sagas/common_storage.rs:74
region = b5afc932-36ad-4491-b6df-339291703c27
saga_id = da8ba743-eb8f-4af1-81dd-5d4574fdda7f
saga_name = instance-create
17:44:04.961Z WARN f45dcde1-e10d-440b-ade6-a9ab6f283652 (ServerContext): Region requested, not yet created. Retrying in 270.103584ms
file = nexus/src/app/sagas/common_storage.rs:74
region = 204b1e81-6c7a-49e9-a084-2ab26f3f03a4
saga_id = da8ba743-eb8f-4af1-81dd-5d4574fdda7f
saga_name = instance-create
17:44:28.584Z ERRO f45dcde1-e10d-440b-ade6-a9ab6f283652 (ServerContext): received error from instance PUT
error = Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "a550b4ff-4230-4232-a58e-79c2f480c303", "content-length": "124", "date": "Fri, 01 Dec 2023 17:44:28 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "a550b4ff-4230-4232-a58e-79c2f480c303" }
file = nexus/src/app/instance.rs:1097
instance_id = 499b6192-97b2-42eb-85c4-8a6fa186241b
17:44:28.591Z ERRO f45dcde1-e10d-440b-ade6-a9ab6f283652 (ServerContext): attempted to set instance to Failed after bad put
file = nexus/src/app/instance.rs:1145
instance_id = 499b6192-97b2-42eb-85c4-8a6fa186241b
result = Ok(true)
17:45:43.968Z WARN f45dcde1-e10d-440b-ade6-a9ab6f283652 (dropshot_external): client disconnected before response returned
file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/ff87a01/dropshot/src/server.rs:927
local_addr = 172.30.2.6:443
method = GET
remote_addr = 172.20.17.110:63986
req_id = 8bcf9942-be5e-4193-a31c-fae19f2100ff
uri = https://recovery.sys.madrid.eng.oxide.computer/v1/vpcs/841268c9-9e92-4146-87ca-33c3945d9281
17:45:43.970Z WARN f45dcde1-e10d-440b-ade6-a9ab6f283652 (dropshot_external): client disconnected before response returned
file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/ff87a01/dropshot/src/server.rs:927
local_addr = 172.30.2.6:443
method = GET
remote_addr = 172.20.17.110:63986
req_id = 71ae4f24-8084-4860-833e-cd1c48911ed6
uri = https://recovery.sys.madrid.eng.oxide.computer/v1/vpc-subnets/6d76b3c6-3d39-41f0-bba0-c00eecf591ad
17:45:43.973Z WARN f45dcde1-e10d-440b-ade6-a9ab6f283652 (dropshot_external): client disconnected before response returned
file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/ff87a01/dropshot/src/server.rs:927
local_addr = 172.30.2.6:443
method = GET
remote_addr = 172.20.17.110:63986
req_id = ce31fc78-9868-4c0e-9991-e3648255c870
uri = https://recovery.sys.madrid.eng.oxide.computer/v1/instances/test-instance/external-ips?project=test-project
17:45:45.802Z ERRO f45dcde1-e10d-440b-ade6-a9ab6f283652 (dropshot_external): Error returned from handler: ServiceUnavailable { internal_message: "instance is in state InstanceState(Stopped) and has no active serial console server" }
file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/ff87a01/dropshot/src/websocket.rs:235
local_addr = 172.30.2.6:443
method = GET
remote_addr = 172.20.17.110:64017
req_id = e59d1de8-c2c3-45b6-8f8d-caeaf4c79d59
upgrade = websocket
uri = /v1/instances/test-instance/serial-console/stream?project=test-project&most_recent=262144
The 500 error from "instance PUT" seems like a good bet. But I can't tell from the log message what sled agent it was. I did find it by grep'ing for the req_id in all the sled agent logs and found:
Now I think this is saying that the sled agent returned a 500 error because it received a 500 error from something else. But I have no idea from this message what that other thing was. Short of grepping every log file in the system for e663bfe9-c174-4378-bac3-18e1583ae2d9, I'm not sure how to proceed.
omicron_common's Error's internal_context() (analogous to anyhow::Context) should make it easy to attach extra context here about what it was trying to do.
The text was updated successfully, but these errors were encountered:
I've got a system with #4595, where some crucible server components are mismatched with their clients. I managed to import an image by URL:
but when I tried to create an instance, it failed. That's probably not surprising given the mismatches. But I'm trying to confirm it's a mismatch issue. First I found these Nexus log entry:
The 500 error from "instance PUT" seems like a good bet. But I can't tell from the log message what sled agent it was. I did find it by grep'ing for the req_id in all the sled agent logs and found:
Now I think this is saying that the sled agent returned a 500 error because it received a 500 error from something else. But I have no idea from this message what that other thing was. Short of grepping every log file in the system for e663bfe9-c174-4378-bac3-18e1583ae2d9, I'm not sure how to proceed.
omicron_common's Error's
internal_context()
(analogous toanyhow::Context
) should make it easy to attach extra context here about what it was trying to do.The text was updated successfully, but these errors were encountered: