Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if ioc reboots, then WeTest should retry last stage test if the reason of the failure was "unable to connect to setter..." #56

Open
vnadot opened this issue Jun 8, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@vnadot
Copy link
Contributor

vnadot commented Jun 8, 2022

Description of the case:

  • Every test stage was a success up to stage "400". Then the ioc rebooted during stage "400". So WeTest raises an alarm on this test stage "unable to connect to setter...". Then the ioc restart and connection is regained. Finally WeTest keeps going on other test stages (800.0 and co) and it works fine.

What I'm expecting:

  • if communication is regained, retry stage "400" if the reason of the failure was "unable to connect to setter..."

image

@vnadot vnadot added bug Something isn't working enhancement New feature or request labels Jun 8, 2022
@gohierf
Copy link
Contributor

gohierf commented Jun 9, 2022

Hi,

How can WeTest know that the issue is that the IOC has rebooted ?

WeTest only interfaces with the PVs using caget and caput. I'm not sure how it could detect whether the IOC is restarting, or there is a network issue, or an issue with the PV name.

Aside from this specific situation, we could implement a feature that goes like this:

  • test as failed (using retry or not)
  • test is paused (using on_failure with pause to have user input)
  • the user can currently select to continue or abort, we could add the option to retry which is the same as continue but we run the last test again.

@gohierf
Copy link
Contributor

gohierf commented Jun 9, 2022

I fail to see how this issue is tagged as "bug", WeTest did try to set and get a PV which didn't aswer, that is the expected behavior isn't it ?

@gohierf gohierf removed the bug Something isn't working label Jun 9, 2022
@vnadot
Copy link
Contributor Author

vnadot commented Jun 13, 2022

I think that with pyepics you have some callback about the PV connection. I think that Stéphane is using that for gengiscan. If you are doing a test and get at the same time a connection issue (PV was connected before), you can just pause/retry the test.

I put the "bug" label because a PV connection issue is, to me, not a reason to make the test failed.

@stephane-cea
Copy link
Collaborator

Hi,

Actually with pyepics you might want to simply use the timeout parameter of the get(...) function, like so:

import epics

pv = epics.pv.get_pv("pv_name")
value = pv.get(timeout=timeout)

The timeout will indicate the maximum time to wait (in seconds I think) for data before returning None (more details here).

With a long enough timeout, would it be OK?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants