Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Channels "time" never update / polling never finishes #86

Open
kingspride opened this issue Apr 22, 2024 · 6 comments
Open

Channels "time" never update / polling never finishes #86

kingspride opened this issue Apr 22, 2024 · 6 comments

Comments

@kingspride
Copy link
Contributor

kingspride commented Apr 22, 2024

Hello,

I have a question regarding the recceivers' behaviour.

see this excerpt from loglevel debug:

2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Polling begins...
2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Find existing channels by IOCID: 10.0.0.15:34980
2024-04-23T01:15:22+0200 [-] DEBUG:urllib3.connectionpool http://localhost:8080 "GET /ChannelFinder/resources/channels?iocid=10.0.0.15%3A34980 HTTP/1.1" 200 None
2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Find existing channels by name: FHIFEL:GagePort:Port1_value|FHIFEL:GagePort:Port2_value|FHIFEL:GagePort:Port1_count|FHIFEL:GagePort:Port2_count
2024-04-23T01:15:22+0200 [-] DEBUG:urllib3.connectionpool http://localhost:8080 "GET /ChannelFinder/resources/channels?~name=FHIFEL%3AGagePort%3APort1_value%7CFHIFEL%3AGagePort%3APort2_value%7CFHIFEL%3AGagePort%3APort1_count%7CFHIFEL%3AGagePort%3APort2_count HTTP/1.1" 200 None
2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Add existing channel with different IOC: {'name': 'FHIFEL:GagePort:Port1_value', 'owner': 'admin', 'properties': [{'name': 'hostName', 'owner': 'cfstore', 'value': 'fel03'}, {'name': 'iocName', 'owner': 'cfstore', 'value': 'iocGagePort'}, {'name': 'iocid', 'owner': 'cfstore', 'value': '10.0.0.15:34980'}, {'name': 'pvStatus', 'owner': 'cfstore', 'value': 'Active'}, {'name': 'time', 'owner': 'cfstore', 'value': '2024-04-23 01:15:22.941623'}, {'name': 'recceiverID', 'owner': 'cfstore', 'value': 'fel-recceiver'}, {'name': 'recordType', 'owner': 'cfstore', 'value': 'ai'}], 'tags': []}
2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Add existing channel with different IOC: {'name': 'FHIFEL:GagePort:Port2_value', 'owner': 'admin', 'properties': [{'name': 'hostName', 'owner': 'cfstore', 'value': 'fel03'}, {'name': 'iocName', 'owner': 'cfstore', 'value': 'iocGagePort'}, {'name': 'iocid', 'owner': 'cfstore', 'value': '10.0.0.15:34980'}, {'name': 'pvStatus', 'owner': 'cfstore', 'value': 'Active'}, {'name': 'time', 'owner': 'cfstore', 'value': '2024-04-23 01:15:22.941623'}, {'name': 'recceiverID', 'owner': 'cfstore', 'value': 'fel-recceiver'}, {'name': 'recordType', 'owner': 'cfstore', 'value': 'ai'}], 'tags': []}
2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Add existing channel with different IOC: {'name': 'FHIFEL:GagePort:Port1_count', 'owner': 'admin', 'properties': [{'name': 'hostName', 'owner': 'cfstore', 'value': 'fel03'}, {'name': 'iocName', 'owner': 'cfstore', 'value': 'iocGagePort'}, {'name': 'iocid', 'owner': 'cfstore', 'value': '10.0.0.15:34980'}, {'name': 'pvStatus', 'owner': 'cfstore', 'value': 'Active'}, {'name': 'time', 'owner': 'cfstore', 'value': '2024-04-23 01:15:22.941623'}, {'name': 'recceiverID', 'owner': 'cfstore', 'value': 'fel-recceiver'}, {'name': 'recordType', 'owner': 'cfstore', 'value': 'ai'}], 'tags': []}
2024-04-23T01:15:22+0200 [-] DEBUG:recceiver.cfstore Add existing channel with different IOC: {'name': 'FHIFEL:GagePort:Port2_count', 'owner': 'admin', 'properties': [{'name': 'hostName', 'owner': 'cfstore', 'value': 'fel03'}, {'name': 'iocName', 'owner': 'cfstore', 'value': 'iocGagePort'}, {'name': 'iocid', 'owner': 'cfstore', 'value': '10.0.0.15:34980'}, {'name': 'pvStatus', 'owner': 'cfstore', 'value': 'Active'}, {'name': 'time', 'owner': 'cfstore', 'value': '2024-04-23 01:15:22.941623'}, {'name': 'recceiverID', 'owner': 'cfstore', 'value': 'fel-recceiver'}, {'name': 'recordType', 'owner': 'cfstore', 'value': 'ai'}], 'tags': []}
2024-04-23T01:15:22+0200 [-] INFO:recceiver.cfstore Total channels to update: 4

what I understand that happens is:

  • recceiver server announces presence
  • reccaster client connects
  • client uploads records
  • server checks for existing matching channels in channelfinder, by "iocid".
  • does not find any since the client port changed since last time (iocid contains IP:Port)
  • updates channel completely "update channel with different IOC"
  • client and server keep connection up, and play pingpong
  • nothing else happens.

my question now is:
is it intended that the time, and pvStatus properties never update? why?
I noticed that the debug log states "Polling begins", but never "polling complete", or "update channel with same IOC".
shouldnt the time and pvStatus be regularly updated as some kind of heartbeat, and put a channel "inactive", when a client disconnects / disappears?

For example, I have a couple of channels in my channelfinder marked as active, but they are offline right now. You cant tell from the channelfinder and would assume they are usable.

I need some clarification :D

thanks in advance,
William

@jacomago
Copy link
Contributor

Hi William,

Yes, pvStatus is the status the last time the channel changed. The heartbeat (ping pong) is to update the status of the IOC to Inactive if the IOC goes down. I don't know if we ever want to change this @shroffk ?

If you have channels that are marked as active but they are down, then a possible situation is the ChannelFinder and/or recceiver was down when the ioc went down. You can use the cleanOnStart configuration in the recceiver to mark all channels Inactive on boot to avoid this problem. Otherwise you've likely found a bug :) If you can reproduce please let us know!

I hope that helps!
Sky

@kingspride
Copy link
Contributor Author

hm, okay.
I'll rewrite my custom webinterface / list to make clear its "last changed".

also, I have weird behavior on my channelfinder itself, as I just noticed...
It says it has 24753 channels in total, but when I try to get any channel after the 10,000th, I get an Error 500 and "all shards failed" in the log.

so far, channelfinder and all its surroundings are not very pleasant to maintain or use, unfortunately. :(
I appreciate the idea and concept but somehow a lot can go wrong and make stuff kind of unreliable...
sorry :)

@jacomago
Copy link
Contributor

@kingspride Don't worry I agree. We've had some long debug sessions of recsync up to ChannelFinder, it can get confusing. More help in making it more stable is always appreciated.

The 10 000 is a limit on the Elasticsearch side, but I'm surprised by the error. Can you give me more details on the problem in the ChannelFinderService repo?

@kingspride
Copy link
Contributor Author

kingspride commented Apr 23, 2024

sure. see you on the other side :)

ChannelFinder/ChannelFinderService#139

@shroffk
Copy link
Collaborator

shroffk commented Apr 23, 2024

is it intended that the time, and pvStatus properties never update? why?

The time is meant to reflect the last change on the IOC side, it should match when the IOC was last started.
The pvStatus should update when the connection between the IOC is lost... usually when IOC's go down it might be due to broader n/w issues where the recceiver also goes down, in these situations you might have Channels marked active when the IOC is not.
The hope is that when everything recovers then the data in CF will be corrected once the recceiver+IOC connections are re-established.

CF is trying to build central view over the distributed nature of EPICS which might have edge cases.

We are working on a newer version of recciever and contributions are welcome

@kingspride
Copy link
Contributor Author

thanks!

@jacomago I see you are working on fixing the cleaning operation since its affected by the 10k limit aswell.
Otherwise, I would have reported that here...

You can close the issue if you want, my questions are depleted :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants