Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access zwave stick after upgrade to 9.0.1 under Bookworm via Snap #3311

Closed
3 tasks done
jfhautenauven opened this issue Sep 29, 2023 · 51 comments
Closed
3 tasks done
Labels
question Further information is requested

Comments

@jfhautenauven
Copy link

Checklist

  • I am not using Home Assistant. Or: a developer has told me to come here.
  • I have checked the troubleshooting section and my problem is not described there.
  • I have read the changelog and my problem is not mentioned there.

Deploy method

Snap

Z-Wave JS UI version

9.0.1

ZwaveJS version

12.0.0

Describe the bug

image

After installing the latest version available on the candidate channel, I get an error where my stick cannot be accessed anymore.
Seems to be something related to the snap I think.

If I install using the snap --devmode attribute, then the problem disappears.

To Reproduce

Have a Debian fresh install of bookwork.
Install ZwavejsUi via Snap package manager.
(Of course have a plugged in stick)

Bam, error accessing the stick over serial

Expected behavior

I should be able to upgrade without loosing the connection to my zwave stick :)
@jmgiaever : I think that stuff has already been bubbled up via other channels, but just to make sure, I add you in copy the issue

Additional context

No response

@jfhautenauven jfhautenauven added the bug Something isn't working label Sep 29, 2023
@jfhautenauven
Copy link
Author

erratum : the devmode trick that used to work in the past now doesn't work anymore, same error message ... i'm screwed :) HELP :D

@jfhautenauven
Copy link
Author

some progress : devmode flag active and disabling "soft reset" seems to have done the trick... not optimal, but at least I got my setup working again :)

@jmgiaever
Copy link
Contributor

Hi,

Have you tried to cold boot your machine? Also unplug the USB device.

@jmgiaever
Copy link
Contributor

some progress : devmode flag active and disabling "soft reset" seems to have done the trick... not optimal, but at least I got my setup working again :)

Have you tried just to disable soft reset? Without using devmode...

@robertsLando
Copy link
Member

I think the issue is related to: zwave-js/node-zwave-js#6341

@jfhautenauven
Copy link
Author

Hi,

Have you tried to cold boot your machine? Also unplug the USB device.

Hi,

Yeah, cold boot gave no result. Also tried plugging and unplugging, but no result either.

I'll try again with just the soft reset disabled. Gimme a few minutes :)

@robertsLando
Copy link
Member

@jfhautenauven It's related to soft reset, check the issue I linked above. Should be fixed on next zui version

@jfhautenauven
Copy link
Author

@jfhautenauven It's related to soft reset, check the issue I linked above. Should be fixed on next zui version

@robertsLando : indeed, just tried with soft reset disabled (and devmode disabled), works like a charm.
So indeed, seems soft reset related. Will check again next version.

@jmgiaever : sorry for raising a false alarm, it reminded a previous issue, but was unrelated, sorry for that :)

@sylvaindd
Copy link

I'm facing this issue as well, I have a Aeotec Gen5.
Disabling the Soft Reset did the trick for me as well.

Will this be a problem in the future ? Does changing to a more recent controller would fix it ?
I'm running ESXi 8 -> Debian -> Docker -> ZwaveJS

@robertsLando
Copy link
Member

From: zwave-js/node-zwave-js#6341

There are a few known solutions to this:

  1. Prefer /dev/serial/by-id/... paths over /dev/tty... in case the path changes after reconnecting
  2. When using the Aeotec Gen5 or Gen5+, updating the firmware to 1.2 can help, plus you get SmartStart support for free.
  3. If you're using ESXi, updating to 7.0u1 can help.
  4. For some other VMs, this document contains instructions for properly setting up USB passthrough.
  5. Expose the serial port via TCP, either from the host or a different device.
  6. As a last measure, soft-reset can be disabled in the Z-Wave JS UI settings. The stick may need to be re-plugged before it starts working again. It is preferable to try the other solutions first, since being able to restart the stick has the aforementioned upsides.

@jmgiaever
Copy link
Contributor

The package in candidate is now updated, for those who want to give it a try.

@jfhautenauven
Copy link
Author

yup, gave it a try, version 9.0.3.

So, to summarize and also answer to @robertsLando :

Stick is in FW v1.2
Followed / checked the guide about Synology VM

Still a NOGO.
I'll fully reboot the synology just to make sure.

So : soft reset enabled causes my stick to "freeze" and not be detected anymore.
Could be related to the particularities of the Synology environment... at this stage i'm not so sure how / what logs to provide to help investigate this matter further.

@robertsLando
Copy link
Member

@jfhautenauven just keep soft reset disabled so, dunno if @AlCalzone has any other suggestions but I don't think there is much else to try

@jfhautenauven
Copy link
Author

Okay so:

  • Disable soft reset.
  • Shutdown VM.
  • Reboot synology
  • Start up VM via Virtual Machine Manager (syno)
  • Go to ZWJSUI, all works, driver initialized, boots up network, all works

Then

  • go to settings and enable soft reset
  • hit Save
  • back on the equipments page ==> no more sign of life, no error message in the tiny "status" thingy on the upper right corner
  • try going to newtork view page, loads infinitely (turning circle never ends)
  • going to the "provisioning entities" page : it displays an error message saying the Zwave client is not connected
  • going to settings again and hitting the save button again, there i get an error message displaying in the upper right corner status things saying the serial port cannot be opened because it doesnt exist

From there :

  • First attempt : restart the VM via Synology Virtual Machine Manager ==> gives two error messages : first one displayed for a few seconds says there is an error on the Serial API driver executing function softReset (didn't not catch the full message, went too fast for my eyes). Second error message that stays says error opening serial port. In the settings, indeed my serial port device doesn't show up anymore
  • Second attempt : stop the VM via Synology Virtual Machine Manager and unplug the stick for 30 seconds, then replug. Restart VM. ==>
    image

then

image

  • Third attempt : disable soft reset in the settings, shutdown VM, unplug / replug stick on the syno. Start VM ==> all back to normal

image

Altough I understand the alledged benefits of soft reset, I would definitely stay away from it in a setup similar of mine :
Aeotec ZW Stick Gen5+ FW 1.2
Synology Virtual Machine + Debian Bookworm + Snap

I'm not sure yet exactly why this issue bubbles up, but I'm starting to wonder if either the soft reset makes the Syno go banana about the stick and it's VM link between host and VM ... or if it's something else.

For the moment, I'll stay with the Soft Reset disabled.
Trivia : maybe a patch to disable it when using the setup described above could be interesting, because i'm pretty sure that if others update, they'll get issues aswell (as my setup is nothing out of the ordinary, I guess i'm not the only one :D )

@robertsLando
Copy link
Member

@jfhautenauven About this:

  • go to settings and enable soft reset
  • hit Save
  • back on the equipments page ==> no more sign of life, no error message in the tiny "status" thingy on the upper right corner

This is because the status reflects the controller status, giving that the controller is not connected zui keeps trying to open the serial without success

  • try going to newtork view page, loads infinitely (turning circle never ends)

This could be an error, anyway it makes no sense to go there or elsewhere in the ui if controller is not connected, all APIs require driver to be up and running

  • going to the "provisioning entities" page : it displays an error message saying the Zwave client is not connected

Same as above ⬆️

  • going to settings again and hitting the save button again, there i get an error message displaying in the upper right corner status things saying the serial port cannot be opened because it doesnt exist

This is because, when the stick is soft resetted and you are not using dev/serial/by-id path, it could take another path like

/dev/ttyACM0 --> /dev/ttyACM1

This explains the error "cannot be opened because it doesnt exist"

@jfhautenauven
Copy link
Author

@robertsLando : been checking this night, but I don't seem to get the "by-id" config to show up ... am I missing something ? Is there something i'm supposed to execute outside of the UI ? Command line ?

@robertsLando
Copy link
Member

robertsLando commented Oct 3, 2023

See this: #2483

@kpine
Copy link
Contributor

kpine commented Oct 3, 2023

Does Snap even support the /dev/serial/by-id paths? A recent discussion in Discord left me feeling like it does not.

@robertsLando
Copy link
Member

cc @jmgiaever

@jmgiaever
Copy link
Contributor

Does Snap even support the /dev/serial/by-id paths? A recent discussion in Discord left me feeling like it does not.

Yes it does. But I'm not sure if it does on all systems.

I know there's something fishy going on in Debian distros. There's been some udev rules or something that messed up things after an upgrade.

And there's also some hacks(?) that needs to be done when snap is installed on Rasbian.

But if this is fixed or still valid, I'm not sure of.

On the other hand, snap shouldn't map things the way docker does, so using the symlinks shouldn't be a problem.

I've never, even once, had a case where I had to change from e.g ttyAMC0 to ttyAMC1. And I have quite many installs running, for a very long time.

@robertsLando
Copy link
Member

robertsLando commented Oct 3, 2023

I had to change from e.g ttyAMC0 to ttyAMC1

That is caused by soft-reset, problem is that if you restart that iinstance then you could need to switch back to ttyACM0 again as both are symlinked

@jmgiaever
Copy link
Contributor

jmgiaever commented Oct 3, 2023

I'm not sure you are correct, in regards to the snap environment.

I have unplugged the stick and plugged it back in, probably several hundred times, while ZUI is running. And it has never happened that I had to change the path. Not after a soft reset either.

@kpine btw, I switched to the path by id, to verify that it's working. However I'm running Ubuntu which is shipped with snap. Let me know which OS the people that are struggling with it, if you know.

@jfhautenauven
Copy link
Author

@jmgiaever : I can answer you what OS i'm running :)

ZUI via snap on Debian 12 in a virtual machine hosted on a Synology via the Synology Virtual Machine manager.

What I can say for sure is that my stick does not go to ACM001. Because it doesn't show up in the list after a reboot or anything, i'm left with only 3 choices, instead of 4 ... and those 3 choices have nothing to do with the Zwave Stick (other peripherals).

@jmgiaever
Copy link
Contributor

But you find them outside of the snap environment?

In such a case there's probably a udev rule issue, or something like that.

And then you should ask the guys that know more about these things at: https://forum.snapcraft.io/

I'm quite certain this isn't an issue within the snap package itself.

@jfhautenauven
Copy link
Author

you mean when I try to list them from the shell command line ? yeah I see them.
what i don't see after the soft reset is the ACM000 ... it goes away from both ZUI and the OS. As if I had unplugged the stick.

At this stage to be honnest, i don't think i'm clever enough with all the linux stuff to be of a great help :( maybe the others will know better, sorry guys :'(

@jmgiaever
Copy link
Contributor

Ok, then my guess is it's a VM issue or an OS issue, and most likely not snap.

It must be available in the OS to be available to snap.

@jfhautenauven
Copy link
Author

I would go even further, i'm thinking about something specific to synology ... but i cannot prove that, just a hunch from previous experience ...

@jmgiaever
Copy link
Contributor

jmgiaever commented Oct 3, 2023

Or can it simply be the stick? Hardware or software issues?

Have you tried a different stick, just to see how that act in these situations?

@jfhautenauven
Copy link
Author

i only have one stick at home, so no, could not test another stick ... i don't think my stick has a hardware problem, since i disabled soft reset, the thing has been running for days ... done several maintenance reboots, and no lockup, nothing out of the ordinary.

@kpine
Copy link
Contributor

kpine commented Oct 3, 2023

A soft-reset is the same as physically pulling and re-inserting the USB stick. If that doesn't work, then it's a VM configuration problem, and disabling soft-reset avoids the problem entirely.

@AlCalzone AlCalzone added question Further information is requested and removed bug Something isn't working labels Oct 4, 2023
@AlCalzone
Copy link
Member

Like the aforementioned issue explains: This is likely an incorrect VM configuration. If you can't figure out how it's supposed to be, disabling soft-reset is one solution.

@jfhautenauven
Copy link
Author

@AlCalzone : I think i'll leave it disabled. If it is supposed to be configured in a more specific way that described in the installation guide, then indeed i'm afraid I'm not able to tweak it better than what I already did.

Thanks anyways folks for investing the time to look into that matter with us all :)

@AlCalzone
Copy link
Member

That's fine. If we knew the necessary steps for all/most VMs, I'd be happy to document them, but we don't.

@andy-sheen
Copy link

andy-sheen commented Oct 6, 2023

Also running a snap image. I have exactly the same problem. stable (8.26.0) is fine. Moving to candidate 9.0.3 (as I wanted to upgrade to HA 23.10.0) stopped the code from being able to open the Zwave stick (gen 5 Aeotec). I'm running on an old ESXi (6.0 I think) VM environment with the USB passed through to the HA VM which is running Ubuntu 22.04 latest.

Disabling soft reset allows everything to work. Are there any downsides to disabling soft reset?

@jmgiaever
Copy link
Contributor

The change in the snap package is only the bump from 9.0.2 to .3, so I don't think the problem is with the snap.

However there might be a change in the code being built (ZUI), but that wouldn't be anything I could do with the snap package to fix it. It has the access it needs to use USB devices (raw-usb).

@AlCalzone
Copy link
Member

Are there any downsides to disabling soft reset?

It is necessary to apply a restored NVM backup. If you do that, you may have to unplug and replug the stick to achieve the same effect.
And you won't benefit from automatic recovery if your stick becomes unresponsive, but that shouldn't be frequent.

Most likely you won't notice any difference, because soft-reset was disabled automatically before.

@andy-sheen
Copy link

However there might be a change in the code being built (ZUI), but that wouldn't be anything I could do with the snap package to fix it. It has the access it needs to use USB devices (raw-usb).

OK. As an ex-s/w dev (retired) first thing I'd do is look at the changes between 8.26 and 9.0.x. I suspect there may be a difference in the way the device is accessed. The ONLY thing I have changed here is moving from current to candidate - this exposes the problem, with the soft reset appearing to be the thing that triggers it.

I used to program in C and Python, but have no experience of the codebase here, although I can use git! Does anyone have any pointers to the area of code that deals with this to save me trawling things and trying to figure out how things work in an alien (to me) language?

@jmgiaever
Copy link
Contributor

jmgiaever commented Oct 6, 2023

look at the changes between 8.26 and 9.0.x

I'm fully aware of that. But as mentioned, the snap already has all the necessary permissions to get access to the USB devices. At least that I'm aware of.

These permissions are not individually developed or maintained by each package maintainer, but a part of the Snap ecosystem, so in cases there are problems with them the issue must be in the Snap environment and not that particular package.

We simply add support for these permissions (interfaces) by adding a few lines of text to the "snap recipe".

If you find any permissions that are missing please let me know, but I have maybe 20 instances running with ZUI on different OSes, and I don't have issues with any of them so I'm not able to debug it.

What you can do is to install the snappy-debug tool and see what it spits out when the stick crashes.

If you don't find anything that can relate to this issue, then you can also try to disable ZUI and run it manually.

sudo snap stop zwave-js-ui 
sudo zwave-js-ui.disable
sudo zwave-js-ui.exec

(Quitting with e.g CRTL+X the running command will exit the execution. )

You will now be able to see potential application crashes in the terminal window, which isn't visible in the logs made by ZUI or ZWJS.

If you find any, please report them.

When you want to enable ZUI as a service again, quit the execution and issue sudo zwave-js-ui.enable

@andy-sheen
Copy link

look at the changes between 8.26 and 9.0.x

I'm fully aware of that. But as mentioned, the snap already has all the necessary permissions to get access to the USB devices. At least that I'm aware of.

These permissions are not individually developed or maintained by each package maintainer, but a part of the Snap ecosystem, so in cases there are problems with them the issue must be in the Snap environment and not that particular package.

Given the way this manifests, I don't believe this is a snap permissions issue (although see below if a new permission has been added to the open).

If soft-reset is disabled, the system works perfectly fine. The stick can be opened, and data read from it. Whatever happens when soft-reset is enabled stops the stick from being accessed. See the pic. in the first post by @jfhautenauven . The serial port cannot be opened at all. This is the symptom I am seeing. Enable soft-reset and you can't access the stick. Disable it and it is fine. I'm wondering if a new attribute has been added to the open command if soft-reset is enabled. Kind of like:

if (soft-reset):
attrs = attrs + "new attr"
open(stick, attrs)

Thanks for the help on debugging - I'm away for a few days from tomorrow so will look at it when back.

@andy-sheen
Copy link

As a PS. I have just tried enabling soft reset in the UI. I get the following error messages in the overview panel after about 30 seconds and the device does not list any entities.

Zwave errors

The logs (snap logs zwave-js-ui -f) give simply a

2023-10-06T17:57:41+01:00 zwave-js-ui.zwave-js-ui[19211]: 2023-10-06 17:57:41.886 INFO Z-WAVE-SERVER: Server closed

They do not give

2023-10-06T17:58:06+01:00 zwave-js-ui.zwave-js-ui[19211]: 2023-10-06 17:58:06.447 INFO Z-WAVE-SERVER: ZwaveJS server listening on 0.0.0.0:3000
2023-10-06T17:58:07+01:00 zwave-js-ui.zwave-js-ui[19211]: 2023-10-06 17:58:07.364 INFO Z-WAVE-SERVER: DNS Service Discovery enabled

when soft reset is enabled.

@andy-sheen
Copy link

PPS. toggling soft reset back off and everything works fine.

@jmgiaever
Copy link
Contributor

I quite certain it isn't either. However it could help to try execute the snap directly as I explained and not as a service, just to see if it output something valuable when this happens.

E.g an exception is thrown or something, that can help the @AlCalzone investigate this further.

And have you tried to run the PKG version of ZUI, just to verify whether or not the issue is just related to the snap package?

If you can give it a try, that would be grat. Stop the snap service first, so there's no conflicting resources. You probably want to copy your «store dir» from the snap package to the location PKG is using.

The store directory is typically stored in /var/snap/zwave-js-ui/<rev>/ directory, where <rev> is the revision you are running. There is normally a symlink «current» in the same folder as all the revisions, that points to the revision in use.

@robertsLando could probably tell where the PKG version stores its «store dir». It might just be in the same directory you're in when you start the PKG version.

@jmgiaever
Copy link
Contributor

They do not give

But it gives this when it's disabled, or is it other installation methods of ZUI that gives this when its enabled?

@andy-sheen
Copy link

OK. Will leave that for sometime next week unless anyone has figured it out. Beer time now!

@andy-sheen
Copy link

They do not give

But it gives this when it's disabled, or is it other installation methods of ZUI that gives this when its enabled?

Yes. With soft reset disabled, it gives this. With soft reset enabled, it just gives the closed message

@jmgiaever
Copy link
Contributor

They do not give

But it gives this when it's disabled, or is it other installation methods of ZUI that gives this when its enabled?

Yes. With soft reset disabled, it gives this. With soft reset enabled, it just gives the closed message

Ok. Give the PKG version a go when you can, and report back if it acts similar. Enjoy 🍻

@AlCalzone
Copy link
Member

The only thing disabling soft-reset does is prevent Z-Wave JS from sending the soft-reset command. The USB port isn't opened differently or anything like that.

@andy-sheen
Copy link

The only thing disabling soft-reset does is prevent Z-Wave JS from sending the soft-reset command. The USB port isn't opened differently or anything like that.

How incredibly weird. Given that the two people having issues are both on VMs, it may be that. I really should build a new VM system that is a bit more modern than ESXi 6...

@AlCalzone
Copy link
Member

It's definitely the VM: zwave-js/node-zwave-js#6341

@andy-sheen
Copy link

Thanks.

@robertsLando
Copy link
Member

Closing so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

7 participants