Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't read data via lm-sensors under certain circumstances #133

Open
goc9000 opened this issue Jun 19, 2024 · 4 comments
Open

Can't read data via lm-sensors under certain circumstances #133

goc9000 opened this issue Jun 19, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@goc9000
Copy link

goc9000 commented Jun 19, 2024

Description

On my machine, the hwmon source produces a confusing, unusable mess, so I switched to the lm-sensors source. This worked great with Freon, and it allows one to add overrides in /etc/sensors.d to edit out badly configured or defective sensors.

However, Astra does not display any sensors at all from that source.

I did some digging and I think the problem is that Astra Monitor gets the data by executing sensors -j. I think the expectation was that a JSON output is easier to machine-parse than the regular human-readable version.

Unfortunately, the JSON output as read by Astra Monitor is not guaranteed to be valid. There are at least two bugs which come into play here:

First, if sensors fails to read some values (e.g. sensors are broken, device is turned off etc), it will emit messages on stderr. For instance, on my machine, the output of sensors -j begins like this:

{
   "iwlwifi_1-virtual-0":{
      "Adapter": "Virtual device",
      "temp1":{
ERROR: Can't get value of subfeature temp1_input: Can't read

      }
   },
(... many more lines omitted ...)

Note the glaring error message. It is emitted on stderr, but Utils.executeCommandAsync redirects stderr to stdout, so the message breaks the JSON structure.

Second, and this is a severe bug with sensors itself, even if I turn off the offending sensors in /etc/sensors.d, the output is now:

{
   "iwlwifi_1-virtual-0":{
      "Adapter": "Virtual device",
   },
(... many more lines omitted ...)

Note the trailing comma, invalid for standard JSON. This is because sensors has no real JSON support, it just pastes text lines and doesn't account for all edge cases.

It's unlikely that the bug will be fixed in sensors anytime soon, but here are some workarounds I recommend:

  1. Ensure stderr output is not mixed with stdout, at least for sensors
  2. Apply a regex before JSON.parse to try to get rid of trailing commas, e.g. stdoutString = stdoutString.replace(/,\s*(?=}|])/g, '');

Steps to Reproduce

  1. Ensure that sensors produces some output on your machine

  2. Disable all sensors under some minor category in lm-sensors by adding e.g. /etc/sensors.d/temp.conf with a content like:

    chip "chip-name-here"
            ignore temp1
            ignore temp2
            ... etc ...
    
  3. Run sensors again and note that it should now show a traling comma for that chip/adapter

  4. Switch to lm-sensors in the extension and observe how it doesn't detect any sensors at all anymore

Environment

  • Astra Monitor Version: 25 (EGOv42)
  • GNOME Version: 46
  • Operating System and Version: Ubuntu 24.04
  • Other Relevant System Information:

Logs

Error seen in logs:

Jun 20 00:13:58 REDACTED gjs[1175960]: ###### Astra Monitor ###### Error getting sensors sources: SyntaxError: JSON.parse: expected double-quoted property name at line 4 column 4 of the JSON data

Sensors output with sensor error and stderr redirected to stdout:

sensors_output_with_error.txt

Sensors output with the offending sensor disabled:

sensors_output_with_ignore.txt

Additional Context

N/A

@goc9000 goc9000 added the bug Something isn't working label Jun 19, 2024
@ljuzig
Copy link
Collaborator

ljuzig commented Jul 2, 2024

Thank you! I should create a debug build to better understand why hwmon source produces "a confusing, unusable mess" on your system. That should be the preferred source since it has a far better performance.

Utils.executeCommandAsync shouldn't redirect the stderr into stdout, I used to have the very same error on my system as yours and I've always been able to use lm-sensors until hwmon source has been introduced, the problem must be somewhere else. I'm gonna investigate further into that if I'm able to reproduce it.

The sensors bug with trailing commas is an interesting one, didn't know that. I'm gonna fix that.

ljuzig added a commit that referenced this issue Jul 2, 2024
@goc9000
Copy link
Author

goc9000 commented Jul 3, 2024

Thanks for working on this.

Regarding hwmon, the issue is complicated. Reading the data technically works, but there are are several problems that would require a more thorough re-engineering for it to be useful:

  • Astra shows every hwmon entry as a "sensor" but a lot of those aren't really "sensors" but rather Min/Max values, PWM settings, flags etc. Astra just reports them indiscriminately in the same place, and it's really hard to see what's going on. I'm getting 120+ "sensors" coming from one chip, when in fact there are only 8 or 9 useful values or so. lm-sensors merges the min/max/critical indicators together with the current value, at the very least, and doesn't report flags or such.

  • As far as I know, there isn't any way in hwmon to ignore sensors, rename sensors, or rescale the values. All of which can be done in lm-sensors and can help remove or fix bad data.

  • Astra shows sensors of all types: temperature, fan speed, voltages, PWM settings, etc. and seems to treat them equally. In practice though, this information is not equally important. Temperatures will often be the most important indicator, then fan speeds. Voltages and PWM settings tend to be used for debugging and should be normally deemphasized. It's a hard problem, but there should be some way for the user to adjust the priority of various categories of sensors, so that they don't get overloaded with hundreds of values on one screen.

@ljuzig
Copy link
Collaborator

ljuzig commented Jul 26, 2024

  • I should manage min-max and other variants better, such as including them in the same row as the entry, but this would require a significant amount of work at the moment.
  • I can introduce a regex or a method to exclude sensors, similar to how I handle network interfaces or storage devices.
  • The current category order is determined by hwmon and shouldn't be random but rather intentional.

Having a lot of sensors is currently considered an edge case. In every device I've tested, I've encountered a manageable number of sensors. Nonetheless, I would like to address edge cases as well, but please note that it is not at the top of the priority list right now. If you could provide a screenshot or something to help me understand your specific case, it would be very helpful.

@goc9000
Copy link
Author

goc9000 commented Sep 17, 2024

I've split off the discussion about the hwmon display to a separate issue ( #145 ).

Note that I've upgraded to the 26 EGOv43 version which should include the trailing commas fix but the lm-sensors source still doesn't work. There must be more to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants