Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect anomalies in getTotalEnergyConsumption return value and fall back to power polling for AMD GPUs #113

Closed
parthraut opened this issue Aug 26, 2024 · 0 comments · Fixed by #132
Assignees

Comments

@parthraut
Copy link
Collaborator

parthraut commented Aug 26, 2024

The getTotalEnergyConsumption method to return the total energy consumption of a GPU internally calls the amdsmi.amdsmi_get_energy_count() method.

This sometimes reports very inaccurate results which are not physically possible (see this). Zeus will detect this and not use this amdsmi method internally, issuing a warning and falling back to power polling by appropriately setting the _supportsGetTotalEnergyConsumption attribute in the GPUs class.

@parthraut parthraut self-assigned this Aug 26, 2024
@jaywonchung jaywonchung linked a pull request Oct 17, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
@parthraut and others