Add smartctl_device_power_mode and handle standby JSON#327
Open
yonran wants to merge 2 commits intoprometheus-community:masterfrom
Open
Add smartctl_device_power_mode and handle standby JSON#327yonran wants to merge 2 commits intoprometheus-community:masterfrom
yonran wants to merge 2 commits intoprometheus-community:masterfrom
Conversation
|
Hey @yonran, thank you for your effort. Are you able to finalize this thing by signing your commits? I believe thats why it cant be merged, as DCO check fails. It would be really nice to see that metric included ;) Im not related with this project, its just my guess about DCO. |
Export power mode state from smartctl's power_mode JSON field. This allows monitoring which drives are spinning vs sleeping without waking them up during collection.
Gauge: smartctl_device_power_mode{device}. The value is the ata_value: 0=standby, 255=active, etc.
Cache JSON for standby drives (when smartctl --nocheck=standby exit code is 2) instead of returning stale data from when it was active.
Signed-off-by: Yonathan Randolph <[email protected]>
When smartctl returns standby (exit status bit 1), we cache the minimal JSON so power_mode can still be exported. That JSON omits capacity, block size, device info, and NVMe health fields, so collectors must skip those metrics when fields are missing to avoid emitting zeros or empty-label series. Signed-off-by: Yonathan Randolph <[email protected]>
0c1f17a to
f9023ae
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add ATA
smartctl_device_power_modegauge to implement feature requests #310, #195. Looks like a number from 0 to 255 (from the source of smartmon ataprint.cpp from the latest ATA specification; see draft of ACS-3):Note that this does not return NVME power states (nvmeprint.cpp).
smartctl_device_power_modefrompower_mode.ata_valueand allow low‑power--nocheck=standbyresponses to be cached (so sleeping drives still exportsmartctl_device_power_mode/smartctl_device_smartctl_exit_statuswhen exit‑status bit 1 is set withpower_modepresent).smartctl_devicesmartctl_device_capacity_blockssmartctl_device_capacity_bytessmartctl_device_block_sizesmartctl_device_percentage_usedsmartctl_device_available_sparesmartctl_device_available_spare_thresholdsmartctl_device_critical_warningsmartctl_device_media_errorssmartctl_device_num_err_log_entriesI skip emitting metrics when they do not exist in the JSON so that they don’t turn to 0 when the field does not exist in the json. Without that commit (28d568e), the gauges jump to 0 each time the disk is in standby state. For example, here is a disk that shuts down after 45s of idle because I ran

sudo hdparm -S 10 /dev/sdc:Note that other metrics such as smartctl_device_power_on_seconds already were empty instead of 0 when they don't exist in the JSON:
