|
| 1 | +powermetrics(1) General Commands Manual powermetrics(1) |
| 2 | + |
| 3 | +NNAAMMEE |
| 4 | + ppoowweerrmmeettrriiccss |
| 5 | + |
| 6 | +SSYYNNOOPPSSIISS |
| 7 | + ppoowweerrmmeettrriiccss [--ii _s_a_m_p_l_e___i_n_t_e_r_v_a_l___m_s] [--rr _o_r_d_e_r] [--tt _w_a_k_e_u_p___c_o_s_t] |
| 8 | + [--oo _o_u_t_p_u_t___f_i_l_e] [--nn _s_a_m_p_l_e___c_o_u_n_t] |
| 9 | + |
| 10 | +DDEESSCCRRIIPPTTIIOONN |
| 11 | + ppoowweerrmmeettrriiccss gathers and display CPU usage statistics (divided into time |
| 12 | + spent in user mode and supervisor mode), timer and interrupt wakeup |
| 13 | + frequency (total and, for near-idle workloads, those that resulted in |
| 14 | + package idle exits), and on supported platforms, interrupt frequencies |
| 15 | + (categorized by CPU number), package C-state statistics (an indication of |
| 16 | + the time the core complex + integrated graphics, if any, were in low- |
| 17 | + power idle states), CPU frequency distribution during the sample. The |
| 18 | + tool may also display estimated power consumed by various SoC subsystems, |
| 19 | + such as CPU, GPU, ANE (Apple Neural Engine). Note: Average power values |
| 20 | + reported by powermetrics are estimated and may be inaccurate - hence they |
| 21 | + should not be used for any comparison between devices, but can be used to |
| 22 | + help optimize apps for energy efficiency. |
| 23 | + |
| 24 | + --hh, ----hheellpp |
| 25 | + Print help message. |
| 26 | + |
| 27 | + --ss _s_a_m_p_l_e_r_s, ----ssaammpplleerrss _s_a_m_p_l_e_r_s |
| 28 | + Comma separated list of samplers and sampler groups. Run with -h |
| 29 | + to see a list of samplers and sampler groups. Specifying |
| 30 | + "default" will display the default set, and specifying "all" will |
| 31 | + display all supported samplers. |
| 32 | + |
| 33 | + --oo _f_i_l_e, ----oouuttppuutt--ffiillee _f_i_l_e |
| 34 | + Output to _f_i_l_e instead of stdout. |
| 35 | + |
| 36 | + --bb _s_i_z_e, ----bbuuffffeerr--ssiizzee _s_i_z_e |
| 37 | + Set output buffer _s_i_z_e (0=none, 1=line) |
| 38 | + |
| 39 | + --ii _N, ----ssaammppllee--rraattee _N |
| 40 | + sample every _N ms (0=disabled) [default: 5000ms] |
| 41 | + |
| 42 | + --nn _N, ----ssaammppllee--ccoouunntt _N |
| 43 | + Obtain _N periodic samples (0=infinite) [default: 0] |
| 44 | + |
| 45 | + --tt _N, ----wwaakkeeuupp--ccoosstt _N |
| 46 | + Assume package idle wakeups have a CPU time cost of _N us when |
| 47 | + using hybrid sort orders using idle wakeups with time-based |
| 48 | + metrics |
| 49 | + |
| 50 | + --rr _m_e_t_h_o_d, ----oorrddeerr _m_e_t_h_o_d |
| 51 | + Order process list using specified _m_e_t_h_o_d [default: composite] |
| 52 | + |
| 53 | + [pid] |
| 54 | + process identifier |
| 55 | + [wakeups] |
| 56 | + total package idle wakeups (alias: -W) |
| 57 | + [cputime] |
| 58 | + total CPU time used (alias: -C) |
| 59 | + [composite] |
| 60 | + energy number, see --show-process-energy (alias: -O) |
| 61 | + |
| 62 | + --ff _f_o_r_m_a_t, ----ffoorrmmaatt _f_o_r_m_a_t |
| 63 | + Display data in specified format [default: text] |
| 64 | + |
| 65 | + [text] |
| 66 | + human-readable text output |
| 67 | + [plist] |
| 68 | + machine-readable property list, NUL-separated |
| 69 | + |
| 70 | + --aa _N, ----ppoowweerraavvgg _N |
| 71 | + Display poweravg every _N samples (0=disabled) [default: 10] |
| 72 | + |
| 73 | + ----hhiiddee--ccppuu--dduuttyy--ccyyccllee |
| 74 | + Hide CPU duty cycle data |
| 75 | + |
| 76 | + ----sshhooww--iinniittiiaall--uussaaggee |
| 77 | + Print initial sample for entire uptime |
| 78 | + |
| 79 | + ----sshhooww--uussaaggee--ssuummmmaarryy |
| 80 | + Print final usage summary when exiting |
| 81 | + |
| 82 | + ----sshhooww--ppssttaatteess |
| 83 | + Show pstate distribution. Only available on certain hardware. |
| 84 | + |
| 85 | + ----sshhooww--pplliimmiittss |
| 86 | + Show plimits, forced idle and RMBS. Only available on certain |
| 87 | + hardware. |
| 88 | + |
| 89 | + ----sshhooww--ccppuu--qqooss |
| 90 | + Show per cpu QOS breakdowns. |
| 91 | + |
| 92 | + ----sshhooww--pprroocceessss--ccooaalliittiioonn |
| 93 | + Group processes by coalitions and show per coalition information. |
| 94 | + Processes that have exited during the sample will still have |
| 95 | + their time billed to the coalition, making this useful for |
| 96 | + disambiguating DEAD_TASK time. |
| 97 | + |
| 98 | + ----sshhooww--rreessppoonnssiibbllee--ppiidd |
| 99 | + Show responsible pid for xpc services and parent pid |
| 100 | + |
| 101 | + ----sshhooww--pprroocceessss--wwaaiitt--ttiimmeess |
| 102 | + Show per-process sfi wait time info |
| 103 | + |
| 104 | + ----sshhooww--pprroocceessss--qqooss--ttiieerrss |
| 105 | + Show per-process qos latency and throughput tier |
| 106 | + |
| 107 | + ----sshhooww--pprroocceessss--iioo |
| 108 | + Show per-process io information |
| 109 | + |
| 110 | + ----sshhooww--pprroocceessss--ggppuu |
| 111 | + Show per-process gpu time. This is only available on certain |
| 112 | + hardware. |
| 113 | + |
| 114 | + ----sshhooww--pprroocceessss--nneettssttaattss |
| 115 | + Show per-process network information |
| 116 | + |
| 117 | + ----sshhooww--pprroocceessss--qqooss |
| 118 | + Show QOS times aggregated by process. Per thread information is |
| 119 | + not available. |
| 120 | + |
| 121 | + ----sshhooww--pprroocceessss--eenneerrggyy |
| 122 | + Show per-process energy impact number. This number is a rough |
| 123 | + proxy for the total energy the process uses, including CPU, GPU, |
| 124 | + disk io and networking. The weighting of each is platform |
| 125 | + specific. Enabling this implicitly enables sampling of all the |
| 126 | + above per-process statistics. |
| 127 | + |
| 128 | + ----sshhooww--pprroocceessss--ssaammpp--nnoorrmm |
| 129 | + Show CPU time normailzed by the sample window, rather than the |
| 130 | + process start time. For example a process that launched 1 second |
| 131 | + before the end of a 5 second sample window and ran continuously |
| 132 | + until the end of the window will show up as 200 ms/s here and |
| 133 | + 1000 ms/s in the regular column. |
| 134 | + |
| 135 | + ----sshhooww--pprroocceessss--iippcc |
| 136 | + Show per-process Instructions and cycles on ARM machines. Use |
| 137 | + with --show-process-amp to show cluster stats. |
| 138 | + |
| 139 | + ----sshhooww--aallll |
| 140 | + Enables all samplers and displays all the available information |
| 141 | + for each sampler. |
| 142 | + |
| 143 | + This tool also implements special behavior upon receipt of certain |
| 144 | + signals to aid with the automated collection of data: |
| 145 | + |
| 146 | + SIGINFO |
| 147 | + take an immediate sample |
| 148 | + SIGIO |
| 149 | + flush any buffered output |
| 150 | + SIGINT/SIGTERM/SIGHUP |
| 151 | + stop sampling and exit |
| 152 | + |
| 153 | +OOUUTTPPUUTT |
| 154 | + _G_u_i_d_e_l_i_n_e_s _f_o_r _e_n_e_r_g_y _r_e_d_u_c_t_i_o_n |
| 155 | + |
| 156 | + CPU time, deadlines and interrupt wakeups: Lower is better |
| 157 | + |
| 158 | + Interrupt counts: Lower is better |
| 159 | + |
| 160 | + C-state residency: Higher is better |
| 161 | + |
| 162 | + _R_u_n_n_i_n_g _T_a_s_k_s |
| 163 | + |
| 164 | + 1. CPU time consumed by threads assigned to that process, broken down |
| 165 | + into time spent in user space and kernel mode. |
| 166 | + |
| 167 | + 2. Counts of "short" timers (where the time-to-deadline was < 5 |
| 168 | + milliseconds in the future at the point of timer creation) which woke up |
| 169 | + threads from that process. High frequency timers, which typically have |
| 170 | + short time-to-deadlines, can result in significant energy consumption. |
| 171 | + |
| 172 | + 3. A count of total interrupt level wakeups which resulted in dispatching |
| 173 | + a thread from the process in question. For example, if a thread were |
| 174 | + blocked in a usleep() system call, a timer interrupt would cause that |
| 175 | + thread to be dispatched, and would increment this counter. For workloads |
| 176 | + with a significant idle component, this metric is useful to study in |
| 177 | + conjunction with the package idle exit metric reported below. |
| 178 | + |
| 179 | + 4. A count of "package idle exits" induced by timers/device interrupts |
| 180 | + which awakened threads from the process in question. This is a subset of |
| 181 | + the interrupt wakeup count. Timers and other interrupts that trigger |
| 182 | + "package idle exits" have a greater impact on energy consumption relative |
| 183 | + to other interrupts. With the exception of some Mac Pro systems, Mac and |
| 184 | + iOS systems are typically single package systems, wherein all CPUs are |
| 185 | + part of a single processor complex (typically a single IC die) with |
| 186 | + shared logic that can include (depending on system specifics) shared last |
| 187 | + level caches, an integrated memory controller etc. When all CPUs in the |
| 188 | + package are idle, the hardware can power-gate significant portions of the |
| 189 | + shared logic in addition to each individual processor's logic, as well as |
| 190 | + take measures such as placing DRAM in to self-refresh (also referred to |
| 191 | + as auto-refresh), place interconnects into lower-power states etc. Hence |
| 192 | + a timer or interrupt that triggers an exit from this package idle state |
| 193 | + results in a a greater increase in power than a timer that occurred when |
| 194 | + the CPU in question was already executing. The process initiating a |
| 195 | + package idle wakeup may also be the "prime mover", i.e. it may be the |
| 196 | + trigger for further activity in its own or other processes. This metric |
| 197 | + is most useful when the system is relatively idle, as with typical light |
| 198 | + workloads such as web browsing and movie playback; with heavier |
| 199 | + workloads, the CPU activity can be high enough such that package idle |
| 200 | + entry is relatively rare, thus masking package idle exits due to the |
| 201 | + process/thread in question. |
| 202 | + |
| 203 | + 5. If any processes arrived and vanished during the inter-sample |
| 204 | + interval, or a previously sampled process vanished, their statistics are |
| 205 | + reflected in the row labeled "DEAD_TASKS". This can identify issues |
| 206 | + involving transient processes which may be spawned too frequently. dtrace |
| 207 | + ("execsnoop") or other tools can then be used to identify the transient |
| 208 | + processes in question. Running powermetrics in coalition mode, (see |
| 209 | + below), will also help track down transient process issues, by billing |
| 210 | + the coalition to which the process belongs. |
| 211 | + |
| 212 | + _I_n_t_e_r_r_u_p_t _D_i_s_t_r_i_b_u_t_i_o_n |
| 213 | + |
| 214 | + The interrupts sampler reports interrupt frequencies, classified by |
| 215 | + interrupt vector and associated device, on a per-CPU basis. Mac OS |
| 216 | + currently assigns all device interrupts to CPU0, but timers and |
| 217 | + interprocessor interrupts can occur on other CPUs. Interrupt frequencies |
| 218 | + can be useful in identifying misconfigured devices or areas of |
| 219 | + improvement in interrupt load, and can serve as a proxy for identifying |
| 220 | + device activity across the sample interval. For example, during a |
| 221 | + network-heavy workload, an increase in interrupts associated with Airport |
| 222 | + wireless ("ARPT"), or wired ethernet ("ETH0" "ETH1" etc.) is not |
| 223 | + unexpected. However, if the interrupt frequency for a given device is |
| 224 | + non-zero when the device is not active (e.g. if "HDAU" interrupts, for |
| 225 | + High Definition Audio, occur even when no audio is playing), that may be |
| 226 | + a driver error. The int_sources sampler attributes interrupts to the |
| 227 | + responsible InterruptEventSources, which helps disambiguate the cause of |
| 228 | + an interrupt if the vector serves more than one source. |
| 229 | + |
| 230 | + _B_a_t_t_e_r_y _S_t_a_t_i_s_t_i_c_s |
| 231 | + |
| 232 | + The battery sampler reports battery discharge rates, current and maximum |
| 233 | + charge levels, cycle counts and degradation from design capacity across |
| 234 | + the interval in question, if a delta was reported by the battery |
| 235 | + management unit. Note that the battery controller data may arrive out-of- |
| 236 | + phase with respect to powermetrics samples, which can cause aliasing |
| 237 | + issues across short sample intervals. Discharge rates across |
| 238 | + discontinuities such as sleep/wake may also be inaccurate on some |
| 239 | + systems; however, the rate of change of the total charge level across |
| 240 | + longer intervals is a useful indicator of total system load. Powermetrics |
| 241 | + does not filter discharge rates for A/C connect/disconnect events, system |
| 242 | + sleep residency etc. Battery discharge rates are typically not comparable |
| 243 | + across machine models. |
| 244 | + |
| 245 | + _P_r_o_c_e_s_s_o_r _E_n_e_r_g_y _U_s_a_g_e |
| 246 | + |
| 247 | + The cpu_power sampler reports data derived from the Intel energy models; |
| 248 | + as of the Sandy Bridge intel microarchitecture, the Intel power control |
| 249 | + unit internally maintains an energy consumption model whose details are |
| 250 | + proprietary, but are likely based on duty cycles for individual execution |
| 251 | + units, current voltage/frequency etc. These numbers are not strictly |
| 252 | + accurate but are correlated with actual energy consumption. This section |
| 253 | + lists: power dissipated by the processor package which includes the CPU |
| 254 | + cores, the integrated GPU and the system agent (integrated memory |
| 255 | + controller, last level cache), and separately, CPU core power and GT |
| 256 | + (integrated GPU) power (the latter two in a forthcoming version). The |
| 257 | + energy model data is generally not comparable across machine models. |
| 258 | + |
| 259 | + The cpu_power sampler next reports, on processors with Nehalem and newer |
| 260 | + microarchitectures, hardware derived processor frequency and idle |
| 261 | + residency information, labeled "P-states" and "C-states" respectively in |
| 262 | + Intel terminology. |
| 263 | + |
| 264 | + C-states are further classified in to "package c-states" and per-core C- |
| 265 | + states. The processor enters a "c-state" in the scheduler's idle loop, |
| 266 | + which results in clock-gating or power-gating CPU core and, potentially, |
| 267 | + package logic, considerably reducing power dissipation. High package c- |
| 268 | + state residency is a goal to strive for, as energy consumption of the CPU |
| 269 | + complex, integrated memory controller if any and DRAM is significantly |
| 270 | + reduced when in a package c-state. Package c-states occur when all CPU |
| 271 | + cores within the package are idle, and the on-die integrated GPU if any |
| 272 | + (SandyBridge mobile and beyond), on the system is also idle. Powermetrics |
| 273 | + reports package c-state residency as a fraction of the time sampled. This |
| 274 | + is available on Nehalem microarchitecture and newer processors. Note that |
| 275 | + some systems, such as Mac Pros, do not enable "package" c-states. |
| 276 | + |
| 277 | + Powermetrics also reports per-core c-state residencies, signifying when |
| 278 | + the core in question (which can include multiple SMTs or "hyperthreads") |
| 279 | + is idle, as well as active/inactive duty cycle histograms for each |
| 280 | + logical processor within the core. This is available on Nehalem |
| 281 | + microarchitecture and newer processors. |
| 282 | + |
| 283 | + This section also lists the average clock frequency at which the given |
| 284 | + logical processor executed when not idle within the sampled interval, |
| 285 | + expressed as both an absolute frequency in MHz and as a percentage of the |
| 286 | + nominal rated frequency. These average frequencies can vary due to the |
| 287 | + operating system's demand based dynamic voltage and frequency scaling. |
| 288 | + Some systems can execute at frequencies greater than the nominal or "P1" |
| 289 | + frequency, which is termed "turbo mode" on Intel systems. Such operation |
| 290 | + will manifest as > 100% of nominal frequency. Lengthy execution in turbo |
| 291 | + mode is typically energy inefficient, as those frequencies have high |
| 292 | + voltage requirements, resulting in a correspondingly quadratic increase |
| 293 | + in power insufficient to outweigh the reduction in execution time. |
| 294 | + Current systems typically have a single voltage/frequency domain per- |
| 295 | + package, but as the processors can execute out-of-phase, they may display |
| 296 | + different average execution frequencies. |
| 297 | + |
| 298 | + _D_i_s_k _U_s_a_g_e _a_n_d _N_e_t_w_o_r_k _A_c_t_i_v_i_t_y |
| 299 | + |
| 300 | + The network and disk samplers reports deltas in disk and network activity |
| 301 | + that occured during the sample. Also specifying --show-process-netstats |
| 302 | + and --show-process-io will give you this information on a per process |
| 303 | + basis in the tasks sampler. |
| 304 | + |
| 305 | + _B_a_c_k_l_i_g_h_t _l_e_v_e_l |
| 306 | + |
| 307 | + The battery sampler also reports the instantaneous value of the backlight |
| 308 | + luminosity level. This value is likely not comparable across systems and |
| 309 | + machine models, but can be useful when comparing scenarios on a given |
| 310 | + system. |
| 311 | + |
| 312 | + _D_e_v_i_c_e_s |
| 313 | + |
| 314 | + The devices sampler, for each device, reports the time spent in each of |
| 315 | + the device's states over the course of the sample. The meaning of the |
| 316 | + different states is specific to each device. Powermetrics denotes low |
| 317 | + power states with an "L", device usable states with a "U" and power on |
| 318 | + states with an "O". |
| 319 | + |
| 320 | + _S_M_C |
| 321 | + |
| 322 | + The smc sampler displays information supplied by the System Management |
| 323 | + Controller. On supported platforms, this includes fan speed and |
| 324 | + information from various temperature sensors. These are instantaneous |
| 325 | + values taken at the end of the sample window, and do not necessarily |
| 326 | + reflect the values at other times in the window. |
| 327 | + |
| 328 | + _T_h_e_r_m_a_l |
| 329 | + |
| 330 | + The thermal sampler displays the current thermal pressure the system is |
| 331 | + under. This is an instantaneous value taken at the end of the sample |
| 332 | + window, and does not necessarily reflect the value at other times in the |
| 333 | + window. |
| 334 | + |
| 335 | + _S_F_I |
| 336 | + |
| 337 | + The sfi sampler shows system wide selective forced idle statistics. |
| 338 | + Selective forced idle is a mechanism the operating system uses to limit |
| 339 | + system power while minimizing user impact, by throttling certain threads |
| 340 | + on the system. Each thread belongs to an SFI class, and this sampler |
| 341 | + displays how much each SFI class is currently being throttled or empty if |
| 342 | + none of them is throttled. These are instantaneous values taken at the |
| 343 | + end of the sample window, and do not necessarily reflect the values at |
| 344 | + other times in the window. To get SFI wait time statistics on a per |
| 345 | + process basis use --show-process-wait-times. |
| 346 | + |
| 347 | +KKNNOOWWNN IISSSSUUEESS |
| 348 | + Changes in system time and sleep/wake can cause minor inaccuracies in |
| 349 | + reported cpu time. |
| 350 | + |
| 351 | +Darwin 5/1/12 Darwin |
0 commit comments