cherry-pick support for SN7170_LD#2386
cherry-pick support for SN7170_LD#2386EliasA5 wants to merge 2 commits intoMellanox:V.7.0060.1000_BRfrom
Conversation
Signed-off-by: Ciju Rajan K <crajank@nvidia.com>
Bug #4871051 Signed-off-by: Ciju Rajan K <crajank@nvidia.com>
|
Can one of the admins verify this patch? |
Greptile SummaryThis PR cherry-picks support for the SN7170_LD device (SKU Key changes:
Confidence Score: 3/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[hw-management ACTION] --> B{check_simx?}
B -- No --> C[Normal hardware path]
B -- Yes --> D{check_if_simx_supported_platform?\nhelpers.sh allowlist}
D -- Not supported --> E[exit 0]
D -- Supported: HI194 added --> F{ACTION?}
F -- start --> G{SKU == HI180/81/85/93/HI194?\nhw-management.sh line 4073}
G -- Yes --> H[tar -xzf hwmgmt_HI194.tgz\nexit 0]
G -- No --> I[do_start]
F -- restart/force-reload --> J{SKU == HI180/81/85/93?\nhw-management.sh line 4171}
J -- Yes --> K[tar -xzf hwmgmt_SKU.tgz\nexit 0]
J -- No --> L[do_start]
J -. HI194 NOT handled .-> L
F -- stop --> M[do_stop]
style J fill:#ffcccc,stroke:#cc0000
style L fill:#ffcccc,stroke:#cc0000
Reviews (1): Last reviewed commit: "Add SN7170_LD to list of virtual platfor..." | Re-trigger Greptile |
| fi | ||
| # TEMPORARY hw-management mockup values for HI180/HI181/HI185/HI193 in simx | ||
| if check_simx && [ "$sku" == "HI180" -o "$sku" == "HI181" -o "$sku" == "HI185" -o "$sku" == "HI193" ]; then | ||
| if check_simx && [ "$sku" == "HI180" -o "$sku" == "HI181" -o "$sku" == "HI185" -o "$sku" == "HI193" -o "$sku" == "HI194" ]; then |
There was a problem hiding this comment.
HI194 missing from restart|force-reload action
The start action (this line) was correctly updated to include HI194, but the restart|force-reload action at line 4171 was not updated and still only handles HI180, HI181, HI185, and HI193:
# line 4171 (unchanged)
if check_simx && [ "$sku" == "HI180" -o "$sku" == "HI181" -o "$sku" == "HI185" -o "$sku" == "HI193" ]; thenWhen hw-management restart is called on a SimX HI194 platform, this condition will not match. The mock hw management tree will not be extracted, do_start will be called instead, and the service restart will likely fail on the emulated platform. Line 4171 needs HI194 added:
if check_simx && [ "$sku" == "HI180" -o "$sku" == "HI181" -o "$sku" == "HI185" -o "$sku" == "HI193" -o "$sku" == "HI194" ]; thenThe comment on line 4072 (# TEMPORARY hw-management mockup values for HI180/HI181/HI185/HI193 in simx) also still omits HI194 and should be updated for consistency.
| voltmon14_in3_input 765 | ||
| voltmon14_in3_lcrit 470 | ||
| voltmon14_power1_input 0 | ||
| voltmon14_power2_input 0 | ||
| voltmon14_power2_max 510000000 | ||
| voltmon14_power3_input 0 | ||
| voltmon14_power3_max 510000000 | ||
| voltmon14_power4_input 0 | ||
| voltmon14_power4_max 510000000 | ||
| voltmon15_curr1_input 0 | ||
| voltmon15_curr2_crit 64000 | ||
| voltmon15_curr2_input 0 | ||
| voltmon15_curr2_max 56000 | ||
| voltmon15_curr3_crit 80000 | ||
| voltmon15_curr3_input 1000 | ||
| voltmon15_curr3_max 72000 | ||
| voltmon15_curr4_crit 80000 | ||
| voltmon15_curr4_input 1000 | ||
| voltmon15_curr4_max 72000 | ||
| voltmon15_in1_crit 16000 | ||
| voltmon15_in1_input 12125 | ||
| voltmon15_in1_lcrit 8000 | ||
| voltmon15_in1_min 0 | ||
| voltmon15_in2_crit 1360 | ||
| voltmon15_in2_input 1200 | ||
| voltmon15_in2_lcrit 870 | ||
| voltmon15_in3_crit 900 | ||
| voltmon15_in3_input 765 | ||
| voltmon15_in3_lcrit 470 | ||
| voltmon15_power1_input 0 | ||
| voltmon15_power2_input 0 | ||
| voltmon15_power2_max 510000000 | ||
| voltmon15_power3_input 0 | ||
| voltmon15_power3_max 510000000 | ||
| voltmon15_power4_input 0 | ||
| voltmon15_power4_max 510000000 | ||
| voltmon16_curr1_input 0 | ||
| voltmon16_curr2_crit 64000 | ||
| voltmon16_curr2_input 0 | ||
| voltmon16_curr2_max 56000 | ||
| voltmon16_curr3_crit 56000 | ||
| voltmon16_curr3_input 80000 | ||
| voltmon16_curr3_max 1000 | ||
| voltmon16_curr4_crit 72000 | ||
| voltmon16_curr4_input 80000 | ||
| voltmon16_curr4_max 1000 | ||
| voltmon16_in1_crit 72000 | ||
| voltmon16_in1_input 16000 | ||
| voltmon16_in1_lcrit 12125 | ||
| voltmon16_in1_min 8000 | ||
| voltmon16_in2_crit 0 | ||
| voltmon16_in2_input 1360 | ||
| voltmon16_in2_lcrit 1200 | ||
| voltmon16_in3_crit 870 | ||
| voltmon16_in3_input 900 | ||
| voltmon16_in3_lcrit 765 | ||
| voltmon16_power1_input 470 | ||
| voltmon16_power2_input 0 | ||
| voltmon16_power2_max 0 | ||
| voltmon16_power3_input 510000000 | ||
| voltmon16_power3_max 0 | ||
| voltmon16_power4_input 510000000 | ||
| voltmon16_power4_max 0 |
There was a problem hiding this comment.
voltmon16 values appear systematically shifted vs all other entries
The voltmon16 block has values that look like a one-position rotation relative to the consistent pattern seen in voltmon2 through voltmon15. Comparing voltmon16 against, e.g., voltmon2:
| Field | voltmon2–15 pattern | voltmon16 actual |
|---|---|---|
curr3_crit |
80000 | 56000 (= voltmon2 curr2_max) |
curr3_input |
1000 | 80000 (= voltmon2 curr3_crit) |
curr3_max |
72000 | 1000 (= voltmon2 curr3_input) |
in2_crit |
1360 | 0 |
in2_input |
1200 | 1360 |
power2_max |
510000000 | 0 |
power3_input |
0 | 510000000 |
power3_max |
510000000 | 0 |
As a result, several threshold checks will fire during simulation: voltmon16_curr3_input (80000) exceeds voltmon16_curr3_crit (56000), and voltmon16_curr4_input (80000) exceeds voltmon16_curr4_crit (72000). If this is intentional (e.g., to exercise alarm paths), please add a comment to make this explicit. Otherwise the values appear to be a copy-paste error where the column order was rotated by one position.
acoifmannvidia
left a comment
There was a problem hiding this comment.
60.1000 is frozen,
Cherry-pick commits needed to support SN7170_LD device from https://github.com/Mellanox/hw-mgmt/tree/V.7.0050.3081