pbr: pbr nexthop group json fixes#159
pbr: pbr nexthop group json fixes#159suphawitwanl wants to merge 4 commits intoaitest-base-pr-21194from
Conversation
suphawitwanl
commented
Mar 20, 2026
- fix interfaceName and vrfId in nexthop-group json output — Fixes the key naming
- change nexthop-group json output from array to dict — Changes JSON structure and adds nhgCount.
The vrfId JSON key was incorrectly storing the interface name instead of the actual VRF ID for IFINDEX, IPV4_IFINDEX, and IPV6_IFINDEX nexthop types. Separate them into interfaceName for the interface name and vrfId for the numeric VRF ID. Signed-off-by: Sougata Barik <[email protected]>
Change the JSON format of "show pbr nexthop-groups json" from an array to a dictionary keyed by table ID, allowing extra attributes to be added. Add nhgCount to the top-level output. Signed-off-by: Sougata Barik <[email protected]>
Greptile SummaryThis PR makes two JSON output changes to the Major Concerns:
Hi I can read ./greptile thanks
|
| Filename | Overview |
|---|---|
| lib/nexthop_group.c | Fixes two genuine bugs (NEXTHOP_TYPE_IPV4_IFINDEX and IPV6_IFINDEX had "vrfId" key storing an interface name string, and vrfId was a string not an int). Also renames NEXTHOP_TYPE_IFINDEX "nexthop" key to "interfaceName" — a breaking JSON API change with no deprecation path. |
| pbrd/pbr_nht.c | Changes nexthop-group JSON output from array to dict keyed by table_id, adds nhg_count tracking. Breaking change to JSON output structure with no deprecation. nhg_count uses uint16_t which is unnecessarily narrow. |
| pbrd/pbr_nht.h | Header updated to reflect changed function signature of pbr_nht_json_nexthop_group, adding uint16_t *pbr_nhg_count output parameter. Straightforward and correct. |
| pbrd/pbr_vty.c | Switches JSON container from array to object, adds nhgCount metadata at same level as group entries. Mixed metadata/data at the same JSON level is a design smell; no YANG model backing the new structure. |
| tests/topotests/pbr_topo1/test_pbr_topo1.py | Updated tests for new dict-based JSON output. nhg_cmp has a potential flakiness issue (ordering) and no type guard. test_pbr_nhg_with_vrf_disable_enable lacks a type check. File handle leak in runit_nhg. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["show pbr nexthop-groups [WORD] json"] --> B["pbr_vty.c: show_pbr_nexthop_group"]
B --> C["json_object_new_object()"]
C --> D["pbr_nht_json_nexthop_group(j, name, &count)"]
D --> E["hash_iterate(pbr_nhg_hash, pbr_nht_json_nhg)"]
E --> F["pbr_nht_json_nhg per hash bucket"]
F --> G["json_object_object_addf(j, group, '%u', table_id)"]
G --> H["pns->nhg_count++"]
H --> I["Next bucket"]
I --> E
E --> J["*pbr_nhg_count = pns.nhg_count"]
J --> K["json_object_int_add(j, 'nhgCount', count)"]
K --> L["vty_json output"]
F --> M["pbr_nht_json_nhg_nexthops per nexthop"]
M --> N["nexthop_group_json_nexthop(nh_obj, nh)"]
N --> O{nh->type}
O -->|IFINDEX| P["interfaceName + vrfId int"]
O -->|IPV4| Q["nexthop IP"]
O -->|IPV4_IFINDEX| R["nexthop IP + interfaceName + vrfId int"]
O -->|IPV6| S["nexthop IPv6"]
O -->|IPV6_IFINDEX| T["nexthop IPv6 + interfaceName + vrfId int"]
O -->|BLACKHOLE| U["no fields"]
Comments Outside Diff (1)
-
pbrd/pbr_vty.c, line 1870-1881 (link)JSON output structure is a flat dict mixing metadata with data
The resulting JSON structure is:
{ "12345": { "id": 12345, "name": "foo", ... }, "67890": { "id": 67890, "name": "bar", ... }, "nhgCount": 2 }The
nhgCountmetadata is mixed at the same level as the nexthop group entries. This is an unusual and fragile design — any consumer must know to skip integer-valued keys, and adding future metadata fields will further pollute the namespace. A cleaner structure would nest the groups:{ "nhgCount": 2, "nexthopGroups": { "12345": { ... }, "67890": { ... } } }Also note: changing the JSON output from an array to a dict is a breaking change for existing consumers and automation tooling. This should be explicitly noted in the commit message with a YANG model backing this new structure (per
yang-json-outputrule).Prompt To Fix With AI
This is a comment left during a code review. Path: pbrd/pbr_vty.c Line: 1870-1881 Comment: **JSON output structure is a flat dict mixing metadata with data** The resulting JSON structure is: ```json { "12345": { "id": 12345, "name": "foo", ... }, "67890": { "id": 67890, "name": "bar", ... }, "nhgCount": 2 } ``` The `nhgCount` metadata is mixed at the same level as the nexthop group entries. This is an unusual and fragile design — any consumer must know to skip integer-valued keys, and adding future metadata fields will further pollute the namespace. A cleaner structure would nest the groups: ```json { "nhgCount": 2, "nexthopGroups": { "12345": { ... }, "67890": { ... } } } ``` Also note: changing the JSON output from an array to a dict is a **breaking change** for existing consumers and automation tooling. This should be explicitly noted in the commit message with a YANG model backing this new structure (per `yang-json-output` rule). How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/nexthop_group.c
Line: 1093-1095
Comment:
**Breaking JSON API change — no deprecation path**
For `NEXTHOP_TYPE_IFINDEX`, the key is being renamed from `"nexthop"` to `"interfaceName"`. While the old name was semantically wrong, this is a **breaking change** to a public JSON output consumed by operators and tooling. Per FRR's backwards-compatibility rule, CLI and `lib/` JSON output changes require a deprecation path.
The fix for `NEXTHOP_TYPE_IPV4_IFINDEX` and `NEXTHOP_TYPE_IPV6_IFINDEX` is clearly correct (the old code stored the interface name under the `"vrfId"` key, which is an obvious bug), and additionally `vrfId` changing from a string to an integer is correct. But the `NEXTHOP_TYPE_IFINDEX` rename from `"nexthop"` to `"interfaceName"` needs documentation/coordination with anyone consuming this output.
At minimum, this should be called out explicitly in the commit message and PR description as a breaking JSON output change.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: pbrd/pbr_vty.c
Line: 1870-1881
Comment:
**JSON output structure is a flat dict mixing metadata with data**
The resulting JSON structure is:
```json
{
"12345": { "id": 12345, "name": "foo", ... },
"67890": { "id": 67890, "name": "bar", ... },
"nhgCount": 2
}
```
The `nhgCount` metadata is mixed at the same level as the nexthop group entries. This is an unusual and fragile design — any consumer must know to skip integer-valued keys, and adding future metadata fields will further pollute the namespace. A cleaner structure would nest the groups:
```json
{
"nhgCount": 2,
"nexthopGroups": {
"12345": { ... },
"67890": { ... }
}
}
```
Also note: changing the JSON output from an array to a dict is a **breaking change** for existing consumers and automation tooling. This should be explicitly noted in the commit message with a YANG model backing this new structure (per `yang-json-output` rule).
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 141-145
Comment:
**Dict-to-list extraction loses ordering guarantees — potential test flakiness**
```python
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)
```
The dict returned from parsing the JSON output is keyed by table IDs, which are numeric strings. The iteration order of this dict depends on the hash iteration order from the C daemon — specifically `hash_iterate(pbr_nhg_hash, ...)` — which is **not guaranteed to be stable** across runs or daemon restarts. If `topotest.json_cmp` performs positional list comparison and the expected file has a fixed ordering, this test is potentially flaky.
Consider either:
1. Sorting `actual_groups` by a stable key (e.g., `"name"` or `"id"`) before comparison, or
2. Reworking `nhg_cmp` to do a set-based match rather than positional list comparison.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 140
Comment:
**No guard against non-dict output from vtysh_cmd**
If `router.vtysh_cmd(cmd, isjson=True)` returns something other than a dict (e.g., an error string, `None`, or a JSON array due to a daemon bug), calling `.items()` will raise `AttributeError` and abort the test with a confusing traceback rather than a clean failure message.
Add a guard:
```suggestion
output = router.vtysh_cmd(cmd, isjson=True)
if not isinstance(output, dict):
return "Expected dict output, got: {}".format(type(output).__name__)
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 151
Comment:
**File handle leak — use context manager**
`open(expfile).read()` opens the file but never explicitly closes it. Use a context manager for correctness:
```suggestion
with open(expfile) as f:
expected = json.loads(f.read())
```
Per the `python-formatting` rule, Python code should follow best practices formatted with `black`. This also applies to the same pattern used elsewhere (`runit` function, line 127), but that is pre-existing.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 453-462
Comment:
**No guard against non-dict nhg_data in test_pbr_nhg_with_vrf_disable_enable**
The updated code calls `nhg_data.items()` unconditionally. If `nhg_data` is not a dict (e.g., daemon not responding, JSON parse failure), this will raise `AttributeError` and surface as a cryptic exception rather than a clean test failure. The old code had `if isinstance(nhg_data, list):` which at least provided a safe fallback. Add a corresponding type check:
```python
if not isinstance(nhg_data, dict):
return "Unexpected nhg_data type: {}".format(type(nhg_data).__name__)
for key, item in nhg_data.items():
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: pbrd/pbr_nht.c
Line: 1446-1447
Comment:
**nhg_count type is overly narrow**
`uint16_t nhg_count` caps at 65535. While this limit is unlikely to be hit in practice, `uint32_t` would match the type of `table_id` (which is used as the dict key) and be more future-proof. There's no memory savings benefit to using `uint16_t` here since the struct is on the stack.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "chore: add .greptile..."
| case NEXTHOP_TYPE_IFINDEX: | ||
| json_object_string_add(j, "nexthop", | ||
| json_object_string_add(j, "interfaceName", | ||
| ifindex2ifname(nh->ifindex, nh->vrf_id)); |
There was a problem hiding this comment.
Breaking JSON API change — no deprecation path
For NEXTHOP_TYPE_IFINDEX, the key is being renamed from "nexthop" to "interfaceName". While the old name was semantically wrong, this is a breaking change to a public JSON output consumed by operators and tooling. Per FRR's backwards-compatibility rule, CLI and lib/ JSON output changes require a deprecation path.
The fix for NEXTHOP_TYPE_IPV4_IFINDEX and NEXTHOP_TYPE_IPV6_IFINDEX is clearly correct (the old code stored the interface name under the "vrfId" key, which is an obvious bug), and additionally vrfId changing from a string to an integer is correct. But the NEXTHOP_TYPE_IFINDEX rename from "nexthop" to "interfaceName" needs documentation/coordination with anyone consuming this output.
At minimum, this should be called out explicitly in the commit message and PR description as a breaking JSON output change.
Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/nexthop_group.c
Line: 1093-1095
Comment:
**Breaking JSON API change — no deprecation path**
For `NEXTHOP_TYPE_IFINDEX`, the key is being renamed from `"nexthop"` to `"interfaceName"`. While the old name was semantically wrong, this is a **breaking change** to a public JSON output consumed by operators and tooling. Per FRR's backwards-compatibility rule, CLI and `lib/` JSON output changes require a deprecation path.
The fix for `NEXTHOP_TYPE_IPV4_IFINDEX` and `NEXTHOP_TYPE_IPV6_IFINDEX` is clearly correct (the old code stored the interface name under the `"vrfId"` key, which is an obvious bug), and additionally `vrfId` changing from a string to an integer is correct. But the `NEXTHOP_TYPE_IFINDEX` rename from `"nexthop"` to `"interfaceName"` needs documentation/coordination with anyone consuming this output.
At minimum, this should be called out explicitly in the commit message and PR description as a breaking JSON output change.
How can I resolve this? If you propose a fix, please make it concise.| """ | ||
| output = router.vtysh_cmd(cmd, isjson=True) | ||
| actual_groups = [v for k, v in output.items() if isinstance(v, dict)] | ||
| return topotest.json_cmp(actual_groups, expected) | ||
|
|
There was a problem hiding this comment.
Dict-to-list extraction loses ordering guarantees — potential test flakiness
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)The dict returned from parsing the JSON output is keyed by table IDs, which are numeric strings. The iteration order of this dict depends on the hash iteration order from the C daemon — specifically hash_iterate(pbr_nhg_hash, ...) — which is not guaranteed to be stable across runs or daemon restarts. If topotest.json_cmp performs positional list comparison and the expected file has a fixed ordering, this test is potentially flaky.
Consider either:
- Sorting
actual_groupsby a stable key (e.g.,"name"or"id") before comparison, or - Reworking
nhg_cmpto do a set-based match rather than positional list comparison.
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 141-145
Comment:
**Dict-to-list extraction loses ordering guarantees — potential test flakiness**
```python
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)
```
The dict returned from parsing the JSON output is keyed by table IDs, which are numeric strings. The iteration order of this dict depends on the hash iteration order from the C daemon — specifically `hash_iterate(pbr_nhg_hash, ...)` — which is **not guaranteed to be stable** across runs or daemon restarts. If `topotest.json_cmp` performs positional list comparison and the expected file has a fixed ordering, this test is potentially flaky.
Consider either:
1. Sorting `actual_groups` by a stable key (e.g., `"name"` or `"id"`) before comparison, or
2. Reworking `nhg_cmp` to do a set-based match rather than positional list comparison.
How can I resolve this? If you propose a fix, please make it concise.| """Compare nexthop-groups JSON dict output against expected list. | ||
|
|
||
| The JSON output is a dict keyed by table ID (plus nhgCount). | ||
| Extract the group values and compare as a list against expected. |
There was a problem hiding this comment.
No guard against non-dict output from vtysh_cmd
If router.vtysh_cmd(cmd, isjson=True) returns something other than a dict (e.g., an error string, None, or a JSON array due to a daemon bug), calling .items() will raise AttributeError and abort the test with a confusing traceback rather than a clean failure message.
Add a guard:
| Extract the group values and compare as a list against expected. | |
| output = router.vtysh_cmd(cmd, isjson=True) | |
| if not isinstance(output, dict): | |
| return "Expected dict output, got: {}".format(type(output).__name__) | |
| actual_groups = [v for k, v in output.items() if isinstance(v, dict)] |
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 140
Comment:
**No guard against non-dict output from vtysh_cmd**
If `router.vtysh_cmd(cmd, isjson=True)` returns something other than a dict (e.g., an error string, `None`, or a JSON array due to a daemon bug), calling `.items()` will raise `AttributeError` and abort the test with a confusing traceback rather than a clean failure message.
Add a guard:
```suggestion
output = router.vtysh_cmd(cmd, isjson=True)
if not isinstance(output, dict):
return "Expected dict output, got: {}".format(type(output).__name__)
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
```
How can I resolve this? If you propose a fix, please make it concise.| if not nhg_data: | ||
| return "Nexthop-group C data is empty" | ||
|
|
||
| # Find the nexthop-group with name "C" (ID is variable) | ||
| # Find the nexthop-group with name "C" (table ID key is variable) | ||
| nhg = None | ||
|
|
||
| if isinstance(nhg_data, list): | ||
| for item in nhg_data: | ||
| if isinstance(item, dict) and item.get("name") == "C": | ||
| nhg = item | ||
| break | ||
| for key, item in nhg_data.items(): | ||
| if isinstance(item, dict) and item.get("name") == "C": | ||
| nhg = item | ||
| break |
There was a problem hiding this comment.
No guard against non-dict nhg_data in test_pbr_nhg_with_vrf_disable_enable
The updated code calls nhg_data.items() unconditionally. If nhg_data is not a dict (e.g., daemon not responding, JSON parse failure), this will raise AttributeError and surface as a cryptic exception rather than a clean test failure. The old code had if isinstance(nhg_data, list): which at least provided a safe fallback. Add a corresponding type check:
if not isinstance(nhg_data, dict):
return "Unexpected nhg_data type: {}".format(type(nhg_data).__name__)
for key, item in nhg_data.items():Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 453-462
Comment:
**No guard against non-dict nhg_data in test_pbr_nhg_with_vrf_disable_enable**
The updated code calls `nhg_data.items()` unconditionally. If `nhg_data` is not a dict (e.g., daemon not responding, JSON parse failure), this will raise `AttributeError` and surface as a cryptic exception rather than a clean test failure. The old code had `if isinstance(nhg_data, list):` which at least provided a safe fallback. Add a corresponding type check:
```python
if not isinstance(nhg_data, dict):
return "Unexpected nhg_data type: {}".format(type(nhg_data).__name__)
for key, item in nhg_data.items():
```
How can I resolve this? If you propose a fix, please make it concise.