Skip to content

pbr: pbr nexthop group json fixes#159

Open
suphawitwanl wants to merge 4 commits intoaitest-base-pr-21194from
aitest-21194-scenario-8-dot-greptile
Open

pbr: pbr nexthop group json fixes#159
suphawitwanl wants to merge 4 commits intoaitest-base-pr-21194from
aitest-21194-scenario-8-dot-greptile

Conversation

@suphawitwanl
Copy link

  1. fix interfaceName and vrfId in nexthop-group json output — Fixes the key naming
  2. change nexthop-group json output from array to dict — Changes JSON structure and adds nhgCount.

sougatahitcs and others added 4 commits March 19, 2026 14:55
The vrfId JSON key was incorrectly storing the interface name
instead of the actual VRF ID for IFINDEX, IPV4_IFINDEX, and
IPV6_IFINDEX nexthop types. Separate them into interfaceName
for the interface name and vrfId for the numeric VRF ID.

Signed-off-by: Sougata Barik <[email protected]>
Change the JSON format of "show pbr nexthop-groups json" from
an array to a dictionary keyed by table ID, allowing extra
attributes to be added. Add nhgCount to the top-level output.

Signed-off-by: Sougata Barik <[email protected]>
@greptile-apps
Copy link

greptile-apps bot commented Mar 20, 2026

Greptile Summary

This PR makes two JSON output changes to the pbrd PBR daemon: (1) fixes incorrect key names in nexthop_group_json_nexthopNEXTHOP_TYPE_IPV4_IFINDEX and IPV6_IFINDEX were storing interface names under the "vrfId" key and vrfId was emitted as a string instead of an integer, and (2) changes show pbr nexthop-groups json output from a JSON array to a dict keyed by table ID, adding a nhgCount metadata field. The bug fixes are genuine and correct; the structural array-to-dict change and the NEXTHOP_TYPE_IFINDEX key rename are breaking JSON API changes.

Major Concerns:

  • Breaking JSON API changes without deprecation path: Two changes break existing JSON consumers — the "nexthop""interfaceName" rename in lib/nexthop_group.c for NEXTHOP_TYPE_IFINDEX, and the array-to-dict restructuring in pbrd/pbr_vty.c. Per FRR backwards-compat policy, these must either be documented as intentional breaking changes in the commit message or given a deprecation path.
  • No YANG model backing the new JSON structure: Per yang-json-output rule, new JSON output must be backed by a YANG model. The new dict-keyed-by-table-id structure with nhgCount has no YANG model reference.
  • Flat JSON design mixing metadata and data: nhgCount lives at the same level as the nexthop group entries (keyed by table ID). A nested structure (nexthopGroups: {...}) would be cleaner and allow adding metadata without key collisions.
  • Test flakiness risk: nhg_cmp extracts dict values into a Python list whose order depends on C hash iteration order (not stable). If topotest.json_cmp is positional for list comparison, the test is potentially flaky.
  • Missing type guards in tests: nhg_cmp and test_pbr_nhg_with_vrf_disable_enable both call .items() without first verifying the JSON output is a dict, risking AttributeError on unexpected daemon output.
  • User docs not updated: The show pbr nexthop-groups json output format has changed; doc/user/pbr.rst should be updated to document the new JSON structure.

Hi I can read ./greptile thanks

Confidence Score: 2/5

  • Not safe to merge — contains breaking JSON API changes with no deprecation path, no YANG model backing, and test robustness issues.
  • The bug fixes (wrong key names for interfaceName/vrfId) are correct and important. However, the array-to-dict restructuring and the NEXTHOP_TYPE_IFINDEX key rename are breaking changes to public JSON output with no deprecation path, no YANG model, and flat metadata/data mixing. The topotest additions have ordering-dependent list comparison risks and missing type guards that could cause flaky CI. These concerns must be resolved before merge.
  • lib/nexthop_group.c (breaking key rename), pbrd/pbr_vty.c (breaking struct change, flat metadata design), tests/topotests/pbr_topo1/test_pbr_topo1.py (ordering flakiness, missing type guards)

Important Files Changed

Filename Overview
lib/nexthop_group.c Fixes two genuine bugs (NEXTHOP_TYPE_IPV4_IFINDEX and IPV6_IFINDEX had "vrfId" key storing an interface name string, and vrfId was a string not an int). Also renames NEXTHOP_TYPE_IFINDEX "nexthop" key to "interfaceName" — a breaking JSON API change with no deprecation path.
pbrd/pbr_nht.c Changes nexthop-group JSON output from array to dict keyed by table_id, adds nhg_count tracking. Breaking change to JSON output structure with no deprecation. nhg_count uses uint16_t which is unnecessarily narrow.
pbrd/pbr_nht.h Header updated to reflect changed function signature of pbr_nht_json_nexthop_group, adding uint16_t *pbr_nhg_count output parameter. Straightforward and correct.
pbrd/pbr_vty.c Switches JSON container from array to object, adds nhgCount metadata at same level as group entries. Mixed metadata/data at the same JSON level is a design smell; no YANG model backing the new structure.
tests/topotests/pbr_topo1/test_pbr_topo1.py Updated tests for new dict-based JSON output. nhg_cmp has a potential flakiness issue (ordering) and no type guard. test_pbr_nhg_with_vrf_disable_enable lacks a type check. File handle leak in runit_nhg.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["show pbr nexthop-groups [WORD] json"] --> B["pbr_vty.c: show_pbr_nexthop_group"]
    B --> C["json_object_new_object()"]
    C --> D["pbr_nht_json_nexthop_group(j, name, &count)"]
    D --> E["hash_iterate(pbr_nhg_hash, pbr_nht_json_nhg)"]
    E --> F["pbr_nht_json_nhg per hash bucket"]
    F --> G["json_object_object_addf(j, group, '%u', table_id)"]
    G --> H["pns->nhg_count++"]
    H --> I["Next bucket"]
    I --> E
    E --> J["*pbr_nhg_count = pns.nhg_count"]
    J --> K["json_object_int_add(j, 'nhgCount', count)"]
    K --> L["vty_json output"]

    F --> M["pbr_nht_json_nhg_nexthops per nexthop"]
    M --> N["nexthop_group_json_nexthop(nh_obj, nh)"]
    N --> O{nh->type}
    O -->|IFINDEX| P["interfaceName + vrfId int"]
    O -->|IPV4| Q["nexthop IP"]
    O -->|IPV4_IFINDEX| R["nexthop IP + interfaceName + vrfId int"]
    O -->|IPV6| S["nexthop IPv6"]
    O -->|IPV6_IFINDEX| T["nexthop IPv6 + interfaceName + vrfId int"]
    O -->|BLACKHOLE| U["no fields"]
Loading

Comments Outside Diff (1)

  1. pbrd/pbr_vty.c, line 1870-1881 (link)

    P1 JSON output structure is a flat dict mixing metadata with data

    The resulting JSON structure is:

    {
      "12345": { "id": 12345, "name": "foo", ... },
      "67890": { "id": 67890, "name": "bar", ... },
      "nhgCount": 2
    }

    The nhgCount metadata is mixed at the same level as the nexthop group entries. This is an unusual and fragile design — any consumer must know to skip integer-valued keys, and adding future metadata fields will further pollute the namespace. A cleaner structure would nest the groups:

    {
      "nhgCount": 2,
      "nexthopGroups": {
        "12345": { ... },
        "67890": { ... }
      }
    }

    Also note: changing the JSON output from an array to a dict is a breaking change for existing consumers and automation tooling. This should be explicitly noted in the commit message with a YANG model backing this new structure (per yang-json-output rule).

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: pbrd/pbr_vty.c
    Line: 1870-1881
    
    Comment:
    **JSON output structure is a flat dict mixing metadata with data**
    
    The resulting JSON structure is:
    ```json
    {
      "12345": { "id": 12345, "name": "foo", ... },
      "67890": { "id": 67890, "name": "bar", ... },
      "nhgCount": 2
    }
    ```
    
    The `nhgCount` metadata is mixed at the same level as the nexthop group entries. This is an unusual and fragile design — any consumer must know to skip integer-valued keys, and adding future metadata fields will further pollute the namespace. A cleaner structure would nest the groups:
    ```json
    {
      "nhgCount": 2,
      "nexthopGroups": {
        "12345": { ... },
        "67890": { ... }
      }
    }
    ```
    
    Also note: changing the JSON output from an array to a dict is a **breaking change** for existing consumers and automation tooling. This should be explicitly noted in the commit message with a YANG model backing this new structure (per `yang-json-output` rule).
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/nexthop_group.c
Line: 1093-1095

Comment:
**Breaking JSON API change — no deprecation path**

For `NEXTHOP_TYPE_IFINDEX`, the key is being renamed from `"nexthop"` to `"interfaceName"`. While the old name was semantically wrong, this is a **breaking change** to a public JSON output consumed by operators and tooling. Per FRR's backwards-compatibility rule, CLI and `lib/` JSON output changes require a deprecation path.

The fix for `NEXTHOP_TYPE_IPV4_IFINDEX` and `NEXTHOP_TYPE_IPV6_IFINDEX` is clearly correct (the old code stored the interface name under the `"vrfId"` key, which is an obvious bug), and additionally `vrfId` changing from a string to an integer is correct. But the `NEXTHOP_TYPE_IFINDEX` rename from `"nexthop"` to `"interfaceName"` needs documentation/coordination with anyone consuming this output.

At minimum, this should be called out explicitly in the commit message and PR description as a breaking JSON output change.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: pbrd/pbr_vty.c
Line: 1870-1881

Comment:
**JSON output structure is a flat dict mixing metadata with data**

The resulting JSON structure is:
```json
{
  "12345": { "id": 12345, "name": "foo", ... },
  "67890": { "id": 67890, "name": "bar", ... },
  "nhgCount": 2
}
```

The `nhgCount` metadata is mixed at the same level as the nexthop group entries. This is an unusual and fragile design — any consumer must know to skip integer-valued keys, and adding future metadata fields will further pollute the namespace. A cleaner structure would nest the groups:
```json
{
  "nhgCount": 2,
  "nexthopGroups": {
    "12345": { ... },
    "67890": { ... }
  }
}
```

Also note: changing the JSON output from an array to a dict is a **breaking change** for existing consumers and automation tooling. This should be explicitly noted in the commit message with a YANG model backing this new structure (per `yang-json-output` rule).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 141-145

Comment:
**Dict-to-list extraction loses ordering guarantees — potential test flakiness**

```python
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)
```

The dict returned from parsing the JSON output is keyed by table IDs, which are numeric strings. The iteration order of this dict depends on the hash iteration order from the C daemon — specifically `hash_iterate(pbr_nhg_hash, ...)` — which is **not guaranteed to be stable** across runs or daemon restarts. If `topotest.json_cmp` performs positional list comparison and the expected file has a fixed ordering, this test is potentially flaky.

Consider either:
1. Sorting `actual_groups` by a stable key (e.g., `"name"` or `"id"`) before comparison, or
2. Reworking `nhg_cmp` to do a set-based match rather than positional list comparison.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 140

Comment:
**No guard against non-dict output from vtysh_cmd**

If `router.vtysh_cmd(cmd, isjson=True)` returns something other than a dict (e.g., an error string, `None`, or a JSON array due to a daemon bug), calling `.items()` will raise `AttributeError` and abort the test with a confusing traceback rather than a clean failure message.

Add a guard:
```suggestion
    output = router.vtysh_cmd(cmd, isjson=True)
    if not isinstance(output, dict):
        return "Expected dict output, got: {}".format(type(output).__name__)
    actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 151

Comment:
**File handle leak — use context manager**

`open(expfile).read()` opens the file but never explicitly closes it. Use a context manager for correctness:

```suggestion
    with open(expfile) as f:
        expected = json.loads(f.read())
```

Per the `python-formatting` rule, Python code should follow best practices formatted with `black`. This also applies to the same pattern used elsewhere (`runit` function, line 127), but that is pre-existing.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 453-462

Comment:
**No guard against non-dict nhg_data in test_pbr_nhg_with_vrf_disable_enable**

The updated code calls `nhg_data.items()` unconditionally. If `nhg_data` is not a dict (e.g., daemon not responding, JSON parse failure), this will raise `AttributeError` and surface as a cryptic exception rather than a clean test failure. The old code had `if isinstance(nhg_data, list):` which at least provided a safe fallback. Add a corresponding type check:

```python
if not isinstance(nhg_data, dict):
    return "Unexpected nhg_data type: {}".format(type(nhg_data).__name__)
for key, item in nhg_data.items():
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: pbrd/pbr_nht.c
Line: 1446-1447

Comment:
**nhg_count type is overly narrow**

`uint16_t nhg_count` caps at 65535. While this limit is unlikely to be hit in practice, `uint32_t` would match the type of `table_id` (which is used as the dict key) and be more future-proof. There's no memory savings benefit to using `uint16_t` here since the struct is on the stack.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "chore: add .greptile..."

Comment on lines 1093 to 1095
case NEXTHOP_TYPE_IFINDEX:
json_object_string_add(j, "nexthop",
json_object_string_add(j, "interfaceName",
ifindex2ifname(nh->ifindex, nh->vrf_id));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Breaking JSON API change — no deprecation path

For NEXTHOP_TYPE_IFINDEX, the key is being renamed from "nexthop" to "interfaceName". While the old name was semantically wrong, this is a breaking change to a public JSON output consumed by operators and tooling. Per FRR's backwards-compatibility rule, CLI and lib/ JSON output changes require a deprecation path.

The fix for NEXTHOP_TYPE_IPV4_IFINDEX and NEXTHOP_TYPE_IPV6_IFINDEX is clearly correct (the old code stored the interface name under the "vrfId" key, which is an obvious bug), and additionally vrfId changing from a string to an integer is correct. But the NEXTHOP_TYPE_IFINDEX rename from "nexthop" to "interfaceName" needs documentation/coordination with anyone consuming this output.

At minimum, this should be called out explicitly in the commit message and PR description as a breaking JSON output change.

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/nexthop_group.c
Line: 1093-1095

Comment:
**Breaking JSON API change — no deprecation path**

For `NEXTHOP_TYPE_IFINDEX`, the key is being renamed from `"nexthop"` to `"interfaceName"`. While the old name was semantically wrong, this is a **breaking change** to a public JSON output consumed by operators and tooling. Per FRR's backwards-compatibility rule, CLI and `lib/` JSON output changes require a deprecation path.

The fix for `NEXTHOP_TYPE_IPV4_IFINDEX` and `NEXTHOP_TYPE_IPV6_IFINDEX` is clearly correct (the old code stored the interface name under the `"vrfId"` key, which is an obvious bug), and additionally `vrfId` changing from a string to an integer is correct. But the `NEXTHOP_TYPE_IFINDEX` rename from `"nexthop"` to `"interfaceName"` needs documentation/coordination with anyone consuming this output.

At minimum, this should be called out explicitly in the commit message and PR description as a breaking JSON output change.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +141 to +145
"""
output = router.vtysh_cmd(cmd, isjson=True)
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Dict-to-list extraction loses ordering guarantees — potential test flakiness

actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)

The dict returned from parsing the JSON output is keyed by table IDs, which are numeric strings. The iteration order of this dict depends on the hash iteration order from the C daemon — specifically hash_iterate(pbr_nhg_hash, ...) — which is not guaranteed to be stable across runs or daemon restarts. If topotest.json_cmp performs positional list comparison and the expected file has a fixed ordering, this test is potentially flaky.

Consider either:

  1. Sorting actual_groups by a stable key (e.g., "name" or "id") before comparison, or
  2. Reworking nhg_cmp to do a set-based match rather than positional list comparison.
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 141-145

Comment:
**Dict-to-list extraction loses ordering guarantees — potential test flakiness**

```python
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
return topotest.json_cmp(actual_groups, expected)
```

The dict returned from parsing the JSON output is keyed by table IDs, which are numeric strings. The iteration order of this dict depends on the hash iteration order from the C daemon — specifically `hash_iterate(pbr_nhg_hash, ...)` — which is **not guaranteed to be stable** across runs or daemon restarts. If `topotest.json_cmp` performs positional list comparison and the expected file has a fixed ordering, this test is potentially flaky.

Consider either:
1. Sorting `actual_groups` by a stable key (e.g., `"name"` or `"id"`) before comparison, or
2. Reworking `nhg_cmp` to do a set-based match rather than positional list comparison.

How can I resolve this? If you propose a fix, please make it concise.

"""Compare nexthop-groups JSON dict output against expected list.

The JSON output is a dict keyed by table ID (plus nhgCount).
Extract the group values and compare as a list against expected.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 No guard against non-dict output from vtysh_cmd

If router.vtysh_cmd(cmd, isjson=True) returns something other than a dict (e.g., an error string, None, or a JSON array due to a daemon bug), calling .items() will raise AttributeError and abort the test with a confusing traceback rather than a clean failure message.

Add a guard:

Suggested change
Extract the group values and compare as a list against expected.
output = router.vtysh_cmd(cmd, isjson=True)
if not isinstance(output, dict):
return "Expected dict output, got: {}".format(type(output).__name__)
actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 140

Comment:
**No guard against non-dict output from vtysh_cmd**

If `router.vtysh_cmd(cmd, isjson=True)` returns something other than a dict (e.g., an error string, `None`, or a JSON array due to a daemon bug), calling `.items()` will raise `AttributeError` and abort the test with a confusing traceback rather than a clean failure message.

Add a guard:
```suggestion
    output = router.vtysh_cmd(cmd, isjson=True)
    if not isinstance(output, dict):
        return "Expected dict output, got: {}".format(type(output).__name__)
    actual_groups = [v for k, v in output.items() if isinstance(v, dict)]
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 453 to +462
if not nhg_data:
return "Nexthop-group C data is empty"

# Find the nexthop-group with name "C" (ID is variable)
# Find the nexthop-group with name "C" (table ID key is variable)
nhg = None

if isinstance(nhg_data, list):
for item in nhg_data:
if isinstance(item, dict) and item.get("name") == "C":
nhg = item
break
for key, item in nhg_data.items():
if isinstance(item, dict) and item.get("name") == "C":
nhg = item
break
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No guard against non-dict nhg_data in test_pbr_nhg_with_vrf_disable_enable

The updated code calls nhg_data.items() unconditionally. If nhg_data is not a dict (e.g., daemon not responding, JSON parse failure), this will raise AttributeError and surface as a cryptic exception rather than a clean test failure. The old code had if isinstance(nhg_data, list): which at least provided a safe fallback. Add a corresponding type check:

if not isinstance(nhg_data, dict):
    return "Unexpected nhg_data type: {}".format(type(nhg_data).__name__)
for key, item in nhg_data.items():
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/pbr_topo1/test_pbr_topo1.py
Line: 453-462

Comment:
**No guard against non-dict nhg_data in test_pbr_nhg_with_vrf_disable_enable**

The updated code calls `nhg_data.items()` unconditionally. If `nhg_data` is not a dict (e.g., daemon not responding, JSON parse failure), this will raise `AttributeError` and surface as a cryptic exception rather than a clean test failure. The old code had `if isinstance(nhg_data, list):` which at least provided a safe fallback. Add a corresponding type check:

```python
if not isinstance(nhg_data, dict):
    return "Unexpected nhg_data type: {}".format(type(nhg_data).__name__)
for key, item in nhg_data.items():
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants