Skip to content

[SmartSwitch] Support maximum gnmi message size of 125k messages#643

Open
croos12 wants to merge 4 commits intosonic-net:masterfrom
croos12:croos-increase-gnmi-message-size
Open

[SmartSwitch] Support maximum gnmi message size of 125k messages#643
croos12 wants to merge 4 commits intosonic-net:masterfrom
croos12:croos-increase-gnmi-message-size

Conversation

@croos12
Copy link
Copy Markdown

@croos12 croos12 commented Apr 9, 2026

Why I did it

At max DASH scale, a single gNMI call (CA-to-PA, ENI-level object config) can contain up to ~125,000 protobuf messages, which serializes to ~32 MB. sonic-gnmi currently uses gRPC's 4 MB default for MaxRecvMsgSize, so these calls are rejected.

How I did it

Setting Before After
gRPC MaxRecvMsgSize (server) 4 MB (gRPC default) configurable via -max_recv_msg_size
gRPC MaxSendMsgSize (server) math.MaxInt32 (default) configurable via -max_send_msg_size

Defaults are unchanged; sonic-buildimage#26679 sets both to 32 MB for SmartSwitch.

How to verify it

Start telemetry with -max_recv_msg_size above 4 MB and send a gNMI message larger than 4 MB; confirm it is accepted.

Copilot AI review requested due to automatic review settings April 9, 2026 18:23
@croos12 croos12 changed the title Support maximum gnmi message size of 125k [SmartSwitch] [DPU] Support maximum gnmi message size of 125k Apr 9, 2026
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds configurable gRPC max receive/send message sizes to the SONiC gNMI telemetry server, to better handle larger config/telemetry payloads at scale.

Changes:

  • Added MaxRecvMsgSize / MaxSendMsgSize to TelemetryConfig.
  • Introduced -max_recv_msg_size and -max_send_msg_size CLI flags.
  • Applied the configured limits to the gRPC server via grpc.MaxRecvMsgSize / grpc.MaxSendMsgSize.

Comment thread telemetry/telemetry.go Outdated
Comment thread telemetry/telemetry.go Outdated
Comment thread telemetry/telemetry.go Outdated
Comment thread telemetry/telemetry.go Outdated
@croos12 croos12 changed the title [SmartSwitch] [DPU] Support maximum gnmi message size of 125k [SmartSwitch] [NPU] Support maximum gnmi message size of 125k Apr 9, 2026
@croos12 croos12 changed the title [SmartSwitch] [NPU] Support maximum gnmi message size of 125k [SmartSwitch] Support maximum gnmi message size of 125k Apr 9, 2026
vivekrnv
vivekrnv previously approved these changes Apr 9, 2026
@vivekrnv
Copy link
Copy Markdown

vivekrnv commented Apr 9, 2026

@prsunny PFA

@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@croos12
Copy link
Copy Markdown
Author

croos12 commented Apr 10, 2026

/azpw run

@mssonicbld
Copy link
Copy Markdown
Contributor

⚠️ Notice: /azpw run only runs failed jobs now. If you want to trigger a whole pipline run, please rebase your branch or close and reopen the PR.
💡 Tip: You can also use /azpw retry to retry failed jobs directly.

Retrying failed(or canceled) jobs...

@mssonicbld
Copy link
Copy Markdown
Contributor

Retrying failed(or canceled) stages in build 1085425:

✅Stage Build:

  • Job build: retried.

prabhataravind
prabhataravind previously approved these changes Apr 10, 2026
croos12 added 2 commits April 10, 2026 21:59
Signed-off-by: Connor Roos <croos@nvidia.com>
Signed-off-by: Connor Roos <croos@nvidia.com>
@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@hdwhdw
Copy link
Copy Markdown
Contributor

hdwhdw commented Apr 10, 2026

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@croos12
Copy link
Copy Markdown
Author

croos12 commented Apr 13, 2026

/azpw run

@mssonicbld
Copy link
Copy Markdown
Contributor

⚠️ Notice: /azpw run only runs failed jobs now. If you want to trigger a whole pipline run, please rebase your branch or close and reopen the PR.
💡 Tip: You can also use /azpw retry to retry failed jobs directly.

Retrying failed(or canceled) jobs...

@mssonicbld
Copy link
Copy Markdown
Contributor

Retrying failed(or canceled) stages in build 1085623:

✅Stage BuildArm64:

  • Job arm64 deb build: retried.

✅Stage BuildAmd64:

  • Job amd64 deb build: retried.

✅Stage Build:

  • Job amd64 deb build: retried.
  • Job arm64 deb build: retried.
  • Job build: retried.

@croos12
Copy link
Copy Markdown
Author

croos12 commented Apr 14, 2026

/azpw retry

@mssonicbld
Copy link
Copy Markdown
Contributor

Retrying failed(or canceled) jobs...

@mssonicbld
Copy link
Copy Markdown
Contributor

Retrying failed(or canceled) stages in build 1085623:

✅Stage BuildAmd64:

  • Job amd64 deb build: retried.

✅Stage BuildArm64:

  • Job arm64 deb build: retried.

✅Stage Build:

  • Job amd64 deb build: retried.
  • Job build: retried.
  • Job arm64 deb build: retried.

@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@hdwhdw
Copy link
Copy Markdown
Contributor

hdwhdw commented Apr 16, 2026

@croos12 Just to double check do you mean 125K or 125M in title?

@croos12 croos12 changed the title [SmartSwitch] Support maximum gnmi message size of 125k [SmartSwitch] Support maximum gnmi message size 4 MB Apr 17, 2026
@croos12 croos12 changed the title [SmartSwitch] Support maximum gnmi message size 4 MB [SmartSwitch] Support maximum gnmi message size 125 MB Apr 17, 2026
@croos12 croos12 changed the title [SmartSwitch] Support maximum gnmi message size 125 MB [SmartSwitch] Support maximum gnmi message size 125k messages Apr 17, 2026
@croos12
Copy link
Copy Markdown
Author

croos12 commented Apr 18, 2026

@croos12 Just to double check do you mean 125K or 125M in title?

It's 125k total messages per file max. In testing I found that will fit in 32MB for gnmi send/recv buffers

@croos12 croos12 changed the title [SmartSwitch] Support maximum gnmi message size 125k messages [SmartSwitch] Support maximum gnmi message size of 125k messages Apr 19, 2026
@hdwhdw
Copy link
Copy Markdown
Contributor

hdwhdw commented Apr 21, 2026

Sorry this is still a bit confusing. I don't see any mentioning of the 125kb in the new code and isn't 125 kb < 4mb? What exactly are you changing to 125kb?

@croos12
Copy link
Copy Markdown
Author

croos12 commented Apr 21, 2026

Sorry this is still a bit confusing. I don't see any mentioning of the 125kb in the new code and isn't 125 kb < 4mb? What exactly are you changing to 125kb?

Sorry these variables are kind of confusing. The "125k" in the title means 125,000 messages (count), not 125 kilobytes. The largest ENI-level object configuration in DASH requires up to 125,000 protobuf messages to be sent in a single gNMI call (CA-to-PA). At that scale, the serialized payload requires a maxMessage size of around ~32MB, which I am setting in my sonic-buildimage PR that requires this change: sonic-net/sonic-buildimage#26679

@hdwhdw
Copy link
Copy Markdown
Contributor

hdwhdw commented Apr 21, 2026

Thanks for the clarification. It would be nice to include some background or pointers to background in the description.

Also you might want to merge latest master for the pipeline error.

@mssonicbld
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants