[EXPERIMENTAL] Improved Go tuple library by Semisol · Pull Request #12338 · apple/foundationdb

Semisol · 2025-08-28T22:38:50Z

This PR adds an experimental tuple decoder that is optimized for lower allocations and higher performance, support for fixed-length byte strings, along with fixing some panics.

This should be kept experimental until a way to handle the potentially breaking changes is found (possibly combining it with other breaking binding changes in a Go bindings "v2")

Over the original decoder, the new decoder attempts to zero-copy strings, and uses a new Boxed type which requires zero allocations compared to the any type which requires its contents are on the heap.
A neat hack relating to on-stack slice allocation is also used to remove allocations from append that arise from tuples while not having to estimate the length of the tuple for capacity.

The tests are also improved to ensure unpacking works correctly, and benchmarks are added.

This PR was closed and reopened due to separating this update to a separate branch

Benchmarking

To run benchmarks: go test -run=NONE -bench=^BenchmarkTupleUnpack

To-do list

Improve code documentation
Add fuzz tests that were used to discover panics that should have been errors when unpacking tuples
Handle the breaking change to google/uuid (we may be able to just alias tuple.UUID to it)
Make the V2 decoder the primary decoder, and remove the original decoder's code before merge.
Handle the fact that some users may expect Unpack to not depend on the input buffer. This could be "fixed" by copying the input buffer or by making a breaking v2 release (alongside possibly other binding improvements).
Add support for big.Int. The new decoder does not support this right now.
Add support for end-of-tuple type encoding and returning bytes after EOT
Consider adding support for 64-bit identifier type

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

The PR has a description, explaining both the problem and the solution.
The description mentions which forms of testing were done and the testing seems reasonable.
Every function/class/actor that was touched is reasonably well documented.

This commit: - Requires a minimum Go version 1.20 to allow using newer features in future commits - Adds support for the length-prefixed bytes type - Fixes certain panics detected by fuzzing on Unpack - Switches to binary.BigEndian methods for decoding integers which eliminates allocations and is faster - Switches to google/uuid for UUIDs to reduce conversion boilerplate as it seems to be the de-facto library for UUIDs - Does some other refactors for a future tuple "builder" to reduce allocations and eliminate type-casting overhead

This commit adds an experimental tuple unpacker to the Go bindings which reduces unnecessary allocations and tries to optimize decoding. Tests for the FixedLen type are also added, alongside extending tests to ensure the unpacked value is identical to the starting value and adding benchmarks, and negative numbers. - Zero-copy is used for strings and byte slices whenever possible to eliminate allocations. - Suprisingly, a loop was faster than using standard library functions to locate the end of a string and check. - A function table was used as it could allow extensibility in the future and may be faster on certain platforms. - A custom Boxed type is used, which is larger than an interface{} but does not require contents to be allocated on the heap. This is effectively a tagged union. - A neat hack is used in the unpacking loop to only do 1 allocation in the majority of tuple decoding cases.

vishesh · 2025-09-04T20:21:43Z

Thanks for the PR. Curious what is breaking here? Is it the API or something else?

Semisol · 2025-11-28T19:54:34Z

Thanks for the PR. Curious what is breaking here? Is it the API or something else?

UnpackV2 (which will become Unpack) assumes that the input byte slice will not become deallocated (if it for example points to memory not allocated by Go) or changed (as it tries to zero-copy strings).

The other breaking change is to switch from the tuple.UUID type to google/uuid, as it seems to be the most common convention.

I am currently maintaining an internal fork of the Go bindings to see how it can be improved, as it has a lot of rough edges. I am intending to contribute back the results once it is more stable, ideally as one large break.

Semisol added 2 commits August 29, 2025 00:28

Semisol mentioned this pull request Aug 28, 2025

[EXPERIMENTAL] Improved Go tuple library #12337

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EXPERIMENTAL] Improved Go tuple library#12338

[EXPERIMENTAL] Improved Go tuple library#12338
Semisol wants to merge 2 commits intoapple:mainfrom
Semisol:go-tuple

Semisol commented Aug 28, 2025

Uh oh!

vishesh commented Sep 4, 2025

Uh oh!

Semisol commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Semisol commented Aug 28, 2025

Benchmarking

To-do list

Code-Reviewer Section

Uh oh!

vishesh commented Sep 4, 2025

Uh oh!

Semisol commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants