Improve IPv4 lookups with Eytzinger index#202
Improve IPv4 lookups with Eytzinger index#202marselester wants to merge 1 commit intooschwald:mainfrom
Conversation
WalkthroughThis pull request introduces an optional IPv4 index feature to optimize IPv4 lookups using an Eytzinger-like precomputed path. A new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 🧪 Unit Test Generation v2 is now available!We have significantly improved our unit test generation capabilities. To enable: Add this to your reviews:
finishing_touches:
unit_tests:
enabled: trueTry it out by using the Have feedback? Share your thoughts on our Discord thread! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@reader_test.go`:
- Line 866: Add a functional test that opens the MMDB with the WithIPv4Index
option and verifies the indexed lookup path returns identical results to the
non-indexed path: call Open("GeoLite2-City.mmdb", WithIPv4Index) and compare its
lookup outputs against Open("GeoLite2-City.mmdb") for the same test IPs (add
this either in TestLookupNetwork or as a new TestIPv4IndexLookup that iterates
existing test cases), ensuring you exercise the functions used for
lookups/decoding to validate correctness; also add brief comments to
BenchmarkDecodeCountryCodeWithStruct, BenchmarkDecodePathCountryCode, and
BenchmarkCityLookupConcurrent explaining why WithIPv4Index is intentionally
omitted or add variants that include it for comparison.
In `@reader.go`:
- Around line 503-512: Replace the len-based check with a nil check to clarify
intent: change the conditional that decides between r.traverseTreeWithIndex and
r.traverseTree to use "r.ipv4Index != nil" instead of "len(r.ipv4Index) > 0";
keep the rest of the logic (calling traverseTreeWithIndex(ip) when the index is
present and traverseTree(ip, 0, 128) otherwise) intact to preserve behavior.
- Around line 662-670: The node parameter in Reader.traverseTree is
unused/overwritten (always 0 at callers and replaced when ip.Is4()), so remove
the node uint parameter from traverseTree and update its body to call
r.traverseTreeFromBit(ip, startNode, startBit, stopBit) using r.ipv4Start (and
r.ipv4StartBitDepth) for IPv4 or 0 for IPv6; also update all call sites to drop
the node argument and adjust any signatures that referenced traverseTree
accordingly while keeping traverseTreeFromBit untouched.
|
|
||
| func BenchmarkInterfaceLookup(b *testing.B) { | ||
| db, err := Open("GeoLite2-City.mmdb") | ||
| db, err := Open("GeoLite2-City.mmdb", WithIPv4Index) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for functional test coverage of WithIPv4Index option
echo "=== WithIPv4Index usage in all test files ==="
rg -n 'WithIPv4Index' --glob '*_test.go'
echo -e "\n=== Test functions (non-benchmark) opening databases ==="
rg -n 'func Test[^B]' --glob '*_test.go' -A 10 | grep -E '(func Test|Open\()'
echo -e "\n=== All Open calls with WithIPv4Index context ==="
rg -n 'Open\(' --glob '*_test.go' -B 2 -A 2 | grep -E '(func|Open\(|WithIPv4Index)'Repository: oschwald/maxminddb-golang
Length of output: 15896
Add functional tests for the IPv4 index path.
The WithIPv4Index option is only used in benchmarks (lines 866, 885, 956, 975). While these benchmarks validate performance with the index enabled, there are no functional unit tests that exercise the indexed lookup path for correctness. Add test cases to TestLookupNetwork or create a dedicated test that opens the database with WithIPv4Index to ensure the index-based traversal returns identical results to the non-indexed path.
Additionally, some IPv4 benchmarks (BenchmarkDecodeCountryCodeWithStruct, BenchmarkDecodePathCountryCode, BenchmarkCityLookupConcurrent) do not use WithIPv4Index. If this is intentional for comparison purposes, a brief comment explaining the rationale would be helpful for maintainability.
🤖 Prompt for AI Agents
In `@reader_test.go` at line 866, Add a functional test that opens the MMDB with
the WithIPv4Index option and verifies the indexed lookup path returns identical
results to the non-indexed path: call Open("GeoLite2-City.mmdb", WithIPv4Index)
and compare its lookup outputs against Open("GeoLite2-City.mmdb") for the same
test IPs (add this either in TestLookupNetwork or as a new TestIPv4IndexLookup
that iterates existing test cases), ensuring you exercise the functions used for
lookups/decoding to validate correctness; also add brief comments to
BenchmarkDecodeCountryCodeWithStruct, BenchmarkDecodePathCountryCode, and
BenchmarkCityLookupConcurrent explaining why WithIPv4Index is intentionally
omitted or add variants that include it for comparison.
| var ( | ||
| node uint | ||
| prefixLength int | ||
| err error | ||
| ) | ||
| if len(r.ipv4Index) > 0 && ip.Is4() { | ||
| node, prefixLength, err = r.traverseTreeWithIndex(ip) | ||
| } else { | ||
| node, prefixLength, err = r.traverseTree(ip, 0, 128) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Minor: Consider using r.ipv4Index != nil for the nil check.
While len(r.ipv4Index) > 0 works correctly (returns 0 for nil slices), r.ipv4Index != nil is more idiomatic for checking whether an optional feature is enabled. This also more clearly conveys the intent that we're checking if the index was initialized.
♻️ Suggested change
-if len(r.ipv4Index) > 0 && ip.Is4() {
+if r.ipv4Index != nil && ip.Is4() {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var ( | |
| node uint | |
| prefixLength int | |
| err error | |
| ) | |
| if len(r.ipv4Index) > 0 && ip.Is4() { | |
| node, prefixLength, err = r.traverseTreeWithIndex(ip) | |
| } else { | |
| node, prefixLength, err = r.traverseTree(ip, 0, 128) | |
| } | |
| var ( | |
| node uint | |
| prefixLength int | |
| err error | |
| ) | |
| if r.ipv4Index != nil && ip.Is4() { | |
| node, prefixLength, err = r.traverseTreeWithIndex(ip) | |
| } else { | |
| node, prefixLength, err = r.traverseTree(ip, 0, 128) | |
| } |
🤖 Prompt for AI Agents
In `@reader.go` around lines 503 - 512, Replace the len-based check with a nil
check to clarify intent: change the conditional that decides between
r.traverseTreeWithIndex and r.traverseTree to use "r.ipv4Index != nil" instead
of "len(r.ipv4Index) > 0"; keep the rest of the logic (calling
traverseTreeWithIndex(ip) when the index is present and traverseTree(ip, 0, 128)
otherwise) intact to preserve behavior.
| func (r *Reader) traverseTree(ip netip.Addr, node uint, stopBit int) (uint, int, error) { | ||
| startBit := 0 | ||
| if ip.Is4() { | ||
| startBit = r.ipv4StartBitDepth | ||
| node = r.ipv4Start | ||
| } | ||
|
|
||
| return r.traverseTreeFromBit(ip, node, startBit, stopBit) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
The node parameter appears unused and may cause confusion.
The node parameter passed to traverseTree is always 0 at all call sites and is immediately overwritten when ip.Is4() is true. For IPv6 addresses, the function starts from node 0 regardless of the parameter value.
Consider removing the parameter or documenting why it exists:
♻️ Suggested simplification
-func (r *Reader) traverseTree(ip netip.Addr, node uint, stopBit int) (uint, int, error) {
- startBit := 0
- if ip.Is4() {
- startBit = r.ipv4StartBitDepth
- node = r.ipv4Start
- }
+func (r *Reader) traverseTree(ip netip.Addr, stopBit int) (uint, int, error) {
+ startBit := 0
+ node := uint(0)
+ if ip.Is4() {
+ startBit = r.ipv4StartBitDepth
+ node = r.ipv4Start
+ }
return r.traverseTreeFromBit(ip, node, startBit, stopBit)
}And update call sites accordingly:
-return r.traverseTree(ip, 0, 128)
+return r.traverseTree(ip, 128)🤖 Prompt for AI Agents
In `@reader.go` around lines 662 - 670, The node parameter in Reader.traverseTree
is unused/overwritten (always 0 at callers and replaced when ip.Is4()), so
remove the node uint parameter from traverseTree and update its body to call
r.traverseTreeFromBit(ip, startNode, startBit, stopBit) using r.ipv4Start (and
r.ipv4StartBitDepth) for IPv4 or 0 for IPv6; also update all call sites to drop
the node argument and adjust any signatures that referenced traverseTree
accordingly while keeping traverseTreeFromBit untouched.
Just an idea to use IPv4 index (1 MB) to speed up lookups by almost 30%.
benchstat
old.txt
new.txt
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.