Warning Unhealthy 35m (x43 over 18h) kubelet Liveness probe failed: Get "http://10.100.183.149:9253/dbs/int/global/DBSWriter/healthz": EOF
Normal Killing 35m (x12 over 18h) kubelet Container dbs2go-global-w failed liveness probe, will be restarted
Normal Pulling 35m (x13 over 18h) kubelet Pulling image "registry.cern.ch/cmsweb/dbs2go:v00.06.45"
...
Upon taking a look at the logs from those pods I found the full of the following error messages: [1]
[2025-04-04 00:00:00.451537959 +0000 UTC m=+5269.571874437] server.go:3640: http: panic serving 10.100.6.151:45200: internal/sync.HashTrieMap: ran out of hash bits while inserting
goroutine 12975 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1947 +0xbe
panic({0xa06a20?, 0xbce0e0?})
/usr/local/go/src/runtime/panic.go:792 +0x132
internal/sync.(*HashTrieMap[...]).expand(0xbe1380?, 0xc0007bc1b0, 0xc000343260, 0x98c1c8f0b00498f9, 0x0, 0xc0007b2320)
/usr/local/go/src/internal/sync/hashtriemap.go:181 +0x1e5
internal/sync.(*HashTrieMap[...]).LoadOrStore(0xbe1380, {0xa06a20, 0xc0003b2180}, {0xa78900, 0xc0007883c0})
/usr/local/go/src/internal/sync/hashtriemap.go:160 +0x38d
sync.(*Map).LoadOrStore(...)
/usr/local/go/src/sync/hashtriemap.go:67
github.com/ulule/limiter/v3/drivers/store/memory.(*Cache).LoadOrStore(0xc00042e280, {0xc000734000?, 0xff61c0?}, 0xc0007883c0)
/go/pkg/mod/github.com/ulule/limiter/v3@v3.11.0/drivers/store/memory/cache.go:137 +0x55
github.com/ulule/limiter/v3/drivers/store/memory.(*Cache).Increment(0xc00042e280, {0xc000734000, 0x14}, 0x1, 0x3b9aca00)
/go/pkg/mod/github.com/ulule/limiter/v3@v3.11.0/drivers/store/memory/cache.go:198 +0x166
github.com/ulule/limiter/v3/drivers/store/memory.(*Store).Get(0xc000010810, {0x10?, 0x0?}, {0xc0002f4180, 0xc}, {{0xad7796?, 0x8738dd?}, 0x457d49?, 0xc0004a54d8?})
/go/pkg/mod/github.com/ulule/limiter/v3@v3.11.0/drivers/store/memory/store.go:42 +0x25f
github.com/ulule/limiter/v3.(*Limiter).Get(0xc00006d4e8?, {0xbd6dd8?, 0xc0003430e0?}, {0xc0002f4180?, 0x51f3b0?})
/go/pkg/mod/github.com/ulule/limiter/v3@v3.11.0/limiter.go:49 +0x3d
github.com/dmwm/dbs2go/web.limitMiddleware.(*Middleware).Handler.func2({0xbd62d0, 0xc00011e180}, 0xc00049cdc0)
/go/pkg/mod/github.com/ulule/limiter/v3@v3.11.0/drivers/middleware/stdlib/middleware.go:45 +0xac
net/http.HandlerFunc.ServeHTTP(0xc00006d5e8?, {0xbd62d0?, 0xc00011e180?}, 0x0?)
/usr/local/go/src/net/http/server.go:2294 +0x29
github.com/dmwm/dbs2go/web.validateMiddleware.func1({0xbd62d0, 0xc00011e180}, 0xc00049cdc0)
/go/src/github.com/vkuznet/dbs2go/web/middlewares.go:103 +0x358
net/http.HandlerFunc.ServeHTTP(0x76925b?, {0xbd62d0?, 0xc00011e180?}, 0x0?)
/usr/local/go/src/net/http/server.go:2294 +0x29
github.com/dmwm/dbs2go/web.authMiddleware.func1({0xbd62d0, 0xc00011e180}, 0xc00049cdc0)
/go/src/github.com/vkuznet/dbs2go/web/middlewares.go:79 +0x343
net/http.HandlerFunc.ServeHTTP(0x10?, {0xbd62d0?, 0xc00011e180?}, 0xc00006d848?)
/usr/local/go/src/net/http/server.go:2294 +0x29
github.com/vkuznet/auth-proxy-server/logging.LoggingMiddleware.func1({0xbd6138, 0xc0000002a0}, 0xc00049cdc0)
/go/pkg/mod/github.com/vkuznet/auth-proxy-server/logging@v0.0.0-20230224155500-18f9e3f9c368/logging.go:146 +0x158
net/http.HandlerFunc.ServeHTTP(0x0?, {0xbd6138?, 0xc0000002a0?}, 0x6?)
/usr/local/go/src/net/http/server.go:2294 +0x29
github.com/dmwm/dbs2go/web.headerMiddleware.func1({0xbd6138, 0xc0000002a0}, 0xc00049cdc0)
/go/src/github.com/vkuznet/dbs2go/web/middlewares.go:149 +0x4db
net/http.HandlerFunc.ServeHTTP(0xc00049cc80?, {0xbd6138?, 0xc0000002a0?}, 0x800?)
/usr/local/go/src/net/http/server.go:2294 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc0000bc000, {0xbd6138, 0xc0000002a0}, 0xc00049cb40)
/go/pkg/mod/github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1e2
net/http.(*ServeMux).ServeHTTP(0x477bf9?, {0xbd6138, 0xc0000002a0}, 0xc00049cb40)
/usr/local/go/src/net/http/server.go:2822 +0x1c4
net/http.serverHandler.ServeHTTP({0xc000342f30?}, {0xbd6138?, 0xc0000002a0?}, 0x1?)
/usr/local/go/src/net/http/server.go:3301 +0x8e
net/http.(*conn).serve(0xc0000d05a0, {0xbd6dd8, 0xc0005f3b60})
/usr/local/go/src/net/http/server.go:2102 +0x625
created by net/http.(*Server).Serve in goroutine 13
/usr/local/go/src/net/http/server.go:3454 +0x485
Impact of the bug
DBS internal
Describe the bug
Upon cutting the latest tag
v00.06.45and deploying it in preproduction, we started experiencing some restarts of the k8 pods due to broken liveness probe. Here follows the event log from one of the pods:Upon taking a look at the logs from those pods I found the full of the following error messages: [1]
This behavior has not been observed with DBS version
v00.06.44. The comparison between the two tags is: v00.06.44...v00.06.45 . but if we need to shortly list the PRs only, here they are:How to reproduce it
Not clear yet.
Expected behavior
The service to be stable.
Additional context and error message
Similar issue reported at: golang/go#69534 , even though it does not become clear what have been the solution there.
[1]