AGT-26 - Add persistent config store with MQTT get/set#52
Merged
Conversation
0eaf32b to
d71c5d4
Compare
rodneyosodo
reviewed
May 25, 2026
Add a file-backed key-value store (pkg/config/store.go) that survives restarts. On startup config is merged in order: env vars → bootstrap response → persisted overrides via MG_AGENT_CONFIG_PATH. MQTT config get <key> and config set <key> <value> commands read and write the store at runtime. Only the five operational keys are accepted (log_level, heartbeat_interval, terminal_session_timeout, nodered_url, server_port); unknown keys are rejected to prevent arbitrary storage and SSRF via the nodered_url key. Writes use an atomic temp-file rename to avoid store corruption on crash.
- Treat empty config file as an empty store instead of a fatal error - Copy entries under lock and write to disk outside the lock; roll back in-memory state if the disk write fails - Reject zero/negative durations in ApplyConfigEntry to prevent time.NewTicker(0) panic and permanent device bricking on restart - Use strings.SplitN(...,3) in the set handler so values containing commas are not silently truncated - Add default case to ServiceConfig switch so unknown commands return an error instead of silently acknowledging - Unexport settableKeys; remove nodered_url (SSRF risk) and server_port (cannot be re-bound at runtime) from the allowlist - Wire slog.LevelVar through initLogger and agent.New so log_level changes take effect immediately on the running logger - Add heartbeatIntervalCh to reset the selfHeartbeat ticker live when heartbeat_interval is updated via MQTT
… command - Extract Store interface from concrete fileStore so the store is injectable and mockable without leaking the concrete type - Add Remove(key) with atomic write and rollback, mirroring Set - Add MQTT reset command to drop a persisted override (same allowlist as set); returns "not_configured" when no store is wired - Add ApplyConfigEntryForTest export shim for white-box unit tests - Add TestApplyConfigEntry covering all settable keys, invalid duration no-op, and unknown-key no-op - Add TestStore_Remove_* covering existence, persistence, and write-error rollback; add reset cases to TestConfigGetSet
…evert - Add cfgMu (sync.RWMutex) and startupConfig to agent struct; snapshot config in Config(), AddConfig(), Terminal(), Publish(), getTopic(), normalizeNodeRedFlow(), and UpdateLiveness() to eliminate data races - Add validateSettableValue to reject invalid log levels, unparseable durations, and heartbeat intervals below 1s before persisting - Add revertToStartup so 'reset' propagates the startup value to running subsystems (log level, heartbeat ticker) immediately - Sync levelVar with any persisted log_level override on startup in main - Fix Set/Remove to hold the write lock for the full duration of the disk write and roll back in-memory state on failure - Fix NewStore to use errors.Is(err, os.ErrNotExist) - Extend tests: rollback assertion on write error, invalid log level and sub-second heartbeat rejection, zero/negative duration edge cases, and require.NoError for seed operations
Wrap deferred Close calls in closures, prefix fire-and-forget cleanup calls with _ =, and explicitly discard fmt.Fprintf returns in tests.
Extract repeated config key strings and not_configured into named constants (keyLogLevel, keyHeartbeatInterval, keyTerminalSessionTimeout, notConfigured) to satisfy goconst. Auto-format service.go and service_test.go with gci/gofumpt to resolve formatting violations.
Co-authored-by: Rodney Osodo <[email protected]>
- Use DataChan for heartbeat topic in New() - Parse ServiceConfig args consistently using TrimSpace - Add default case to validateSettableValue
rodneyosodo
approved these changes
May 25, 2026
nyagamunene
reviewed
May 25, 2026
| t = fmt.Sprintf("m/%s/c/%s/res", domainID, a.config.Channels.CtrlChan()) | ||
| t = fmt.Sprintf("m/%s/c/%s/res", domainID, cfg.Channels.CtrlChan()) | ||
| case data: | ||
| t = fmt.Sprintf("m/%s/c/%s/gateway/telemetry", domainID, a.config.Channels.DataChan()) |
Contributor
There was a problem hiding this comment.
Should we not update here to cfg.config.Channels
Contributor
Author
There was a problem hiding this comment.
Fixed — getTopic now uses cfg.Channels consistently throughout.
nyagamunene
reviewed
May 25, 2026
| BootstrapRetryDelay string `env:"MG_AGENT_BOOTSTRAP_RETRY_DELAY_SECONDS" envDefault:"10"` | ||
| BootstrapSkipTLS string `env:"MG_AGENT_BOOTSTRAP_SKIP_TLS" envDefault:"false"` | ||
| DeviceDBPath string `env:"MG_AGENT_DEVICE_DB_PATH" envDefault:"/var/lib/agent/devices.db"` | ||
| ConfigPath string `env:"MG_AGENT_CONFIG_PATH" envDefault:"agent-config.json"` |
Contributor
There was a problem hiding this comment.
Should we update the docker compose and .env files with this variable. as of now there is no way to set this to a different path
Contributor
Author
There was a problem hiding this comment.
Added MG_AGENT_CONFIG_PATH to docker/.env and docker-compose.yml.
- Use cfg.Channels consistently in getTopic - Add MG_AGENT_CONFIG_PATH to docker/.env and docker-compose.yml
nyagamunene
approved these changes
May 26, 2026
rodneyosodo
approved these changes
May 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this do?
Adds a file-backed persistent config store. Supports MQTT
get/setcommands. Live-updates log level and heartbeat interval.Which issue(s) does this PR fix/relate to?
Resolves #26
List any changes that modify/break current functionality
agent.New()gains two new parameters:*config.Store,*slog.LevelVar.Have you included tests for your changes?
Yes. Store and service config tests added.
Did you document any new/modified functionality?
No docs changes needed.
Notes
Settable keys:
log_level,heartbeat_interval,terminal_session_timeout.