config: accept hostname for [domain-fronting] target#480
config: accept hostname for [domain-fronting] target#4809seconds merged 5 commits into9seconds:masterfrom
Conversation
The existing `[domain-fronting].ip` only accepts a literal IP. That forces SNI-router setups to pin a static container address (and a static docker subnet) so mtg can dial the fronting backend directly instead of resolving the secret's hostname via DNS, which would loop back into mtg through the SNI router. Add a sibling `[domain-fronting].host` that accepts either a hostname or an IP. Hostnames are resolved at dial time by the native dialer (Happy Eyeballs / dual-stack), so a docker-DNS or any A+AAAA record naturally picks the right backend address family per client. Setting both `host` and `ip` is rejected at validation. The mtglib API stays backward compatible: ProxyOpts.DomainFrontingIP is still a plain string and the dial path already calls JoinHostPort + DialContext, both of which accept hostnames. Only the doc comment was clarified.
Follow-up to the previous commit on this branch: - Rename Config.GetDomainFrontingIP -> GetDomainFrontingHost. The helper now returns a hostname or an IP, so the old name was a lie. Drop the unused defaultValue net.IP parameter (every caller passed nil). Update internal/cli/run_proxy.go and internal/cli/doctor.go; rename the misleading `ip` local var in doctor.go to `override`. - Add TOML fixtures (domain_fronting_host.toml, domain_fronting_ip.toml) so the new field is exercised through the actual Parse()->JSON->Config path users hit, not just via direct .Set() calls. Plus a positive backward-compat test confirming an `ip`-only legacy config still validates and resolves correctly, and a no-fronting test confirming the unset case returns empty. - Clarify example.config.toml: `ip` is kept for backward compatibility, not because it has stricter validation semantics worth choosing over `host`. mtglib.ProxyOpts.DomainFrontingIP keeps its name (public API).
|
A few open questions worth deciding before merge — happy to do either branch: 1. Additive vs. rename-with-deprecation. This PR adds 2. 3. Mutual-exclusion scope. The current check rejects |
- mtglib/proxy.go: rename private field domainFrontingIP -> domainFrontingHost
and update DomainFrontingAddress() doc comment to reflect that hostnames
are now accepted. The exported mtglib.ProxyOpts.DomainFrontingIP is
unchanged (public API), so the assignment in NewProxy now reads
`domainFrontingHost: opts.DomainFrontingIP,` which makes the
public-vs-internal naming explicitly visible at the boundary.
- internal/config/{parse,config}.go: reorder so Host comes before IP in
the [domain-fronting] struct. Cosmetic, but signals Host is the
preferred forward path.
- Add TestDomainFrontingHostAcceptsLiteralIP + domain_fronting_host_ip.toml
fixture exercising the documented "host accepts hostname or literal IP"
contract end-to-end.
| return nil | ||
| } | ||
|
|
||
| if strings.ContainsAny(value, " \t\n/?#") { |
There was a problem hiding this comment.
I'm wondering if there is a possibility to validate that this domain is resolvable. We can set any double dutch here. IP is fine, but I do believe that we can do something about resolving hostname.
There was a problem hiding this comment.
I'd push back on doing DNS at parse time. Three reasons:
-
Codebase precedent. The closest existing field is
mtglib.Secret.Host(the secret's SNI hostname), andSecret.Set()does no DNS validation — only non-empty. Same "any double dutch" risk, deliberate choice. -
The reachability check already lives in
doctor.checkFrontingDomain()ininternal/cli/doctor.goresolves and dials the fronting target end-to-end; with this PR it picks uphostviaGetDomainFrontingHost(). A bogus hostname surfaces the dialer's DNS error there. That's the right layer for semantic checks — explicit, opt-in, with proper diagnostics. -
Resolving at parse time defeats the point of accepting a hostname. The motivating case (mtg behind an SNI router on a docker network) specifically needs dial-time resolution: the alias may resolve in-container but not on the host, and the address family can flip between v4/v6 per client (Happy Eyeballs). If we resolve at parse, either we cache the IP and lose all that, or we discard and resolve again at dial — in which case the parse-time resolve is just a flaky boot dependency.
Operational: a transient DNS hiccup at startup would prevent the proxy from starting, and a one-shot resolve doesn't catch the host going stale later — so it adds fragility without much real safety.
If the concern is that doctor's message for an unresolvable host is too generic (it surfaces whatever DialContext returns), I can add an explicit LookupIPAddr step in checkFrontingDomain so the error reads "hostname X cannot be resolved" rather than being nested inside a dial error. Want me to wire that in?
There was a problem hiding this comment.
Makes sense, thanks. I see your point. But I must reply on that argument:
The motivating case (mtg behind an SNI router on a docker network) specifically needs dial-time resolution: the alias may resolve in-container but not on the host, and the address family can flip between v4/v6 per client (Happy Eyeballs)
We should not think about different modes, host and container one. There should be only one environment we have to think about: one that runs mtg. If it happens to be in a container, let it be it. If it happens to be a generic host one, let it be a host one. If something is resolved on the host, but not in a container, then this is not a concern of mtg.
But this is not a performative concern, just my opinion in this regard. Such rigid behavior helps making a resilient software
| // | ||
| // This is useful when DNS resolution of the fronting host is blocked. | ||
| // The hostname from the secret is still used for SNI in the TLS handshake. | ||
| // DomainFrontingIP is the address to use when connecting to the fronting |
There was a problem hiding this comment.
I think it makes sense to deprecate this option then. Otherwise we will have 2 ways how to set the same effective values
There was a problem hiding this comment.
Agreed on the principle — two ways to set the same effective value is confusing. The codebase already has a clean pattern for this: checkDeprecatedConfig() in internal/cli/doctor.go plus the tplWDeprecatedConfig template (already deprecates the flat domain-fronting-ip/domain-fronting-port in favour of the [domain-fronting] block). I can mirror it for [domain-fronting].ip → host:
- Add a deprecation entry to
checkDeprecatedConfig(when = "2.4.0", one minor after the existing2.3.0removals). - Update
example.config.tomlto mark# ip = "10.10.10.11"deprecated, matching the comment style of the other deprecated blocks. - Add
// Deprecated: use Host instead.onConfig.DomainFronting.IP.
One thing I'd like to pin down before the commit: scope. Are you asking to deprecate the config option [domain-fronting].ip, or also the public Go field mtglib.ProxyOpts.DomainFrontingIP?
The Go field is the dial target consumed by anyone using mtg as a library. With this PR it already holds either a hostname or an IP, so deprecating the symbol would mean introducing a renamed DomainFrontingHost for the same value — a breaking API change. I'd lean toward only deprecating the config key and leaving the Go API alone (maybe with a clarifying note in the godoc), but it's your call.
There was a problem hiding this comment.
Thanks, glad we agree here. Yes, I'm thinking about config option. We can add a comment for DomainFrontingIP that it is present but unnecessary, and just remove all its usage. I understand that in theory it is going to break backward compatibility but this is not a new pattern. I think we are fine here.
For example, this is how Go itself communicates such deprecations: https://pkg.go.dev/net just take a look at DualStack. It has been deprecated and flipped its original meaning. Technically, this was not backward compatible change.
There was a problem hiding this comment.
Two scope questions before the commit:
-
DomainFrontingIPdeprecation. "Remove all its usage" reads two ways: (a) soft —NewProxycopiesIP → HostwhenHostis empty, existing library callers keep working; (b) strict no-op à lanet.Dialer.DualStack— field marked Deprecated, value silently ignored. Your DualStack reference points at (b); I'll go that way unless you'd rather not silently break out-of-tree callers. -
CLI flag.
--domain-fronting-ipinsimple_runis the same situation at the CLI layer. Add--domain-fronting-host+ deprecate the old flag in this PR, or leave the CLI surface for a follow-up?
There was a problem hiding this comment.
I think that good compromise would be to issue a warning in logs if any value is set. And ignore afterwards
There was a problem hiding this comment.
Confirming: warn + ignore for the config key, the DomainFrontingIP Go field, and the --domain-fronting-ip CLI flag.
One side effect worth flagging: a config with only ip set (no host) will warn and effectively disable domain-fronting until the user renames the key — same deliberate compatibility break as net.Dialer.DualStack. Will push the deprecation as the next commit on this branch.
|
Yeah, I havent' thought about this usecase, makes perfect sense. |
Per review on 9seconds#480: warn-and-ignore for the IP-shaped paths, mirroring the net.Dialer.DualStack precedent — a config that sets only "ip" will warn at startup and effectively disable domain-fronting until the user switches to "host". - mtglib.ProxyOpts: add DomainFrontingHost; mark DomainFrontingIP Deprecated and warn-and-drop in NewProxy. - internal/config: GetDomainFrontingHost returns only [domain-fronting].host; deprecated keys are no longer used to derive the dial target. runProxy logs a startup warning per deprecated key that is set. - internal/cli: add --domain-fronting-host; --domain-fronting-ip flag is parsed only so the runtime warning can fire. - internal/cli/doctor: redirect the existing 2.3.0 entry at "host" and add a 2.4.0 entry for [domain-fronting].ip. - example.config.toml: mark # ip = ... as deprecated.
Summary
Today
[domain-fronting].iponly accepts a literal IP (TypeIP→net.ParseIP). When mtg sits behind an SNI router (HAProxy, etc.)whose DNS for the secret's hostname points back at the same host,
mtg's default fronting behaviour resolves that hostname and dials
itself, looping. Working around this in deployment requires pinning a
static docker container IP and subnet so the literal-IP field can
target the fronting backend directly.
This adds a sibling
[domain-fronting].hostfield that accepts eithera hostname or an IP. Hostnames are resolved at dial time by the native
dialer (which already handles dual-stack / Happy Eyeballs), so an A+AAAA
docker-DNS or normal DNS record reaches the right backend address
family for IPv4 and IPv6 clients without a static pin.
ipsemantics are unchanged. Setting bothhostandipis rejectedin
Validate().API stability
mtglib.ProxyOpts.DomainFrontingIPis still a plainstring. Thedial path already passes it to
net.JoinHostPort+DialContext,both of which accept hostnames. Only the field's doc comment was
clarified — no signature change, no new public symbols in
mtglib.internal/cli/run_proxy.go,simple_run.goanddoctor.goallcontinue to work via the existing
Config.GetDomainFrontingIP()helper, which now prefers
HostoverIPover the legacy flatdomain-fronting-ip.What's added
internal/config/type_host.go—TypeHost(accepts hostname or IP,rejects empty,
host:port, URLs, whitespace).internal/config/type_host_test.go— Get / UnmarshalOk / UnmarshalFail.internal/config/config_test.go—TestDomainFrontingHostOrIPcovering precedence and mutual-exclusion.
[domain-fronting]section ofexample.config.tomlupdated todocument
hostalongsideip.Motivation
Direct follow-up to a review thread on #478, which currently works
around the literal-IP restriction with a pinned docker subnet. Once
this lands, that PR collapses to ~3 lines (
host = "web") and itsIPv6 PROXY-protocol-family caveat goes away.
Test plan
go build ./...go vet ./...go test ./...— full suite greenTypeHosttests pass (5 accept, 5 reject)TestDomainFrontingHostOrIPcovers Validate() rejection of both-set and Host-wins precedence