Skip to content

doctor: deepen DC verification with MTProto handshake probe#496

Draft
dolonet wants to merge 3 commits into9seconds:masterfrom
dolonet:doctor/rpc-probe
Draft

doctor: deepen DC verification with MTProto handshake probe#496
dolonet wants to merge 3 commits into9seconds:masterfrom
dolonet:doctor/rpc-probe

Conversation

@dolonet
Copy link
Copy Markdown
Contributor

@dolonet dolonet commented May 5, 2026

Closes #494.

Stacked on top of #495 — keeping this as a draft until #495 merges, then I'll rebase to drop the obfuscation rename commit and leave only the two commits unique to this work. The diff currently shows 3 commits because of that.

What this does

After a successful TCP connect, mtg doctor performs an unauthenticated req_pq_multiresPQ exchange against each DC. This proves the peer can speak MTProto, not just bind on port 443.

  • mtglib/dcprobe — new leaf package, ~170 LOC, no new dependencies. Reuses the now-public mtglib/obfuscation (mtglib: promote obfuscation out of internal #495). Single round-trip, two TL messages, hand-rolled (de)serialization to avoid pulling in gotd/td or similar.
  • Doctor.checkNetworkAddresses — the dial loop now runs the probe and surfaces rpc <rtt> next to the connect line.
  • Failure messages distinguish tcp connect to <addr>: … from rpc handshake to <addr>: … via wrapped errors. errors.Is(err, dcprobe.ErrNotTelegram) lets callers tell the cases apart.

The probe is enabled by default — an opt-in --deep flag would defeat the purpose, since the existing TCP-only check is what motivated the issue. Open to a flag if you'd rather have a soft rollout.

Output sample

Verified locally on a Hetzner Helsinki host with a real config:

Validate native network connectivity
  ✅ DC 1 (rpc 144ms)
  ✅ DC 2 (rpc 42ms)
  ✅ DC 3 (rpc 130ms)
  ✅ DC 4 (rpc 27ms)
  ✅ DC 5 (rpc 185ms)

Tests

  • mtglib/dcprobe/probe_test.go ships two tests:
    • TestProbeRejectsMisbehavingPeer — deterministic, in-process, no network. Spins up a localhost listener that accepts the obfs2 handshake then writes garbage; asserts errors.Is(err, dcprobe.ErrNotTelegram).
    • TestProbeAgainstTelegramDCs — opt-in via MTG_PROBE_NETWORK=1, hits all 5 main DCs + 2 IPv6 DCs. Skipped by default to keep CI hermetic; matches the existing pattern for the obfuscation fuzz tests.
  • go build ./..., go vet ./..., go test ./... all clean.

Notes

  • I picked the failure-mode of "TCP ok but RPC failed" returns the same red ❌ as TCP failure — the wrapped error message tells you which it was. Happy to add a separate ⚠️ template if you'd like a softer signal there.
  • DC numbers are now passed into checkNetworkAddresses so the probe knows which DC to bake into the obfs2 handshake frame; previously the function only needed the address list.

dolonet added 3 commits May 5, 2026 23:15
Per discussion on 9seconds#494, this allows external packages (e.g. the upcoming
mtglib/dcprobe for the doctor RPC probe) to reuse the obfuscated2
transport without an internal wrapper.

No public-API change beyond the import path. The only exported names
(Obfuscator, its two methods, and the Secret field) were already
exported within the package.
New leaf package that performs the first step of the MTProto handshake
(req_pq_multi -> resPQ) over the existing obfuscated2 transport. No
auth_key is generated; no long-lived state is introduced. Two TL
messages, one round-trip, no new dependencies.

A generic listener cannot fake the reply because it must echo back our
random nonce in resPQ.

Used by the doctor command in a follow-up commit to distinguish a real
Telegram DC from a generic TCP listener bound to port 443.
Closes 9seconds#494.

After a successful TCP connect, run an unauthenticated req_pq_multi ->
resPQ exchange via mtglib/dcprobe. This rejects generic listeners that
happen to bind 443 but cannot speak MTProto.

Output now shows "(rpc <rtt>)" on success; on failure the wrapped error
distinguishes "tcp connect to ...: ..." from "rpc handshake to ...: ...".
The probe runs by default — an opt-in flag would defeat the purpose,
since the existing TCP-only check is what motivated the issue.
@dolonet dolonet force-pushed the doctor/rpc-probe branch from dee4137 to 408e2c7 Compare May 5, 2026 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

doctor: deepen DC verification beyond TCP connect

1 participant