Skip to content

fix: nat lost in some p2p apps#2216

Merged
KKRainbow merged 1 commit into
EasyTier:mainfrom
21paradox:develop
May 9, 2026
Merged

fix: nat lost in some p2p apps#2216
KKRainbow merged 1 commit into
EasyTier:mainfrom
21paradox:develop

Conversation

@21paradox
Copy link
Copy Markdown
Contributor

reuse conn by dst_peer_id, every peer use only 1 quic conn, to fix nat lost problem

我遇到了丢nat的问题。场景是透明代理, 所有数据都通过 tun(gost自带)发到远端(自带的relay协议, 基于tcp),
跑p2p应用(erigon) 会有 0 caplin peer的问题。

经过调试,使用单个quic conn来处理所有连接(open_bi), 可以解决这个问题, pr的修改大致是这个意思

如果场景的连接数量过高,可以本地修改easytier/src/tunnel/quic.rs 的max_concurrent_bidi_streams配置为2000(默认256)

@KKRainbow KKRainbow requested a review from ZnqbuZ May 6, 2026 16:47
@KKRainbow
Copy link
Copy Markdown
Member

丢 NAT 是什么意思

@KKRainbow
Copy link
Copy Markdown
Member

不知道为啥没法 comment。这个 pr 有几个严重 bug 我 comment 不上去。
另外我觉得还是有必要调查一下为什么现在的形式会有问题,单纯的改成 connection 复用感觉只会隐藏问题

Comment thread easytier/src/gateway/quic_proxy.rs Outdated
@ZnqbuZ
Copy link
Copy Markdown
Contributor

ZnqbuZ commented May 6, 2026

只从 quic 的角度看确实是应该复用已有的 connection 的,我当时写的时候确实对 quic 不够熟悉

@ZnqbuZ
Copy link
Copy Markdown
Contributor

ZnqbuZ commented May 6, 2026

对于 transport_config,quic tunnel 和 quic proxy 应该需要不同的参数,这个有待测试

@21paradox
Copy link
Copy Markdown
Contributor Author

丢 NAT 是什么意思

容器里流量通过gost来转发(relay协议,流量封装成tcp, 类似vless),easytier就当作网络中转,而且只处理tcp的请求。
有些依赖p2p的应用(比如etherum的节点)找不到peers.
开启enable_quic_proxy 或者 enable_kcp_proxy 后有这个问题, 如果是默认udp没有问题, 但是默认udp速度只有100k-200k

Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs
@21paradox 21paradox force-pushed the develop branch 4 times, most recently from 8dee0fc to b16fec3 Compare May 8, 2026 00:51
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
@21paradox 21paradox force-pushed the develop branch 4 times, most recently from 73c1356 to a8ab9ab Compare May 8, 2026 06:46
Comment thread easytier/src/gateway/quic_proxy.rs
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
@ZnqbuZ
Copy link
Copy Markdown
Contributor

ZnqbuZ commented May 8, 2026

另外只开启 enable_kcp_proxy 不开启 enable_quic_proxy 的时候也有这个问题吗?

@21paradox
Copy link
Copy Markdown
Contributor Author

另外只开启 enable_kcp_proxy 不开启 enable_quic_proxy 的时候也有这个问题吗?

我现在不确定kcp是否有问题了,需要再观察一下

@21paradox 21paradox force-pushed the develop branch 2 times, most recently from 70fee73 to bf8e376 Compare May 9, 2026 00:36
@KKRainbow KKRainbow requested a review from ZnqbuZ May 9, 2026 01:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR changes the QUIC proxy behavior to reuse a single QUIC connection per destination peer (keyed by dst_peer_id) to mitigate NAT-loss issues observed in some P2P app scenarios over a transparent proxy/tun setup.

Changes:

  • Introduce a per-peer connection cache (moka::future::Cache) to reuse quinn::Connection by PeerId.
  • Replace multi-attempt concurrent connect logic with a simpler retry loop that reuses/invalidates cached connections.
  • Adjust stream receive task handling to await transfer completion and log transfer errors.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 7 comments.

File Description
easytier/src/gateway/quic_proxy.rs Adds per-peer QUIC connection caching + retry/invalidate logic; updates stream task execution to await transfer result.
easytier/Cargo.toml Adds moka dependency (future cache) to support connection reuse.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +852 to +853
let conn_map = Cache::builder()
.max_capacity(u8::MAX.into()) // same with max_concurrent_bidi_streams, can be increased
Comment thread easytier/src/gateway/quic_proxy.rs Outdated

let mut connect_tasks = JoinSet::<Result<QuicStream, Error>>::new();
let connect = |tasks: &mut JoinSet<_>| {
for attempt in 0..2 {
Comment thread easytier/src/gateway/quic_proxy.rs Outdated

if attempt == 0 {
self.conn_map.invalidate(&dst_peer_id).await;
tokio::time::sleep(Duration::from_millis(300)).await;
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
}
}

Err(anyhow!("quic connect: failed to establish stream after retry").into())
Comment on lines +852 to +854
let conn_map = Cache::builder()
.max_capacity(u8::MAX.into()) // same with max_concurrent_bidi_streams, can be increased
.time_to_idle(Duration::from_secs(600))
Comment thread easytier/src/gateway/quic_proxy.rs Outdated
.await;

match stream {
Ok(stream) => return Ok(stream),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果有一个链接 stream 活着,但是一直没新的连接创建,是不是 conn 也会因为超过 ttl 被 evict?

Copy link
Copy Markdown
Contributor Author

@21paradox 21paradox May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

由下一次try_get_with 触发的,超过ttl 才会重新init一个新的。或者手动invalidate, 下一次try_get_with 触发新的init

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我的意思是两个节点之间有存活的长链接,并且一直没有新的链接建立,这个长链接会不会被 cache ttl 清掉?

Copy link
Copy Markdown
Contributor Author

@21paradox 21paradox May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

惰性触发的,moka不会主动清理(没有异步任务)。下一个connect fn调用才会清理,然后旧的quinn::Connection被替换成新的,但是 活跃的 SendStream/RecvStream 内部持有连接引用 → QUIC 连接继续存活

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是惰性清理,那如果来了一波 connection,后续没有新连接的话内存就要永久被占用?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后面代码加了定时任务60s调用run_pending_tasks 清理

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

清理之后,那上面的问题是否依然存在:会误清理长链接

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么需要手动清理?让 Cache 自己按照 TTI 清理就行了,要不然用 Cache 干什么

stream 会持有 Connection 的引用,Cache 里面的 Connection 释放了也没关系,以前我就是这么做的。不放心的话可以加个单元测试,如果真有问题的话把 Connection 句柄 Clone 一份和 stream 的生命周期绑定就行

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我本地测试验证下清理后单条连接的情况

Copy link
Copy Markdown
Contributor Author

@21paradox 21paradox May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

清理之后,那上面的问题是否依然存在:会误清理长链接

测了下不会误清理,长链接能保持(conn被踢掉后)

复现步骤:
.try_get_with(dst_peer_id, async move {
debug!("quic connect begin {}", dst_peer_id); //加日志

            .time_to_idle(Duration::from_secs(30)) // 调小

            let mut interval = tokio::time::interval(Duration::from_secs(10)); //调小
            loop {
                interval.tick().await;
                debug!("quic conn_map_bg run_pending_tasks"); //加日志
                conn_map_bg.run_pending_tasks().await;
            }

然后ssh连接远端ssh(termux dropbear ssh)
ssh -p 8022 -t u0_a94@10.126.126.8 "while true; do echo $(date) keepalive; sleep 1; done"

日志过滤内容:May 09 19:35:21 nixos12700 easytier-core[267418]: 2026-05-09T19:35:21.056038686+08:00 DEBUG easytier::gateway::quic_proxy: quic connect begin 790591911
May 09 19:35:29 nixos12700 easytier-core[267418]: 2026-05-09T19:35:29.538870845+08:00 DEBUG easytier::gateway::quic_proxy: quic conn_map_bg run_pending_tasks
^[]11;rgb:0c0c/0c0c/0c0c^[\May 09 19:35:39 nixos12700 easytier-core[267418]: 2026-05-09T19:35:39.539525309+08:00 DEBUG easytier::gateway::quic_proxy: quic conn_map_bg run_pending_tasks
然后再连接新的ssh
ssh -p 8022 -t u0_a94@10.126.126.8 "while true; do echo $(date) keepalive; sleep 1; done"
发现日志
May 09 19:37:01 nixos12700 easytier-core[267418]: 2026-05-09T19:37:01.316130293+08:00 DEBUG easytier::gateway::quic_proxy: quic connect begin 790591911 说明重新创建conn了

等待一会,2个ssh都能输出时间,而且
easytier-cli proxy 能看到2个Connected的连接, 断开一个ssh后变成Closed(一个),过会消失

@KKRainbow KKRainbow merged commit bfbfa2e into EasyTier:main May 9, 2026
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants