Skip to content

Commit 22a38f1

Browse files
committed
New draft
1 parent e9355c4 commit 22a38f1

File tree

1 file changed

+99
-0
lines changed

1 file changed

+99
-0
lines changed

_drafts/bonding-iperf3.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
title: "BanD Width! It's MyBOND!!"
3+
tagline: "Can bonding bring double upload?"
4+
tags: linux networking
5+
redirect_from: /p/78
6+
---
7+
8+
My workstation in my univerity lab has consistently generated some 600 TiB of annual upload across two PT sites. After an upgrade to the room a few months back, it received the most wanted upgrade for years: A second 1 Gbps line to the campus network. I immediately made the two NICs into one bonded 2 Gbps interface, and it has shined for many times with multiple popular torrents.
9+
10+
Inspired by my friend [TheRainstorm](https://blog.yfycloud.site/) who managed to double his upload by load-balancing WireGuard over two lines from the same ISP (China Mobile), I figured I'd leverage this chance and make a better understanding though some more detailed experiments.
11+
12+
## Setup
13+
14+
### Sender
15+
16+
My workstation:
17+
18+
- Ubuntu 24.04 (Kernel 6.8)
19+
- Two Intel I210 NICs, connected to the same switch of campus network
20+
- iperf 3.16
21+
22+
Controlled variables:
23+
24+
- Bond mode: Round-robin (`balance-rr`, 0) vs Transmit Load Balancing (`balance-tlb`, 5)
25+
- For `balance-tlb` mode, `xmit_hash_policy` is set to `layer3+4`.
26+
- TCP congestion control (CC) algorithm: CUBIC vs BBR
27+
- Parallelism (the value to `-P` option of iperf3): 1 vs 4
28+
- Note that due to the way `balance-tlb` works, with only one connection, bonding is not going to be different than using only one NIC.
29+
<!--
30+
- Whether another application is constantly generating background upload activity.
31+
- This is implemented by letting qBittorrent run in the background with upload rate limited to 1 MB/s, and is labeled `qB` during the experiment.
32+
-->
33+
34+
These variables combine into 8 different scenarios, which is enough dimensions to look at in isolation.
35+
36+
### Receivers
37+
38+
I sourced three destination (receiver) hosts with > 2 Gbps download bandwidth, labeled as following:
39+
40+
- **A**: One of our usually idle lab servers.
41+
- **B**: [USTC Mirrors](https://mirrors.ustc.edu.cn/) server.
42+
- **C**: Friend-sponsored home broadband in Shanghai.
43+
44+
Typical traits of these destinations are:
45+
46+
| Destination | Download BW | Latency | BDP\* | Other notes |
47+
| :--: | :--: | :--: | :--: | :--- |
48+
| A | 10 Gbps | 250 ± 30 us | 500 KB | Mostly idle |
49+
| B | 10 Gbps | 300 ± 200 us | 600 KB | Under constant load |
50+
| C | ~2.2 Gbps | 28 ± 0.2 ms | 56 MB | Mostly idle |
51+
52+
Because my workstation can only generate upload at a theoretical maximum speed of 2 Gbps, BDP is calculated at this speed.
53+
54+
## Analyses
55+
56+
iperf3 provides three indicators: Transmission bitrate, number of retransmissions and the congestion window size.
57+
58+
We cannot attach too much importance to the bond mode when testing bonding performance, so let's get straight to it.
59+
60+
### Single stream
61+
62+
| Dest | P | CC | Bitrate (RR) | Bitrate (TLB) | Retr (RR) | Retr (TLB) | Cwnd (RR) | Cwnd (TLB) |
63+
| :--: | :--: | :--: | ---: | ---: | ---: | ---: | ---: | ---: |
64+
| A | 1 | BBR | 1.78 Gbps | 940 Mbps | 20079 | 20 | 331 KB | 233 KB |
65+
| A | 1 | CUBIC | 1.19 Gbps | 936 Mbps | 7258 | 110 | 103 KB | 385 KB |
66+
| B | 1 | BBR | 1.62 Gbps | 944 Mbps | 37018 | 51 | 343 KB | 241 KB |
67+
| B | 1 | CUBIC | 1.19 Gbps | 941 Mbps | 6914 | 72 | 98 KB | 338 KB |
68+
| C | 1 | BBR | 1.11 Gbps | 935 Mbps | 0 | 0 | 8.87 MB | 7.67 MB |
69+
| C | 1 | CUBIC | 1.16 Gbps | 931 Mbps | 0 | 0 | 6.44 MB | 4.20 MB |
70+
71+
We first look at `balance-tlb` mode.
72+
As expected, with one single stream, it runs on only one slave interface (confirmed by watching `bmon -p eth0,eth1,bond0` during execution).
73+
And with the entire route being able to carry nearly 1.8 Gbps, there's no surprise that different congestion algorithms don't affect single-stream performance in all scenarios.
74+
75+
We then note the huge difference between the BBR and CUBIC algorithms.
76+
Because destinations A and B both have a very low BDP, any fluctuation in latency hits hard on CUBIC in forms of out-of-order packet deliveries, which reflects clearly in the difference of retransmitted packets and the Cwnd size.
77+
In TLB mode, with only one active NIC, retransmission is kept low, but in RR mode it skyrocketed.
78+
79+
For destination C however, that's an entirely different case (zero retransmission) with its own reason:
80+
There's enough BDP for the sender to raise Cwnd up to 8 MiB, but the receiver reports a window of only 4 MiB, so the entire transmission is limited by the receiver, not congestion or link load.
81+
82+
### Parallel streams
83+
84+
During the experiment,
85+
86+
For 4 parallel streams, the results look much better:
87+
88+
| Dest | P | CC | Bitrate (RR) | Bitrate (TLB) | Retr (RR) | Retr (TLB) | Cwnd (RR) | Cwnd (TLB) |
89+
| :--: | :--: | :--: | ---: | ---: | ---: | ---: | ---: | ---: |
90+
| A | 4 | BBR | 1.79 Gbps | 1.77 Gbps | 41357 | 18 | 222 KB | 178 KB |
91+
| A | 4 | CUBIC | 1.80 Gbps | 1.77 Gbps | 13254 | 82 | 48 KB | 306 KB |
92+
| B | 4 | BBR | 1.79 Gbps | 1.72 Gbps | 61810 | 9368 | 237 KB | 236 KB |
93+
| B | 4 | CUBIC | 1.77 Gbps | 1.74 Gbps | 15549 | 4243 | 31 KB | 174 KB |
94+
| C | 4 | BBR | 1.70 Gbps | 1.82 Gbps | 185 | 0 | 4.04 MB | 3.84 MB |
95+
| C | 4 | CUBIC | 1.74 Gbps | 1.82 Gbps | 20 | 17 | 2.91 MB | 2.89 MB |
96+
97+
The gap between RR and TLB no longer exists, and differences in retransmissions and Cwnd size can be attributed entirely to the CC algorithms themselves.
98+
99+
## Conclusion

0 commit comments

Comments
 (0)