Skip to content

Latest commit

 

History

History
625 lines (485 loc) · 27.2 KB

File metadata and controls

625 lines (485 loc) · 27.2 KB

reliable Integration Plan

1. Source Analysis - reliable Library

1.1 What is reliable?

reliable is a lightweight packet acknowledgement system for UDP-based protocols by Glenn Fiedler (Mas Bandwidth LLC, BSD-3-Clause). Reference implementation: C, ~1500 lines of logic in reliable.c.

Core features:

  1. Packet acknowledgement - each outgoing packet carries a compressed ack field; the sender learns which packets the remote side received.
  2. Packet fragmentation and reassembly - packets above fragment_above bytes are split into fragment_size-byte fragments and reassembled on the remote side transparently.
  3. RTT, packet-loss, and bandwidth estimates - computed from the sent/received sliding windows.

reliable is NOT a reliability layer (no retransmission). It only provides acknowledgements so the application layer can decide whether to retransmit application-level data.


1.2 C Module Map

Component Lines (approx) Responsibility
Sequence arithmetic ~20 Wrapping uint16 greater-than / less-than comparison
reliable_sequence_buffer_t ~200 Sliding window buffer: uint16 key -> fixed-stride entry
Byte I/O primitives ~100 Little-endian read/write uint8/16/32/64
Packet header codec ~150 Variable-length compressed header (3-9 bytes)
Fragment header codec ~90 5-byte fragment header + embedded packet header in frag-0
Fragment reassembly ~80 Per-packet reassembly state in sequence buffer
reliable_endpoint_t ~500 Main endpoint: send, receive, update, acks, telemetry
Tests ~900 Inline unit and end-to-end tests

1.3 Key Data Structures

reliable_sequence_buffer_t (internal)

sequence       : uint16  - next expected sequence (sliding window head)
num_entries    : int     - capacity (does NOT need to be power-of-two in C; but we MUST use
                           power-of-two in Java for index = seq & (capacity - 1))
entry_stride   : int     - size in bytes of each fixed-size entry slot
entry_sequence : uint32[]- slot occupancy: 0xFFFFFFFF = empty, else = sequence number stored
entry_data     : uint8[] - flat slab of (num_entries * entry_stride) bytes

Java mapping: SequenceBuffer backed by int[] (entry_sequence) and an Agrona UnsafeBuffer (entry_data slab). Zero allocation in steady state.

reliable_sent_packet_data_t (internal, stored in sent_packets sequence buffer)

time         : double   - time the packet was sent
acked        : 1 bit    - whether the remote side has acked this packet
packet_bytes : 31 bits  - total bytes (including UDP+IP header) for bandwidth estimate

Java mapping: two parallel flat arrays - double[] sentPacketTime and int[] sentPacketInfo where bit-0 = acked, bits 1-31 = packet_bytes. Eliminates object allocation per slot.

reliable_received_packet_data_t (internal, stored in received_packets sequence buffer)

time         : double   - time the packet was received
packet_bytes : uint32   - total bytes for bandwidth estimate

Java mapping: double[] receivedPacketTime + int[] receivedPacketBytes.

reliable_fragment_reassembly_data_t (internal, stored in fragment_reassembly sequence buffer)

sequence              : uint16   - packet sequence number being reassembled
ack                   : uint16   - ack piggybacked from fragment-0 header
ack_bits              : uint32   - ack bits piggybacked from fragment-0 header
num_fragments_received: int      - how many fragments received so far
num_fragments_total   : int      - expected total (from first fragment received)
packet_data           : uint8*   - heap-allocated buffer for reassembled payload
packet_bytes          : int      - reassembled payload size (known when last fragment arrives)
packet_header_bytes   : int      - how many bytes the compact packet header occupies
fragment_received     : uint8[256] - bitmask: 1 = this fragment_id received

Java mapping: the fragment_reassembly buffer holds fixed-stride slots (all fields except packet_data). packet_data is a pre-allocated UnsafeBuffer[] pool, one buffer per reassembly slot, sized at startup: RELIABLE_MAX_PACKET_HEADER_BYTES + max_fragments * fragment_size.

reliable_config_t (public configuration)

name[256]                     : string label (debug only)
context                       : void* passed to callbacks
id                            : uint64 - endpoint identity passed to callbacks
max_packet_size               : int    default 16384
fragment_above                : int    default 1024
max_fragments                 : int    default 16  (max 256)
fragment_size                 : int    default 1024
ack_buffer_size               : int    default 256
sent_packets_buffer_size      : int    default 256
received_packets_buffer_size  : int    default 256
fragment_reassembly_buffer_size: int   default 64
rtt_smoothing_factor          : float  default 0.0025
packet_loss_smoothing_factor  : float  default 0.1
bandwidth_smoothing_factor    : float  default 0.1
packet_header_size            : int    default 28 (IPv4+UDP header bytes for bw accounting)
transmit_packet_function      : (context, id, sequence, data, bytes) -> void
process_packet_function       : (context, id, sequence, data, bytes) -> int (1=ack, 0=drop)
allocate_function / free_function

reliable_endpoint_t (public opaque type)

config               : reliable_config_t (copy)
time                 : double
rtt                  : float   (ms, exponential moving average)
packet_loss          : float   (%, EMA over sent_packets_buffer_size/2 samples)
sent_bandwidth_kbps  : float   (EMA)
received_bandwidth_kbps: float (EMA)
acked_bandwidth_kbps : float   (EMA)
num_acks             : int
acks                 : uint16[] (pre-allocated, size = ack_buffer_size)
sequence             : uint16   (next outgoing sequence number)
sent_packets         : reliable_sequence_buffer_t*
received_packets     : reliable_sequence_buffer_t*
fragment_reassembly  : reliable_sequence_buffer_t*
counters             : uint64[10]

1.4 Wire Format

Normal packet header (3 to 9 bytes, variable)

[prefix_byte]    uint8
  bit 0   = 0  (distinguishes from fragment packet where bit 0 = 1)
  bit 1   = 1 if ack_bits[7:0] != 0xFF (byte 0 of ack_bits is present)
  bit 2   = 1 if ack_bits[15:8] != 0xFF
  bit 3   = 1 if ack_bits[23:16] != 0xFF
  bit 4   = 1 if ack_bits[31:24] != 0xFF
  bit 5   = 1 if (sequence - ack) fits in uint8 (ack stored as 1-byte difference)
[sequence]       uint16 little-endian
[ack]            uint8 (difference) OR uint16 little-endian
[ack_bits_byte0] uint8 (omitted if bits[7:0] == 0xFF)
[ack_bits_byte1] uint8 (omitted if bits[15:8] == 0xFF)
[ack_bits_byte2] uint8 (omitted if bits[23:16] == 0xFF)
[ack_bits_byte3] uint8 (omitted if bits[31:24] == 0xFF)

Best case (no packet loss, ack close to sequence): 1 + 2 + 1 = 4 bytes. Worst case (max loss, far apart): 1 + 2 + 2 + 4 = 9 bytes. Constant RELIABLE_MAX_PACKET_HEADER_BYTES = 9.

Fragment packet header (5 bytes fixed, frag-0 also carries packet header)

[prefix_byte]   uint8 = 1  (bit 0 = 1 indicates fragment)
[sequence]      uint16 little-endian
[fragment_id]   uint8  (0-based)
[num_fragments-1] uint8 (max 255 -> max 256 fragments, but config caps at max_fragments <= 256)

For fragment_id == 0, the normal packet header immediately follows (before payload data).

Constant RELIABLE_FRAGMENT_HEADER_BYTES = 5.


1.5 Sequence Arithmetic (Wrapping uint16)

// s1 > s2 with wrap-around (half-space rule)
s1 > s2 iff (s1 > s2 && s1-s2 <= 32768) || (s1 < s2 && s2-s1 > 32768)

Java: implemented as static boolean sequenceGreaterThan(int s1, int s2) using & 0xFFFF masking since Java has no unsigned 16-bit type.


1.6 Ack Bit Generation

ack      = received_packets.sequence - 1     (most recently received sequence)
ack_bits = 32-bit bitmask where bit i (0=LSB) is set if (ack - i) is in received_packets

Covers the last 33 received sequences (ack + ack-1 .. ack-31).


1.7 Telemetry Update Algorithm (called once per game tick)

  • packet_loss: scan sent_packets_buffer_size/2 oldest unacked sent-packet slots. Ratio of unacked to samples, EMA-smoothed with packet_loss_smoothing_factor.
  • sent/received/acked bandwidth: scan same windows, accumulate bytes over time range (min send_time to max send_time), compute kbps, EMA-smooth.

1.8 Public API Summary

Function Java equivalent
reliable_endpoint_create(config, time) new ReliableEndpoint(config, clock)
reliable_endpoint_destroy(ep) ep.close() (AutoCloseable)
reliable_endpoint_send_packet(ep, data, bytes) ep.sendPacket(buf, offset, length)
reliable_endpoint_receive_packet(ep, data, bytes) ep.receivePacket(buf, offset, length)
reliable_endpoint_update(ep, time) ep.update(timeSeconds)
reliable_endpoint_get_acks(ep, &n) ep.getAcks() returns short[], ep.numAcks()
reliable_endpoint_clear_acks(ep) ep.clearAcks()
reliable_endpoint_next_packet_sequence(ep) ep.nextPacketSequence()
reliable_endpoint_rtt(ep) ep.rtt()
reliable_endpoint_packet_loss(ep) ep.packetLoss()
reliable_endpoint_bandwidth(ep, ...) ep.sentBandwidthKbps(), etc.
reliable_endpoint_counters(ep) ep.counters()
reliable_endpoint_reset(ep) ep.reset()
reliable_endpoint_free_packet(ep, ptr) not needed (GC / pool return)

2. Integration with netcode-java

2.1 Role in the Stack

Application / Game Logic
        |        ^
        v        |
  ReliableEndpoint  (ack, fragment, telemetry)
        |        ^
        v        |
  NetcodeClient / NetcodeServer  (encrypt, auth, session)
        |        ^
        v        |
  UdpTransport  (raw UDP send/recv)

reliable sits above the transport and below the application. It is inserted on the CONNECTION_PAYLOAD path only. Handshake packets (REQUEST, CHALLENGE, RESPONSE, KEEP_ALIVE, DISCONNECT) bypass reliable entirely because they have their own sequencing or do not need acks.

2.2 Integration Points in Existing Code

Existing class Change
NetcodeClient After transition to CONNECTED: create one ReliableEndpoint (client-side). Route outgoing payloads through endpoint.sendPacket(). In receivePacket(), for PAYLOAD packets forward raw bytes to endpoint.receivePacket(). Call endpoint.update(time) each doWork().
NetcodeServer Per client slot (0..maxClients-1): create one ReliableEndpoint. Same routing logic.
UdpTransport Implement TransmitPacketHandler - called by reliable when it has a UDP datagram ready to send. Passes pre-framed bytes (reliable header + payload) directly to DatagramChannel.send().
PacketQueue Implement ProcessPacketHandler - called by reliable when a fully reassembled payload is ready to deliver to the application. Enqueues into the existing PacketQueue ring.
ClientAgent / ServerAgent Call reliableEndpoint.update(time) once per doWork() tick. Process pending acks from reliableEndpoint.getAcks() and deliver to application.

3. Java Package Layout

New package: net.ztrust.reliable

net.ztrust.reliable/
  ReliableConstants.java           - all integer constants (#define ports)
  ReliableConfig.java              - configuration value class (immutable after build)
  ReliableConfigBuilder.java       - builder for ReliableConfig
  TransmitPacketHandler.java       - @FunctionalInterface: transmit callback
  ProcessPacketHandler.java        - @FunctionalInterface: process/deliver callback
  SequenceBuffer.java              - sliding window buffer (package-private)
  PacketHeaderCodec.java           - encode/decode compressed packet header (package-private)
  FragmentCodec.java               - encode/decode fragment header (package-private)
  ReliableEndpoint.java            - main public class

New test packages:

src/test/java/.../reliable/
  SequenceBufferTest.java
  PacketHeaderCodecTest.java
  ReliableEndpointTest.java        - ack, fragment, packet-loss, bandwidth unit tests

src/jmh/java/.../reliable/
  ReliableEndpointBenchmark.java   - sendPacket + receivePacket round-trip throughput

4. Performance-Critical Design Decisions

4.1 SequenceBuffer - Zero-Allocation Design

The C reliable_sequence_buffer_t allocates per-entry data with malloc. Java port replaces this with:

  • int[] entrySequence - size = capacity, value 0xFFFFFFFF = empty.
  • UnsafeBuffer entryData - off-heap slab: capacity * entryStride bytes, pre-allocated at construction. Entry at index i starts at i * entryStride.
  • Index: seq & (capacity - 1) (capacity MUST be power-of-two).
  • No per-entry Object allocation ever. All reads/writes use absolute offset arithmetic on the UnsafeBuffer.

Note: the C reference does NOT require power-of-two capacity (it uses seq % num_entries). The Java port MANDATES power-of-two capacity for the & (capacity-1) index rule from the copilot instructions. Default configs already use 256, 64 which are powers of two.

4.2 Sent/Received Packet Data - Parallel Primitive Arrays

Instead of a typed struct inside the sequence buffer:

// sent packets
double[] sentPacketTime;    // indexed by seq & (cap-1)
int[]    sentPacketInfo;    // bit 0 = acked, bits 1-30 = packet_bytes

// received packets
double[] receivedPacketTime;
int[]    receivedPacketBytes;

This avoids any object header overhead and keeps related fields cache-adjacent within each array.

4.3 Fragment Reassembly Buffer - Pre-Allocated Packet Buffers

The C reference mallocs packet_data on first fragment of each fragmented sequence. The Java port pre-allocates a pool of UnsafeBuffer objects at construction, one per reassembly slot:

// at construction:
UnsafeBuffer[] reassemblyPacketData = new UnsafeBuffer[reassemblyBufferSize];
for (int i = 0; i < reassemblyBufferSize; i++) {
    int bufSize = RELIABLE_MAX_PACKET_HEADER_BYTES + maxFragments * fragmentSize;
    reassemblyPacketData[i] = UnsafeBuffer.allocateDirectAligned(bufSize, 64);
}

On reassembly slot eviction (advance past stale sequence), the buffer is zeroed and returned to the pool implicitly (index reuse). No heap allocation in steady state.

The per-slot metadata (sequence, ack, ackBits, numFragsReceived, numFragsTotal, packetBytes, packetHeaderBytes) is stored as parallel primitive arrays indexed by slot:

short[] reassemblySequence;
short[] reassemblyAck;
int[]   reassemblyAckBits;
int[]   reassemblyNumFragsReceived;
int[]   reassemblyNumFragsTotal;
int[]   reassemblyPacketBytes;
int[]   reassemblyPacketHeaderBytes;
boolean[][] reassemblyFragmentReceived;  // [slot][fragmentId]

boolean[][] is allocated once at construction (256 booleans per slot).

4.4 Ack Buffer

short[] acks;    // pre-allocated, size = ackBufferSize (default 256)
int numAcks;

Java short is signed; use acks[i] & 0xFFFF when treating as unsigned sequence number.

4.5 Transmit Scratch Buffer

In C, reliable_endpoint_send_packet mallocs a temporary buffer for the transmit datagram. Java port pre-allocates one UnsafeBuffer per endpoint for this:

// size = RELIABLE_FRAGMENT_HEADER_BYTES + RELIABLE_MAX_PACKET_HEADER_BYTES + fragmentSize
UnsafeBuffer transmitScratch;

This buffer is reused on every sendPacket() call. Single-writer principle: ReliableEndpoint is owned by exactly one thread (client agent or server agent), so no synchronization needed.

4.6 Callbacks as Interfaces (Not Lambdas Capturing State)

@FunctionalInterface
public interface TransmitPacketHandler {
    void transmit(long endpointId, int sequence, DirectBuffer data, int offset, int length);
}

@FunctionalInterface
public interface ProcessPacketHandler {
    /**
     * @return true if packet is accepted and should be acked, false to discard
     */
    boolean process(long endpointId, int sequence, DirectBuffer data, int offset, int length);
}

Implementations MUST be stateless or hold pre-allocated state - no lambda captures allocating heap objects. Resolved at construction of ReliableEndpoint, stored as final fields.

4.7 Counters

long[] counters = new long[ReliableConstants.NUM_COUNTERS];  // 10 entries

Indices match C RELIABLE_ENDPOINT_COUNTER_* constants. Off-heap AtomicBuffer not needed because counters are written by the single owner thread only; reads from other threads (metrics reporters) can use a volatile read or periodic snapshot.

4.8 Clock Injection

// construction parameter:
NanoClock clock;  // existing util.NanoClock interface in netcode-java

update(double timeSeconds) takes an explicit time argument (same as C API), so the caller (agent loop) controls when time advances. No System.nanoTime() inside the endpoint.


5. Detailed Implementation Plan

Phase 7 - reliable Foundation (new module)

Task File(s) Notes
Constants ReliableConstants.java Port all #define values
Config ReliableConfig.java, ReliableConfigBuilder.java Immutable config; builder sets defaults matching C reliable_default_config()
Callback interfaces TransmitPacketHandler.java, ProcessPacketHandler.java @FunctionalInterface, zero-allocation contract in Javadoc
Sequence buffer SequenceBuffer.java Off-heap slab via UnsafeBuffer, int[] occupancy array, power-of-two capacity enforced, insert, find, remove, advance, generateAckBits
Packet header codec PacketHeaderCodec.java writeHeader(buf, offset, seq, ack, ackBits) -> bytes written; readHeader(buf, offset, len, out) -> header bytes consumed
Fragment codec FragmentCodec.java writeFragmentHeader(buf, offset, seq, fragId, numFrags) -> 5; readFragmentHeader(buf, offset, len, out)
Endpoint ReliableEndpoint.java All state as primitives + pre-allocated buffers; sendPacket, receivePacket, update, telemetry methods
Unit tests SequenceBufferTest, PacketHeaderCodecTest, ReliableEndpointTest Port all C test cases: ack bits generation, header round-trip, ack flow, fragment reassembly, packet loss scenario
Benchmark ReliableEndpointBenchmark.java Measure sendPacket + receivePacket round-trip; target: < 200 ns per packet at 256-byte payload (no fragmentation), zero allocation

Phase 8 - netcode Integration

Task File(s) Notes
Wire reliable into client NetcodeClient.java Create ReliableEndpoint on connect; route CONNECTION_PAYLOAD through it; call endpoint.update() per doWork()
Wire reliable into server NetcodeServer.java Per client slot: ReliableEndpoint[]; same routing
Transport transmit handler Implement TransmitPacketHandler in UdpTransport or as inner class in NetcodeClient/NetcodeServer Direct DatagramChannel.send() call with pre-allocated ByteBuffer view of the Agrona UnsafeBuffer
Ack delivery ClientAgent.java, ServerAgent.java After endpoint.update(), drain endpoint.getAcks() into application ack callback; endpoint.clearAcks()
Fragment size config ClientConfig.java, ServerConfig.java Expose fragmentAbove and maxFragments; default matches netcode payload limit (1200 bytes)
Integration test ReliableNetcodeIntegrationTest.java In-process client+server with TransportOverride; verify acks delivered, fragments reassembled, RTT/loss computed
Benchmark ReliableNetcodeBenchmark.java End-to-end IPC p50/p99/p99.99 with reliable layer active

6. Constants Reference

public final class ReliableConstants {
    public static final int MAX_PACKET_HEADER_BYTES    = 9;
    public static final int FRAGMENT_HEADER_BYTES      = 5;

    // Default config values
    public static final int DEFAULT_MAX_PACKET_SIZE                    = 16 * 1024;
    public static final int DEFAULT_FRAGMENT_ABOVE                     = 1024;
    public static final int DEFAULT_MAX_FRAGMENTS                      = 16;
    public static final int DEFAULT_FRAGMENT_SIZE                      = 1024;
    public static final int DEFAULT_ACK_BUFFER_SIZE                    = 256;
    public static final int DEFAULT_SENT_PACKETS_BUFFER_SIZE           = 256;
    public static final int DEFAULT_RECEIVED_PACKETS_BUFFER_SIZE       = 256;
    public static final int DEFAULT_FRAGMENT_REASSEMBLY_BUFFER_SIZE    = 64;
    public static final float DEFAULT_RTT_SMOOTHING_FACTOR             = 0.0025f;
    public static final float DEFAULT_PACKET_LOSS_SMOOTHING_FACTOR     = 0.1f;
    public static final float DEFAULT_BANDWIDTH_SMOOTHING_FACTOR       = 0.1f;
    public static final int DEFAULT_PACKET_HEADER_SIZE                 = 28;  // IPv4+UDP

    // Counter indices
    public static final int COUNTER_PACKETS_SENT               = 0;
    public static final int COUNTER_PACKETS_RECEIVED           = 1;
    public static final int COUNTER_PACKETS_ACKED              = 2;
    public static final int COUNTER_PACKETS_STALE              = 3;
    public static final int COUNTER_PACKETS_INVALID            = 4;
    public static final int COUNTER_PACKETS_TOO_LARGE_TO_SEND  = 5;
    public static final int COUNTER_PACKETS_TOO_LARGE_TO_RECV  = 6;
    public static final int COUNTER_FRAGMENTS_SENT             = 7;
    public static final int COUNTER_FRAGMENTS_RECEIVED         = 8;
    public static final int COUNTER_FRAGMENTS_INVALID          = 9;
    public static final int NUM_COUNTERS                        = 10;
}

7. Performance Budget Targets

Operation Target
sendPacket (256-byte, no fragment) < 150 ns
sendPacket (4096-byte, 4 fragments) < 500 ns total
receivePacket (256-byte, no fragment) < 100 ns
receivePacket (4096-byte, last fragment arriving) < 300 ns (reassembly)
update() (telemetry compute) < 2 us (called at 60Hz, not per-packet)
Allocation per send/receive 0 bytes
Allocation per update 0 bytes

8. Risks and Mitigations

Risk Mitigation
Java short is signed; uint16 sequence arithmetic requires masking Use & 0xFFFF consistently; add invariant tests for wrap-around at 65535->0
Fragment reassembly buffer uses heap allocation in C reference Pre-allocate UnsafeBuffer[] pool per-slot at construction; verified by JMH -prof gc
Sequence buffer capacity must be power-of-two (Java rule) Assert in constructor: Integer.bitCount(capacity) == 1; all defaults are already powers of two
fragment_received[256] boolean array prevents per-event-allocation Pre-allocate boolean[][] with all 256 inner arrays at construction
transmit_packet_function in C allocates a temp buffer per call Pre-allocate transmitScratch UnsafeBuffer per endpoint; single-writer guarantees safe reuse
RTT uses fabs(a - b) < 0.00001 float comparison Port exactly; float precision acceptable for telemetry (not protocol-critical)
Bandwidth calculation scans buffer_size/2 entries per update() This is O(128) at default config, called at 60Hz max; acceptable. Profile to confirm
Java does not distinguish uint32 in entry_sequence Use int with unsigned semantics; 0xFFFFFFFF stored as -1 in signed int; comparisons use == -1 or == 0xFFFFFFFF cast

9. Mermaid - ReliableEndpoint Lifecycle

stateDiagram-v2
    [*] --> IDLE : new ReliableEndpoint(config)

    IDLE --> ACTIVE : sendPacket() or receivePacket()

    ACTIVE --> ACTIVE : sendPacket()\n- assign sequence\n- write header\n- fragment if needed\n- call transmitHandler

    ACTIVE --> ACTIVE : receivePacket()\n- detect normal vs fragment\n- parse header\n- call processHandler\n- record acks\n- update RTT

    ACTIVE --> ACTIVE : update(time)\n- compute packet_loss\n- compute bandwidth estimates

    ACTIVE --> IDLE : reset()

    IDLE --> [*] : close()
Loading

10. Mermaid - Send Path Data Flow

flowchart TD
    A[Application calls sendPacket] --> B{packet_bytes <= fragment_above?}
    B -- yes --> C[Write compact packet header into transmitScratch]
    C --> D[Copy payload after header]
    D --> E[Call transmitHandler once]
    B -- no --> F[Write packet header into local 9-byte stack buffer]
    F --> G[Compute num_fragments]
    G --> H[Loop fragment_id = 0 to num_fragments-1]
    H --> I[Write 5-byte fragment header into transmitScratch]
    I --> J{fragment_id == 0?}
    J -- yes --> K[Copy packet header into transmitScratch after fragment header]
    J -- no --> L[Skip]
    K --> M[Copy fragment payload data]
    L --> M
    M --> N[Call transmitHandler]
    N --> H
    H --> O[Record sent_packet_data: time, bytes, acked=false]
    E --> O
    O --> P[Increment sequence, increment PACKETS_SENT counter]
Loading

11. Mermaid - Receive Path Data Flow

flowchart TD
    A[receivePacket called] --> B{prefix_byte bit 0 == 0?}
    B -- yes, normal --> C[Parse compact packet header]
    C --> D{Stale check: sequence too old?}
    D -- stale --> E[Increment PACKETS_STALE, return]
    D -- ok --> F[Call processHandler]
    F --> G{processHandler returns true?}
    G -- no --> H[return without acking]
    G -- yes --> I[Insert into received_packets buffer]
    I --> J[advance fragment_reassembly sequence]
    J --> K[Loop ack_bits 0..31: mark sent_packets as acked, update RTT]
    B -- no, fragment --> L[Parse 5-byte fragment header]
    L --> M{reassembly entry exists for sequence?}
    M -- no --> N[Insert new reassembly entry, allocate packet_data slot]
    M -- yes --> O[Validate num_fragments matches]
    N --> P
    O --> P[Store fragment data into reassembly buffer]
    P --> Q{all fragments received?}
    Q -- no --> R[return, wait for more fragments]
    Q -- yes --> S[Recursive call: receivePacket with reassembled header+payload]
    S --> T[Remove reassembly entry]
Loading

12. File Creation Sequence (Implementation Order)

  1. ReliableConstants.java - no dependencies
  2. TransmitPacketHandler.java, ProcessPacketHandler.java - no dependencies
  3. ReliableConfig.java, ReliableConfigBuilder.java - depends on constants
  4. SequenceBuffer.java - depends on Agrona UnsafeBuffer
  5. PacketHeaderCodec.java - depends on Agrona DirectBuffer/MutableDirectBuffer
  6. FragmentCodec.java - depends on PacketHeaderCodec
  7. ReliableEndpoint.java - depends on all of the above
  8. Tests for each file
  9. ReliableEndpointBenchmark.java
  10. Integration into NetcodeClient, NetcodeServer, ClientAgent, ServerAgent

13. Wire Compatibility Note

The Java reliable port operates entirely above the netcode encryption layer. The reliable header is prepended to the plaintext payload before encryption by netcode. On receive, netcode decrypts the outer netcode packet, then hands the decrypted payload (still containing the reliable header + ack fields) to ReliableEndpoint.receivePacket().

This matches how netcode.c and reliable.c compose in the C reference. The reliable wire format is therefore never visible on the wire in raw form - it is always inside an encrypted netcode CONNECTION_PAYLOAD envelope.

Implication: no separate C interop test is needed for wire compatibility. All reliable tests can be pure in-process Java tests. The only compatibility requirement is that two Java endpoints speaking to each other (or one Java endpoint speaking to a Java netcode peer) agree on the reliable header format, which is guaranteed by using the same codec.