μTP2: Resilio Connect and The Evolution of An Enterprise-Grade Protocol

We’ve recently released a preview build of Resilio Connect, our upcoming product designed specifically for moving data in enterprise environments. We wanted to share a deeper dive into the technical improvements we’ve made at the protocol level for Resilio Connect.

Resilio Connect will introduce a new protocol that is dedicated to optimizing transfer speeds over WAN, satellite, or mobile networks. This new μTP2 protocol was designed to work over networks with high latency and some packet loss, which is crucial when moving data over the mixed infrastructure or fat pipes increasingly common for enterprise IT.

We’re building on μTP, which is an improvement over the original BitTorrent protocol designed to be network friendly. μTP is able to dynamically adjust the rate of sending to avoid congestion when network connections are swamped, which is useful when there’s a high volume of traffic. You can learn more about μTP here.

Our new μTP2 protocol architecture is based on bulk transfer strategy (compared to the sliding window based strategy of μTP) where the sender sends packets periodically with a fixed packet delay to create a uniform packet distribution in time and uses a congestion control algorithm to calculate the ideal send rate. There is no wait for acknowledgment for every packet — instead the protocol uses interval acknowledgment for a group of packets combined with additional information about lost packets.

WAN_diagram

This acknowledgment combined with periodical RTT (Round Trip Time) probing creates the necessary information for the congestion control algorithm to calculate the new sending rate. μTP2 then uses a delayed retransmission strategy: lost packets retransmit once per RTT to decrease unnecessary retransmissions. This substantially increases the reliability and efficiency of transfers using μTP2 over poor network conditions.

Let’s take a look at the most important protocol states:
1. Congestion control
2. Fast start state with fast exponential growth
3. Speed probing state after congestion is detected
4. Stable speed adjusting and congestion avoidance state

1. Congestion control algorithm
The congestion control algorithm for the μTP2 protocol is loss oriented and uses an AIMD strategy(additive-increase/multiplicative-decrease) strategy for speed adjusting. The goal is to separate random losses from congestion losses and intelligently reduce protocol speed only when congestion losses occurs. The main source of information about losses is selective ack that acknowledges group of packets with additional information about lost packets.

After receiving a selective ack the protocol calculates the loss probability for confirmed packets. Depending on this value μTP2 detects if congestion occurred. If so, current transfer speed is reduced to keep the pipeline in a stable state.

2. Fast start
After the connection setup μTP2 enters the fast start state: it starts sending packets at a predefined rate and increases this rate with every ack. As soon as congestion is detected, the average send rate is calculated and used as the first prediction of the possible maximum send rate value for this connection in order to find the line speed. The sender then reduces the send rate and enters the speed probing state.

3. Speed probing state
After the sender detects congestion and reduces the send rate it enters the speed probing state to detect if the speed reduction was successful and no more congestion exists. In this state the sender creates a packet train and sends it at the current send rate to the receiver. After receiving acknowledgment for the packet train the sender calculates the packet loss probability if it shows that congestion still exists, the protocol continues reducing the sending rate until congestion is gone. After the sender detects that congestion is gone it moves to congestion avoidance state with the current send rate.

4. Congestion avoidance
The congestion avoidance state is the main state for the sender. In this state the sender analyzes the current loss probability and if no congestion is detected it slightly increases the send rate per RTT. If congestion occurs in a close range to the possible maximum send rate the protocol decreases the current send rate. If the sending rate of the next congestion event is in a close range to the previous send rate of congestion, the send rate continues to slightly decrease in speed. This makes speed reduction more gradual in an attempt to utilize the maximum bandwidth of the connection. If speed is significantly reduced the protocol enters the speed probing state to ensure that congestion is gone.

The bulk transfer strategy that is the foundation of μTP2 was designed to maintain transfer speeds despite packet delays or packet loss. This results in vastly improved transfer speeds over the WAN, where such network conditions are common.

We compared transferring data between a server in Amsterdam and a server in Hong Kong over a 1Gbps network connection with 200ms packet transfer delay and around 1% packet loss. Both servers are identical and used a single Intel Xeon E3-1270 processor (4 Cores, 3.40 GHz) with 2GB of RAM.

speed-chart

Resilio Connect using the μTP2 protocol reached a maximum speed of 121.961 MByte/s, almost completely utilizing the full bandwidth of the network connection.

μTP2 will be available in our future Resilio Connect to bring speed and efficiency improvements to organizations moving data over WAN or unreliable connections. Resilio Connect is coming soon — if you’re interested, let us know.