A DFSR Alternative: Fast & resilient P2P replication with Connect

DFSR, or distributed files system replication, is a feature of Windows Server for replicating files across several servers. Microsoft introduced this feature in Windows NT4.0 as an add-on, and it became a standard feature in Windows 2000 server and on.

Basically, it was an attempt by a Microsoft to solve the complex problem of distributing files so teams across several offices could collaborate effectively. Nowadays, it’s a much easier problem to solve, with the advent of the public cloud. Instead of sharing files between offices, people work directly with the cloud. However, the cloud doesn’t solve all the issues in collaboration and you might still need to build a system where several servers need to be synchronized across different offices. Let’s see if the DFSR as a technology for 2000 is an adequate response to these needs.

First, let’s see why you might need to synchronize servers directly. One of the reason might be that you have big files and would require significant time to upload to the cloud. The second case might be that due to a variety of reasons you can’t upload files to the cloud. It might be security concerns, company or government policy etc. Third and most popular reasons that we see is that moving files or synchronizing servers is just a part of your workflow. You are building a business process that is more complex than just uploading and downloading files to the cloud.

The DFSR is a standard client-server or 1:1 file synchronization approach that was popular 10-15 years ago. All of the other commercial solutions use the same model for synchronization – copy the file from point A to point B. This approach obviously has limitations, such as the availability of A – if server A is not available, then server B and C will not get the data. Resource shortage of A, if there is any resource shortage on server A (networking, memory, CPU) the load cannot be moved to server B, even if it has all the data available.  

There are a number of other limitations to DSFR. If you need to extend your file synchronization beyond Windows servers, DSFR is not the right tool for the job. You also can’t give different roles to different servers, e.g. if you need some servers to just read information, without being able to change it.

Let’s look into typical problems that DFSR-based solutions run into:

  1. Working over long WAN, connections. When you need to send data to remote locations over mobile or satellite connections that have a very long retransmission time and potential for packet losses, DSFR can run into problems.  It’s based on the TCP/IP protocol, which treats every packet loss as a network congestion and backs off speed in order to reduce the load of the connection. This approach helps applications that are TCP/IP based share networks and collectively agree on the maximum speed they can use for data transfer. In case of wide-area networks (WAN), the packet loss might represent some failure on intermediate device and the channel is often not congested. Therefore, the logic of reducing speed in case of packet loss is not appropriate for WAN connectivity.
  2. Another problem in the WAN relates to the way TCP/IP guarantees data delivery. First, it needs a recipient to acknowledge that packet arrived at the destination. Once a recipient gets a packet, it sends a confirmation packet to the sender acknowledging that it has received the specific packet. The time during which the packet travels from sender to receiver is called retransmission time (RTT). In the local network (LAN) it is below 0.01ms, but in WAN networks it can be as high as 800ms or more. Therefore, a receiver can wait up to a second or even more before it’s able to send another packet. These TCP/IP deficiencies are inherited by DSFR. Overcoming these bottlenecks requires specific hardware or software to overcome these bottlenecks.
  3. Synchronizing or delivering files to more than one destination. It is extremely rare that an organization needs to send or synchronize files just to one location or server. Usually, it is more than one destination server, in more than one location. It such a case, a common approach is to execute jobs one by one. DFSR sends files to one location, then to another.
  4. Tens of millions of files and real-time change detection. DFSR is not optimized for a large number of files. It is usually very slow when you need to synchronize folders with few millions of files since it takes forever to scan thru this folder, find changes and transfer them. A better approach would be to use real-time file system monitoring to pick up changes on the fly without the need to browse thru the whole directory tree.
  5. Smart data routing. DFSR needs static ip:ports to establish a connection to different machines. If a machine has  a new IP:port or is not available, DSFR stops operation and needs a human to re-configure.
  6. Scripting. There is no way to have scripting around DFSR. If you need to build workflows beyond a simple “do something after the file arrives at destination”, there is no way to do so with DFSR.

In contrast to DSFR, Resilio Connect was developed to be a robust replication and file distribution solution that is faster under any network conditions and data volumes. It’s specifically optimized to replicate large volumes of data, and works well with many small files, without any limits.  Besides, with a unique network protocol, it’s sensitive to bandwidth changes and is smart enough to avoid network congestions or use full bandwidth when possible. Resilio lets you take control over the replication process, see its progress and evaluate the results.

Resilio Connect data replication is not limited to the Windows platform only, and can be easily configured as a cross-platform solution (Linux, OS X, iOS, and Android).

Solutions:

Enough about problems, let’s talk solutions, Resilio Connect alleviates many of the limitations organizations running DFSR encounter. Its peer-to-peer synchronization, rebuilt for the enterprise, and offers significant performance improvements over DSFR in all sorts of scenarios. Let’s cover these in more detail.

How to make DFSR faster
It’s impossible to make DFSR faster. Performance is limited because its basic set of technologies is limited to delta encoding and compression.

You would need to replace DFSR to get faster performance. Our solution, Resilio Connect, adds peer-to-peer data transfer, WAN line optimization, smart routing and real-time file system monitoring in order to speed up syncing for today’s enterprise.

DFSR & large files
It is impossible to use DFSR for large files. DFSR doesn’t have an optimized way of calculating checksum of the file, which leads to an extremely long time to calculate file differences.

Resilio Connect optimizes the checksum calculations so that it can sync very fast for files of any size.

DFSR & VPN
It is impossible to use DFSR without additional encryption.The lack of embedded traffic encryption requires people to install and configure additional encryption channel – VPN.

Any good DFSR replacement should include encryption in the product, so no additional products are required. The Connect uses AES128 in CTR mode to encrypt all the traffic that is sent between clients. This includes both data and all the control traffic.

DFSR & static IP addresses
It is impossible to use DFSR in a dynamic network environment. DFSR requires static IP and port addresses for a destination.The static IP addresses expose another problem with rsync, since as soon as IP address of the server is changed, the rsync will fail to operate.

The Resilio Connect uses a dynamic routing approach. When a rule specifies that machine A and B need to exchange data, both machines use tracker or multicast to discover IP:Port addresses on the fly. No human intervention necessary.

DFSR & WAN connection
It is impossible to use DSFR over long connections for offices located 3,000 miles and more apart. Usually, these networks referred as WAN networks. The long distance between offices makes TCP packet travel time long and increases chances of packet loss due to equipment failure or congestion. The TCP will slow down the speed significantly for these types of networks. Rsync is based on TCP protocol, therefore the rsync speed will be slow.

Any good DFSR alternative should have a WAN network support, which Resilio Connect does. With Connect you can utilize 100% of the available bandwidth in your network independent of distance, latency, or loss. To achieve that Connect uses UDP based protocol uTP2 that uses bulk packet transfer with selective acknowledgement of lost packets.

DFSR & several destinations
A good DFSR alternative will use a peer-to-peer approach that leverages networking between all offices and significantly speeds up data transfer. This approach splits each file into blocks and sends these blocks independently. Each recipient can send the block to other recipients once received. This dramatically speeds up syncing operation, since not only we are transferring concurrently to several machines, but also using other network channels to offload load from a sender network channel. This is the approach Resilio Connect takes.

DFSR & NAT
It is impossible to use DFSR to connect to a server behind NAT. NAT is usually a firewall that hides server internal address and provides a connection with external IP address. You will need to open ports for incoming connection on the device so rsync can establish a connection.

Connect uses NAT traversal techniques that could establish a direct connection between computers without a need to open ports.

Summary
Here is a handy summary table of the features needed in a synchronization solution today, and how Resilio Connect stacks up against DSFR.

Resilio Connect DFSR
Delta encoding + +
Compression +
Dynamic routing +
Encryption +
WAN optimization +
NAT traversal +
All OS +
1M+ files +
Big folders +
Real-time + +

Let’s see a detailed comparison of the two:

DFSR Connect
Cross platform solution No Yes
Max number of servers 256 for each replication group No limits
Max number of files 70 million (WinServer 2012R2), 11 million (WinServer 2008) No limits
ACL replication Yes Yes
Max file size 250 Gb (WinServer 2012R2), 64 Gb (WinServer 2008) No limits
Data deduplication Yes Yes
WAN acceleration No Yes
P2P replication No Yes
Time to replicate (checked between Windows servers with IntelCore i7 CPU, 3.40 GHz)

  • 3.72 Gb single file
  • 100k files, 4kb each
  • Around 4 minutes
  • 46 minutes
  • 1 minute, 39 seconds
  • 17 minutes
File encryption No Yes

 

How much faster can you replicate with Resilio Connect? Schedule a demo to find out.