Knfsd: Use Cases, Limitations, and a Reliable Alternative

Knfsd: Use Cases, Limitations, and a Reliable Alternative

Knfsd is an open-source NFS caching tool that’s useful for certain high-performance computing (HPC), burst to cloud, and burst for compute use cases. 

While not an officially supported Google product, knfsd is built for deploying and operating a high-performance NFS cache in Google Cloud (GCP). In this guide, you’ll learn how knfsd works and explore its main use cases.

But, before we dive in, it’s also worth noting that despite its practicality in some rendering scenarios, knfsd comes with serious drawbacks:

  • It’s developed and maintained by only one person. 
  • Its performance suffers when faced with small amounts of latency.
  • Every NFS server you deploy becomes a single point of failure (SPOF).
  • It’s difficult to use, as you need to write and manage scripts even for very simple tasks.

Due to these downsides, knfsd isn’t a good choice for workflows involving two or more locations (e.g., cloud regions and data centers), as any latency between sites degrades its performance. It’s also not useful for teams that want to use a non-Linux OS or operate across multiple cloud providers, instead of just GCP.

That’s why, in the second part of this guide, we’ll explore how Resilio can help you overcome these downsides. For use cases such as burst to cloud, burst rendering in VFX, and other scenarios, Resilio provides clear advantages over knfsd.

Resilio is a highly reliable and high-performance file caching and synchronization system that works with just about any device, cloud provider, and storage type. It’s a commercial, globally supported solution that can overcome latency across branch offices, cloud regions, data centers, and other endpoints.

Resilio is a superior alternative to knfsd because it provides:

  • Pure Performance: Resilio outperforms knfsd in every scenario. The more latency, the bigger the performance boost you get with Resilio. In a simple test across 2 GCP East to GCP West Resilio outperforms knfsd by 8x. (We’ll explore the performance comparison in more detail below.)
  • Organic Scalability: With Resilio, you can scale horizontally by adding file caching gateways (Resilio agents). This will scale-out storage performance non-disruptively by adding Resilio caching gateway agents.  
  • WAN acceleration: Resilio’s proprietary WAN optimization technology lets you fully utilize any network, including VSATs, cell, Wi-Fi, and any IP connection. It also overcomes latency and packet loss to ensure predictable transfer and sync times, irrespective of network quality.
  • Full Mesh Sync: Resilio enables bi-directional and N-way sync within and across sites. You can fully sync or partially sync jobs. Updates can be made in real-time or pulled on-demand.  
  • Storage Flexibility: You can use Resilio with any device (desktops, laptops, servers, IoT devices, etc.), storage type (file, block, object, NAS, DAS, SAN), operating system (Linux, Windows, macOS, Unix), and cloud provider (GCP, AWS, Azure, Backblaze, Wasabi). As a result, you can freely cache, move, and sync data across your storage devices, cloud providers, and other endpoints.
  • Reliability: Resilio uses a peer-to-peer (P2P) replication architecture, which eliminates single points of failure and ensures data is always synchronized as quickly as possible. This unique architecture also scales organically and makes Resilio ideal for several use cases around cloud bursting, remote work collaboration, server synchronization, disaster recovery, and more. 
  • Efficiency: Resilio has features that enable you to optimize costs, such as selective sync and caching. You can also use it to cache frequently accessed files locally as well as get low-latency access to all your files, regardless of where they’re stored.
  • Automation and centralized management: In Resilio, any UI action — including caching, hydration, and synchronization — can be automated using the API. You can integrate with 3rd-party systems and trigger hydration or other actions based on the cache’s state. You can also use the Central Management Console to easily set up and manage jobs across your entire environment.

Organizations use Resilio Connect to provide fast, reliable file sync and access for many use cases, such as cloud bursting, VDI profile sync, disaster recovery, multi-site collaboration and remote work, edge synchronization, server synchronization, and more. To see how Resilio can benefit your business, schedule a demo with our team.

Knfsd 101: How It Works, Main Use Cases, and Limitations

As we said, knfsd is designed for HPC and burst compute scenarios where there’s a requirement for a high-performance NFS cache between an NFS server and its downstream NFS clients. It’s based on Linux kernel modules, like nfs-kernel server (the standard Linux NFS Server for NFS re-exporting) and cachefilesd (for persistent cache of network filesystems on disk).

Knfsd works by mounting NFS exports from a source NFS filer and re-exporting the mount points to downstream NFS clients. For most knfsd use cases, the source filer is located on premises, while the downstream NFS clients are in GCP.

This enables companies to harness excess capacity by augmenting on-prem compute and storage with a hybrid cloud configuration. It can also help teams minimize costs by taking advantage of low-cost, ephemeral compute resources in the cloud and keeping network changes as low as possible.

Knfsd is a free open-source solution, so you have to set up and manage it yourself. If you’re interested, check out the instructions in the official GitHub repository, which is broken down into two key sections — build of knfsd image and deployment and operations of knfsd cluster on GCP. There’s also a suite of tests you can run to detect common problems like the correct kernel version, whether cachefilesd is enabled and active, and so on.

But, as we said earlier, the solution also suffers from a number of issues. For example:

  • Even small amounts of latency degrade its performance. If you have more than one data center and other geographically distributed endpoints, this problem can disrupt your operations and cancel out many of the benefits of using knfsd.
  • It can only run on Linux and is primarily focused on GCP. This makes it a poor choice for companies that use other operating systems or cloud providers.
  • It doesn’t provide control over caching behavior. For instance, there are no options for cache pinning or pre-caching. The solution also takes a while to invalidate the cache.
  • It’s not reliable or resilient during outages and other disaster scenarios because every NFS server you deploy is a SPOF. 

Due to these downsides, knfsd is only suitable for use cases involving one data center where there’s no latency. Plus, you need an expert or team that has the necessary time and understanding of nfsd, FS-Cache, load balancing, and other components used in the knfsd repository. 

For larger deployments and use cases that require more speed, predictability, and visibility across locations, an alternative solution is required.

Resilio: A Flexible, High-Performance, and Low Latency Alternative to Knfsd

Resilio is a high-performance and scalable file caching, replication, and synchronization system that’s ideal for cloud bursting use cases around rendering, HPC, and more. Like knfsd, you can use it to harness excess capacity in the cloud and control costs.

However, unlike knfsd, Resilio is a global, commercially supported solution that overcomes latency and ensures optimal performance, irrespective of distance. Resilio is also incredibly reliable and versatile because it:

  • Can be used with nearly any storage type, operating system, cloud provider, and device. 
  • Employs a unique P2P architecture that ensures maximum performance and eliminates single points of failure.
  • Can cache, synchronize, transfer, and ingest data across locations and systems, with no limits on file sizes or numbers.
  • Uses a proprietary, UDP-based WAN optimization protocol to maximize transfer speeds across any network and overcome the impact of latency and packet loss.
  • Offers features for improving productivity and reducing costs like local caching, partial downloads, selective sync, and more.

This makes Resilio a high-performance, scalable, and enterprise-ready alternative to knfsd for organizations that want to cache, transfer, and sync data across geographically distributed locations, while having centralized, granular control and visibility over the process.

Resilio vs knfsd: Performance Comparison

In a 2-site performance test across 2 cloud regions in Google cloud — between GCP East to GCP West — Resilio outperforms knfsd by 8x. In our findings, any latency between sites interfere’s with knfsd and NFS caching in general. In this scenario, there was about 60-80 ms of latency between the two Google cloud regions. 

The following workload was tested: 

  • ~9,500 files
  • ~34 GB
  • ~60 msec delay
  • Measuring reading time of all files
  • Remounts before each iteration to minimize NFS client caching impact
GCP - East Coast to GCP - West Coast: NFS Storage > Resilio Agent > Resilio Transfer > Resilio Caching Gateway vs KNFSD Proxy

The results showed clearly that knfsd struggles with any latency between sites — in this case, there was about 60 ms between regions. Resilio, by contrast, overcomes latency to obtain faster read-write speeds between Resilio caching agents. In the test, a sample file set containing 9,500 files was transferred between locations. 

ResilioKnfsd
First Read (sec)208.91733.1
Last Read (sec)50.453.4

In the next sections, we’ll explore the main benefits of Resilio that make this drastic performance improvement possible.

Versatile and Efficient Caching and Access across Any Location

Knfsd is designed for the very specific and limited use case of using NFS with Linux and GCP. Beyond these parameters, it either doesn’t work at all or has various issues (like latency and poor scalability).

Conversely, Resilio is a vendor-agnostic solution that you can use with just about any:

  • Device — such as desktops, laptops, file servers, NAS/DAS/SAN devices, mobile devices (Resilio offers iOS and Android apps), and IoT devices. 
  • Cloud storage platform — such as GCP, AWS, Azure, Backblaze, Wasabi, MinIO, and more. 
  • Operating system — such as Linux, Microsoft Windows, macOS, Unix, FreeBSD, OpenBSD, Ubuntu, and more.
  • Virtual machine — such as VMware, Citrix, and Microsoft Hyper-V.
Compatible third-party solutions

Put simply, you can deploy Resilio on your existing infrastructure at a low cost, as you don’t need to buy and manage new hardware or software. At the same time, you’re not limited to a particular vendor or ecosystem — you can freely move, sync, and cache data across multiple cloud providers, data centers, and storage types.

Our solution also acts as a storage gateway for unified, low-latency access to files stored anywhere. All end-users can be provided with the same view of files, regardless of whether they’re stored in the cloud or on-premises. This unified interface operates much like Microsoft OneDrive, making it very familiar and user-friendly for most people.

How to select the "Always keep on this device" option.

Lastly, Resilio offers plenty of ways to reduce costs associated with moving and accessing data. For example, our solution only syncs the changed portion of files, to minimize the amount of data that gets transferred.

Additionally, Resilio lets you use:

  • Selective caching to choose which files are stored on local devices, providing employees with faster access to the data they need and reducing data egress fees associated with downloading files.
  • Selective sync to select which specific files and folders sync to which endpoints. This ensures that files only sync to the destinations where they’re needed so you can reduce data transfer fees.
  • Flexible downloads to fully or partially download files and folders, so you can get quicker access to the files you need and reduce data egress costs.
  • Automation capabilities to automatically sync, cache, download, and purge any file based on the policies you set.
Operations Team: Resilio Connect Management Console

P2P Replication Architecture and File Chunking for Speed and Reliability

Resilio Connect is one of the very few file synchronization solutions that can sync in real time and in any direction. This is possible thanks to its P2P replication architecture that lets every endpoint in your environment share files directly with every other endpoint. 

This enables you to:

  • Achieve blazing-fast sync speeds.
  • Eliminate single points of failure.
  • Sync in any direction.
  • Scale organically.

The P2P architecture eliminates SPOFs (single points of failure). If any server or network goes down, Resilio can dynamically route around the outage and find the optimal path to deliver files to their destination. If a transfer is interrupted, Resilio can perform a checksum restart to resume the transfer where it left off and will retry all transfers until they’re complete.

Plus, Resilio uses file chunking to break files down into small chunks that can transfer independently of each other. For example, say you wanted to sync a file across five endpoints. Resilio could split that file into five chunks and each endpoint could work together to share them across your environment. Endpoint 1 can share the first chunk with Endpoint 2. Endpoint 2 could immediately share that chunk with any other endpoint, even before it receives the remaining four file chunks. 

Soon every endpoint will be sharing file chunks, leading to transfer and sync speeds that are 310x faster than hub and spoke solutions. 

P2P vs Client-Server architecture GIF

As we said, this also enables Resilio to sync in any direction, including:

  • One-way, for migrating to the cloud or backing up files to another location.
  • Two-way, for keeping two endpoints synchronized for multi-site collaboration.
  • One-to-many, for distributing software updates from one server to many endpoints at once. 
  • Many-to-one, for collecting data from multiple endpoints onto one server.
  • N-way, for keeping multiple (hundreds or even thousands) of endpoints synchronized simultaneously.

N-way sync is a particularly powerful capability of Resilio that gives it an advantage over other competing solution for use cases like:

  • Server sync: Resilio enables you to automate, visualize, and accelerate file delivery 10x faster across any server environment, regardless of how large and geographically distributed it is. You can synchronize hundreds of millions of files predictably across any network.
  • Remote and hybrid work collaboration: With N-way sync, employees across a number of remote endpoints can confidently collaborate on files in real-time. If an employee at any location makes a change to a file, that file change is immediately synchronized to every other branch office, so everyone always has the most up-to-date version of files.
  • Hot-site disaster recovery: Resilio provides each endpoint in your environment with the power of a data center or backup site. You can configure failover to or from any endpoint, enabling Active-Active High Availability. In the event of a disaster, every endpoint can work together to bring your application back online, allowing Resilio to achieve sub-five-second RPOs and RTOs within minutes of an outage.
Hot/Live DR: Multi-site Active/Active; Warm DR: Active/Active; Cold DR: Active/Passive; Offsite Copy: Backup Copy

Finally, Resilio’s P2P architecture overcomes another one of knfsd’s key downsides — scalability. Our solution is organically scalable, so it performs better as you add more endpoints. As a result, it can sync 200 endpoints in roughly the same time it takes most hub-and-spoke solutions to sync just two.

Proprietary WAN Optimization Protocol for Predictable Transfers across Any Network

Unlike many standard file transfer and sync solutions, Resilio doesn’t rely on TCP to transfer and sync files over long distances. Instead, our solution uses a proprietary, UDP-based WAN acceleration protocol known as Zero Gravity Transport™ (ZGT).

ZGT enables you to move and sync data across geographically distributed data centers and cloud regions, while overcoming the impact of latency and packet loss. This is a massive benefit over knfsd which struggles and performs poorly even with minimal latency.

ZGT eliminates latency and allows you to fully utilize any network, including VSAT, Wi-Fi, cell (3G, 4G, 5G), and any IP connection. It accomplishes this using:

  • A congestion control algorithm that constantly probes the RTT (Round Trip Time) to calculate and maintain the ideal data packet send rate. This enables it to maintain a uniform packet distribution over time.
  • Interval acknowledgments, which means it sends acknowledgements for groups of packets, rather than acknowledging each packet receipt.
  • Delayed retransmission, which means it retransmits lost packets in groups once per RTT (rather than immediately retransmitting each packet) to reduce unnecessary retransmissions.
Cross office server sync calculator

Put simply, ZGT makes Resilio ideal for ingesting, transferring, caching, and syncing data across all locations, especially those located at the far edge with poor network connectivity. For more details, you can check out our:

Centralized Management and Automation

Resilio offers a Central Management Console that you can use to manage your entire environment, even if it spans multiple storage types, cloud providers, and data centers. The Console, which can be accessed from any web browser, gives you control over all aspects of data caching, transfer, and synchronization by allowing you to:

  • Create, manage, and monitor jobs.
  • Review a history of all executed jobs.
  • Monitor real-time metrics and reports on active jobs.
  • Adjust replication parameters, such as disk I/O, data hashing, file priorities, syncing metadata, and more.
  • Create set-it-and-forget-it automation policies that control how data is synchronized, cached, purged, and downloaded — such as purging files from a cache after a certain number of days.
  • Adjust bandwidth at each endpoint manually or create profiles that govern how much bandwidth is allocated to each endpoint at certain times of the day or on certain days of the week.
Edit bandwidth schedule 'default'

Alternatively, you can manage Resilio via command-line code if you prefer. Plus, you can use Resilio’s powerful REST API to script and automate any functionality that your workflow requires. This is a great way to increase productivity and minimize management time.

Resilio scripts provide three types of triggers:

  • Before a job starts.
  • After a job completes.
  • After all jobs complete.

For a real-life example of Resilio’s centralized management and automation, check out this case study on our website. This marine construction company used to rely on Microsoft SCCM to distribute updates to their vessels. However, the solution required hours of management time and was unreliable due to frequent network disconnects, leading to many vessels being years behind in key security updates.

With Resilio, the company now distributes updates to each vessel’s central server, after which they’re distributed to each workstation onboard over the LAN. The whole process is highly reliable and can be remotely monitored by the IT team to ensure all systems are fully patched.

Global headquarters: How Resilio syncs servers from remote vessels

Native Enterprise-Grade Security

Knfsd relies on setting up a VPN for security. It’s also a small, unsupported project, which can further compromise its reliability and your data’s protection if you don’t invest in separate security software. 

Resilio, on the other hand, comes with native security features, which are reviewed by 3rd-party security experts. Some of these features include:

  • End-to-end data encryption: Resilio encrypts data at rest and in transit using AES-256-bit encryption.
  • Integrity validation: Resilio uses cryptographic data integrity validation to ensure files arrive at their destination intact and uncorrupted.
  • Granular permission controls: You can control who is allowed to access specific files and folders, regardless of where they’re stored.
  • Forward secrecy: Each session is protected with a one-time session encryption key.
  • Mutual authentication: Before initiating a transfer with any endpoint, the endpoint is required to provide an authentication key. This ensures your data is only delivered to approved destinations.
Mutual Authentication: Data is only delivered to designated endpoints; In-Transit Encryption: Data can't be intercepted or hacked; Integrity Validation Process: Ensures data remains intact

Use Resilio for Fast and Reliable File Caching, Access, and Synchronization across All Locations

Resilio is an ideal knfsd alternative for teams looking for a globally supported, high-performance, ultra-low latency solution. Its unique benefits enable it to overcome knfsd’s limitations and give teams the freedom to cache, access, and sync data across all locations while overcoming latency and working with their choice of storage and OS (not just Linux).

Put simply, Resilio is:

  • Fast and reliable: Our solution uses a P2P replication architecture that eliminates single points of failure and provides fast synchronization. It also uses a proprietary WAN acceleration protocol to optimize transfers over any network regardless of quality, latency, or packet loss.
  • Versatile: You can deploy Resilio agents on any on-premise storage device, cloud storage platform, and operating system. This means you’re not limited to GCP or Linux but can instead freely cache, access, and move data across other clouds (e.g., AWS, Azure, or Backblaze), on-prem environments, hybrid cloud setups, and much more.
  • Efficient: You can install Resilio directly on your existing infrastructure, so there’s no need to invest in new hardware, multiple gateways, or security solutions. You can also increase productivity and minimize cloud costs with features like selective sync, local caching, and partial downloads.
  • Easy to manage and monitor: Resilio’s Management Console gives you granular control over how files are cached, replicated, and accessed in your environment, even in hybrid- and multi-cloud scenarios. You can also set up and monitor new jobs without writing code (although you can always script any functionality that your workflow requires).
  • Secure by default: Resilio includes built-in security features that protect your data at rest and in transit. You can be sure that your data is always secure, without having to invest in 3rd-party security tools and VPNs.

Organizations use Resilio Connect to cache, sync, and access data for media workflows (Turner Sports, Innovative), gaming (Wargaming, Larian Studios), remote operations (Mercedes-Benz, Buckeye Power Sales), and more. If you want to learn how Resilio Connect can help your business, as well, schedule a demo with our team.

Overview

Learn how knfsd works, what its weaknesses are, and how to overcome them with Resilio.

Related Posts

Schedule Demo

Step 1: fill in your details

On the next step you will be able to choose date and time of the demo session

Additional Resources

Resilio Connect for Server Sync

Related Posts