Lsyncd stands for Live Syncing Daemon. I’m curious why I’ve never seen a bumper sticker called Live Sync or Daemon. Or a live syncing logo similar to the BSD devil, beastie. Lynscd may not be that popular, but it does serve a purpose.
Lsyncd was first released in 2008 by Mirko Vogt, and it has since been developed and maintained by a community of contributors on Github. Live syncing refers to the process of listening for and relaying file and directory changes on a source server (sometimes referred to as a master) to another program like Rsync (or Rsync-ssh), which handles the actual sync between systems. This is done through scripting commands in an Lsyncd configuration file (lsyncd.conf).
The good news is that Lsync provides a basic, automated (via scripting), one-way, semi-current (within a minute or so) sync capability. Lsyncd propagates changes from 1 Linux system to another to keep the remote (target) server in sync with the local (source) server. So if that type of mirror solution for Linux is what you’re looking for, Lsyncd may be worth a look.
The bad news is that you’ll be building, scripting, troubleshooting and supporting lsyncd. Also, it is not be a good fit for large-scale deployments. You can only sync one-way from a source to a target between 2 servers.
Other challenges with Lsyncd include:
- Poor management and diagnostic tools: If anything goes wrong in your script such as a loop, there’s no easy way to diagnose or troubleshoot what’s happening. Log files may become your best friends.
- Poor scalability: Save pthreads or stuffing more vRAM or vCPU in a box, Lsync won’t scale beyond a single system. It was designed to replicate changes from one sysetm to another.
- Limited applicability: Got big files or many files? Need to sync to Windows or Mac? Need to sync two-way or N-way? Or to more than a single endpoint? Good luck.
- SPOF: The design, like many traditional point-to-point file sync tools (pick your vendor) is a single point of failure. If the source or target goes down, your sync job goes down with it.
In summary, given the above caveats, if you’re a Linux guru and you like scripting, don’t have a business critical project, and don’t need to sync larger data sets for file systems, then Lsyncd may suffice. You’ll want to be comfortable editing configuration files (lsyncd.conf contains the Lsyncd configuration) and writing scripts to call Rsync, rsync-ssh, or programs like lsyncd -nodaemon. And troubleshooting them.
Near-Real-time Sync for Linux
Near-real-time refers to synchronizing changes as fast as possible: this could be within seconds or within a reasonable time-frame, based on your “real-time” requirements. One roll-your-own approach to this for Linux is Lsyncd + another sync tool like Rsync.
The variety of programs that can work with Lsyncd are described in the Lsyncd docs on Github. Under axkibe slash lsyncd.git (git clone https://github.com/axkibe/lsyncd.git) you can clone the source code to your Linux environment. From there you can use makepkg and sudo to create your package and cd into lsyncd. Git has specific instructions for creating this for your OS.
Another approach, in stark contrast, and enterprise-capable: Resilio Connect. Real-time can be configured and intervals set to near zero on the peers. The solution runs on Linux but is also cross-platform and runs on most popular operating systems. Resilio Connect is by far the fastest and most scalable offering on the market; so it’s on the high end when considering lsyncd and rsync alternatives.
The Resilio management console and Resilio agents run on a variety of Linux distros. Plus you get built in automation (also exposed via scripting and APIs) that scales the real-time synchronization process to as many endpoints as you can deploy. We have customers automatically synchronizing many thousands of servers in real-time. And reliability combined with centralized management make deploying files a hands-free operation.
Keep reading to learn about the pro’s and con’s of lsyncd (plus rsync or rsync-ssh) and how it compares to Resilio Connect. If you’d like to see a live sync Resilio demo, please get in touch with us.
Lsyncd + Rsync or Rsync-SSH
You can use Lsyncd in combination with Rsync or Rsync-ssh to mirror 1 Linux server to another. The basic gist is that the Lsyncd service runs on each Linux host. Lsyncd listens for file and directory change notifications on the source server using iNotify or FSevents, collects the changes, and then feeds them to a sync engine like rsync or rsync-ssh where a one-way replication occurs to the remote server.
Rsync is a well-established and widely used tool for one-way file synchronization. Unlike Lsyncd, Rsync does not use inotify and can be used to schedule and script the syncing process. Rsync can be called by Lsyncd to sync the changes to the remote server. Lsyncd let’s you set the file system scan intervals through the statusinterval setting configured in the lsyncd.conf configuration file. These changes are passed on to Rsync or Rsync-ssh.
Rsync has a differential sync engine that is more efficient than Lsyncd when synchronizing two systems. But Rsync is also one-way (unidirectional) and not scale to transferring large files or many files, especially over slow or unreliable network connections. Rsync does not offer much control over the sync process, but are at least two ways to check rsync progress.
The following command for invoking Rsync from Lsyncd might looking something like:
lsyncd- rsync /source your-remote-server.com::target/
This will sync a source directory on server 1 with a target directory on server 2; you can use mkdir to create any source and target directory (set targetdir) you need. Should you use Lsyncd with rsync or rsync-ssh? Lsyncd plus rsync-ssh may be the better option if you need to tunnel through firewalls. That said, Rsync has an optimized differential sync engine making it faster for synchronizing files on the same network or LAN between 2 systems.
Bottom line is, you will need to be familiar with those tools as well. As an rsync alternative, according to the Github documentation, lsyncd plus rsync+ssh does a better job handling file renames and file moves. Rsync, by contrast, has to retransmit entire files that were moved or deleted.
With Lsyncd, you can kick off scripts using Lua, a lightweight, high-performance programming language that is commonly used for scripting and embedding in other applications. It’s designed to be easy to learn, easy to use, and easy to embed. It has a small footprint, efficient memory management, and a simple and consistent syntax. Lua is also considered fast, as it uses a just-in-time (JIT) compiler, which compiles Lua code to machine code, improving execution speed. Lua is often used in game engines, embedded systems, and automation systems, as well as some other applications and platforms.
Through this framework, Lsyncd provides:
- Near-real-time file monitoring: As stated earlier, Lsyncd uses inotify to monitor file changes in a local directory and compare those changes to a target directory. The level of “real-time” can be configured such that changes are detected at given intervals (every minute or 5 minutes or whatever value you set in the conf file); lsyncd detects changes as they happen. If set to a low value, this allows for near-instantaneous file synchronization. Yet, network latency, file size, number of files, and other factors play a role in the level of “real-time”.
- Incremental file transfers: Lsyncd can call rsync to transfer files incrementally, where only the delta or change portion of the file is transferred. This means that file and directory changes can be sent efficiently between 2 systems, instead of having to send the entire file. That said, there may be problems sending incremental deltas of larger files.
- Include/exclude: Like regex, Lsyncd allows for the use of filters to exclude files or directories from sync jobs. Most contemporary sync solutions support this, but this is useful in environments where a subset of files within a directory tree need to be included or excluded from the transfer or synchronization job.
- Hub-and-trickle distribution: In a hub-and-spoke topology, Lsyncd can be configured to synchronize files to multiple locations sequentially, making it useful for creating backups or keeping multiple servers in sync. But there are caveats with hub-and-spoke. Each spoke pairs with the hub; thus, changes are only propagated between the hub and the one spoke. This is by no means real-time nor scalable, but may suffice for basic file backup use cases.
- Scripting: Lsyncd is written in Lua, a lightweight, high-level scripting language that is commonly used for complex scripting and embedding in other applications. This allows for custom scripts and actions to be executed before and after synchronizing files. When multiple, interdependent programs need to be run, Lua provides an efficient framework for that.
- Multiple operation modes: Depending on your use case, Lsyncd can operate in different modes like mirroring, propagating (basic file copy), or one-way sync. It can also use different methods of offloading the synchronization process like rsync or rsync-over-ssh.
- Logging: In var/log/lsyncd/lsyncd.log and var/log/lsyncd/lsyncd.status Lsyncd provides logging functionality and can output the logs to a file, syslog or both. It also provides debugging options that can be used to troubleshoot any errors that may occur.
Challenges with Lsyncd
Watch out for:
- Error handling and troubleshooting: Scripts can have errors. There’s no centralized management or monitoring system to diagnose these. Errors are kept in a logfile (lsyncd.log) as they arise.
- Conflicting changes: Although rare, conflicts may arise. This can happen when the same file is updated on multiple servers at the same time, leading to a conflict that needs to be resolved. Lsyncd can be configured to use a conflict resolution strategy, but it can still be a time-consuming and complex process to resolve conflicts. By contrast, Resilio Connect provides reliable bidirectional sync and conflict avoidance through priority peers, read-only peers, and file rename direction to both avoid conflicts and preserve data integrity.
- Looping syncs: When multiple servers are synced one- or two-ways, there is a risk of creating an infinite loop of syncs, where the changes are passed around endlessly between the servers. This can be hard to diagnose as well as lead to a high load on the servers and affect the performance. This can happen if servers are configured to sync with each other, rather than having a master-slave or hub-spoke configuration.
- Performance issues: Sync jobs and other transfers can not scale well on Lsyncd, especially when syncing large files or many files at once. This can lead to slow transfer speeds, high CPU usage, and high memory usage. It can also put pressure on network resources and create bottlenecks.
- Single point of failures (SPOFs): All systems running Lsyncd are a single point of failure. If either a source or target system fails, or the network connecting the two systems fails, the entire transfer and synchronization process fails.
- Security risks: With one- or two-way syncing, there is an increased risk of unauthorized access or data breaches, especially if the servers are accessible from the internet or if Lsyncd is configured to sync files using Rsync.
- Lack of versioning: Lsyncd does not provide versioning capability by default. So when conflict occurs, the last synced file will overwrite the older version. This can lead to data loss or corruption in cases where a previous version was needed.
- Lack of scalability: As the number of servers increases, the complexity and resources needed for the syncing increases. Lsyncd can only sync between 2 servers at once. And larger file systems (directory trees) containing larger files and larger numbers of files will bring it to its proverbial syncing knees. Lsyncd will not scale to for environments with frequently change files or larger data sets that need be kept current across multiple servers or locations.
- Syncing large files or large numbers of files: File sizes, number of files, number of servers, and complexity of the environment are all worth consideration before rolling out Lsyncd.
- Not cross platform: Lsyncd is for Linux only.
If speed, reliability, and performance are needed, you may want to consider an Lsyncd alternative like Resilio Connect. As both an Lsyncd and Rsync alternative, Resilio gives you complete automation to keep many servers in sync–in real-time–for any type of payload, across as many locations as needed.
Resilio Connect offers centralized management where the entire real-time sync process can be controlled, tracked, and monitored across as many servers as you need to manage. If there’s a failure anywhere in the network, Resilio’s resilient peer-to-peer solution routes around the failure. In fact, there is no single point of failure in the Resilio solution. If something does and happens, then it’s easy to diagnose and track down exactly what went wrong, vs having to sift through statusfiles or logfiles on lysyncd.
The Resilio management console can run natively on Linux (virtual machines, containers, or any system) and be located anywhere. Resilio agents are installed on each managed system, or “peer” in Resilio vernacular. So if you have a handful of Linux systems, and another handful of Windows and Mac systems, they all can be managed through a single pane of glass—or via scripting and APIs.
Another big difference between Lsyncd, rsync, and Resilio Connect is scalability: you can synchronize files across many systems concurrently (from a few to many thousands of servers)–and do this in about the same time as it takes to sync two servers. So comparing Connect to Lsyncd is not apples to apples. (Note: Resilio Connect is very different from Resilio Sync, an entry-level product designed for syncing fewer files across (at most) two servers and is designed for individuals and workgroups.)
Lsyncd and rsync (and other real-time sync tools) are point-to-point; meaning they can only sync between 2 servers at once, or via a hub-and-spoke. Resilio Connect, by contrast, is peer-to-peer which can be setup in most any topology and sync across many servers (or desktops) in parallel. Resilio Connect is not just real-time but can sync in parallel across as many systems as needed. If your apps, users, or workflows have larger file sets, and need their files to be available no matter where they are located, Resilio Connect is worth a look. Files can be of any size and type and each job scales easily to 250+ million files per job. The entire sync process is optimized for speed and efficiency: from picking up the changes on the peers to real-time synchronization across multiple peers concurrently. Automation is provided through the UI, via scripting, and via APIs. Customers rave that Resilio “just works” and they don’t have to do much once it’s operational.
Automation is the name of the game with Resilio. Scripts can be run before or after jobs complete. If you’re familiar with Linux the environment will be familiar; sync jobs can also be automated through the UI and via APIs. If you’re in DevOps, you can use any CI/CD pipeline but automate the file distribution and sync across as many locations and target endpoints as needed. A key consideration for Resilio is the ability to always get the changes where they need to be—no matter how many servers you have, how challenging the network, or how large your files are. Resilio offers a centralized management environment that makes it easy to track, monitor, and control file synchronization and other jobs, like distribution, consolidation, and scripting.
One-way Sync on Linux
With Lsyncd + Rsync your sync job can one be one-way (unidirectional) between 2 systems. Theoretically you can configure this to go two-way—but Lsyncd was designed for one-way sync: to listen for file changes on a local host server–and relay those to a remote server. According to Axel Kittenberger on Github, master-master mode is not officially supported due to unwanted behavior, such as loops; it was simply not designed for 2-way-sync.
Bidirectional or Multidirectional Sync on Linux
Resilio Connect provides reliable two-way (bidirectional) real-time sync for Linux (or another OS) that works across as many servers and locations as needed. Connect supports real-time one- or two-way sync as well as one-to-many, many-to-one, or many-to-many (what we call N-way sync).
Some other advantages of Resilio Connect over Lsyncd include:
- Efficiency: Sync many files of any size and type in real-time in any direction: one- or two-way to one-to-many, or many-to-one, or many-to-many. Resilio automatically syncs files in parallel across as many servers or endpoints as you can throw at it. Some customers synchronize millions of files across hundreds of servers at once.
- Reliability: Resilio is resilient and designed to overcome failures; there’s no SPOF. Resilio’s peer-to-peer engine dynamically routes around failures. A failure could be in the network or a server could be down. Or an entire site down. Files always get to where they need to be, on time and in real-time.
- Extensible automation: All tasks and data movement jobs (distribution, ingest, synchronization, and scripting) can be automated. You can either use the web UI, scripting, or extensible API set. Resilio Connect offers easy scripting by embedding scripts to run before, during, or after sync jobs.
- Ease of use: Resilio Connect has a user-friendly web interface that makes it easy to set up and manage file syncing across multiple servers. In contrast, Lsyncd requires a certain level of technical expertise to configure and maintain.
- Real-time: Resilio supports inotify and other file system change journals to achieve sub-five-second change notifications and replication across many endpoints concurrently. This can also be set based on your needs. Lsyncd defaults to 1 minute change notification intervals. While this can be configured in Lsynd.conf, the likelihood of picking up changes across larger directory trees containing larger numbers of files has not been validated on Lsyncd.
- High performance: Resilio Connect uses a distributed, peer-to-peer architecture for file syncing, which allows for faster transfer speeds and more efficient use of network resources compared to traditional client-server architectures. Lsyncd is point-to-point and while it uses inotify and rsync, it can only sync between 2 servers. Resilio is 10-100x faster than Lsyncd, depending on the network, file sizes, numbers of files, and latency. The more challenging the environment, the bigger the boost you’ll likely see with Resilio.
- Scalability: Resilio Connect is designed to handle files of any size and type, as well as large numbers of files and servers, making it suitable for enterprise-scale file syncing. Lsyncd, on the other hand, is better suited for smaller-scale projects containing fewer numbers of smaller files. Resilio has internally validated synchronizing 250+ million files per sync job but that is not a design limit.
- Security: Resilio includes built-in security features such as end-to-end encryption and role-based access control to protect data during transmission and at rest. There is no reliance on 3rd-party security solutions; Lsyncd does not include these features out of the box, but can be configured to use third-party tools for encryption and authentication.
- Cloud-friendly: If for any reason you need to extend sync jobs to the cloud, Resilio Connect supports cloud storage out of the box (object storage, file storage, and block storage).
Best Practices for Lsyncd and Linux
Here are some best practices for using Lsyncd. Please note that an Lsyncd alternative like Resilio Connect should be considered real-time sync for larger deployments, server sync, or web and app server sync; pretty much any scenario where you’ll be synchronizing larger files, many files, or a large number of servers. As such, please consider the following for Lsyncd:
- Test and benchmark Lsyncd before deployment: Make sure your scripts are right and do some basic performance testing to ensure adequate sync performance. This includes testing and measuring IOPS for each host system, CPU and memory usage, and overall stability of the software.
- Use a central Lsyncd configuration file: It’s important to use a central configuration file that is shared among all other systems. This allows for easy management of Lsyncd settings and ensures that all servers are configured consistently.
- Use rsync or rsync-over-ssh: Lsyncd can use either rsync or rsync-over-ssh to synchronize files. Rsync-over-ssh provides an additional layer of security by encrypting the data being transferred.
- Monitor and troubleshoot Lsyncd: Keep your log files handy. It’s important to monitor Lsyncd and troubleshoot any issues that may arise. This includes monitoring the log files, checking the sync status, and using debugging options to troubleshoot any errors.
- Limit the number of files and directories to sync: Lsyncd can be configured to sync specific files and directories, where not all files and directories need to be synchronized. This can help to reduce the load on the servers and improve performance.
- Use SSH Key Based Authentication: Using SSH key-based authentication for Lsyncd ensures the highest level of security. With this method, no password is required, and the private key file can be protected with a passphrase, which is still more secure than storing passwords in plain text.
- Use file filters: As mentioned earlier, Lsyncd allows you to use filters to exclude files or directories from sync. This feature can be useful to limit the number of files and directories.
It’s worth noting that some of the above may not be directly related to Lsyncd but are best practices for managing a large number of servers, such as monitoring and troubleshooting, and use of ssh keys, these can help to ensure that Lsyncd is deployed and managed in a secure and efficient manner.
Dealing with the Oops factor
Dealing with conflict resolution and other problems that may arise is non-trivial. How are conflicts resolved when a usr in one location may stomp on an operation while another user is accessing a file?
In the Resilio model, file sync conflicts are avoided by following best practices. These include setting priority nodes (certain peers will win if a file operation is in conflict), read-only nodes, and storing renamed files for a given period before the file in conflict is deleted. File renaming conflicts have their own set of best practices.
Lsyncd only detects certain file system events like creating, modifying or deleting files, but it does not have the ability to monitor file rename or move events. In the case of rename or move, Lsyncd has a workaround that can be used in some scenarios, it can be configured to detect the deletion of a file in the source directory and the creation of a file in the target directory, in this way Lsyncd will treat it as a rename or move operation and will synchronize the target file accordingly. However, this solution requires that the option “nofollow” is set to true in the configuration.
In terms of migrating files, Resilio Connect is POSIX- and NTFS-compliant and can migrate metadata with each file and directory in the sync process. This works a little different for Linux than Windows / NTFS. When you want to migrate permissions, this is set as an option with each sync job. So if you ever have to access a file copy, ownership / permissions is maintained.
In the Resilio model, file integrity is maintained end-to-end for files at rest and inflight across all endpoints involved in the transfer or sync process.
When to consider Lsyncd alternatives
If you’re fine with Lsyncd as it is, you may have stopped reading by now. Lsyncd is designed to propagate changes one-way: from 1 Linux host to 1 other Linux remote backup system.
Resilio Connect can do that–but so can Resilio Sync, an entry-level Resilio product. Resilio Connect is designed for larger-scale scenarios where IT or DevOps users need to rapidly, automatically, and efficiently distribute and sync files across a large number of servers (or other devices) in real-time or on a schedule. At least that’s why many enterprise customers use Resilio. Many of those customers are on Linux–and sync’ing many number of Linux servers.
Resilio is also well suited for use cases such as:
- File server and NAS sync: File servers (either Windows or Linux based) and NAS (often Linux or BSD based) are used to store and share files, and are typically used in organizations that have multiple employees or locations. Synchronizing file servers ensures that all users have access to the latest files and reduces the risk of data loss or inconsistencies. Resilio enables fast, reliable, and scalable real-time sync for as many file servers as needed. Optionally, for those on WIndows, Resilio offers DFS replication via turnkey DFSR Replacement for DFS namespaces.
- Web server deployment, distribution, and sync: Web servers (NginX, NodeJS, Apache, IIS) are typically used to ensure high availability and fast response times. Synchronizing many web servers in real-time ensures that all servers have the latest content and reduces the risk of downtime or errors. They can also be rapidly load balanced and protected. Resilio Connect offers a low latency, real-time automated deployment solution for web servers (to sync content, codes, and apps) where you can publish one (to a single source of truth) and synchronize to many—in parallel.
- App server sync: App servers and their hosted applications may need to rapidly deployed or updated to ensure all files (build files, containers, or other code and apps) is synchronized. Resilio can do this such that it’s easy to automate distribution and sync across many servers at once; typically used to ensure high availability and fast response times for application users. Synchronizing app servers ensures that all servers have the latest application code and reduces the risk of downtime or errors. Resilio ensures all app servers are in sync within sub-five-seconds and that they can be easily set up for active-active high availability to shorten recovery times and simplify failover.
- Disaster Recovery: If you need real-time replication for disaster recovery to meet faster RTOs and RPOs measured in seconds, Resilio Connect gives you the speed and flexibility you need. You can have as many sites and locations as needed.
- WAN Optimization: It’s worth noting that Resilio Connect can be used to synchronize multiple servers across multiple locations. Through a proprietary, UDP-based WAN optimization capability called Zero Gravity Transport™ (ZGT), Resilio is able to overcome packet loss and latency on WANs. This works by:
- Calculating the ideal send rate via a congestion control algorithm.
- Sending interval acknowledgements for a group of packets (that contains information about packet loss), rather than for each individual packet.
- Retransmitting lost packets just once per RTT (Round Trip Time).
Resilio’s WAN optimization technology speeds up transfer and sync jobs to or from the edge of the network — i.e., to and from remote locations with poor internet connections. One of our customers, Shifo, for example, uses Resilio to ingest and sync files across extremely poor network conditions.
When ZGT is combined with real-time server sync, there’s nothing faster. Mixhits Radio syncs their servers coast to coast using Resilio Connect: “If a music program updates in one location, file changes are detected and propagated across servers within 2 seconds,” said their CEO, Gary Hanna. “That rapid update and real-time synchronization has been a saving grace for us.”
A highly reliable and scalable Lsyncd Alternative
Lsyncd is a Linux-based tool that uses inotify to monitor files and directories and rsync or rsync-ssh to synchronize the files and directories in a point-to-point or hub-and-spoke architecture. The Lsyncd mirror solution is designed to synchronize a directory tree hosted on one server to a remote mirror. Lsyncd is good for smaller scale projects where the files and directories do not change that often. Using Lsyncd + Rsync-SSH provides a way to sync files from a secure area across a firewall to a less secure area.
While Linux gurus may like the seeming flexibility of Lsyncd, files may not be kept current across more than 2 servers unless relayed in a hub-and-spoke. And those 2 servers should contain smaller files and smaller numbers of files. Lsyncd +rsync is a good combo—if you want a free, open source scriptable tool that is not very reliable. Linux support includes Fedora or Red Hat via EPEL, Ubuntu, Centos, Debian, et al.
That said, if you need to reliably sync large files or larger numbers of files in real-time—and across more than 2 servers—Resilio Connect is fast, reliable, and scales to as many servers as needed. Resilio Connect runs on Linux, is full cross platform, and automates deployment and sync of as many servers as needed, in real-time.
Please get in touch with us to learn more about Resilio Connect.