Google Storage Transfer Service (GSTS) is Google’s solution for transferring files from on-premise file systems to cloud storage or from cloud to cloud.
It’s a pay-as-you-go product suitable for transferring data in one direction across Google Cloud Platform (GCP), Amazon Web Services (AWS), Azure, and on-premise environments (including from on-prem file systems).
This makes GSTS a solid choice for transfer scenarios like:
- Migrating data from file systems into GCP.
- Migrating on-premise data from a source bucket to GCP.
- Moving data in one direction between storage buckets, whether inside GCP or across cloud providers.
However, GSTS also has downsides. Mainly, it can be very difficult to deploy and manage. In situations where you’ll be movingor syncing large files, larger file systems, or multiple file systems in multiple directions — for example, two-way, one-to-many, many-to-one, or many-to-many — you may want to consider alternatives.
In these cases, a solution like Resilio Connect is well suited to moving and syncing data in any direction, across any cloud provider — using any type of device and network. To learn more about Resilio Connect or see it live in action, please schedule a demo.
Specifically, limitations with GSTS include:
- Complex deployment options. GSTS’s agents require you to have Docker installed and running on Linux servers or VMs. This can make the deployment difficult (or even impossible) for companies that rely mainly on Windows or other non-Linux technologies in their stack. It’s not like most Google services due to the complexity associated with setting up containerized instances, permissions, and other manual configuration steps.
- Time-consuming to set up and manage in larger projects. If you want to speed up transfers when moving lots of files, you’ll have to manually partition up your dataset and set up parallel transfers. While useful, this capability takes effort to set up and creates complex management demands down the line.
- Limited one-way transfer. This means you can only transfer data from a single source to a single destination. GSTS isn’t a good choice if you have a subset of data that needs to be continuously synced with data in other locations.
In the second part of this guide, we’ll discuss how Resilio Connect — our WAN-optimized data replication, sync, and file gateway solution — can help you overcome these downsides.
Resilio Connect enables omnidirectional, real-time sync across cloud regions, cloud providers, on-prem environments, and hybrid cloud infrastructure. Our solution offers:
- Versatile, hassle-free deployment options. Resilio Connect works with any device — including servers, desktops, laptops, edge devices, and more — and popular operating systems, including Linux, Windows, Android, Mac, FreeBSD, and more. There’s no limitation to using Docker, Linux, or any other combination of technologies.
- The ability to transfer and sync files in any direction. With Resilio, you can transfer files from endpoints collecting data at the remote edge, push updates to thousands of locations globally, let users in different regions continuously sync files in real time, and much more. There’s also no need to manually partition your dataset to increase your bandwidth, regardless of the number of files or endpoints.
- Reliability for all transfer, sync, and replication use cases. Resilio Connect uses a unique P2P architecture that doesn’t have a single point of failure. If any endpoint in your environment goes down, you can still access your data from another one. Plus, Resilio can also fully utilize any network (even if it’s unreliable), including VSAT, Wi-Fi, 3G/4G/5G cell networks, or any IP network.
- A single point of management for your data. You can control transfer, sync, and replication jobs from a Central Management Console. Plus, with our agents installed, you get a centralized view of all your data, regardless of whether it is in a Windows, Linux, Mac, Android, NAS system, or any other endpoint.
- Blazing-fast transfer speeds. Resilio Connect’s P2P architecture and proprietary, UDP-based WAN optimization deliver the fastest transfer speeds in the industry. We’ve achieved transfer throughput speeds of over 100 Gbps and have transferred 500 GB of data in GCP from London to Australia in 50 seconds.
Brands like Exxon Mobil, 2K Games, Warner Brothers, Cisco, and KFC use Resilio Connect to achieve fast and reliable data transfer, sync, and replication. To see Resilio in action for yourself, schedule a demo with our team.
Google Storage Transfer Service: How It Works, Pricing, & Limitations
GSTS is a software-based solution (SaaS) that can transfer data from various sources to destinations in Google Cloud Storage (unlike Google’s Transfer Appliance which is a hardware-based transfer solution).
At the time of this writing, the supported sources are:
- Amazon Simple Storage Service (S3).
- S3-compatible storage.
- Google Cloud Storage.
- Azure Blob Storage.
- File systems.
- Publicly accessible or signed HTTP and HTTPs URLs.
You can also use GSTS to transfer data from Google Storage to file systems or between two file systems. Make sure to refer to Google’s documentation on sources and links for the most up-to-date info on this topic.
Regardless of the source and destination, GSTS:
- Reduces the amount of data that needs to be transferred by only moving files or objects that have been changed since the last transfer.
- Encrypts data in transit automatically and performs data integrity checks to ensure it arrives intact.
- Can be configured to deliver Pub/Sub notifications on transfer completion.
- Enables you to choose to preserve your files’ or objects’ metadata.
Depending on the use case, you can get started in different ways, including the Google Cloud console, REST APIs, the Gcloud command-line tool, and Java or Python client libraries.
In terms of the setup, transfers where the source and/or destination is a file system or S3-compatible storage require agents and agent pools, while transfers from other sources don’t. For more details about each use case and transfer option, check out Google’s documentation on agent-based transfers and agentless transfers.
GSTS also uses a parallelized architecture to accelerate transfer speeds. You can create multiple parallel transfers, which is especially useful (but also time-consuming and difficult to do) when transferring a large number of relatively small files.
Besides simply transferring data to the cloud, you can also use GSTS for:
- Data archival by moving cold data to Cloud Storage from on-premise storage systems.
- Data backup by replicating your data to Google Cloud or creating a copy of a Cloud Storage bucket in another region.
- Data processing pipelines by moving data generated on data centers or other clouds to Google Cloud for analytics using BigQuery.
In terms of pricing, GSTS charges $0.0125 per GB transferred to the destination successfully for agent-based transfers to or from file systems. The service is free for agentless transfers or for agent-based transfers from S3-compatible storage.
However, remember that you’ll also incur regular storage charges, as well as charges for moving data from one storage bucket to another. There are also other factors that can contribute to your charges and you can learn more about them here.
Downside #1: Limited Deployment Options
If your transfer operation involves a file system or S3-compatible storage, you’ll need to set up GSTS agents. These agents are apps that can only run on Linux servers and VMs inside a Docker container.
There are no other deployment options here, which can overcomplicate your workflow and cost you lots of time if Linux and Docker aren’t part of your regular tech stack.
For example, say you’ve got a NAS system with Windows systems connected to it. In order to use GSTS, you first have to set up the Linux VMs and Docker containers. Then, you have to mount the storage and copy it into the Linux VM, after which it can get transferred. This whole process can take a lot of engineering hours.
Not to mention, IT teams that use Windows (or other non-Linux technologies) likely won’t be comfortable with their data running through Linux.
Downside #2: Time-Consuming Setup and Management for Large Projects
GSTS can be easy to use for simpler projects, like one-time transfers or cloud migrations from one site to GCP. However, things get exponentially more difficult if you want to maximize transfer speeds in a larger project.
For example, say you have to transfer a few hundred million files. GSTS has a maximum number of allowed queries per second (QPS) per transfer job. Since transferring data can trigger list, read, and write operations, you can easily hit this limit. Theoretically, you could overcome this by partitioning up your dataset and having multiple parallel transfers running concurrently.
In reality, this can be a very cumbersome process. First, you have to do the math on exactly how many Linux VMs you’ll need on the source site (or sites) to handle each part of the load based on the throughput you need to achieve. And that’s before you start setting them up.
The bigger issue is that you’re actually creating multiple points of management. For instance, if you need five Linux instances, you now have five points of management. You also have to get the data from your source servers into these instances at scale, which can be very difficult.
Downside #3: Differences in Transfer Features Depending on the Data Source and Destination
GSTS’s features don’t work the same way when it comes to cloud object storage vs. Cloud Storage transfers vs. file system transfers. For example:
- You can adjust transfer bandwidth when dealing with file system transfers but not when moving data from cloud object storage to cloud storage.
- There’s no overwrite option for file system transfers. This means that when GSTS detects a new or changed object on the file system, it copies the complete object to Cloud Storage (and you can’t change that behavior).
- Name-based source data filtering and modified-time source-data filtering are not supported for file system transfers.
You can learn more about these differences in Google’s documentation.
How Resilio Connect Delivers Fast and Reliable Transfer and Sync in Any Direction
Resilio Connect is a software-only, agent-based file transfer, sync, and replication solution that delivers the fastest transfer speeds across multiple locations over any distance.
You can deploy Resilio directly on your endpoint devices — from mobile phones to desktops, servers, cloud instances, and many more. There’s no need to buy new hardware or work with a particular set of technologies (like Linux and Docker).
Our solution also lets you manage on-premise and cloud data from one unified management view, from any location. You can set up, manage, and monitor data transfer, sync, and replication jobs from one central place, without needing to partition your dataset to increase throughput (like with GSTS). Plus, all jobs can be automated via the graphical management console, scripts, or APIs.
- Achieve near-instant transfer speeds across any cloud region, on-prem storage, and hybrid cloud setups.
- Handle massive data workloads and sync large volumes of data over many endpoints.
- Distribute updates and time-sensitive operational data in predictable timeframes across the globe.
- Collect data from geographically dispersed sites, machines, and vehicles.
- Deploy updates and security patches to vehicles, vessels, and other machines operating at the edge.
- Sync branch offices to ensure good remote work collaboration (even if they’re located in areas with poor or unreliable connectivity).
Flexible Deployment Options — No Technology Limitations
As we said, GSTS’ transfer agents run only on Linux servers or VMs and require having Docker installed. This means you’ll often be forced to use Linux and Docker even if they’re not part of your usual technology stack, which can overcomplicate your workflow.
In contrast, Resilio Connect offers much simpler and more flexible deployment options. Resilio is an agent-based, cross-platform solution that runs on your choice of physical or virtual machines. Agents run on Windows, Linux, macOS, FreeBSD, and some NAS systems.Deployment options include:
- Industry-standard desktops, servers, storage, edge devices, networks, and much more. Plus, Resilio is compatible with all popular operating systems, including Windows, Linux, FreeBSD, Open BSD, Mac, and Android. Open source or closed source, you can choose the OS that works for you.
- DAS, SAN, and NAS, including OSNexus, Synology, TrueNAS, QNAP, and more. This means you can continue using your current storage, without needing to purchase new hardware.
- Any hypervisor on any hardware running popular operating systems — VMware, Citrix, Hyper-v, and other hypervisors can be used for hosting virtual machines. Docker can be used for containers. There are no dependencies on having to run in a specific type of OS (e.g., Linux in Docker) like with GSTS.
Resilio is also a cloud-agnostic solution that can be deployed on any infrastructure — single-cloud, multi-cloud, hybrid cloud, on-premises, and so on.
For example, Resilio Connect lets you:
- Transfer and sync data across GCP regions and services.
- Use a variety of file storage solutions and cloud storage services simultaneously, including GCP, AWS, Azure, Wasabi, Backblaze, and more.
- Use any persistent or ephemeral compute engine hosted on any cloud. For example, some workloads may perform better on GCP while others perform better on AWS. Resilio makes it easy to move and sync cloud data between cloud providers automatically and transparently.
- Browse and sync files on file, block, or object storage via popular tools on operating systems like Mac and Windows.
- Blend storage capacity from any type of storage — hard drives, SSDs, storage systems, and so on.
Put simply, Resilio lets you continue using your current solutions and gives you the freedom to move data across any number of storage systems and cloud service providers without restricting you to Docker, Linux, or any other technology.
Fast, Reliable, and Predictable Transfer Speeds across Any Network
Conventional transfer solutions like GSTS are limited by “point-to-point” architectures. These architectures usually come in one of two topologies — client-server or “follow-the-sun”.
- In the client-server model, one device is designated as a hub server, while the others are clients. The hub can transfer and receive objects from any device but clients can only share data with the hub server. For example, if Client 1 wants to sync data through the other servers in your environment, it must first send them to the hub, which then replicates them to the other clients.
- In the “follow-the-sun” method, transfers can only happen sequentially, from one device to the next. So, if Device 1 wants to sync data across the environment, it must first replicate them across Device 2, which will then replicate Device 2 to Device 3, and so on.
Both of these topologies create bottlenecks by limiting data transfers or syncs to only two devices at a time (be it from hub to client or from one device to another). This leads to slower transfer speeds and creates single points of failure in your environment.
To overcome this issue, Resilio Connect uses a unique P2P (peer-to-peer) architecture. With this architecture, every device in your environment can transfer data across other devices. As a result, you can leverage the full bandwidth of your environment, leading to much faster transfer, sync, and replication speeds.
Resilio Connect also uses file chunking to turn files into several chunks when sharing them. Each one can be transferred independently from the others.
For example, say Device A wants to sync data to Devices B, C, D, and E. Device A can transfer the first block to Device B. Once it receives the block, Device B can share it with any other device in the network while Device A sends the remaining blocks. This mechanism results in transfer speeds that are 3-10x faster than traditional solutions.
These capabilities work automatically, regardless of the size and number of files you need to transfer or sync. There’s no need to manually partition your data in order to get fast, reliable, and efficient data transfers every time.
In short, our solution is ideal for transferring or syncing lots of files to locations all over the world, regardless of the network, number of endpoints, regions, and cloud providers.
For example, VoiceBase uses Resilio Connect to distribute 50 GB of files to 400 servers quickly and reliably. They were able to achieve an 88% drop in software distribution time thanks to Resilio.
Besides contributing to our industry-leading speeds, the P2P architecture makes Resilio an organically scalable solution.
This means that adding more endpoints only speeds up transfers and increases available bandwidth. As a result, you can use the full bandwidth of your environment to sync thousands of systems and millions of files quickly, reliably, and efficiently. Resilio can sync data 50% faster than point-to-point solutions in a 1:2 scenario and 500% faster in a 1:10 scenario.
The P2P architecture also doesn’t have a single point of failure. If one endpoint goes down, Resilio Connect can simply retrieve data from the nearest available endpoint by automatically routing around the outage.
Plus, in the event of a server failure or another disaster, Resilio can utilize all of your servers to achieve sub-five-second RPOs (Recovery Point Objectives) and RTOs (Recovery Time Objectives) within minutes of an outage.
This makes Resilio an ideal disaster recovery software for use cases like:
- Hot-site DR
- Warm-site DR
- Offsite copy
- Cold DR
Transfer, Sync, and Replicate in Any Direction
Unlike GSTS, and other typical file transfer and replication solutions, Resilio Connect can sync all data — including cloud data — in any direction: one-way, two-way, one-to-many, many-to-one, and N-way (full mesh).
This versatility makes Resilio Connect perfect for lots of use cases.
Let’s take n-way synchronization as an example. Say you have a workforce that’s distributed across the globe. With n-way sync, everyone can make changes to files and have those changes immediately distributed to every other office, regardless of where they’re working from. As a result, everyone can be sure they’re working on the same file versions.
The same goes for syncing a large environment of servers. With Resilio’s P2P architecture and n-way sync, every server in the environment can sync across the others in real-time, without having to rely on a hub server. This makes the whole process much faster and more efficient, as you can utilize the environment’s full bandwidth.
Centralized Granular Control over All Ingest, Replication, and Sync Tasks
With GSTS, you may need to partition your dataset manually in order to achieve maximum transfer speeds in some cases. Besides being time-consuming, this process creates multiple points of management (i.e., servers or VMs running Docker and Linux).
Resilio Connect avoids that complexity by giving you a single point of management for all data transfer, replication, and sync jobs in the form of a Central Management Console.
Here are some examples of things you can do with the Console:
- Create new replication, transfer, and sync jobs. You can have Resilio run across as many agents as necessary with only a single job set up to move the data that can be controlled from the Central Management Console.
- Create automated policies that govern bandwidth allocation for each endpoint.
- Monitor the progress of each job and set up real-time notifications that can be sent via email or webhooks.
- Script any type of functionality your job requires with Resilio’s REST API.
- Adjust key parameters like buffer size, bandwidth usage policies, and disk I/O threads.
- Manage files (or objects) stored in Google Storage, S3, Azure, or another cloud storage provider.
- Configure permissions for who can access, view, and edit your data.
The Management Console also shows you all key metrics for tracking each job’s progress, including duration, number of Resilio agents, number of agents, total bytes to transfer, maximum speed, ETAs, and more.
Lastly, you can store the console pretty much anywhere, including:
- In any virtual or physical Windows or Linux instance.
- In your on-prem environment.
- In a VM instance in a cloud computing provider.
And, because you can easily control every data movement aspect, you don’t need to manage NFS mounts or manage data across remote devices.
For a real-life example of the power of Resilio’s Central Management Console, check out our case study with MixHits Radio. They were able to massively reduce their troubleshooting times thanks to the Management Console.
“We have gone from spending 15 hours on average per week troubleshooting conflicts in the prior solution to spending no time at all with Resilio. We configure jobs once in the Resilio Connect Management Console and never have to look at it again.”
— Gary Hanna, CEO of Mixhits Radio
High-Speed WAN Transfers and Full Utilization of Any Network
Most file transfer solutions use the TCP protocol for transfers over wide area networks (WANs). This can be a big issue because latency and packet loss (which are defining characteristics of WANs) disrupt TCP’s performance.
TCP treats packet loss as a network congestion issue and reduces the transfer speed in response. However, packet loss is not a network congestion issue in WANs, so you end up getting data transfer speeds that are significantly lower than the available bandwidth of your internet provider. This prevents you from fully utilizing expressive WAN connections and synchronizing data across multiple sites and cloud providers.
That’s why Resilio Connect uses a proprietary, UDP-based transfer protocol called Zero Gravity Transport™ (ZGT). ZGT intelligently analyzes the underlying network by measuring packet loss, latency, and throughput over time, allowing it to:
- Adjust and maintain consistent speed.
- Adapt to changing conditions in real time.
- Maximize network performance and the utilization of any connection, including broadband, VSAT, Wi-Fi, cell, and more.
This technology is designed for unreliable networks, so you can always rely on it to ingest, sync, and replicate terabytes of data from the edge of a network to a centralized location in predictable timeframes. If you want more details on how ZGT works, check out our detailed WAN Optimization Whitepaper.
In the event of an outage, network failure, or another similar issue, Resilio Connect can also perform checksum restarts. This allows failed transfers to continue where they left off, which reduces the strain on your network since you’re not transferring the same more than once.
Lastly, we have a free speed calculator on our website, which you can use to see how much time Resilio Connect can save your business, depending on your use case (server sync, file send, remote work, and so on).
Overall, the combination of P2P architecture, WAN acceleration, and file chunking enables Resilio to deliver the fastest transfer speeds in the industry. For example:
- In early tests of a next-generation release, we’ve seen speeds of 200 Gbps across Azure regions.
- We’ve transferred 500 GB of data in the Google Cloud Platform from London to Australia in 50 seconds.
Efficient File Transfer and Access
Resilio Connect is a flexible object storage gateway solution that lets you ingest, sync, and access files across any cloud storage provider (including GCP, AWS, Azure, Backblaze, and more).
Additionally, our solution operates in a very efficient way to keep your cloud costs as low as possible. Here’s how:
- Resilio detects and transfers or syncs only the changed portion of your files. That way, there’s no unnecessary data being moved across the network.
- Our solution lets you choose the optimal network for your traffic. This feature is called Smart Routing and it enables you to move traffic (or just parts of it) to a LAN, instead of keeping it on an expensive WAN.
- You can download and sync files on demand with Transparent Selective Sync (TSS). TSS lets you browse objects stored in Google Storage or any S3-compatible storage as files, select individual files, and download, partially download, or sync them. This granular control is crucial for minimizing unnecessary data transfers.
- Resilio also lets you store frequently used files locally and keep infrequently used files in long-term cloud file storage. That way, you don’t have to download frequently accessed files from the cloud every time, which helps you reduce egress fees. Files that are not fully downloaded remain in the cloud but still appear local, which saves local storage capacity.
Lastly, Resilio is also a great solution if you want to provide people in your organization with low-latency access to the data they need. This is done through a very simple interface that operates similarly to Microsoft OneDrive, as you can see below.
Using this interface, people in your organization can access the data they need quickly and efficiently, regardless of where it’s stored.
For example, say you have key operational data stores in different environments, including on-prem servers and cloud storage services like Azure Blobs, Amazon S3, and Google Storage. Resilio Connect gives your teams a unified view of all that data, while TSS lets them browse, sync, or download the files they want to on-prem devices where they can be cached locally.
State-of-the-Art Security by Default
Our solution provides end-to-end data protection and encryption for data at rest and in transit. This means there’s no reliance on 3rd-party security tools or services when working with Resilio Connect.
Some of the most essential security features of Resilio include:
- AES 256 encryption for data at rest and in transit.
- Cryptographic data integrity validation to ensure files arrive uncorrupted and intact.
- Mutually authenticated endpoints, meaning endpoints must be pre-approved in order to receive data. That way, data arrives only at its intended destination.
- Data immutability, which protects you from ransomware and data loss by storing immutable copies of data in the public cloud.
- Access management, so you can set granular permissions that govern who can access, view, edit, and copy your files.
- And much more.
These and all other security features have been reviewed by 3rd-party security experts to ensure their reliability and data protection capabilities.
Achieve Fast, Reliable, Efficient Transfer and Sync in Any Direction
Resilio Connect’s unique P2P architecture and proprietary WAN optimization technology remove single points of failure, speed up transfers, and overcome latency and packet loss across any network, regardless of distance.
Plus, our solution is installed directly on endpoints, like Windows, Mac, Linux, and Android devices, NAS systems, and more. This makes its deployment much easier compared to GSTS’ method, as you’re not required to use Docker or Linux and can continue working with your existing infrastructure.
Additionally, our solution is:
- Able to transfer and sync data in any direction, including one-way, two-way, one-to-many, many-to-one, and n-way.
- Easy to manage, because you can control all aspects of data transfer, sync, and access from a single interface. You also get a single point of management and don’t need to manually partition your data, which saves a ton of time and effort.
- Efficient, as it only transfers or syncs the changed portion of files, so no unnecessary data gets moved across the network. Plus, you can use features like TSS, Smart Routing, and local storage to make each transfer or sync even more efficient.
- Flexible, as you can use it with various storage types, cloud providers, on-prem infrastructures, and hybrid cloud environments. You’re not restricted to any vendor’s ecosystem.
- Secure, thanks to AES 256 encryption and other security features that have been verified by 3rd-party security experts.
- Organically scalable because it performs better as you add more endpoints.
To learn more about how Resilio Connect can help your business, schedule a demo with our team.