On the edge: The only real problem is scaling

edge

“The only real problem is scaling.  All others inherit from that one.” — Mike O’Dell, Chief Scientist, UUNet

Building and running applications at the edge of the network is a complex exercise in distributed systems engineering.   And the challenge is often one of operations and scale.

When we think of scaling challenges at the edge, the first thought is often the large number of devices proliferating in that environment, from industrial IoT sensors, to retail automation to an increasingly mobile workforce.  All of these revolutions are powered by an explosion of cheap computing at the edge. But scaling challenges at the edge also arise in other ways, from the number of business processes, workflows or transactions that must be executed, the number of locations involved, integration of 3rd party applications, the size of the data to support emerging technologies like machine learning, the number of files in use or any combination of the above.

Scaling operations at the edge it a difficult challenge.  A manual business process that works from one site to another can be operated with brute force.  But scaling that manual process to thousands or 10’s of thousands of locations or endpoints is not feasible, unless you can somehow hire 10^4 more people.

Instead, you will need to make smart and solid technology decisions to automate and power any workflow or application at the edge of the network.  In doing so, you will want to rely on technology and solutions with demonstrated success operating in the unforgiving environment of the edge. And more than that, you want solutions that can scale,

Peer to peer systems are distributed systems that satisfy both requirements.  

To make the point, let’s consider the rich heritage of peer to peer solutions through two well-known examples.

Skype

By some estimates, Skype now carries over half of all international call volume while the incumbent telecommunications companies carry the rest.  Skype is clearly an application that has demonstrated considerable scale, and efficient scale at that.

Since being acquired by Microsoft, Skype has transformed from a peer to peer service to a centralized cloud service, but most of the growth and early scale was achieved as peer to peer software and with a relatively small operational workforce — compared to the world’s telecommunications companies, that is.  If the domestic telecommunications industry is estimated to employ nearly 1% of the overall U.S. workforce, extrapolating globally and the alternative to Skype is literally like hiring 10^6 more people. Skype simply does it more efficiently by abstracting the underlying infrastructure and intelligently routing the calls to preserve quality.

BitTorrent

In the late 2000’s, BitTorrent was moving over 1 ExaByte (1 Billion GBs) of data every month, across the edge of the network, involving users in every country of the world and on every kind of network imaginable.  It was at the time, a stunning volume of data, representing double-digit percentage points of aggregate Internet traffic. Nothing else on the Internet was close. Even YouTube, the most successful online video property at the time, was on a run rate to deliver an ExaByte of video every 2.7 years!  

And through an extra bit of scaling magic, BitTorrent moved thirty YouTube’s worth of traffic with only twenty employees and zero data centers!  YouTube on the other hand, like the telcos in the previous example, wallowed in the vast resources of Google’s entire data center and network infrastructure budget and operations team to achieve a fraction of the result.  

++

It’s obvious from the previous examples that peer to peer systems demonstrate magical operational scaling properties.  And that’s exactly what you need at the edge, the very place in your business where you have minimal operational resources at your disposal.

The scaling properties derive from the nature of the peers, which are generally equal in capability and often act as both client and server.  In this way, as demand grows, so does aggregate capacity in the system, which makes the application faster and more reliable at scale (the exact inverse of the client/server model at scale, which typically degrades in capacity on the order of demand until total failure is achieved).  In this way, 100’s of millions of endpoints contribute considerable capacity to the system making the overall system stronger.

Furthermore, peer to peer systems typically abstract the underlying infrastructure to remove complexity in routing and other basic operations through the creation of an intelligent and programmable overlay network.  This means there is no added complexity at the edge in configuration or operation, which is a key scaling requirement when the number of edge nodes or locations is increasing. Through built-in discovery mechanisms, the peers automatically find each other to complete the task at hand and intelligently manage the underlying infrastructure to maximum benefit in doing so.  When operating in an edge connected world, this programmable infrastructure is an absolute requirement to get the most out of the limited resources available without massive operational resources required to continuously tune the infrastructure at every edge node.

With such a rich heritage of being resource light and scalable, peer to peer solutions have continued to evolve in support of the modern enterprise.  For example, systems like Resilio Connect are centrally managed and leverage advanced transport protocols between the peers to intelligently leverage and program the performance of the underlying infrastructure and all from one central console, without requiring the manpower to manage the underlying infrastructure at every point.

The edge environment is ideal for these distributed systems.  If your application is seeking to integrate workflows at the edge with systems in the cloud (so-called edge-to-cloud or edge-connected workflows) the operational scaling problem is the only one that matters.    

All others inherit from that one.