Slack releases simple, smart network overlay as open source
Slack has released Nebula, an open source global overlay that connects computers regardless of their location — in any cloud, on-premise, or behind the gatekeepers of private networks.
The company wrote its own solution to cope with a sudden boom in resources that internal IT teams needed to monitor and manage.
That’s a problem that happens when companies scale— their underlying systems and networks must also expand to support the work of the organization. But managing those ever-growing technological resources can be difficult without a significant investment in some overarching IT management system. And even something as “simple” as sharing files (like server configurations, for example) gets hugely resource-consuming at any significant scale.
There are dozens of applications and services on the market that can help IT teams achieve an abstraction of a network so it appears as a flat, blanket subnet. In some cases, technicians and developers in-house can cobble together a management system over the increasing number of servers and resources under their control. That was the case at Slack, the San Francisco-based collaborative platform company, whose systems engineers used a series of IPsec tunnels between regions to monitor and manage its growing server numbers. But eventually, the downsides to manual oversight became apparent.
Writing in a Medium article, Slack’s chief Engineer, Ryan Huber, described how the company’s teams explored and ended up rejecting many of the solutions that were available already, such as the 1990s VPN stalwart, Tinc, which can be used to create a single, network overlay across complex hybrid clouds, for example.
While Tinc is an excellent piece of software that’s the very definition of mature (at least in years), its lack of stratified, or granular privilege-based network segmentation made it unsuitable for commercial enterprise use, Huber and team thought.
What Nebula and Tinc (plus others) do is to present to each computer a series of readily-available network connections between nodes, regardless of whether they are behind a complex NAT system, in a different region, in a public or private cloud, or at the user’s home address.
The old adage of the developer working to achieve what he or she wants for themselves rings true here— many of Slack’s engineers and technicians use Nebula for their own home networks, and to connect their remote machines with their work computers, for example. There’s also a similar personal touch to many of Nebula’s naming conventions: nodes that handle inter-machine addressing and routing issues are called “lighthouses”, and the system daemon that handles NAT bridging necessary for node-to-node communications is called “punchy”.
But those personalized and seemingly flippant elements belie what is a highly-secure, resilient and clever piece of technology. Much of the complexities of joining nodes that operate in wildly different topologies is done under the hood; the end-user is presented with the ability to connect easily to other nodes as simply as if they were sitting on the same LAN, on the same subnet. The various traversal and networking protocols are tried and optimized by the lighthouse node(s) on the fly, making the network one that stays up, even when nodes drop off — the type of occurrence that might wreak havok with an ad hoc, IPsec-plus-scripts home-baked solution.
— Allan Leinwand (@leinwand) November 19, 2019
The entirety of Nebula is open source (available on GitHub) and the company has chosen not to create a proprietary encryption system: that’s a crucial aspect of ensuring the Nebula network is trusted by its users who might otherwise be suspicious of how safe this free-to-use network overlay might be.
To use Nebula, you (or your system techs) will need to be happy using secure key pairs, command-line interfaces and unpacking tarballs— but this is meat and drink to most, of course.
There are Linux, Windows and Mac versions, and an iOS version (created in a day’s hackathon in a skeletal form) is in the pipeline. As a system that’s already been in everyday use for the last two years in production at the global player that is Slack, it’s probably a good bet, even at scale.