OC3 registrations are now open! Join the premier event for confidential computing online or in Berlin on March 27.
Blog
This is a memo I initially wrote for internal consumption at Edgeless Systems to decide which open-source license to use for our "always-encrypted" Kubernetes Constellation. We ended up going with AGPL. Here are the considerations that led to this decision...
Felix Schuster
Traditionally, companies have formed around existing, successful open-source projects. In these cases, open source wasn't a deliberate choice and the companies had to work with what was already there. Examples of this setting include well-known "older" companies like Elastic, GitLab, or Redis, as well as some newcomers like Chainguard.
However, these days, open source oftentimes is a deliberate choice. Companies are started with the goal to build an open-source business or existing companies decide to release products as open source. Examples include well-known companies like CockroachDB, MongoDB, or HashiCorp, as well as an abundance of newcomers like n8n, or Kong. For these companies, open source is a means to an end, a strategic choice to maximize value.
While with open source, at least some part of the product is free-to-use in some way, the monetization options are:
Roughly, the benefits of open source for companies are:
Some of these are directly connected. For example, community and word of mouth are closely related. Further, low entry barriers and image clearly also impact word of mouth.
The risks that come with the above benefits are:
It is common wisdom that competition oftentimes doesn't turn out to be a huge problem, because people typically like to buy from the people who built the software originally. For infrastructure SaaS like databases, the story may be different though, because here people typically like to buy from the big CSPs, regardless.
All open-source companies need to balance the benefits against the risks. It is virtually impossible to have the former without the latter. Central to this tension is the license under which a product's source is published. A loose license that lets people do anything with the code will push all five forms of benefits, while also pushing all three forms of risks. A tighter license will mitigate the risks but will also dampen the benefits.
There are roughly the following groups of licenses.
Loose licenses
Popular examples include MIT, BSD, and Apache. They essentially allow anyone to do anything with the software. Only Apache reserves some trademark rights. Many infrastructure software companies initially use Apache. It is also the required license to become a CNCF project.
Copyleft licenses
Copyleft licenses also allow anyone to do anything with the software. However, they require that source code is made available to users of derived software under certain circumstances. Popular examples include:
It's important to note that AGPL only applies if users directly interact with the software. Thus, it would not apply in cases where someone is running cloud offerings on top of Constellation. It would only apply in cases where Constellation is made available as SaaS itself.
Restricted licenses
A range of restricted licenses exist that reserve certain rights. This includes:
Restricted licenses are typically not considered "real" open source. Rule of thumb: a license is real open source if it is accepted by the Open Source Initiative (OSI). If one declares a product released under a "restricted" license as "open source", one risks a shitstorm. At least for open-source purists, the correct term here is "source available".
The monetization model plays an important role for the choice of license. On the one side of the spectrum, SaaS-focused and user-facing companies like n8n, Gitpod, or Confluent can mostly only gain "trust" and "image" from open source, because their product is best consumed as SaaS and few people wish to manage the software on their own. Correspondingly, "competition" and "loss of IP value" are the primary risks. If the audience of a SaaS company is not tech-savvy, then open source doesn't even make sense in the most cases. There is little to gain and much to lose. Think of companies like Personio or HubSpot. These are actually at the extreme end of the spectrum. Consequently, such companies often opt for "restricted" licenses, which prevent "competition" (and by extension "loss of IP") and only moderately impact "trust" and "image".
Some developer-focused SaaS companies like Gitpod or Crowd.dev also pick a middle ground with the "copyleft" AGPL: while the license doesn't rule out SaaS competition, it at least requires potential competitors to open-source their offerings, giving the competitors some disadvantage. This still effectively keeps out a lot of potential competitors while paying in on "image".
On the other side of the spectrum is infrastructure software like cloud-native tooling, which is often used in self-managed deployments by experts and is primarily monetized through support and open core. For this type of software, all benefits and risks apply to their fullest. Therefore, the gains but also the losses can be big. The choice of license is critical for the success of the company. Here, many companies go for copyleft or loose licenses. This includes younger companies like HashiCorp, Kong, or Puppet as well as older ones like Red Hat (OpenShift) or VMWare (Tanzu). However, the older ones typically have a lot of paid enterprise features and no real community traction.
In between infrastructure software and user-facing SaaS is "somewhat infrastructure & somewhat SaaS" software like databases or AI frameworks. One can observe all types of licenses here, possibly with a bias towards "restricted" licenses.
Main takeaway: In the infrastructure space, most companies go for "loose" or "copyleft" licenses.
It can be observed that quite a few companies, especially in the database space, have changed their licenses in recent years. Examples include:
All these companies cited competition from cloud vendors as their primary motivation for the switch. For example, AWS was offering managed versions of Elastic Search and MongoDB without paying the original vendors. When the license change happened, AWS forked the last permissively licensed version and kept maintaining that in the open.
However, one can also clearly interpret these license switches as follows: All the companies are mature businesses. Some of them are even publicly traded. Their growth was substantially fueled by the benefits of open source. Now that they have reached a certain point in their development, they have less to gain from the benefits than they can lose. However, they can't afford to fully alienate their communities and therefore do an awkward dance around licenses, which in effect exclude most competition and also strictly limit free riding.
Main takeaway: Evidence seems to suggest that permissive licenses mostly benefit early-stage companies. (However, one could also interpret the evidence in such a way that permissive licenses are a relic of the past and companies are now switching to more modern, restrictive licensing schemes.)
It is common wisdom that open source works best for markets with many potential customers. There needs to be enough room for an attractive number of paying customers even if a large fraction of customers is "free riding". If there are only five potential (huge) customers in the world, one probably shouldn't go for a "loose" or "copyleft" license. However, it could still work out if some converted to paying customers over time.
Author: Felix Schuster