Open source is awesome! So many great projects out there. Individuals, teams, and companies pouring their passion into code that folks can use and evolve. Licenses are a fundamental part of open source but the reality of finding and complying with the licenses can be daunting and mind-numbing. As an industry we have not come to terms with this challenge and prefer either denial or throwing money at the problem. I think there is a better way.

The landscape today

Open source license compliance is not a very sexy topic and I’m certainly not here to convince you otherwise. It’s full of mind-numbing accounting, proforma actions, and cat herding. At the same time, it’s a key part of the open source ecosystem.

Licenses set out the ground rules for using a piece of content — code, images, doc, … They give you the right to use and distribute the content, and list the associated obligations. The license used in a community sets the core tone of how they operate. Fundamentally licenses make open source work.

Somewhere along the way however, licenses and license compliance built up a mythology and a stench of FUD. With open source hitting the early and late majorities in various domains, license compliance has become a mainstream procurement topic. For traditional organizations — governments, regulated industries, manufacturing — this FUD is taking on a whole new status and driving organizations to take and demand heroic actions.

At the same time, the scale of open source production and consumption has exploded. As I write, there are about 50M public repos on GitHub with about a third of them created in the last year. With package managers like NPM it’s trivial to pull in a thousand packages with a simple command. At Microsoft we currently track millions of integrations of hundreds of thousands of open source components across thousands of products. That’s a lot and we are not alone.

Microsoft open source use
Microsoft open source use scale

Given the complexities around compliance, it’s not surprising that a whole industry has sprung up around figuring out which open source you have, tracking it, getting data about it, and providing workflows to clear and approve open source use. And it’s a big industry with many players both commercial and open source, and big dollars — Black Duck was acquired in December 2017 for half a billion dollars. Companies regularly spend millions of dollars a year on compliance tools, data, and processing.

This is not healthy. It’s driven by fear.

It’s a whole bunch of friction that wastes time and resources that could otherwise go into working on the projects themselves (e.g., improving security) or creating new ones. It inhibits adoption of projects and limits deeper engagement.

Can we do better?

For sure folks need to honor the terms of the licenses — IMHO that’s a given — but somehow, we as a community and an industry need to make that easier, normalized and largely automated. We need to do better here.

Fortunately, times are changing and there is a raft of open source efforts in this space as well as some changing community norms and perspectives.

I don’t have all the answers but do have some ideas. This is the first in a series of posts where I’ll share those thoughts and hopefully hear from you on topics such as:

  • Distill open source license compliance (coming soon…)
  • Look at the industry around compliance (coming soon…)
  • Show how open source projects themselves play a role (coming soon…)
  • Ultimately sketch an open source solution to this open source problem (coming soon…)

What do you think? How much effort are you putting into compliance? As an open source project team how do you think about compliance in your community? Do you care?