, , , , ,

tl;dr—use the OSI-approved MIT License

Oh yeah, and IANAL (I Am Not A Lawyer—obviously).


(Background: I am working on a project for which I am planning to develop some non-trivial software, which I will host on Github as much for the convenience as to make my code available to others, although that is also one motivation.)

I can’t find it now, but not too long ago I was reading an article discussing open source licenses that showed that a large majority (almost all?) of the repositories on Github had no license information. Github has acted to address this by adding an option to automatically add a license to a new repository, and offers several options. The author of the article said that they thought the option would only cause more problems because users will not consider the effects of different licenses and just pick one, probably the GPL as it is the most well-known. Version incompatibility between GPL versions, and more generally incompatibility between various open source licenses offered as options by Github has the potential to cause more fragmentation as users pick a license without due consideration.

The upshot is that Github users need to be more informed and proactive about the way we release our software—just because Github makes it so easy doesn’t mean we’re done as soon as we type git push origin master.

It was thus that I embarked on a journey to learn what I should know about open source software licenses and make a decision on what license to use. After wandering adrift on the ocean of the internet for approximately 1000x as long as I initially intended, I have come to some conclusions which I have recorded here to try to make myself feel better about how long it took to get here.

Conclusions First (Or Second, Rather)

My conclusion is: use a permissive license listed as compatible with the GPLv2 by the FSF. For myself, I am choosing to use the MIT License approved by the OSI. This license is compatible with the GNU GPL and source code licensed by it can be combined with source code licensed under the GPLv2 (with or without an “any later version” statement) or GPLv3[1].

My rationale is that it is the least restrictive and most compatible of the commonly-used FOSS licenses, along with the Simplified or 2-clause BSD license, which I believe it is essentially equivalent to. It is mostly arbitrary which of the two to pick, but “the BSD license” is ambiguous and refers to two or three different licenses[2], while “the MIT license”—although it can also refer to the X11 license—I think most commonly refers to the Expat License, and that is what the OSI approved as the MIT License. Github supports using the OSI-approved MIT license, among other open source licenses. Thus I think that the MIT License is the clearest, easiest, low-cost option going forward for licensing a program as open source, for both the developer and the user.

As for the GNU GPL, I think it makes sense to use it if it serves your purpose, but it deserves careful consideration. I don’t think it makes sense for the GPL (any version) to necessarily be the default option for FOSS, especially given the incompatibility between the GPLv2 and GPLv3. If your goal is to make it easy for people to modify and share your software, then it makes more sense to choose a permissive license like the MIT license. If your goal is to ensure that modifications and derivatives of your software remain open source—a perfectly valid goal—then it makes sense to use the GNU GPLv3 (or, if you use source code which is licensed under the GPLv2 without the “any later version” statement then you have to use GPLv2 as well).

Other Issues


The biggest concern in developing and distributing free or open source software is copyright, because copyright applies as soon as a work is created without need for any application or process, and software has been copyrightable since 1983, thus anytime someone writes a piece of code they own the copyright for that code, which means that they can sue anyone else who uses that code without their permission. Software licenses exist to grant others permission to use source code in various ways.

Patents are a separate issue. A patent is a claimed invention that requires an application to the USPTO, it must show originality yadda yadda to be granted, but the salient point here is that if a patent is granted, it applies to all instantiations of that invention regardless of whether the creator/manufacteror is evenaware of the patent (consequences can be worse if they did know), and thus any source code anywhere could be violating someone’s patent somewhere, and there is no way to ensure that no patents apply other than reading all patents in existence. This is literally impossible, as the number of software patents is in the hundreds of thousands[3]. Yeah, the software patent system in the United States is massively fucked up, not to mince words.

Anyway, corporations and universities often acquire patents, and they may also have reasons to want to open-source software which may be related to those patents. Many open source licenses don’t mention patents; I don’t think it is clear what that means for patents that may affect software which is released under those licenses by the owners of the patents. My guess is that most organizations would be rather leery of this uncertainty; thus the Apache License 2.0 contains a provision granting permission to use patents in the software and preventing people who modify and redistribute the software from suing other people for those patents. The Educational Community License 2.0 limits the patent license to only those held by the programmers and not necessarily all the patents owned by their employing organzation (roughly)—this was apparently to make it easier for research universities to make use of without having to do a large amount of administrative work, essentially.

This is all irrelevant to me, however, and I suspect to most of Github as well. If you do belong to an organization that owns patents, check out the FSF’s list of licenses, and consult a lawyer.

Public Domain

Public domain refer to works that, for whatever reason, are not copyrighted. The most common use refers to works on which the copyright has expired; works produced by the US government are also in the public domain (within the US).

Rather than participate in all of this capitalist, legalist mumbo-jumbo that is copyright and licensing and all that, one alternative that seems attractive is to dedicate your source code into the public domain, thus allowing anyone to use it however they want. The capitalist international legal system turns out not to be so simple, however, in that not all countries even allow public domain dedications, and they may be revocable and ultimately Things Are Not That Simple. Unfortunately.

You can try it if you want to; I feel you, sister, believe me. Apparently The Unlicense is popular, recently, though you might do better with the CC0 dedication as it is more legally tuned and tested. I have a feeling that those might hurt wider adoption somewhat, although the SQLite database is in the public domain, so go figure.

Final Things

I want to mention the CRAPL, by Matt Might at the University of Utah. It’s quite funny, and serious as well—it address real needs unique to academic research. On the other hand, I don’t know how it would stand up in court. I do like it, though.

And one last thing, which I did not address in the original version of this post, and that is applying a license to your code. I still need to read it, so I will just point to the Software Freedom Law Center’s guidelines on this topic.

Really really super duper finally: the issue of documentation! (which I also forgot about, of course) The upshot: use the Creative Commons CC BY-SA (here) for documentation (see section 8 here, and the rest of that page too while you’re at it)—it’s what Wikipedia would want!

[1] http://www.gnu.org/licenses/license-list.html#Expat
[2] http://en.wikipedia.org/wiki/BSD_license
[3] http://www.groklaw.net/article.php?story=20130715054823358