Thursday 28 June 2018

When open source is the right model

At DSN, I found myself in conversation with some entrepreneurs who were curious to know why in an era when people are making billions on relatively small ideas, we aren't adopting a more mercenary IP stance with Derecho.  For them, our focus on papers at DSN and TOCS and open software that really works was a strange choice, given that we do have a shot at products and startups that could be pretty lucrative.

Here are some dimensions of the question worth pondering.
  • Academic research is judged by impact, meaning broad adoption, lots of citations, etc.  
  • We started our project with DARPA MRC funding.  DARPA insisted that we use open source licensing from the start, but we can pretend it didn’t and still reach the same conclusion.
  • Publicly funded research should benefit the taxpayers who wrote the checks.  For a system like Derecho, this means the system needs to be useful, adopted by US companies that can leverage our ideas in their products, and help them hire lots of people in high-paying jobs.  Derecho should enable high value applications that would not have been possible without it.   
Should Derecho be patented and licensed for a fee?
  • Patents don’t protect mathematical proofs or theorems (or algorithms, or protocols).  Patents protect artifacts.  I often end up in debate with theory people who find this frustrating.  Yet it is the way the system works.  Patents simply cannot be used to protect conceptual aspects, even those fundamental to engineering artifacts that use those concepts in basic ways.  They protect the actual realization: the physical embodiment, the actual lines of code.  
  • Thus my group could perhaps patent Derecho itself, the actual software, through Cornell (this ownership assignment is defined under the US Bayh-Dole act).  But we cannot pursue a patent on state machine replication, the 1980’s model (the theory) underlying Derecho. Our patent would be narrow, and would not stop you from creating your own system, Tornado, with similar algorithms inspired directly by our papers.   Sure, we could work with a lawyer to arrive at tricky patent-claim wording that a naive reader might believe to cover optimal state machine replication.  Yet even if the USPTO were to allow the resulting claims no judge would uphold the broad interpretation and rule in our favor against Tornado, because patents are defined to cover artifacts, not mathematical theories.  Software patent claims must be interpreted as statements about Derecho embodiment of the principle, not the principle itself.  This is just how patents work.
  • Wait, am I some kind of expert on patents?  How would I know this stuff about patent law?  Without belaboring the point, yes, I actually am an expert on software IP and patents, although I am not an IP lawyer.  I got my expertise by fighting lawsuits starting in the 1990’s, most recently was the lead expert witness in a case with a lot of money at stake ($Bs), and I’ve worked with some of the country’s best legal IP talent.  My side in these cases never lost, not once.  I also helped Cornell develop its software IP policies.
  • Can open source still be monetized?  Sure.  Just think about Linux.  RedHat and other companies add high value, yet Linux itself remains open and free.  Or DataBricks (Spark).  Nothing stops us from someday following that path.
So, why should this imply that Derecho should be free, open source?
  • There are software systems that nobody wants to keep secret.  Thousands of people know every line of the Linux kernel source code, maybe even tens of thousands, and this is good because it enables Linux to play a universal role: the most standard “device driver” available to the industry, taking the whole machine to be the device.  We need this form of universal standard, and we learned decades ago that without standards, we end up with a Tower of Babel: components that simply don’t interoperate.  The key enabler is open source.
  • That same issue has denied us a standard, universal solution for state machine replication.  We want this to change, and for Derecho to be the standard.
  • There are already a ton of open source, free, software libraries for group communication.  Linux even has a basic block replication layer, built right in.  You need to value an artifact by first assessing the value of other comparable things, then asking what the value-add of your  new technology is, then using the two to arrive at a fair market value and sales proposition.  
  • But this suggests that because Derecho competes with viable options (“incumbents”) that have a dollar value of zero, then even if it has a high differentiated value as a much better tool, the highest possible monetary value for it would need to be quite low, or the market would reject it.  So yes, we could perhaps charge $5 for a license, but we would be foolish to try and charge $500K.  
  • You might still be able to construct a logic for valuing Derecho very high and then licensing is just once.  The broader market would reject the offering, but some single company might consider taking an exclusive license.  So you could protect Derecho with a patent, then sell it.  The system would end up as a proprietary product owned fully by the buyer: your US tax dollars hard at work on behalf of that one lucky buyer.  But now what’s happens?  In fact,   someone else could simply create a new free version.  Examples?  Think about HDFS and Hadoop and Zookeeper (created to mimic GFS, MapReduce and Chubby, all proprietary).  To me the writing is on the wall: if Derecho isn’t offered for free, the market will  reject it in favor of less performent free software, and then if the speed issue becomes a problem, ultimately someone else would build a Derecho clone, offering it as free software to fill the gap.  They would find this fairly easy, given our detailed papers, and it would be legal: recall that you can’t patent protocols.  This would be totally legal.
Conclusion? To maximize impact, Derecho needs to be open source.

No comments:

Post a Comment

This blog is inactive as of early in 2020. Comments have been disabled, and will be rejected as spam.

Note: only a member of this blog may post a comment.