A Few Thoughts on Distributed Computing: September 2018

Tuesday, 25 September 2018

Blockchain verification and the dangers of branching temporal logic models (very technical)

Many years ago, I worked with some colleagues on an ill-fated topic: we tried to write down a logical statement concerning the guarantees provided by atomic multicast systems that manage their own membership. Today, we know how to do that, courtesy of Lamport’s Paxos specification and the proof methods he introduced.

But those were the Wild West days, and that particular project occurred before the Paxos specification was created. Moreover, our atomic multicast (which actually could be configured as a Paxos protocol), also included some optional features Leslie omitted from Paxos, for good reason. Those centered on a form of optimistic early delivery combined with barrier synchronization (analogous to a memory fence).

Our puzzle centered on expressions describing "all the possible" future behaviors that could arise with this rather complex optimistic form of early delivery. The problem was that the set of future scenarios grew exponentially as new members join and existing members failed (or exited voluntarily). Our logic needed to "account" for all of these possibilities. In fact, the logic itself had a flaw, but even had we managed to write it down properly, we still would have had a semantic in which statements can only be interpreted within the “space” of future scenarios. Since it is intractable to do model checking in an exponentially growing state space, such statements are often undecideable: they can have a true or false value and yet no decision procedure can be created that will terminate in bounded time.

A total mess. We ultimately abandoned the effort, tails between our legs, and came to view it as an embarrassment. The less said about it, the better!

Except... I sometimes tell the story, as a cautionary tale.

Not every technology suffers from the issue we encountered. Lamport's way of formalizing Paxos avoids the issue by avoiding speculative "early delivery", which was the real issue in our early work. This is one reason that Lamport's Paxos specification was such a success.

Transactional database models also have a better way of handling such problems: when we ask whether a database system is in a serializable state, the rule is to start by erasing all the uncommitted transactions, at which point serializability is defined as a property of the committed state. This approach accepts that transactions could glimpse inconsistent states while executing: it isn't a problem so long as those transactions can't commit. Moreover, it erases all the events that depend on future outcomes, neatly avoiding the whole issue our unfortunate effort ran up against.

Which brings us to BlockChain. I'm intrigued by the recent work that seeks to put a kind of transactional database behavior "into" the BlockChain, by incorporating SQL-like statements into the transactions themselves, but then reevaluating them as the BlockChain steadily grows.

To appreciate why this poses the same problem I struggled with twenty years ago, think about a smart contract that says something along the following lines: "John agrees to sell me his horse, Bucky, for the sum of $1,500, and has accepted a $150 deposit. If I haven't completed the purchase within a week, John agrees to return the deposit. But in the meanwhile, John can continue to try and sell Bucky. If he finds another buyer, he can cancel my transaction, but in that case must both return the deposit and also pay me an addition $100, to compensate me for my trouble."

The world is full of contracts like these. Smart contracts can express things like rules for computing interest that depend on global interest rates. We probably all remember 2008, when the world financial system melted down over issues with mortgage-backed securities split into interest and principle. The claim is that the expressive power of smart contracts is a good thing, because smart contracts can be analyzed by tools (compiler-style tools), and hence it should be possible to automatically identify risks. Risk managers would then have robust ways to compute their risk and hedge against it, and those hedging contracts (insurance deals) could also be carefully evaluated, etc. Back in 2008, it wasn't the mortgage-backed securities that collapsed, per-se. It was the insurance companies that insured them, but didn't hedge the "secondary" risk properly.

So... how does one attach a "formal meaning" to a smart contract? Let's go back to John's sale of Bucky. Notice that this contract depends on how things play out into the future. For the coming week, the BlockChain will grow, and each new block added to the chain could bring events relevant to the contract. John could decide to cancel the contract and sign a new contract with Sally (perhaps she is ready to pay more -- enough to more than compensate John for the $100 he'll forfeit). I could show up with the remaining $1350, and head home with Bucky. A week could pass, and John would have to return my $150 deposit. And it gets worse: John could sign with Sally, but then Sally might cancel her deal, and perhaps John would then want to reinstate his deal with me.

Much as in that early work I tried to do, John's smart contract with me has a meaning that can depend on a branching future state: some large (maybe exponential) space of possibilities, each leading to its own proper interpretation of the correct "thing to do". Should John hand Bucky over to me, or not? Do I owe him $1,350, or does he owe me $150, or should it be $250?

Without much trouble, we can design sequences of smart contracts in which to know the proper outcome for my contract, I need to figure out the outcome of Sally's contract (and this then becomes an induction, because Sally's contract may depend on the outcome of Carl's contract). This is precisely how my early work failed: you end up with scenarios that can be arbitrarily prolonged, and the total space of scenarios grows exponentially in the length of the future chain, because of an endlessly longer sequence of new events that each depends on its own future outcomes.

Beyond all of which we have the issue of rollbacks: even if you accept the common wisdom and adopt the view that a BlockChain prefix has magically "committed" once it has been extended by six or more blocks, we still run into the problem that the suffix is unstable. So we could have one suffix in which Sally's transaction finalizes, but it might then rollback, aborting that outcome and perhaps replacing it with one in which Sally cancels her purchase.

Should it trouble us that smart contracts on BlockChains might not have a tractable meaning -- a reduction to temporal logic -- if they include future references? For that matter, even without future references, some of the aspects just mentioned would still arise. Is this bad?

I think so: it seems to me that in computing, if we're learned one thing over the decades, it is the overarching importance of rigorous semantics. If BlockChains with smart contracts can't be reduced to a stable logical framework in which proofs can be carried out without solving infeasible problems (evaluating logical formulas within exponentially growing state spaces is a well-known infeasible problem), then we are looking at a profoundly untrustworthy framework.

So beware, all of you rabid BlockChain investors! If you are betting big on smart contracts, you owe it to yourselves to figure out a way to reduce the statements those contracts make to a stable, computationally feasible form. You know what they say: those who fail to learn from the past are doomed to repeat it. If you don't find a robust and tractable semantics for your technology, then someday, twenty years from you, too will be writing little blog postings about how your work once took a hopelessly wrong turn... and that Professor Birman's sad story of his unfortunate foray into the theory of branching future executions should have warned you!

Thursday, 13 September 2018

Will HPC survive the cloud?

I just got back a from an HPC workshop, where a lot of the discussion was focused on the impact of cloud computing on HPC. Here are a few of the main take-aways.

First, to get this up in front, HPC is alive and well. A whole slew of amazing new computers are about to be powered up, operating at speeds that just defy human understanding. So HPC isn't about to collapse and die tomorrow. (Ten years out, though, is a more complex question).
Some of the really big financial drivers for HPC are things that genuinely need massive compute infrastructures: tasks like weather prediction, scientific computing from experiments like the LIGO gravitational-wave observatory, modelling the air flow around a supersonic jet.
But more and more HPC tasks have an embarrassingly parallel structure and really map down to huge numbers of subtasks that don't need massive computers to perform. One person estimated that 90 to 95% of the workload on today's biggest computers consists of vast numbers of smaller jobs that run as a batch, but could easily be performed on smaller machines if they had the right hardware.
And several speakers put pictures of big cloud computing data centers up, and pointed out that no matter how exciting those new HPC systems will be, even a small cloud data center has vastly more compute power in it, and vastly more storage capacity.
On top of this, we have the runaway success of Microsoft's Azure HPC, which has become a genuinely hot cloud platform -- the demand far exceeds what people had expected, based on industry articles I've followed over the past few years. Azure HPC offers smallish clusters that might have, say, 48 machines, but those machines would then be running the same "bare metal" platforms you see on massive HPC supercomputers. And nothing stops Azure HPC from ramping up and starting to offer larger and larger configurations. Rather than run MPI over RoCE, Microsoft just puts a second network infrastructure on their Azure HPC clusters, using InfiniBand for MPI and treating the standard ethernet as a control network for general TCP/IP uses.

So this is the looming threat to the HPC community: not so much that HPC might suddenly loose its steam, but rather that we could see some non-trivial percentage of the HPC jobs migrate towards platforms like Azure HPC. And in fact one speaker at the workshop was the head of computing for a large research university, who told us about a consortium being formed to promote just that transition. What he explained was that while really big HPC still needs the big data centers, like the U. Texas XSEDE systems, most of the campus needs could be adequately served with smaller resources. This makes it appealing for universities to rent, rather than own, and by forming consortia, they could have the bargaining power to make financially compelling deals with big cloud HPC operators like Microsoft (and not just Microsoft -- he pointed out that as a buyer shopping around, he was getting bids from quite a few cloud providers).

The issue this raises is that it redirects money that would in the past have flowed to the HPC data centers towards those big providers. Imagine a world in which, say five years from now, 30% of today's HPC has moved to cloud solutions. The loss of that income base could make it very hard for the big data centers to continue to invest and upgrade. Meanwhile, all that cash flowing to the data center owners would incent them to explore more and more ambitious cloud-hosted HPC products, trying to find the sweet spot that maximizes income without overstretching them.

The second issue I'm seeing relates to my new favorite topic: the intelligent, reactive cloud edge. Recall from my past few blog postings that I've been fascinated by the evolution of the first tier of the cloud: machines inside the data center, but that are on the front line, running services that directly handle incoming data from IoT devices, real-time uses like smart cars or smart homes, or other time-critical, highly demanding, applications. Part of my interest is that I'm really not fond of just working on web servers, and these intelligent IoT applications need the mix of fault-tolerance and consistency that my group specializes in: they seem like the right home for our Derecho technology and the file system that runs over it, Freeze Frame.

But this has an HPC ramification too: if companies like Microsoft want Azure HPC to be a player in their cloud applications, they will invest to strengthen the options for using HPC solutions as part of real-time edge applications. We'll see a growing range of HPC platforms that can tie deeply right into the Azure IoT Edge, for example, and HPC could start to perform demanding tasks under real-time pressure.

Right now, those standards aren't at all mature -- HPC systems are casual about endlessly slow startup (I did one experiment with MPI and was shocked to realize that multi-minute delays are totally common between when a job "starts" and when the full configuration of the job is actually up and ready to run my application). We could talk about why this is the case: they do silly things like pulling the container images one by one on the nodes as they launch, and sometimes actually pull DLLs in one by one as needed too, so the causes are totally mundane. Derecho (or even its RDMC component) could be "life transforming" for this kind of thing! But the real point is that it can be fixed.

So imagine that over a five year period, the Azure edge, and similar systems from Amazon and other providers, start to really do a great job of integrating HPC into the cloud. The rich and extensive tool set the HPC community has developed suddenly becomes available to cloud application creators, for use in real-time situations, and it becomes easy to capture data and "farm it out" to HPC with no significant delays at all (I mean milliseconds, whereas today, that might be minutes...). Wow, what an exciting thing this could yield!!!

For example, in the electric power grid settings I've worked on, one could do micro-predictions of wind patterns or sunshine patterns and use that to anticipate the power output from wind farms or solar farms. You could adjust the wind turbines dynamically to maximize their productivity. Someday, with enough knowledge of the communities connected to the grid, we could even predict the exact power output from city-scale rooftop solar deployments. Just that one use case could be transformative!

Then you can imagine all kinds of image processing and data fusion tasks that would be feasible today in offline settings, but way out of reach for real-time applications. Suddenly they could become HPC subtasks in this hybrid cloud built as a fast reactive edge with HPC clusters as a module available to the edge applications. HPC could become a major player in the cloud ecosystem.

This is the bigger threat to the traditional HPC community, as I see it: a threat of explosive innovation that could win by just being more exciting, faster growing, lucrative, and massive in scale. It wouldn't take long before HPC on the cloud would be the hot setting for young researchers to tackle, and HPC on traditional supercomputers would begin to starve simply because it would look more and more like a legacy world.

At the workshop, we actually had one speaker who made the case that HPC supercomputers were in a "race" to own time-critical (real-time) HPC compute tasks. But there were many speakers, myself included, who argued that no, the race is already over -- the cloud won before the HPC community even knew that the opportunity existed. Today, the real race is the race to be a player in this new thing: the intelligent IoT-based edge. And HPC as a component of that story clearly has a very bright future.