A Few Thoughts on Distributed Computing: May 2019

Edge computing has a dismal reputation. Although continuing miniaturization of computing elements has made it possible to put small ARM processors pretty much anywhere, general purpose tasks don’t make much sense in the edge. The most obvious reason is that no matter how powerful the processor could be, a mix of power, bandwidth and cost constraints argue against that model.

Beyond this, the interesting forms of machine learning and decision making can't possibly occur in an autonomous way. An edge sensor will have the data it captures directly and any configuration we might have pushed to it last night, but very little real-time context: if every sensor were trying to share its data with every other sensor that might be interested in that data, the resulting n^2 pattern would overwhelm even the beefiest ARM configuration. Yet exchanging smaller data summaries implies that each device will run with different mixes of detail.

This creates a computing model constrained by hard theoretical bounds. In papers written in the 1980's, Stoneybrook economics professor Pradeep Dubey studied the efficiency of game-theoretic multiparty optimization. His early results inspired follow-on research by Berkeley's Elias Koutsoupias and Christos Papadimitriou, and by my colleagues here at Cornell, Tim Roughgarten and Eva Tardos. The bottom line is unequivocal: there is a huge "price of anarchy.” In an optimization system where parties independently work towards an optimal state using non-identical data, even when they can find a Nash optimal configuration, that state can be far from the global optimal.

As a distributed protocols person who builds systems, one obvious idea would be to explore more efficient data exchange protocols for the edge: systems in which the sensors iteratively exchange subsets of data in a smarter way, using consensus to agree on the data so that they are all computing against the same inputs. There as been plenty of work on this, including some of mine. But little of it has been adopted or even deployed experimentally.

The core problem is that communication constraints make direct sensor to sensor data exchange difficult and slow. If a backlink to the cloud is available, it is almost always best to just use it. But if you do, you end up with an IoT cloud model, where data first is uploaded to the cloud, then some computed result is pushed back to the devices. The devices are no longer autonomously intelligent: they are basically peripherals of the cloud.

Optimization is at the heart of machine learning and artificial intelligence, and so all of these observations lead us towards a cloud-hosted model of IoT intelligence. Other options, for example ones in which brilliant sensors are deployed to implement a decentralized intelligent system, might enable yield collective behavior but that behavior will be suboptimal, and perhaps even unstable (or chaotic). I was once quite interested in swarm computing (it seemed like a natural outgrowth of gossip protocols, on which I was working at the time). Today, I've come to doubt that robot swarms or self-organizing convoys of smart cars can work, and if they can, that the quality of their decision-making could compete against cloud-hosted solutions.

In fact the cloud has all sorts of magical superpowers that enable it to perform operations inaccessible to the IoT sensors. Consider data fusion: with multiple overlapping cameras operated from different perspectives, we can reconstruct 3D scenes -- in effect, using the images to generate a 3D model and then painting the model with the captured data. But to do this we need lots of parallel computing and heavy processing on GPU devices. Even a swarm of brilliant sensors could never create such a fused scene given today’s communication and hardware options.

And yet, even though I believe in the remarkable power of the cloud, I'm also skeptical about an IoT model that presumes the sensors are dumb devices. Devices like cameras actually possess remarkable powers too, ones that no central system can mimic. For example, if preconfigured with some form of interest model, a smart sensor can classify images: data to upload, data to retain but report only as a thumbnail with associated metadata, and data to discard outright. A camera may be able to pivot so as to point the lens at an interesting location, or to focus in anticipation of some expected event, or to configure a multispectral image sensor. It can decide when to snap the photo, and which of several candidate images to retain (many of today's cameras take multiple images and some even do so with different depths of field or different focal points). Cameras can also do a wide range of on-device image preprocessing and compression. If we overlook these specialized capabilities, we end up with a very dumb IoT edge and a cloud unable to compensate for its limitations.

The future, then, actually will demand a form of edge computing -- but one that will center on a partnership between the cloud (or perhaps a cloud edge running on a platform near the sensor, as with Azure IoT Edge), working in close concert with the attached sensors to dynamically configure them, perhaps reconfigure them as conditions change, and even to pass them knowledge models computed on the cloud that they can use on-camera (or radar, lidar, microphone) to improve the quality of information captured. Each element has its unique capabilities and roles.

Even the IoT network is heading towards a more and more dynamic and reconfigurable model. If one sensor captures a huge and extremely interesting object, while others have nothing notable to report, it may make sense to reconfigure the WiFi network to dedicate a maximum of resources to that one WiFi link. Moments later, having pulled the video to the cloud edge, we might shift those same resources to a set of motion sensors that are watching an interesting pattern of activity, or to some other camera.

Perhaps we need a new term for this kind of edge computing, but my own instinct is to just coopt the existing term -- the bottom line is that the classic idea of edge computing hasn't really gone very far, and reviled or not, is best "known" to people who aren't even active in the field today. The next generation of edge computing will be done by a new generation of researchers and product developers, and they might as well benefit from the name recognition -- I think they can brush off the negative associations fairly easily, given that edge computing never actually took off and then collapsed, or had any kind of extensive coverage in the commercial press.

The resulting research agenda is an exciting one. We will need to develop models for computing that single globally optimal knowledge state, yet for also "compiling" elements of it to be executed remotely. We'll need to understand how to treat physical-world actions like pivoting and focusing as elements of an otherwise Van Neuman computational framework, and to include the possibility of capturing new data side by side with the possibility of iterating a stochastic gradient descent one more time. There are questions of long term knowledge (which we can compute on the back-end cloud using today's existing batched solutions), but also contextual knowledge that must be acquired on the fly, and then physical world "knowledge" such as a motion detection that might be used to trigger a camera to acquire an image. The problem poses open questions at every level: the machine learning infrastructure, the systems infrastructure on which it runs, and the devices themselves -- not brilliant and autonomous, but not dumb either. As the area matures and we gain some degree of standardization around platforms and approaches, the potential seems enormous!

So next time you teach a class on IoT and mention exciting ideas like smart highways that might sell access to high speed lanes or other services to drivers or semi-autonomous cars, pause to point out that this kind of setting is a perfect example of a future computing capability that will soon supplant past ideas of edge computing. Teach your students to think of robotic actions like pivoting a camera, or focusing it, or even configuring it to select interesting images, as one facet of a rich and complex notion of edge computing that can take us into settings inaccessible to the classical cloud, and yet equally inaccessible even to the most brilliant of autonomous sensors. Tell them about those theoretical insights: it is very hard to engineer around an impossibility proof, and if this implies that swarm computing simply won't be the winner, let them think about the implications. You'll be helping them prepare to be leaders in tomorrow's big new thing!

A Few Thoughts on Distributed Computing

Sunday, 12 May 2019

Redefining the IoT Edge