I’m a Cornell faculty member who has worked in distributed computing (and more recently cloud computing) for something like 35 years. I started working in this area as an undergraduate, when it was research just to connect computers together. As a PhD student, I hung around with the BSD group (Bill Joy and the other folks who created the version of Linux most of use), and at one point I even built my own distributed operating system (it worked, but wasn’t a huge success).
After I got my PhD, I took a job at Cornell, and started what became a whole career exploring fault-tolerance, object oriented replication and programming tools for distributed computing, and the theory and practice involved in creating systems with strong consistency, fault-tolerance, and other guarantees. So I've been there from the start and had an unusual degree of opportunity to watch the field evolve and develop, which has been fun.
Why did I decide to write these essays? (I'm not sure this is a blog in the standard sense!) One reason is that I have time on my hands, and a second is that I'm in some interesting places right now: during fall of 2016 and spring 2017 I'm on sabbatical. Blogging will be a side-line: my main goal is to use this trip to work on our new Freeze Frame File System (FFFS) and our Derecho cloud computing platform (learn more about them on my home page, here). But you can't spend all day designing algorithms and protocols, so I'm also talking to people to try and learn how the ways the field is evolving. My notes became these reports: too long to call them normal blogging, but not research papers on book chapters, either.
So far I've been able to connect with people in industry at places like Intel, Google, IBM, Amazon, Microsoft, Mesos, Data Bricks, Mellanox, Huawei… I’m a good listener, and I read a lot, and am pulling what I’ve learned into notes (carefully vetted to not reveal anything proprietary, I should emphasize). So this set of notes captures my insights.
I'm the kind of person who is always pretty technical, and that will be true here, too. I'm just not the right person to write engaging human interest stories. So these will be technology essays, and aimed at the kind of people I talk to in a normal day: people who already know a lot about cloud computing and the field.
But don't feel unwelcome if you are new to the area and trying to use these notes to catch up quickly: everyone starts somewhere! So, if have a naïve question or can't understand something I said, just ask! In fact I love to work with students and that's one thing I miss out here in sabbatical-land.. Plus, I'm not always right (some people think I'm never right) and perhaps you'll notice an error. Honestly, I really do welcome comments and feedback, and I want to hear other points of view…. Feel free to post here if your question would interest other people. You can also email me if you are just puzzled.
Here are some topics I thought I might start with... I'll post a few all at once because those are already written, but then I'll slow down.... it will take a little while before I post everything. The big topics have multiple threads for subtopics, and on those there will be a "topic leader" and a number to indicate the recommended order to read them in.
Core cloud computing topics:
- Why RDMA will ultimately rule the cloud [big topic].
- How to design applications where safety issues arise in the application itself.
- Derecho: Cornell's cool new tool for building highly available, fault-tolerant cloud services
- The Internet of Things needs a Real-Time Cloud
- Is CAP still valid in a world that has RDMA?
- Transactions and their proper role in systems that use replication [big topic].
- What role will the container / docker trends have?
- How does one georeplicate a cloud service?
- What special issues does multitenancy introduce?
- Heavy tails and Real-Time Skew: The unnoticed barrier to scale?
- To what extent can we offload tasks into hardware accelerators?
- Next generation memory technologies
A few stray topics that don’t fit squarely with cloud computing:
- Is our electric power grid at risk of attack? What can be done to reduce the threat?
- Why liability may ultimately reshape the smart car industry.
- Things I'm learning about Bitcoin.
- It takes an infrastructure: Why AI and ML aren’t the only exciting computing story.