Leveraging RDMA for Strongly Consistent Replication at Large Scale

Ken Birman

Abstract:

My work focuses on ways of replicating data in demanding settings, most recently the cloud. The cloud is a setting where copying information and replicating data or computation is common, yet it remains difficult to actually create new applications that leverage replication. Moreover, the existing libraries are very slow.

I’ll present work on Derecho, a blazingly fast C++ library for creating scalable data replication solutions. Derecho has a remarkably simple API and very low overheads, shifting as much work as possible from runtime to compile time. We separate control plane from data plane, sending data on RDMA over a multicast overlay, then using an RDMA-based distributed shared memory to coordinate delivery so as to ensure total ordering, data persistency or other desired properties such as virtually synchronous membership management. The overall framework is minimalist, yet can support complex applications, including ones that have subgroup structures or that shard data within larger sets of nodes. Derecho consistency subsumes both Paxos and virtual synchrony multicast into a single model.

Performance is very satisfying: As an atomic multicast, Derecho is tens of thousands of times faster than commonly used libraries, and as Paxos, it beats prior solutions by large factors while scaling far better than any existing Paxos solution.

If time permits, I'll also say a few words about proofs. Although the protocols used in Derecho are based on ones familiar from decades of work on state machine replication and virtual synchrony multicasts, albeit with some novel twists, the new monotonic formulation used to decouple the control plane from the data plane within Derecho is of deeper interest. This approach turns out to simplify our protocols, seems to support simpler proofs, and also highlights the ties between our work and older results on knowledge logics.

Bio:

Ken Birman has been a systems researcher and faculty member at Cornell University since getting his PhD from UC Berkeley in 1981. He is best known for work on virtually synchronous process group computing models (an early version of what has become known as the Paxos family of protocols), and his software has been widely used. The Isis Toolkit that Ken built ran the NYSE for more than a decade, and is still used to operate many aspects of the communication system in the French air traffic control system. A follow-on named Vsync is widely used in graduate courses that teach distributed computing. This talk is based on his newest system, called Derecho.

Time and Place

Wednesday, May 17, 4:30pm
Gates 104