In our ICC 2018 paper, we focus on how to maintain stable network control services during intermittent network partitioning situations. To solve this challenging problem, we propose a new leader election algorithm that can perform leader election and node clustering in line with tunable optimization objectives related to group stability, size and merging cost. We demonstrate, via experiments, that with the same stability requirement our approach achieves: (i) upto 2x larger group size and (ii) up to 12x lower merging costs than existing approaches.
Credit
This is joint work with Shaoteng Liu, Rebecca Steinert and Dejan Kostic. The work was done at RISE SICS Network Intelligence group. The full abstract is as follows:
Abstract
We propose a novel distributed leader election algorithm to deal with the controller and control service availability issues in programmable networks, such as Software Defined Networks (SDN) or programmable Radio Access Network (RAN). Our approach can deal with a wide range of network failures, especially intermittent network partitions, where splitting and merging of a network repeatedly occur.
In contrast to traditional leader election algorithms that mainly focus on the (eventual) consensus on one leader, the proposed algorithm aims at optimizing control service availability, stability and reducing the controller state synchronization effort during intermittent network partitioning situations. To this end, we design a new framework that enables dynamic leader election based on real-time estimates acquired from statistical monitoring. With this framework, the proposed leader election algorithm has the capability of being flexibly configured to achieve different optimization objectives, while adapting to various failure patterns. Compared with two existing algorithms, our approach can significantly reduce the synchronization overhead (up to 12x) due to controller state updates, and maintain up to twice more nodes under a controller.