Skip to content

Presentation at the IEEE/IFIP NOMS 2018 Workshop: International Workshop on Management of 5G Networks (5GMan)

We have recently presented our work on probabilistic abstractions for proactive multi-RAT control and management at the 5GMan workshop held in conjunction with NOMS 2018. To our joy, this paper also received the “Best Paper Award”. In this paper, we motivate the need of probabilistic abstractions for monitoring and management of multi-RAT networks. We highlight the challenges involved and propose a simple, yet effective, management approach based on monitoring the empirical distribution of the estimated attainable throughput. Initial results indicate that using such probabilistic abstractions in mobile management applications for inter-RAT interface selection scenarios (WiFi-LTE) can substantially reduce the number of performance violations relative to throughput objectives (by 116%) and lead to significantly fewer handovers (by 35x) compared to state-of-the-art approaches.

The full paper is available here: http://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1204509&dswid=4672

Credit

This work was done by Akhila Rao and Rebecca Steinert in the Network Intelligence group at RISE SICS. The full abstract and a link to the paper follows.

Abstract

Development towards 5G has introduced difficult challenges in effectively managing and operating heterogeneous infrastructures under highly varying network conditions. Enabling, for example, unified coordination and management of radio resources across coexisting, multiple radio access technologies (multi-RAT), require efficient representation using high-level abstractions of the radio network performance and state. Without such abstractions, users and networks cannot harvest the full potential of increased resource density and connectivity options resulting in failure to meet the ambitions of 5G.

We present a generic probabilistic approach for unified estimation of performance variability based on attainable throughput of UDP traffic in multi-RATs, and evaluate the applicability in an interface selection control case (involving WiFi and LTE) based on obtaining probabilistic user performance guarantees. From simulations we observe that both users and operators can significantly benefit from this improved service availability at low network cost. Initial results indicate 1) 116% fewer performance violations and 2) 20% fewer performance violations with a reduction by 35 times in the number of handovers, compared to naive and state-of-the-art baselines, respectively.

Metron presentation (with video) at NSDI 2018: NFV Service Chains at the True Speed of the Underlying Hardware

The emerging 100-Gbps deployments will soon challenge the packet processing limits of commodity hardware that is being used for Network Functions Virtualization (NFV). As an illustration,  the available time to process a 64-byte packet at 100 Gbps is only 5 nanoseconds. However, existing NFV platforms unnecessarily expend several nanoseconds exchanging packets between CPU cores to realize chained network functions. Consequently, these systems cannot meet the tight processing requirements of the emerging high speed (i.e, 100 Gbps or beyond) networks. To enable ultra high performance service chain deployments we introduce Metron; an NFV platform that operates at the true speed of the underlying hardware. First, Metron exploits the available programmable network hardware to perform early traffic classification and tagging. Then, Metron uses these tags to accurately dispatch classified packets to the correct CPU core of a server for further stateful processing, eliminating inter-core transfers. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server.  Metron achieves up to (i) 4.7x lower latency, (ii) 7.8x higher throughput, and (iii) 6.5x better efficiency than the state of the art. Thus, Metron’s contributions are crucial for realizing the future high speed NFV deployments. Metron is joint work among KTH Royal Institute of Technology, RISE SICS, and University of Liege. Georgios Katsikas presented the paper at the USENIX NSDI conference in Seattle on April 9, 2018 (slides are here, and the video of the talk is available by clicking below or at this YouTube link:

Our upcoming NSDI paper on high performance and ultra efficient NFV service chaining

In our upcoming NSDI 2018 paper, we focus on how to realize high performance NFV service chains at the true speed of the underlying hardware. We solve this challenging problem by exploiting the synergy between available network resources (i.e., programmable switches and network cards) and commodity servers, while eliminating inter-core communication among the service chain components. We demonstrate, via 40-Gbps and 100-Gbps experiments, that our approach achieves: (i) 2.75-6.5x better efficiency, (ii) up to 4.7x lower latency, and (iii) up to 7.8x higher throughput than the state of the art.

Credits

This is a joint work with Georgios P. Katsikas (RISE SICS Network Intelligence group), Tom Barbette (University of Liege), Dejan Kostic (KTH Royal Institute of Technology), Rebecca Steinert (RISE SICS Network Intelligence group), and Gerald Q. Maguire Jr. (KTH Royal Institute of Technology). The full abstract is as follows:

Abstract

In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers’ resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ fastest cache(s), with zero inter-core communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering.

Our upcoming NOMS paper on the deployment of distributed controllers in programmable networks

In our NOMS 2018 paper, we focus on how to deploy distributed controllers for programmable networks. To solve this challenging problem, we propose an approach that can automatically decide the number of controllers, their locations and control regions, and is guaranteed to find a non-congestion controller deployment plan fulfilling requirements such as reliability and bandwidth. We demonstrate, via experiments, that our approach allows for finding close to optimal solutions under varying conditions and achieves 20.1%-50.1% bandwidth usage reduction when compared with the state of the art.

Credit

This is joint work with Shaoteng Liu,  Rebecca Steinert  and Dejan Kostic.  The work was done at RISE SICS Network Intelligence group. The full abstract is as follows:

Abstract

For large-scale programmable networks, flexible deployment of distributed control planes is essential for service availability and performance. However, existing approaches only focus on placing controllers whereas the consequent control traffic is often ignored. In this paper, we propose a black-box optimization framework offering the additional steps for quantifying the effect of the consequent control traffic when deploying a distributed control plane. Evaluating different implementations of the framework over real-world topologies shows that close to optimal solutions can be achieved. Moreover, experiments indicate that running a method for controller placement without considering the control traffic, cause excessive bandwidth usage (worst cases varying between 20.1%-50.1% more) and congestion, compared to our approach.

Our upcoming ICC paper on stable network control under intermittent network partitioning situations

In our ICC 2018 paper, we focus on how to maintain stable network control services during intermittent network partitioning situations. To solve this challenging problem, we propose a new leader election algorithm that can perform leader election and node clustering in line with tunable optimization objectives related to group stability, size and merging cost. We demonstrate, via experiments, that with the same stability requirement our approach achieves: (i) upto 2x larger group size and (ii) up to 12x lower merging costs than existing approaches.

Credit

This is joint work with Shaoteng Liu,  Rebecca Steinert  and Dejan Kostic.  The work was done at RISE SICS Network Intelligence group. The full abstract is as follows:

Abstract

We propose a novel distributed leader election algorithm to deal with the controller and control service availability issues in programmable networks, such as Software Defined Networks (SDN) or programmable Radio Access Network (RAN). Our approach can deal with a wide range of network failures, especially intermittent network partitions, where splitting and merging of a network repeatedly occur.

In contrast to traditional leader election algorithms that mainly focus on the (eventual) consensus on one leader, the proposed algorithm aims at optimizing control service availability, stability and reducing the controller state synchronization effort during intermittent network partitioning situations. To this end, we design a new framework that enables dynamic leader election based on real-time estimates acquired from statistical monitoring. With this framework, the proposed leader election algorithm has the capability of being flexibly configured to achieve different optimization objectives, while adapting to various failure patterns. Compared with two existing algorithms, our approach can significantly reduce the synchronization overhead (up to 12x) due to controller state updates, and maintain up to twice more nodes under a controller.