Wednesday 19 April 2017

NetBricks: Taking the V out of NFV

Summary
------------
Network Function Virtualization (NFV) proposed the use of software network functions running on VMs instead of hardware middleboxes to achieve simple deployment and management, faster development, and reduced costs as multiple network functions can be deployed on the same machine. Despite these advantages, there have been limited wide scale NF deployments. The reason for this can be attributed to the lack of available tools for building and running NFs which support easy development (through the use of high-level abstractions), and high performance (by making low-level optimizations) at the same time. Another problem with the current NFV deployments is that they try to achieve memory and performance isolation using VMs which leads to performance overheads. The authors try to solve these problems by proposing a new approach, NetBricks, to build and run NFs. The programming model used by NetBricks provides high-level abstractions for common packet processing tasks facilitating rapid development. The execution environment provided by NetBricks to run NFs relies on safe languages and runtimes for isolation, as opposed to the current systems which use VMs for this purpose, and thus, suffer from performance issues. Another problem with the current systems is that they achieve packet isolation by copying packets as they are passed between NFs leading to high performance overhead. NetBricks solves this issue by using using unique types which prohibit simultaneous multithreaded access to the same data by static checks during compile time, and cause no runtime overhead.

Strengths
------------
  • It is a thoroughly thought out framework from the perspective of practicality.
  • Uses type checking and bound checking to get back the network performance lost because of packet copying to provide packet isolation.
  • Enables run-to-completion scheduling i.e., applying all NFs on a packet before proceeding to process another packet, and this is known to provide high throughput.
  • The paper provides an extensive list of programming abstractions, making it easy for a new developer to use these and write any new NF without worrying about the low-level optimizations.
 
Weaknesses
---------------
  • The framework does not support dynamic NF chaining, and requires a restart to perfom changes in the chain. I suppose this is because packet isolation requires type checking which is being performed at compile time.
  • Although, the authors provide an extensive list of programming abstractions, but with new advancements, we can expect a need for new abstractions to arise. And, it looks like a NF developer is currently dependent upon NetBricks to provide an optimized version of the new abstraction.
  • Currently, all the abstractions use same batching technique to maximize the common-case performance. Different batching techniques might be advantageous in different scenarios, and there can also be scenarios where batching might not be desirable.

Discussion Points
----------------------
  • The paper shows that as number of memory accesses per packet (16 or higher) increases, the throughput achieved by NetBricks NF is comparable to NF written in C using DPDK.  Although, most of the NF might require very few memory accesses per packet, it would be interesting to identify cases where we actually need high memory accesses per packet. NetBricks can be implemented for such cases very easily, using it's programming abstractions, without any noticeable performance loss.
  • The use of run-to-completion scheduling raises the question of the order in which the packets are processed, and how to schedule nodes that involve more than one packet. Currently, the authors use a round-robin scheduling policy to schedule such nodes. A discussion of a more complex scheduling policy, and its effects might be interesting.

2 comments:

  1. Very interesting paper. Use of a single language framework and encapsulation in processes is clearly efficient, but loses much of the flexibility of VMs (one can use different implementations more easily).

    ReplyDelete
  2. It would be interesting to compare the baseline and netbricks performance without using DPDK. DPDK has overhead of more CPU computations because of polling and is one of the reason why it is not as popular as it could have been because of its better throughput.

    ReplyDelete

Note: only a member of this blog may post a comment.