NIC-Level Co-Processors for Resilient Coded Networking and Computation


Important for providing resilient time-sensitive services is the availability of resilience mechanisms and high packet processing performance at the right location, thereby avoiding overloaded network and host components. Hardware offloading by NIC-level (Network Interface Card) co-processors enables resilient, low-latency computation and can help to free scarce CPU resources.

In the HYPERNIC project, the Chair of Integrated Systems (LIS) and Chair of Network Architectures and Services (NET) of the Technical University of Munich (TUM) propose a novel communication and computation approach that is resilient against potential attacks and partial network failures. We plan to investigate mechanisms and processing platforms that provide resilience efficiently and flexibly. We aim to design a novel class of NICs with processing capabilities that employ techniques such as network coding, low-latency packet retransmissions, and fault-tolerant algorithms.

Research Objectives

The joint project work consists of a hybrid software/hardware stack. The software layers will ensure the necessary flexibility (through programmability) where the latency of the HYPERNIC functions is not the prior concern. The hardware layer will devise co-processors and wire-rate data processing pipelines where low, deterministic latencies are required. Thus, the two project partners exploit their complementary and proven competencies in the fields of network architectures, protocols, and network processing engines. This merger of skills and expertise will allow the theoretical, methodological, architectural, and practical investigations.

In the project the two project partners will combine the complementary research foci of their respective research groups. Our chair has a strong background in measurements and modeling of programmable packet processing systems, long-term dedication to measurement-driven research, sophisticated testbed infrastructure and extensive measurement facilities. LIS contributes various FPGA boards that allow low-level integration of the investigated resilience mechanisms. Besides hardware acceleration, LIS has a research focus on highly reliable, high-bandwidth networks. Both the research areas and the testbed facilities are complementary. This allows the theoretical and practical investigation of the entire networking stack starting at the NIC and its offloading capabilities, management of the NIC utilizing network drivers, and processing in software up to the application.

Structure of Work

The project is structured in three work areas, as follows.

Work Area A

The first WP investigates fundamental redundancy mechanisms that allow using multiple independent paths through measures on a protocol level, e.g., packet-level duplication of traffic. More efficient ways to introduce redundancy into the network communication, such as NC, are also investigated. Finally, we investigate methods for resilient computation. There, we want to investigate frameworks that allow low-latency replication of state across network nodes reliably and securely.

Work Area B

The first part investigates software implementations of different redundancy schemes. This WP is concerned with the analysis and design, the actual implementation and optimizations. The second part investigates the applicability of hardware offloading for the mechanisms investigated in Workarea A. Offloading specific functions to hardware helps to improve the bandwidth, lowers latency, and avoids jitter. We plan to assess these potential benefits by measurements. Third, we implement resilient computation. Measurements that help to identify possible bottlenecks are used to plan for improvements by targeted hardware acceleration.

Work Area C

Work Area C is dedicated to establish the testbed facilities to perform experiments, with a special focus to support cooperation with the research community. We plan to establish a federated testbed between our two research groups to enable distributed experiments under a common experiment framework. The results created in the federated testbed will be used for modeling of our results, and for being able to predict properties like maximum throughput or worst case latencies of a larger range of systems, scenarios and configurations. The testbed facilities and the investigated techniques will also be made available to other members of the research community, especially to members of the priority programme. A federated testbed with further research groups can be established to extend the capabilities of the original testbeds.

Related publications

2024-01-01 Henning Stubbe, Sebastian Gallenmüller, Georg Carle, “The pos Experiment Controller: Reproducible & Portable Network Experiments,” in 2024 19th Wireless On-Demand Network Systems and Services Conference (WONS), 2024, pp. 1–8. [Pdf] [Preprint] [DOI] [Bib]
2023-12-01 Patrick Sattler, Johannes Zirngibl, Mattijs Jonker, Oliver Gasser, Georg Carle, Ralph Holz, “Packed to the Brim: Investigating the Impact of Highly Responsive Prefixes on Internet-wide Measurement Campaigns,” Proc. ACM Netw., vol. 1, no. CoNEXT3, Dec. 2023. [Url] [Pdf] [Homepage] [DOI] [Bib]
2023-06-01 Henning Stubbe, Sebastian Gallenmüller, Manuel Simon, Eric Hauser, Dominik Scholz, Georg Carle, “Keeping Up to Date With P4Runtime: An Analysis of Data Plane Updates on P4 Switches,” in International Federation for Information Processing (IFIP) Networking 2023 Conference (IFIP Networking 2023), Barcelona, Spain, Jun. 2023, p. 9. [Pdf] [DOI] [Bib]
2023-06-01 Manuel Simon, Sebastian Gallenmüller, Georg Carle, “Never Miss Twice - Add-On-Miss Table Updates in Software Data Planes,” in KuVS Fachgespräch - Würzburg Workshop on Modeling, Analysis and Simulation of Next-Generation Communication Networks 2023 (WueWoWAS’23), Würzburg, Germany, Jun. 2023, p. 5. Best Workshop Contribution [Pdf] [Preprint] [Slides] [DOI] [Bib]

Finished student theses

Author Title Type Advisors Year Links
Leon Krix On-The-Fly Network Erasure Coding Protocol for Delay and Loss-Sensitive Data BA Henning Stubbe, Kilian Holzinger 2023
Luca Otting Improving QUIC with User Space Networking BA Kilian Holzinger, Benedikt Jaeger, Johannes Zirngibl 2023
Alexander Anton Keil Comparison of One-Way Delay Measurement Approaches BA Kilian Holzinger, Florian Wiedner, Henning Stubbe 2023
Thomas Senftl Flexible Precise Path Property Emulation BA Kilian Holzinger, Sebastian Gallenmüller, Stefan Lachnit 2023
Paul Stephan Improvements to Reliable Multipath Forward Error Correction BA Kilian Holzinger, Henning Stubbe 2023
Felix Hahn Failure Detection through Active and Passive Techniques in P4 BA Manuel Simon, Eric Hauser 2023
Ruben Bachmann Comparison of DPDK-Enabled P4 Software Targets MA Manuel Simon, Sebastian Gallenmüller, Stefan Lachnit 2023
Martin Fritz State of the Art Assessment of Multipath QUIC MA Kilian Holzinger, Lion Steger, Marcel Kempf 2023
Krishna Mavani Simulation of a Network Redundancy Protocol BA Kilian Holzinger, Henning Stubbe 2022
Tristan Döring Packet Selection using Concepts from IPFIX and PSAMP BA Kilian Holzinger, Henning Stubbe 2022
Jonas Kaps High-Performance Low-Latency Forward Error Correction Coding for Reliable Ethernet Communication MA Kilian Holzinger, Filip Rezabek 2022
Timon Tsiolis Analyzing the Extensibility of Programmable Data Planes BA Manuel Simon, Henning Stubbe, Sebastian Gallenmüller 2022

Open and running student theses

Author Title Type Advisors Year Links
Angelo Kleinert Assessing the Energy Consumption of QUIC IDP Kilian Holzinger, Johannes Späth 2024
Amal Smaoui Design and Implementation of a Configurable QUIC Workload Framework BA Kilian Holzinger, Stefan Lachnit 2024
Sebastian Warter Packet Processing with Programmable Data Planes and Trusted Execution Environments MA Manuel Simon, Sebastian Gallenmüller 2024
Nico Greger Improvements to Forward Erasure Correction Coding IDP Kilian Holzinger, Henning Stubbe, Stefan Lachnit 2023
Michael Hackl Improvements to Convolutional Forward Erasure Correction Coding IDP Kilian Holzinger, Henning Stubbe, Stefan Lachnit 2023
Felix Christ MASQUE-Proxying in User-Space MA Kilian Holzinger, Lion Steger 2023
Kilian Warmuth Implementation of a Testing Toolchain for a Scientific Measurement Tool IDP Stefan Lachnit, Eric Hauser, Sebastian Gallenmüller 2023
offen Forward Erasure Correction Coding in QUIC IDP, MA Kilian Holzinger, Henning Stubbe, Stefan Lachnit 2023