Publications | Raphael Azorin

2024

THESIS
Traffic Representations for Network Measurements

Raphael Azorin

In Computer Science, Sorbonne University 2024

Abs Bib PDF Website

Measurements are essential to operate and manage computer networks, as they are critical to analyze performance and establish diagnosis. In particular, per-flow monitoring consists in computing metrics that characterize the individual data streams traversing the network. To develop relevant traffic representations, operators need to select suitable flow characteristics and carefully relate their cost of extraction with their expressiveness for the downstream tasks considered. In this thesis, we propose novel methodologies to extract appropriate traffic representations. In particular, we posit that Machine Learning can enhance measurement systems, thanks to its ability to learn patterns from data, in order to provide predictions of pertinent traffic characteristics.The first contribution of this thesis is a framework for sketch-based measurements systems to exploit the skewed nature of network traffic. Specifically, we propose a novel data structure representation that leverages sketches’ under-utilization, reducing per-flow measurements memory footprint by storing only relevant counters. The second contribution is a Machine Learning-assisted monitoring system that integrates a lightweight traffic classifier. In particular, we segregate large and small flows in the data plane, before processing them separately with dedicated data structures for various use cases. The last contributions address the design of a unified Deep Learning measurement pipeline that extracts rich representations from traffic data for network analysis. We first draw from recent advances in sequence modeling to learn representations from both numerical and categorical traffic data. These representations serve as input to solve complex networking tasks such as clickstream identification and mobile terminal movement prediction in WLAN. Finally, we present an empirical study of task affinity to assess when two tasks would benefit from being learned together
@inproceedings{Dissertation_2024, author = {Azorin, Raphael}, title = {Traffic Representations for Network Measurements}, year = {2024}, booktitle = {Computer Science, Sorbonne University}, url = {https://theses.hal.science/tel-04689917}, doi = {tel-04689917}, }
KDD
Fine-grained Attention in Hierarchical Transformers for Tabular Time-series [Workshop]

Raphael Azorin, Zied Ben Houidi, Massimo Gallo, and 2 more authors

In KDD’s 10th Mining and Learning from Time-Series Workshop 2024

Abs arXiv Bib PDF Code Poster Website

Tabular data is ubiquitous in many real-life systems. In particular, time-dependent tabular data, where rows are chronologically related, is typically used for recording historical events, e.g., financial transactions, healthcare records, or stock history. Recently, hierarchical variants of the attention mechanism of transformer architectures have been used to model tabular time-series data. At first, rows (or columns) are encoded separately by computing attention between their fields. Subsequently, encoded rows (or columns) are attended to one another to model the entire tabular time-series. While efficient, this approach constrains the attention granularity and limits its ability to learn patterns at the field-level across separate rows, or columns. We take a first step to address this gap by proposing Fieldy, a fine-grained hierarchical model that contextualizes fields at both the row and column levels. We compare our proposal against state of the art models on regression and classification tasks using public tabular time-series datasets. Our results show that combining row-wise and column-wise attention improves performance without increasing model size. Code and data are available at https://github.com/raphaaal/fieldy.
@inproceedings{Transformers_2024, author = {Azorin, Raphael and Ben Houidi, Zied and Gallo, Massimo and Finamore, Alessandro and Michiardi, Pietro}, title = {Fine-grained Attention in Hierarchical Transformers for Tabular Time-series [Workshop]}, year = {2024}, booktitle = {KDD's 10th Mining and Learning from Time-Series Workshop}, doi = {arXiv:2406.15327}, keywords = {transformers, tabular time-series}, }
PACMNET
Taming the Elephants: Affordable Flow Length Prediction in the Data Plane

Raphael Azorin, Andrea Monterubbiano, Gabriele Castellano, and 3 more authors

In Proceedings of the ACM on Networking, Vol. 2, CoNEXT1 2024

Abs Bib PDF Code Website

Machine Learning (ML) shows promising potential for enhancing networking tasks by providing early traffic predictions. However, implementing an ML-enabled system is a challenging task due to network devices limited resources. While previous works have shown the feasibility of running simple ML models in the data plane, integrating them into a practical end-to-end system is not an easy task. It requires addressing issues related to resource management and model maintenance to ensure that the performance improvement justifies the system overhead. In this work, we propose DUMBO, a versatile end-to-end system to generate and exploit early flow size predictions at line rate. Our system seamlessly integrates and maintains a simple ML model that offers early coarse-grain flow size prediction in the data plane. We evaluate the proposed system on flow scheduling, per-flow packet inter-arrival time distribution, and flow size estimation using real traffic traces, and perform experiments using an FPGA prototype running on an AMD(R)-Xilinx(R) Alveo U280 SmartNIC. Our results show that DUMBO outperforms traditional state-of-the-art approaches by equipping network devices data planes with a lightweight ML model. Code is available at https://github.com/cpt-harlock/DUMBO.
@inproceedings{PACMNET_2024, author = {Azorin, Raphael and Monterubbiano, Andrea and Castellano, Gabriele and Gallo, Massimo and Pontarelli, Salvatore and Rossi, Dario}, title = {Taming the Elephants: Affordable Flow Length Prediction in the Data Plane}, year = {2024}, isbn = {}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3649473}, doi = {10.1145/3649473}, booktitle = {Proceedings of the ACM on Networking, Vol. 2, CoNEXT1}, keywords = {data plane, in-network machine learning, per-flow monitoring}, }

2023

CoNEXT
Memory-Efficient Random Forests in FPGA SmartNICs [Poster]

Andrea Monterubbiano, Raphael Azorin, Gabriele Castellano, and 3 more authors

In Companion of the 19th ACM International Conference on Emerging Networking EXperiments and Technologies 2023

Abs Bib PDF Website

Random Forests (RF) have been a popular Machine Learning (ML) algorithm for more than two decades. This success can be attributed to its simplicity, effectiveness and explainability. However, implementing them in a high-speed programmable data plane is not trivial. To make predictions, i.e., inference, RFs must traverse each tree from the root to the leaf by comparing the features vector at each split node. This process is particularly challenging in network devices where memory is limited, and packet processing cannot be delayed, i.e., predictions occur at line rate. Nevertheless, this implementation is crucial for incorporating recent ML advances in the network, which could benefit use cases such as scheduling, measurements, and routing [1]. Prior studies such as Planter [4] have examined the implementation of RF in network switches, mapping trees to Match-Action Tables (MAT). Another line of work focused on RF implementations optimized for FPGA, mapping tree layers to pipeline stages as done in [2]. Such approaches use different tree representations that naturally come with their strengths and weaknesses depending on the trees’ sparsity, depth, and input features. In this work we (1) propose a novel representation for FPGA-based Random Forests, (2) compare it against state-of-the-art implementations in terms of memory and computation requirements, and (3) evaluate our design on a flow classification task using CAIDA traffic traces.
@inproceedings{CoNEXTPoster_2023, author = {Monterubbiano, Andrea and Azorin, Raphael and Castellano, Gabriele and Gallo, Massimo and Pontarelli, Salvatore and Rossi, Dario}, title = {Memory-Efficient Random Forests in FPGA SmartNICs [Poster]}, year = {2023}, isbn = {}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3624354.3630089}, doi = {10.1145/3624354.3630089}, booktitle = {Companion of the 19th ACM International Conference on Emerging Networking EXperiments and Technologies}, pages = {55–56}, numpages = {2}, keywords = {FPGA, random forest, smartnic}, location = {<conf-loc>, <city>Paris</city>, <country>France</country>, </conf-loc>}, series = {CoNEXT 2023}, }
PACMNET
SPADA: A Sparse Approximate Data Structure Representation for Data Plane Per-Flow Monitoring

Andrea Monterubbiano, Raphael Azorin, Gabriele Castellano, and 3 more authors

In Proceedings of the ACM on Networking, Vol. 1, Dec. 2023

Abs Bib PDF Code Website

Accurate per-flow monitoring is critical for precise network diagnosis, performance analysis, and network operation and management in general. However, the limited amount of memory available on modern programmable devices and the large number of active flows force practitioners to monitor only the most relevant flows with approximate data structures, limiting their view of network traffic. We argue that, due to the skewed nature of network traffic, such data structures are, in practice, heavily underutilized, i.e. sparse, thus wasting a significant amount of memory.This paper proposes a Sparse Approximate Data Structure (SPADA) representation that leverages sparsity to reduce the memory footprint of per-flow monitoring systems in the data plane while preserving their original accuracy. SPADA representation can be integrated into a generic per-flow monitoring system and is suitable for several measurement use cases. We prototype SPADA in P4 for a commercial FPGA target and test our approach with a custom simulator that we make publicly available, on four real network traces over three different monitoring tasks. Our results show that SPADA achieves 2X to 11X memory footprint reduction with respect to the state-of-the-art while maintaining the same accuracy, or even improving it.
@inproceedings{PACMNET_2023, author = {Monterubbiano, Andrea and Azorin, Raphael and Castellano, Gabriele and Gallo, Massimo and Pontarelli, Salvatore and Rossi, Dario}, title = {SPADA: A Sparse Approximate Data Structure Representation for Data Plane Per-Flow Monitoring}, year = {2023}, isbn = {}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3629149}, doi = {10.1145/3629149}, booktitle = {Proceedings of the ACM on Networking, Vol. 1, Dec.}, keywords = {data plane, per-flow monitoring, monitoring data structures}, }
AAAI
"It’s a Match" - A Benchmark of Task Affinity Scores for Joint Learning [Workshop]

Raphael Azorin, Massimo Gallo, Alessandro Finamore, and 2 more authors

In AAAI’s 2nd International Workshop on Practical Deep Learning 2023

Abs arXiv Bib PDF Poster Website

While the promises of Multi-Task Learning (MTL) are attractive, characterizing the conditions of its success is still an open problem in Deep Learning. Some tasks may benefit from being learned together while others may be detrimental to one another. From a task perspective, grouping cooperative tasks while separating competing tasks is paramount to reap the benefits of MTL, i.e., reducing training and inference costs. Therefore, estimating task affinity for joint learning is a key endeavor. Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL. Yet, the literature is lacking of a benchmark to assess the effectiveness of tasks affinity estimation techniques and their relation with actual MTL performance. In this paper, we take a first step in recovering this gap by (i) defining a set of affinity scores by both revisiting contributions from previous literature as well presenting new ones and (ii) benchmarking them on the Taskonomy dataset. Our empirical campaign reveals how, even in a small-scale scenario, task affinity scoring does not correlate well with actual MTL performance. Yet, some metrics can be more indicative than others.
@inproceedings{AAAI23_MTL, author = {Azorin, Raphael and Gallo, Massimo and Finamore, Alessandro and Rossi, Dario and Michiardi, Pietro}, title = {"It's a Match" - A Benchmark of Task Affinity Scores for Joint Learning [Workshop]}, year = {2023}, address = {Washington, DC, USA}, doi = {arXiv:2301.02873}, booktitle = {AAAI's 2nd International Workshop on Practical Deep Learning}, keywords = {multi-task learning, deep learning}, location = {Washington, USA}, }

2022

CoNEXT
Learned Data Structures for Per-Flow Measurements [Poster]

Andrea Monterrubiano, Raphael Azorin, Gabriele Castellano, and 2 more authors

In Proceedings of the 3rd ACM CoNEXT Student Workshop 2022

Abs Bib PDF Website

This work presents a generic framework that exploits learning to improve the quality of network measurements. The main idea of this work is to reuse measures collected by the network monitoring tasks to train an ML model that learns some per-flow characteristics and improves the measurement quality re-configuring the memory according to the learned information. We applied this idea to two different monitoring tasks, we identify the main issues related to this approach and we present some preliminary results.
@inproceedings{CoNEXT22_LearnedDS, author = {Monterrubiano, Andrea and Azorin, Raphael and Castellano, Gabriele and Gallo, Massimo and Pontarelli, Salvatore}, title = {Learned Data Structures for Per-Flow Measurements [Poster]}, year = {2022}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, doi = {https://doi.org/10.1145/3565477.3569147}, booktitle = {Proceedings of the 3rd ACM CoNEXT Student Workshop}, numpages = {2}, keywords = {network measurements, machine learning}, location = {Roma, Italy}, series = {CoNEXT-SW '22}, }
HotNets
Towards a Systematic Multi-Modal Representation Learning for Network Data

Zied Ben Houidi, Raphael Azorin, Massimo Gallo, and 2 more authors

In Proceedings of the 21st ACM Workshop on Hot Topics in Networks 2022

Abs Bib PDF Poster Website

Learning the right representations from complex input data is the key ability of successful machine learning (ML) models. The latter are often tailored to a specific data modality. For example, recurrent neural networks (RNNs) were designed having sequential data in mind, while convolutional neural networks (CNNs) were designed to exploit spatial correlation in images. Unlike computer vision (CV) and natural language processing (NLP), each of which targets a single well-defined modality, network ML problems often have a mixture of data modalities as input. Yet, instead of exploiting such abundance, practitioners tend to rely on sub-features thereof, reducing the problem to single modality for the sake of simplicity. In this paper, we advocate for exploiting all the modalities naturally present in network data. As a first step, we observe that network data systematically exhibits a mixture of quantities (e.g., measurements), and entities (e.g., IP addresses, names, etc.). Whereas the former are generally well exploited, the latter are often underused or poorly represented (e.g., with one-hot encoding). We propose to systematically leverage language models to learn entity representations, whenever significant sequences of such entities are historically observed. Through two diverse use-cases, we show that such entity encoding can benefit and naturally augment classic quantity-based features.
@inproceedings{HotNets22_MultiModal, author = {Houidi, Zied Ben and Azorin, Raphael and Gallo, Massimo and Finamore, Alessandro and Rossi, Dario}, title = {Towards a Systematic Multi-Modal Representation Learning for Network Data}, year = {2022}, isbn = {9781450398992}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3563766.3564108}, doi = {10.1145/3563766.3564108}, booktitle = {Proceedings of the 21st ACM Workshop on Hot Topics in Networks}, pages = {181–187}, numpages = {7}, keywords = {multimodal representation learning, network data}, location = {Austin, Texas}, series = {HotNets '22}, }
ICSOC
A Reproducible Approach for Mining Business Activities from Emails for Process Analytics [Workshop]

Raphael Azorin, Daniela Grigori, and Khalid Belhajjame

In Proceedings of the 19th International Conference on Service-Oriented Computing Workshops 2022

Abs Bib Code Website

Emails are more than just a means of communication, as they are a valuable source of information about undocumented business activities and processes. In this paper, we examine a solution that leverages machine learning to i) extract business activities from emails, and ii) construct business process instances, which group together these activities involved in achieving a common goal. In addition, we examine how relational learning can exploit the relationship between sub-problems (i) and (ii) to further improve their results. The research results presented in this paper are reproducible, and the recipe and data sets used are freely available to interested readers.
@inproceedings{ICSOC_EmailMining, author = {Azorin, Raphael and Grigori, Daniela and Belhajjame, Khalid}, title = {A Reproducible Approach for Mining Business Activities from Emails for Process Analytics [Workshop]}, booktitle = {Proceedings of the 19th International Conference on Service-Oriented Computing Workshops}, year = {2022}, publisher = {Springer International Publishing}, address = {Cham}, pages = {77--91}, isbn = {978-3-031-14135-5}, }

2021

CoNEXT
Towards a Generic Deep Learning Pipeline for Traffic Measurements [Poster]

Raphael Azorin, Massimo Gallo, Alessandro Finamore, and 3 more authors

In Proceedings of the 2nd ACM CoNEXT Student Workshop 2021

Abs Bib PDF Website

Traffic measurements are key for network management as testified by the rich literature from both academia and industry. At their foundation, measurements rely on transformation functions f(x) = y, mapping input traffic data x to an output performance metric y. Yet, common practices adopt a bottom-up design (i.e., metric-based) which leads to (i) invest a lot of efforts into (re)discovering how to perform such mapping and (ii) create specialized solutions. For instance, sketches are a compact way to extract traffic properties (heavy-hitters, super-spreaders, etc.) but require analytical modeling to offer correctness guarantees and careful engineering to enable in-device deployment and network-wide measurements.
@inproceedings{CoNEXT21_DLPipeline, author = {Azorin, Raphael and Gallo, Massimo and Finamore, Alessandro and Filippone, Maurizio and Michiardi, Pietro and Rossi, Dario}, title = {Towards a Generic Deep Learning Pipeline for Traffic Measurements [Poster]}, year = {2021}, isbn = {9781450391337}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3488658.3493785}, doi = {10.1145/3488658.3493785}, booktitle = {Proceedings of the 2nd ACM CoNEXT Student Workshop}, pages = {5–6}, numpages = {2}, keywords = {network measurements, representation learning, deep learning}, location = {Virtual Event, Germany}, series = {CoNEXT-SW '21}, }