Read the full-text article

Conference:

SC Workshops ‘25: Proceedings of the SC ‘25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Abstract

Distributed Filesystems (DFS) are a crucial component of modern computing environments, and their performance is critical to the success of all the facilities that rely on them. However, predicting the DFS I/O performance solely based on the storage system hardware is not trivial.

In this paper, we address this challenge by presenting an empirical method that tries to quantitatively assess how hardware configuration choices influence the performance of a DFS using Ceph as a case study. We investigate the influence of three hardware parameters—number of CPU cores, amount of RAM, and disk bandwidth. To control these variables, we relied on the Linux hotplug interface and Cgroups, avoiding additional software overhead. Our results reveal that for the analyzed workloads, decreasing hardware resources does not always yield proportional performance losses.

This method offers practical insights for designing cost-effective distributed storage systems, remaining general enough to be applied to other filesystems.