AUTHOR=Mallik Aman , Reddy B Ranjith , Sahoo Gadadhar TITLE=A survivability analysis of enterprise hard drives incorporating the impact of workload JOURNAL=Frontiers in Computer Science VOLUME=6 YEAR=2024 URL=https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2024.1400943 DOI=10.3389/fcomp.2024.1400943 ISSN=2624-9898 ABSTRACT=Introduction

Hard disk drive (HDD) failure is a significant cause of downtime in enterprise storage systems. Research suggests that data access rates strongly influence the survival probability of HDDs.

Methods

This paper proposes a model to estimate the probability of HDD failure, using factors such as the total data (TD) read or written and the average access rate (AAR) for a specific drive model. The study utilizes a dataset of HDD failures to analyze the effects of these variables.

Results

The model was validated using case studies, demonstrating a strong correlation between access rate management and reduced HDD failure risk. The results indicate that managing data access rates through improved throttle commands can significantly enhance drive reliability.

Discussion

Our approach suggests that optimizing throttle commands at the storage controller level can help mitigate the risk of HDD failure by controlling data access rates, thereby improving system longevity and reducing downtime in enterprise storage systems.