Unsupervised Detection of Long-Term Idle Periods in Large-Scale On-Premises Server Fleets

Authors

  • Ahmed Javed OnitsAI Inc., South Korea
  • Haneul Yang OnitsAI Inc., South Korea
  • Zohaib Shahid Loughborough University London, United Kingdom

DOI:

https://doi.org/10.1609/aaaiss.v9i1.42906

Abstract

As on-premises GPU and server fleets scale to meet AI work- load demands, substantial hardware assets remain underuti- lized, resulting in prolonged, high-cost “idle” periods. De- tecting these segments in large-scale environments is inher- ently difficult due to the absence of ground-truth labels and the high volatility of modern workloads. We propose an un- supervised pipeline for identifying long-term idle intervals in unlabeled multivariate utilization time series. By leverag- ing daily volatility vectors across CPU, memory, GPU, and storage metrics (/data space and /root space), our novel framework, the BGMM-HMM, employs a Bayesian Gaus- sian Mixture Model for state clustering followed by a Hid- den Markov Model to enforce temporal consistency. Experi- ments on production server-fleet data show that the BGMM- HMM identifies underutilized assets ≈5×more effectively than traditional rule-based baselines. Critically, ablation stud- ies demonstrate that the HMM integration reduces spurious state-switching by >90% compared to standalone clustering, providing the stable, contiguous intervals necessary for prac- tical resource reclamation. Furthermore, robustness tests via synthetic noise injection confirm a 98.3% sensitivity to work- load spikes. This framework provides a scalable and opera- tionally stable tool for infrastructure optimization and ESG- aligned sustainable computing.

Downloads

Published

2026-06-23

How to Cite

Javed, A., Yang, H., & Shahid, Z. (2026). Unsupervised Detection of Long-Term Idle Periods in Large-Scale On-Premises Server Fleets. Proceedings of the AAAI Symposium Series, 9(1), 61–68. https://doi.org/10.1609/aaaiss.v9i1.42906

Issue

Section

AI-Driven Resilience: Building Robust, Adaptive Technologies for a Dynamic World (Full Papers)