Performance Prediction for Large Systems via Text-to-Text Regression
Yash Akhauri, Bryan Lewandowski, Cheng-Hsi Lin, Adrian N. Reyes, Grant C. Forbes, Arissa Wongpanich, Bangding Yang, Mohamed S. Abdelfattah, Sagi Perel, Xingyou Song
2025-06-30
Summary
This paper talks about using a special type of AI model called a text-to-text regression model to predict how efficiently a large computing system, like Google's Borg cluster, uses its resources.
What's the problem?
Traditional methods that predict system performance rely on tables and fixed features, which can’t handle complex data like system logs or configuration files well, making predictions less accurate and hard to adjust to new tasks.
What's the solution?
The researchers created a model that treats both the input (system data) and the output (performance measures) as sequences of text, enabling the AI to learn prediction tasks more flexibly. This model was trained from scratch and showed very high accuracy, adapting quickly to new challenges with very few examples and also providing measures of uncertainty in its predictions.
Why it matters?
This matters because it offers a powerful and general way to predict the behavior of complex systems without needing complicated manual data preparation, helping make important decisions about system efficiency and reliability easier and more accurate.
Abstract
A text-to-text regression model achieves high accuracy in predicting resource efficiency for Google's Borg system, surpassing tabular methods, and demonstrates adaptability and uncertainty quantification.