Hardware and Software Platform Inference

Cheng Zhang, Hanna Foerster, Robert D. Mullins, Yiren Zhao, Ilia Shumailov

2024-11-13

Summary

This paper talks about Hardware and Software Platform Inference (HSPI), a new method that helps verify the actual hardware and software used in machine learning models that clients access. It aims to ensure that clients receive the performance they are paying for.

What's the problem?

The problem is that businesses often buy access to large language models (LLMs) without being able to check if they are getting what they paid for. Providers may advertise using advanced hardware, like NVIDIA H100 GPUs, but might actually deliver less capable models running on cheaper hardware. This can lead to clients paying for high-performance services but receiving lower-quality results instead.

What's the solution?

To address this issue, the authors introduce HSPI, which analyzes the input-output behavior of machine learning models to identify the underlying hardware and software configurations. By examining the numerical patterns in the model's outputs, HSPI can accurately determine what kind of hardware is being used for inference. The method has been tested and shown to distinguish between different hardware types with high accuracy, even when the model is treated as a black box.

Why it matters?

This research is important because it helps establish transparency and trust in the machine learning industry. By allowing clients to verify the actual hardware used by service providers, HSPI can prevent situations where customers pay for premium services but receive subpar performance. This could lead to better accountability among model providers and improve overall service quality in AI applications.

Abstract

It is now a common business practice to buy access to large language model (LLM) inference rather than self-host, because of significant upfront hardware infrastructure and energy costs. However, as a buyer, there is no mechanism to verify the authenticity of the advertised service including the serving hardware platform, e.g. that it is actually being served using an NVIDIA H100. Furthermore, there are reports suggesting that model providers may deliver models that differ slightly from the advertised ones, often to make them run on less expensive hardware. That way, a client pays premium for a capable model access on more expensive hardware, yet ends up being served by a (potentially less capable) cheaper model on cheaper hardware. In this paper we introduce \textbf{hardware and software platform inference (HSPI)} -- a method for identifying the underlying architecture and software stack of a (black-box) machine learning model solely based on its input-output behavior. Our method leverages the inherent differences of various architectures and compilers to distinguish between different types and software stacks. By analyzing the numerical patterns in the model's outputs, we propose a classification framework capable of accurately identifying the used for model inference as well as the underlying software configuration. Our findings demonstrate the feasibility of inferring type from black-box models. We evaluate HSPI against models served on different real hardware and find that in a white-box setting we can distinguish between different s with between 83.9% and 100% accuracy. Even in a black-box setting we are able to achieve results that are up to three times higher than random guess accuracy.

View Paper