Storage

WEKA storage with the Micron 6500 ION SSD supports 256 AI accelerators

文/韦斯·瓦斯克 - 2023-11-28
美光最近公布了我们的研究结果 MLPerf Storage v0.5在美光®9400 NVMe™SSD上. These results highlight the high-performance NVMe SSD as a local cache in an AI server, and the Micron 9400 NVMe SSD performs extremely well for that use case. However, most AI training data lives not in local cache but on shared storage. For SC23, we decided to test the same MLPerf Storage AI workload on a WEKA 存储集群由30TB供电 美光6500 ION NVMe固态硬盘
 
WEKA是一个分布式的, 为AI工作负载设计的并行文件系统, and we wanted to know how the MLPerf Storage AI workload scales on a high-performance SDS solution. 研究结果很有启发性, helping us make sizing recommendations for current-generation AI systems and hinting at the massive throughput future AI storage systems will require.  

首先,快速回顾一下MLPerf Storage 
MLCommons maintains and develops six different benchmark suites and is developing open datasets to support future state-of-the-art model development. The MLPerf Storage Benchmark Suite is the latest addition to the MLCommons’ benchmark collection. 

MLPerf Storage着手解决两个挑战, 等, when characterizing the storage workload for AI training systems — the cost of AI accelerators and the small size of available datasets.  

For a deeper dive into the workload generated by MLPerf Storage and a discussion of the benchmark, 请参阅我们之前的博客文章:
接下来,让我们检查一下正在测试的WEKA集群 
我的队友苏吉特写了一篇 今年早些时候发布的 describing the performance of the cluster in synthetic workloads. 查看完整结果的帖子. 

The cluster is made up of six storage nodes, and each node is configured with the following: 在总, 该集群提供838TB的容量和, 对于高队列深度的工作负载, 达到200gb /s. 

Finally, let’s review how this cluster performs in MLPerf Storage 
Quick note: The results presented here are unvalidated as they have not been submitted to MLPerf Storage for review. Also, the MLPerf Storage benchmark is undergoing changes from v0.第一个2024版本从5到下一个版本. The numbers presented here use the same methodology as the v0.5 .每个客户端发布独立的数据集, 独立客户端, 客户端的加速器共享一个屏障). 

MLPerf Storage基准模拟 英伟达®V100 0中的加速器.5版本. The NVIDIA DGX-2服务器 有16个V100加速器. 对于这个测试, we show the number of clients supported on the WEKA cluster where each client emulates 16 V100 accelerators, 比如NVIDIA DGX-2. 

此外,半.5 of the MLPerf Storage benchmark implements two different models, Unet3D and BERT. 通过测试, we find that BERT does not generate significant storage traffic, 我们将集中在Unet3D上进行测试. (Unet3D是一个3D医学成像模型.)

This plot shows the total throughput to the storage system for a given number of client nodes. 记住,每个节点有16个模拟加速器. 此外, 被认为是“成功的,” a given quantity of nodes and accelerators need to maintain greater than 90% accelerator utilization. 如果加速器低于90%, that represents idle time on the accelerators as they wait for data.

Here we see that the six-node WEKA storage cluster supports 16 clients, each emulating 16 accelerators — for a total of 256 emulated accelerators — and reaching 91 GB/s 的吞吐量.

This performance is like 16 NVIDIA DGX-2 systems (with 16 V100 GPUs each), which is a remarkably high number of AI systems supported by a six-node WEKA cluster. 

V100支持PCIe Gen3 GPU, and the pace-of-performance increases in NVIDIA’s GPU generations are far surpassing platform and PCIe generations. 在单节点系统中, we find that an emulated NVIDIA A100 GPU is four times faster in this workload.

最大吞吐量为91 GB/s, we can estimate that this WEKA deployment would support 8 DGX A100 systems (with 8 A100 GPUs each). 

Looking further into the future at H100 / H200 (PCIe Gen5) and X100 (PCIe Gen6), cutting-edge AI training servers are going to push a massive amount 的吞吐量.

今天的, WEKA storage and the Micron 6500 NVMe SSD are the perfect combination of capacity, 性能和可扩展性为您的AI工作负载. 

请继续关注我们对AI存储的探索! 
温迪Lee-Kadlec

韦斯Vaske

韦斯Vaske is a Senior Member of 技术 Staff on the Micron Data Center Workloads Engineering team in Austin Texas. He analyzes enterprise workloads to understand the performance effects of Flash and DRAM devices on applications and provides 'real-life' workload characterization to internal design & 开发团队. Wes's specific focus is Artificial Intelligence applications and developing the tools for tracing and system observation.

+