MemVerge and Micron Boost NVIDIA GPU Utilization with CXL® Memory
SAN JOSE, Calif., March 18, 2024 /PRNewswire/ -- MemVerge®, a leader in AI-first Big Memory Software, has joined forces with Micron to unveil a groundbreaking solution that leverages intelligent tiering of CXL memory, boosting the performance of large language models (LLMs) by offloading from GPU HBM to CXL memory. This innovative collaboration is being showcased in Micron booth #1030 at GTC, where attendees can witness firsthand the transformative impact of tiered memory on AI workloads.
- SAN JOSE, Calif., March 18, 2024 /PRNewswire/ -- MemVerge®, a leader in AI-first Big Memory Software, has joined forces with Micron to unveil a groundbreaking solution that leverages intelligent tiering of CXL memory, boosting the performance of large language models (LLMs) by offloading from GPU HBM to CXL memory.
- The demonstration, conducted by engineers from MemVerge and Micron featured a FlexGen high-throughput generation engine and OPT-66B large language model running on a Supermicro Petascale Server equipped with an AMD Genoa CPU, Nvidia A10 GPU, Micron DDR5-4800 DIMMs, CZ120 CXL memory modules, and MemVerge Memory Machine™ X intelligent tiering software.
- Simultaneously, GPU utilization soared from 51.8% to 91.8%, thanks to the transparent management of data tiering across DIMMs and CXL modules facilitated by MemVerge Memory Machine X software.
- For more information about MemVerge, Micron, and Supermicro's collaborative efforts and the transformative potential of CXL memory for AI workloads, please visit www.memverge.com/gtc2024