FP32

AMD Delivers Leadership Portfolio of Data Center AI Solutions with AMD Instinct MI300 Series

Retrieved on: 
Wednesday, December 6, 2023

SANTA CLARA, Calif., Dec. 06, 2023 (GLOBE NEWSWIRE) -- Today, AMD (NASDAQ: AMD) announced the availability of the AMD Instinct™ MI300X accelerators – with industry leading memory bandwidth for generative AI1 and leadership performance for large language model (LLM) training and inferencing – as well as the AMD Instinct™ MI300A accelerated processing unit (APU) – combining the latest AMD CDNA™ 3 architecture and “Zen 4” CPUs to deliver breakthrough performance for HPC and AI workloads.

Key Points: 
  • “AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance, and will be in large scale cloud and enterprise deployments,” said Victor Peng, president, AMD.
  • Oracle Cloud Infrastructure plans to add AMD Instinct MI300X-based bare metal instances to the company’s high-performance accelerated computing instances for AI.
  • Dell showcased the Dell PowerEdge XE9680 server featuring eight AMD Instinct MI300 Series accelerators and the new Dell Validated Design for Generative AI with AMD ROCm-powered AI frameworks.
  • Supermicro announced new additions to its H13 generation of accelerated servers powered by 4th Gen AMD EPYC™ CPUs and AMD Instinct MI300 Series accelerators.

Kinara Edge AI Processor Tackles the Monstrous Compute Demands of Generative AI and Transformer-Based Models

Retrieved on: 
Tuesday, December 12, 2023

Kinara™, Inc. , today launched the Kinara Ara-2 Edge AI processor, powering edge servers and laptops with high performance, cost effective, and energy efficient inference to run applications such as video analytics, Large Language Models (LLMs), and other Generative AI models.

Key Points: 
  • Kinara™, Inc. , today launched the Kinara Ara-2 Edge AI processor, powering edge servers and laptops with high performance, cost effective, and energy efficient inference to run applications such as video analytics, Large Language Models (LLMs), and other Generative AI models.
  • The Ara-2 is also ideal for edge applications running traditional AI models and state-of-the-art AI models with transformer-based architectures.
  • View the full release here: https://www.businesswire.com/news/home/20231212788210/en/
    The Kinara Ara-2 edge AI processor is targeted at applications running traditional AI models and state-of-the-art AI models with transformer-based architectures.
  • And as an example of its capabilities for processing Generative AI models, Ara-2 can hit 10 seconds per image for Stable Diffusion and tens of tokens/sec for LLaMA-7B."

NVIDIA, Global Data Center System Manufacturers to Supercharge Generative AI and Industrial Digitalization

Retrieved on: 
Tuesday, August 8, 2023

“As generative AI transforms every industry, enterprises are increasingly seeking large-scale compute resources in the data center,” said Bob Pette, vice president of professional visualization at NVIDIA.

Key Points: 
  • “As generative AI transforms every industry, enterprises are increasingly seeking large-scale compute resources in the data center,” said Bob Pette, vice president of professional visualization at NVIDIA.
  • The next generation of NVIDIA OVX systems powering Omniverse Cloud will feature L40S GPUs to deliver the AI and graphics performance needed to supercharge generative AI pipelines and Omniverse workloads.
  • Global system builders, including ASUS, Dell Technologies, GIGABYTE, HPE, Lenovo, QCT and Supermicro, will soon offer OVX systems that include the NVIDIA L40S GPUs.
  • These servers will help professionals worldwide advance AI and bring generative AI applications like intelligent chatbots, search and summarization tools to users across industries.

BOXX Upgrades AI Workstation Module with New NVIDIA L40S GPUs

Retrieved on: 
Tuesday, August 8, 2023

“Demanding AI workloads require powerful, specific, purpose-built solutions,” said Tim Lawrence, BOXX VP of Engineering.

Key Points: 
  • “Demanding AI workloads require powerful, specific, purpose-built solutions,” said Tim Lawrence, BOXX VP of Engineering.
  • Additionally, third generation NVIDIA RTX and DLSS 3 technology allow for high-fidelity creative workflows with 48GB of GPU memory available.
  • The powerful graphics and AI capabilities of the NVIDIA L40S deliver accelerated performance for complex Omniverse workloads like ray-traced and path-traced rendering, physically accurate simulation, and photorealistic 3D synthetic data generation.
  • BOXX RAXX P1G, ideal for deploying AI solutions on a large scale, is available with up to four NVIDIA L40S GPUs and is powered by a single AMD EPYC™ 7003 processor (up to 64 cores).

Tachyum AI Mastered FP8 To Reach FP32 Precision in Updated Version of AI White Paper

Retrieved on: 
Wednesday, May 17, 2023

The results show that FP8 quantized networks can maintain accuracy on par – or even exceed the accuracy – of baseline FP32 models.

Key Points: 
  • The results show that FP8 quantized networks can maintain accuracy on par – or even exceed the accuracy – of baseline FP32 models.
  • FP8 not only reduces the cost of computation but also memory requirements for large and rapidly growing AI models.
  • With FP8 capable of performing mainstream AI functions, leaders like Tachyum, are poised to help accelerate the rapid evolution of AI hardware technology.
  • To read more about Prodigy’s capabilities in AI, including results from its implementation of FP8, interested parties can download Tachyum’s latest white paper at https://www.tachyum.com/resources/whitepapers/2023/05/17/tachyum-prodigy... .

AMD Unveils the Most Powerful AMD Radeon PRO Graphics Cards, Offering Unique Features and Leadership Performance to Tackle Heavy to Extreme Professional Workloads

Retrieved on: 
Thursday, April 13, 2023

SANTA CLARA, Calif., April 13, 2023 (GLOBE NEWSWIRE) -- AMD (NASDAQ: AMD) today announced the AMD Radeon™ PRO W7000 Series graphics, its most-powerful workstation graphics cards to date. The AMD Radeon™ PRO W7900 and AMD Radeon™ PRO W7800 graphics cards are built on groundbreaking AMD RDNA™ 3 architecture, delivering significantly higher performance than the previous generation and exceptional performance-per-dollar compared to the competitive offering. The new graphics cards are designed for professionals to create and work with high-polygon count models seamlessly, deliver incredible image fidelity and color accuracy, and run graphics and compute-based applications concurrently without disruption to workflows.

Key Points: 
  • The AMD Radeon™ PRO W7900 and AMD Radeon™ PRO W7800 graphics cards are built on groundbreaking AMD RDNA™ 3 architecture, delivering significantly higher performance than the previous generation and exceptional performance-per-dollar compared to the competitive offering.
  • The AMD Radeon PRO W7900 graphics card, designed for extreme workloads, features 61 TFLOPS (FP32) peak single precision performance, offering 1.5X higher geomean performance on the SPECviewperf 2020 benchmark.
  • Created for heavy workloads, the AMD Radeon PRO W7800 graphics card features 45 TFLOPS (FP32) peak single precision performance and 32GB of GDDR6 memory.
  • AMD Radeon PRO Series workstation graphics and Ryzen Threadripper PRO processors provide exceptional performance, reliability, and stability to power mission-critical professional applications.

Deci Achieves Record-Breaking Inference Speed on NVIDIA GPUs at MLPerf

Retrieved on: 
Wednesday, April 5, 2023

TEL AVIV, Israel, April 5, 2023 /PRNewswire-PRWeb/ -- Deci, the deep learning company harnessing Artificial Intelligence (AI) to build better AI, today announced results for its Natural Language Processing (NLP) model submitted to the MLPerf Inference v3.0 benchmark suite under the open submission track. Notably, the NLP model, generated by Deci's Automated Neural Architecture Construction (AutoNAC) technology, dubbed DeciBERT-Large, delivered a record-breaking throughput performance of more than 100,000 queries per second on 8 NVIDIA A100 GPUs while also delivering improved accuracy. Also, Deci delivered unparalleled throughput performance per TeraFLOPs, outperforming competing submissions made on even stronger hardware setups.

Key Points: 
  • Deci achieves the highest inference speed ever to be published at MLPerf for NLP, while also delivering the highest accuracy.
  • Running successful inference at scale requires meeting various performance criteria such as latency, throughput, and model size, among others.
  • In this case, AutoNAC was used by Deci to generate model architectures tailored for various NVIDIA accelerators and presented unparalleled performance on the NVIDIA A30 GPU, NVIDIA A100 GPU (1 & 8 unit configurations), and the NVIDIA H100 GPU.
  • In other words, with Deci, teams can replace 8 NVIDIA A100 cards with just one NVIDIA H100 card, while getting higher throughput and better accuracy (+0.47 F1).

Axelera AI Announces Metis AI Platform

Retrieved on: 
Thursday, December 15, 2022

European artificial intelligence (AI) startup Axelera AI announced today its Metis AI Platform, encompassing the MetisTM AI Processing Unit (AIPU) chip and the VoyagerTM SDK software stack available as cards, boards, and vision systems to accelerate computer vision processing at the edge.

Key Points: 
  • European artificial intelligence (AI) startup Axelera AI announced today its Metis AI Platform, encompassing the MetisTM AI Processing Unit (AIPU) chip and the VoyagerTM SDK software stack available as cards, boards, and vision systems to accelerate computer vision processing at the edge.
  • View the full release here: https://www.businesswire.com/news/home/20221213005651/en/
    About the Axelera AI Metis AI platform.
  • (Graphic: Business Wire)
    The Metis AI Platform redefines affordability for Edge AI solutions without compromising on performance and is the first product in the market to offer hundreds of TeraOPS (TOPs) compute performance at an edge price-point.
  • Axelera AI opens the Early Access Program a unique collaboration opportunity for leading companies to co-design real-world Edge AI solutions based on the Metis AI platform.

RADX Tech Announces Trifecta-SSD COTS PXIe/CPCIe RAID Modules

Retrieved on: 
Wednesday, October 26, 2022

WASHINGTON, Oct. 26, 2022 /PRNewswire-PRWeb/ -- RADX® Technologies, Inc. ("RADX"), today at the Association of Old Crows (AOC) 2022 Annual Convention, announced the Trifecta-GPU™ Family of COTS PXIe/CPCIe GPU Modules. Trifecta-GPUs are the first COTS products that bring the extreme compute acceleration and ease-of-programming of NVIDIA® RTX® A2000 Embedded GPUs to PXIe/CPCIe platforms for modular Test & Measurement (T&M) and Electronic Warfare (EW) applications.

Key Points: 
  • Designed to complement RADX Catalyst-GPU products announced earlier this year, Trifecta-GPUs deliver even greater compute performance by employing NVIDIA RTX Embedded GPUs.
  • For more info on RADX, please visit http://www.radxtech.com , email mailto: [email protected] [ [email protected] __title__ null] or call +1 (619) 677-1849 x1.
  • RADX, the RADX logo, Trifecta-SSD, Trifecta-GPU, Catalyst-GPU and Catalyst-GbE are trademarks or registered trademarks that are the property of RADX Technologies, Inc. All other trademarks are the property of their respective owners.
  • All specifications are thought to be accurate at the time of publication, but are subject to change without notice.

Aetina Launches New MXM GPU Modules for AI Performance Boost at the Edge

Retrieved on: 
Monday, October 24, 2022

These NVIDIA Ampere architecture-based MXM GPU modules, providing superb performance and power efficiency, are suitable for various types of computer vision applications across different industries such as commercial gaming, aerospace, healthcare, and manufacturing.

Key Points: 
  • These NVIDIA Ampere architecture-based MXM GPU modules, providing superb performance and power efficiency, are suitable for various types of computer vision applications across different industries such as commercial gaming, aerospace, healthcare, and manufacturing.
  • The model names of the new Aetina MXM GPU modules powered by NVIDIA RTX A1000, NVIDIA RTX A2000, and NVIDIA RTX A4500 are M3A1000-PP , M3A2000-VY , and M3A4500-WP , respectively.
  • The new MXM GPU modules, with their compact designs, can be easily integrated into a variety of AI systems to boost the computing performance and run complex AI inference tasks.
  • Moreover, the company offers MXM GPU modules powered by other NVIDIA GPUs including Quadro T1000, Quadro RTX3000, and Quadro RTX5000 to meet different levels of computer performance requirements from any AI-powered system developments.