Syntiant to Demonstrate 100% Speed-Up of its Optimized Large Language Models at NVIDIA GTC
Syntiant will demonstrate its optimized LLM in GTC booth I-103 that achieved twice the rate of token generation vs. the SOTA GGML LLaMa-7B benchmark while delivering the same accuracy.
- Syntiant will demonstrate its optimized LLM in GTC booth I-103 that achieved twice the rate of token generation vs. the SOTA GGML LLaMa-7B benchmark while delivering the same accuracy.
- “The ML models we’re demoing at GTC were trained using NVIDIA accelerated computing technologies, which have helped enable AI across many industries,” said Kurt Busch, CEO of Syntiant.
- Optimized to reduce latency and memory footprint, Syntiant’s models are deployable into production on day one and at a lower cost to OEMs.
- Click here to register for NVIDIA GTC or visit Syntiant’s virtual booth .