Large Language Model
54 articles in this category (Page 3 of 3)
vLLM vs TensorRT-LLM vs HF TGI vs LMDeploy, A Deep Technical Comparison for Production LLM Inference
A technical comparison of vLLM, TensorRT-LLM, Hugging Face TGI, and LMDeploy reveals throughput differences of up to 10,000 tokens/second on NVIDIA H100 GPUs.
xAI’s Grok 4.1 Achieves Top Ranking on LMArena with 1483 Elo, Signaling Advances in LLM Preference
xAI’s Grok 4.1 surpasses previous models and competitors, achieving a 64.78% preference rate in A/B testing and securing the top two positions on the LMArena Text Arena leaderboard.
Moonshot AI Introduces Kimi K2 Thinking: A Breakthrough in Long-Horizon Reasoning and Tool Use
Moonshot AI releases Kimi K2 Thinking, an open-source thinking model capable of executing 200–300 sequential tool calls without human intervention, optimized for long-horizon reasoning and agentic tasks.
Liquid AI Releases LFM2-ColBERT-350M: A Compact Late Interaction Model for Multilingual Cross-Lingual Retrieval
Liquid AI introduces LFM2-ColBERT-350M, a 350M-parameter late interaction retriever optimized for multilingual and cross-lingual search, offering high accuracy and fast inference speeds.