AI News
4922 articles in this category (Page 14 of 206)
AI NewsAI InfrastructureLarge Language Model
Sakana AI and NVIDIA Introduce TwELL: 20.5% Faster LLM Inference via Unstructured Sparsity
Sakana AI and NVIDIA introduced TwELL and custom CUDA kernels, achieving 20.5% inference and 21.9% training speedups in LLMs by exploiting activation sparsity.
Read more
AI NewsLarge Language ModelSoftware Engineering
Mastering LLM Distillation: Soft-Label, Hard-Label, and Co-distillation Strategies
LLM distillation uses teacher-student models to transfer reasoning capabilities, reducing costs while maintaining performance through techniques like soft-label and co-distillation.
Read more