Machine Learning
273 articles in this category (Page 2 of 12)
Qwen-Scope: Open-Source Sparse AutoEncoders for LLM Interpretability and Steering
Qwen AI releases Qwen-Scope, an open-source suite of 14 Sparse AutoEncoders (SAEs) for Qwen3/3.5 models, enabling inference-time steering and benchmark analysis without model runs.
OpenAI Privacy Filter: Building a Production PII Redaction Pipeline
Learn to implement a production-grade PII detection pipeline using the OpenAI Privacy Filter to automatically identify and redact sensitive data like API keys and personal addresses.
Talkie-1930: A 13B Vintage LLM Trained Exclusively on Pre-1931 Data
Researchers released Talkie-1930, a 13B parameter open-weight LLM trained on 260 billion tokens of pre-1931 text to eliminate benchmark contamination and research historical reasoning.
Mastering OpenMythos: Implementing Recurrent-Depth Transformers with MLA and MoE
OpenMythos enables deeper reasoning via recurrent computation, allowing Multi-Head Latent Attention (MLA) to achieve significantly smaller KV-cache footprints than GQA.
Google DeepMind’s Decoupled DiLoCo: Scaling AI Training with 88% Goodput and Asynchronous Fault Tolerance
Google DeepMind's Decoupled DiLoCo achieves 88% goodput under high hardware failure rates and reduces inter-datacenter bandwidth from 198 Gbps to 0.84 Gbps.
Secure LLM Agents with Two-Stage Prompt Injection Detection
ZooClaw releases a specialized prompt injection detection API using a two-stage architecture to protect agentic workflows. The system achieves a 0.972 F1 score in English benchmarks, significantly outperforming GPT-4o, and provides sub-10ms latency for 95 percent of production traffic.