Machine Learning
273 articles in this category (Page 5 of 12)
Building Scalable ML Data Pipelines for Image and Structured Data with Daft
Learn how to build an end-to-end ML pipeline using Daft, a Python-native data engine that handles MNIST image reshaping, feature engineering via batch UDFs, and Parquet persistence for high-performance processing.
Yuan 3.0 Ultra: Optimizing Trillion-Parameter MoE Efficiency via LAEP
YuanLab AI releases Yuan 3.0 Ultra, a 1T-parameter MoE model that achieves a 49% boost in pre-training efficiency. By utilizing Layer-Adaptive Expert Pruning and a Reflection Inhibition Reward Mechanism, it reduces total parameters by 33.3% while maintaining state-of-the-art performance in multimodal retrieval and enterprise benchmarks.
Meet SymTorch: A PyTorch Library for Translating Deep Learning Models into Mathematical Equations
Cambridge Researchers introduce SymTorch, a library using symbolic regression to translate PyTorch models into closed-form equations, achieving an 8.3% throughput increase in LLM inference benchmarks.
Building Scalable ML Pipelines on Millions of Rows with Vaex
Learn how to build a production-style analytics and ML pipeline on 2 million rows using Vaex, featuring lazy expressions and approximate statistics without materializing data in memory.
ByteDance AI Maps Molecular Bonds in Reasoning to Stabilize Long Chain-of-Thought Models
ByteDance researchers introduce MOLE-SYN, a framework that treats AI reasoning as molecular structures, stabilizing Long CoT performance across benchmarks like GSM8K and MATH-500.