Computer Vision
32 articles in this category (Page 2 of 2)
AI NewsComputer VisionModel Optimization
FLUX.2: Black Forest Labs' Next-Gen Image Generator Demands 80GB VRAM for Inference
FLUX.2, Black Forest Labs' new image model, requires 80GB VRAM for inference and introduces architectural changes like single-text encoder and fused transformer blocks.
Read more
AI NewsApplicationsComputer Vision
Baidu Releases ERNIE-4.5-VL-28B-A3B-Thinking: An Open-Source and Compact Multimodal Reasoning Model Under the ERNIE-4.5 Family
Baidu’s ERNIE-4.5-VL-28B-A3B-Thinking achieves 3B active parameters per token with 30B total parameters, outperforming larger models on multimodal benchmarks.
Read more
AI NewsArtificial IntelligenceComputer Vision
Spatial Supersensing as the Core Capability for Multimodal AI Systems
This article explores how spatial supersensing is emerging as a critical capability for multimodal AI systems, focusing on the Cambrian-S model and the VSI Super benchmark for evaluating long-video spatial reasoning.
Read more