AI Infrastructure
183 articles in this category (Page 3 of 8)
Building an AI-Powered File Type Detection and Security Pipeline with Magika and OpenAI
Learn to integrate Google's Magika deep-learning file detection with OpenAI's GPT-4o to identify over 100 file labels and detect spoofed extensions with byte-level accuracy.
Building Multi-Agent Systems with SmolAgents: Code Execution and Dynamic Orchestration
Learn to build production-ready multi-agent systems using SmolAgents v1.24.0, featuring Python-based code execution and dynamic tool management for complex reasoning tasks.
Advanced Web Scraping with Crawl4AI: Markdown Generation, JS Execution, and Structured LLM Extraction
Learn to implement Crawl4AI v0.8.x for advanced web crawling, featuring JavaScript execution and LLM-based structured data extraction from unstructured HTML.
Microsoft (MSFT) 21-Day Outlook: Oversold Technicals Clash with AI CapEx Concerns Ahead of Q3 Earnings
Despite a 25% YTD decline and mixed sentiment, MSFT's oversold RSI and strong fundamentals suggest a potential rebound heading into its April 29 earnings catalyst.
NVIDIA KVPress: Optimizing Long-Context LLM Inference with KV Cache Compression
NVIDIA’s KVPress framework enables memory-efficient LLM inference by pruning KV cache pairs with compression ratios up to 0.7, significantly reducing GPU memory overhead for long-context tasks.
Five AI Compute Architectures Every Engineer Should Know: CPUs, GPUs, TPUs, NPUs, and LPUs Compared
Understand the trade-offs between AI architectures, including Groq’s LPU which achieves 10x higher energy efficiency than traditional systems for LLM inference.
Optimizing Deep Learning Workflows with NVIDIA Transformer Engine: FP8 and Mixed Precision Implementation
Learn to implement NVIDIA Transformer Engine with FP8 precision to accelerate training while maintaining accuracy through a robust fallback-enabled workflow.