Benchmarking Local Entity Extraction: 2B Parameter Models for Personal Knowledge Graphs
These articles are AI-generated summaries. Please check the original sources for full details.
3º. Entity extraction with a 2B model: benchmarks from a personal knowledge graph
The qwen3-vl:2b-instruct-q4_K_M model was benchmarked for local entity extraction across personal notes, emails, and photos. It achieved a near-perfect 0.87 F1 score for person names while running entirely on a local CPU via Ollama with only 2GB of RAM.
Why This Matters
While industry standards often default to massive cloud models like GPT-4, personal knowledge graphs running on local hardware require extreme efficiency to fit within 2GB of RAM. The technical reality is that smaller, 4-bit quantized models can achieve production-ready accuracy for core entities like people and places, even when the human ground truth is exceeded by the model’s thoroughness. This shifts the engineering challenge from model size to downstream entity resolution and handling the subjective nature of human-annotated topics versus concrete model extractions.
Key Insights
- High-accuracy person extraction with 0.87 F1 score using qwen3-vl (2026).
- Zero JSON parse errors across 25 benchmark cases, ensuring high reliability for automated pipelines.
- Bilingual extraction capabilities handling Spanish and English natively without explicit language switching.
- Embedding-based matching using qwen3-embedding (1024d) with a 0.75 threshold to normalize entity variants.
- Two-stage vision pipeline that generates image descriptions before extracting structured entities from the text.
Working Examples
The prompt used for zero-shot entity extraction across text and vision descriptions.
Extract named entities from the following text. Return ONLY a JSON object:
- persons: array of person names mentioned
- projects: array of project/product names mentioned
- locations: array of place names mentioned
- topics: array of key topics/themes (max 3)
Rules:
- Only extract what is EXPLICITLY mentioned
- Do not invent or infer entities not present
- Normalize names (capitalize properly)
Practical Applications
- Personal Knowledge Graph: Automatically discovering connections between ‘Ana García’ in emails and meeting notes with ~3s latency. Pitfall: Treating nicknames like ‘Pepe’ as distinct entities from full names without a downstream resolution system.
- Local Multimodal Indexing: Identifying specific objects like ‘olive trees’ or ‘laser levels’ in photos for searchable image databases. Pitfall: Generic descriptions like ‘four people’ being misclassified as person entities, lowering precision metrics.
References:
Continue reading
Next article
Securing $600M+ in x402 Agent Payments with PayCrow Escrow
Related Content
Gemma 4: Enabling Local-First Multimodal AI Infrastructure for Developers
Gemma 4 introduces a family of open models, including MoE and Dense variants, to enable high-reasoning multimodal workflows on local hardware.
CommitAI: Building a Local Offline Git Assistant with Gemma 4 and Ollama
CommitAI automates Git workflows offline using Gemma 4 on hardware as limited as an 8GB RAM MacBook Air M2.
Implementing Graph RAG to Prevent Context Rot in AI Agents
Philip Rathle, CTO at Neo4j, explains how Graph RAG reduces context rot by combining vectors with knowledge graphs for more accurate AI agents.