Benchmarking Local Entity Extraction: 2B Parameter Models for Personal Knowledge Graphs

3º. Entity extraction with a 2B model: benchmarks from a personal knowledge graph

The qwen3-vl:2b-instruct-q4_K_M model was benchmarked for local entity extraction across personal notes, emails, and photos. It achieved a near-perfect 0.87 F1 score for person names while running entirely on a local CPU via Ollama with only 2GB of RAM.

Why This Matters

While industry standards often default to massive cloud models like GPT-4, personal knowledge graphs running on local hardware require extreme efficiency to fit within 2GB of RAM. The technical reality is that smaller, 4-bit quantized models can achieve production-ready accuracy for core entities like people and places, even when the human ground truth is exceeded by the model’s thoroughness. This shifts the engineering challenge from model size to downstream entity resolution and handling the subjective nature of human-annotated topics versus concrete model extractions.

Key Insights

High-accuracy person extraction with 0.87 F1 score using qwen3-vl (2026).
Zero JSON parse errors across 25 benchmark cases, ensuring high reliability for automated pipelines.
Bilingual extraction capabilities handling Spanish and English natively without explicit language switching.
Embedding-based matching using qwen3-embedding (1024d) with a 0.75 threshold to normalize entity variants.
Two-stage vision pipeline that generates image descriptions before extracting structured entities from the text.

Working Examples

The prompt used for zero-shot entity extraction across text and vision descriptions.

Extract named entities from the following text. Return ONLY a JSON object:
- persons: array of person names mentioned
- projects: array of project/product names mentioned
- locations: array of place names mentioned
- topics: array of key topics/themes (max 3)
Rules:
- Only extract what is EXPLICITLY mentioned
- Do not invent or infer entities not present
- Normalize names (capitalize properly)

Practical Applications

Personal Knowledge Graph: Automatically discovering connections between ‘Ana García’ in emails and meeting notes with ~3s latency. Pitfall: Treating nicknames like ‘Pepe’ as distinct entities from full names without a downstream resolution system.
Local Multimodal Indexing: Identifying specific objects like ‘olive trees’ or ‘laser levels’ in photos for searchable image databases. Pitfall: Generic descriptions like ‘four people’ being misclassified as person entities, lowering precision metrics.

References:

https://dev.to/micelclaw/entity-extraction-with-a-2b-model-benchmarks-from-a-personal-knowledge-graph-2f27

On This Page

3º. Entity extraction with a 2B model: benchmarks from a personal knowledge graph

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

GLM on a Single RTX 5090: Can Any Model Survive the Homelab Bakeoff?

GitLost Attack Shows How One Word Change Can Leak Private Repos via AI Agents

Cloudflare Introduces Moltworker for Self-Hosted AI Agents on the Edge