SuperCompress Hits PyPI: 65% Token Savings With 100% LLM Answer Recall

SuperCompress is now on PyPI! pip install supercompress in 1 line

Arjun Shah released SuperCompress to PyPI, a lightweight open-source prompt compressor. It reduces LLM prompt tokens by 65% on average while maintaining perfect answer recall.

Why This Matters

LLM API costs scale linearly with prompt token count, making each interaction expensive for high-volume applications. While ideal compressors would perfectly distill context, real models often drop critical information. SuperCompress solves this with a tiny CPU policy that achieves 65% compression while guaranteeing no answer line is lost, saving significant per-query costs at ~60ms latency.

Key Insights

SuperCompress uses a ~5K parameter CPU policy to score each line of context for relevance, requiring no GPU (2026).
Achieves 65% fewer tokens and 100% oracle recall, ensuring critical answer lines are never dropped (2026).
Runs in ~60ms on CPU with no GPU needed, making it accessible for cost-sensitive deployments (2026).
Released under MIT license with non-commercial clause on PyPI, alongside a live comparison demo (2026).

Working Examples

Install and use SuperCompress to reduce prompt tokens by ~65% while preserving answer accuracy.

pip install supercompress
from supercompress import compress
result = compress(context, question)
print(f"Saved {result['kv_savings_pct']}% tokens")

Practical Applications

Use case: Developers reduce LLM API costs by trimming irrelevant context before sending prompts, cutting token usage by 65% without quality loss.
Pitfall: Blindly compressing all prompts may remove contextual nuance, but SuperCompress’s 100% oracle recall guarantees the answer line stays intact.
Use case: Teams deploy the ~5K parameter model on CPU-only infrastructure to compress prompts in ~60ms, enabling real-time preprocessing.
Pitfall: Over-reliance on compression without tuning could fail for multi-step reasoning tasks, though the tool is designed for direct question-answering scenarios.

References:

On This Page

SuperCompress is now on PyPI! pip install supercompress in 1 line

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Why GLM 5.2's MIT License Doesn't Make It Free: The US$1M Hardware Reality

She Replaced Vibes With Metrics How One Team Cuts Hallucinations By Automating LLM Evaluations In Production

Stack Overflow Opens Its Largest-Ever Developer Survey Amid Doubling Agent Usage