Why Intent Prediction Needs More Than an LLM: A Behavioral AI Perspective
These articles are AI-generated summaries. Please check the original sources for full details.
Why intent prediction needs more than an LLM
Frank Portman, CTO of behavioral AI company Yobi, argues that LLMs are fundamentally unsuited for intent prediction. Unlike next-token generation, predicting future behavior requires handling high-cardinality tokens (three orders of magnitude larger than language) and proprietary, sensitive data.
Why This Matters
LLMs excel at synthesizing information within a context—writing code or composing rap in Shakespearean style—but their inductive bias of predicting the next token in a sequence is not designed for decision-making under uncertainty. Intent prediction demands forecasting expected value from sparse context (e.g., a single row of ad bid data), which pure language models cannot provide. The failure scale is significant: Yobi’s production system runs millions of queries per second for ad auctions, where even a slight latency increase or incorrect prediction can lead to wasted spend or missed revenue opportunities.
Key Insights
- LLMs use ~300k–500k base tokens; behavioral tokens are three orders of magnitude higher (Yobi, 2026).
- ‘Attention Is All You Need’ transformer architecture is used, but graph neural networks handle anonymous identifiers (Yobi stack).
- ‘Inductive vs transductive’: User side changes rapidly; new behaviors require inductive architectures (Yobi).
- ‘Pre-compute as much as possible’: Yobi caches feature requests to trade memory for inference latency at millions QPS.
- ‘Batching across stack’: Processing multiple requests together reduces per-request cost significantly at scale.
Practical Applications
- (Use case) Ad tech: Yobi predicts expected value per impression to decide which creative to show in real-time auctions.
- (Use case) Marketing: Engage existing customers via email/SMS with personalized product recommendations using fine-tuned foundation model.
- (Pitfall) Heuristic walls: Maintaining walls of if-statements becomes harder than training a model and introduces tech debt (pre-LLM era anti-pattern).
- (Pitfall) Chat as default interface: Not all decisions benefit from conversational interaction; agents may need tool-based decisions instead.
References:
Continue reading
Next article
Master Angular Class and Style Binding: Dynamic CSS Made Simple
Related Content
Explainable Causal Reinforcement Learning: Optimizing Precision Oncology Under Real-Time Constraints
Rikin Patel introduces a framework combining Structural Causal Models with Constrained RL to manage oncology workflows, achieving up to 95% confidence in causal moderator effects.
The LLM Is an ALU
An agent wasted four costly LLM round-trips on a single database write—revealing why models need systems architecture like CPUs.
Mastering Seq2Seq Networks: Leveraging Embedding Layers for Sequence Data
Learn how embedding layers convert tokens like 'Let’s' and 'go' into numerical vectors for LSTM-based sequence-to-sequence models.