When Semantic Search Adds No Value
When Semantic Search Adds No Value
The Symptom
The team deploys hybrid search. Overall NDCG@5 improves from 0.77 to 0.82. The product manager celebrates. A week later, three support tickets arrive: developers searching for exact method names report that irrelevant conceptual documents now appear in their top results. The semantic arm of hybrid search injects documents that are “conceptually related” but factually wrong for exact-match queries.
The Internals
Semantic search maps text to a continuous vector space where proximity represents meaning similarity. This representation is fundamentally at odds with exact-match search patterns.
Embedding models are trained on natural language corpora. They learn that “configure,” “set up,” and “initialize” are semantically similar. For a user searching for HttpClient.setConnectionTimeout, the model produces a vector close to documents about “configuring HTTP connection parameters,” including documents about SocketFactory.setKeepAlive, RestTemplate.setReadTimeout, and WebClient.connectTimeout. All are “configuring HTTP connection parameters.” Only one is the method the user searched for.
The failure modes of semantic search on technical documentation:
-
Identifier confusion.
setConnectionTimeoutandsetReadTimeoutare semantically similar (both “set timeouts”) but functionally different methods in different classes. Semantic search conflates them. -
Version specificity. A search for “Spring Boot 3.2 migration guide” returns the 2.7 migration guide because both are “Spring Boot migration” in embedding space.
-
Negation blindness. “How to disable SSL verification” and “How to enable SSL verification” have nearly identical embeddings because the embedding model focuses on “SSL verification,” not on the crucial “disable” vs “enable” distinction.
-
Code search. A search for
@Transactional(readOnly = true)produces a vector for “transactional read-only annotation.” The user wanted the exact annotation syntax. Lexical search finds it. Semantic search returns documents about transaction isolation levels.
The Implementation
Query-Type Detection for Selective Hybrid Search
public class QueryClassifier {
private static final Pattern METHOD_NAME_PATTERN =
Pattern.compile("^[a-zA-Z_][a-zA-Z0-9_.]*\\(.*\\)$|^[a-zA-Z_][a-zA-Z0-9_]*\\.[a-zA-Z_][a-zA-Z0-9_]*$");
private static final Pattern CONFIG_KEY_PATTERN =
Pattern.compile("^[a-z][a-z0-9_.-]+$");
private static final Pattern ANNOTATION_PATTERN =
Pattern.compile("^@[A-Z][a-zA-Z0-9_]*");
public enum QueryType {
EXACT_IDENTIFIER, // Method names, config keys, annotations
NATURAL_LANGUAGE, // Concept queries, how-to questions
ERROR_MESSAGE, // Stack traces, error strings
MIXED // Contains both identifiers and natural language
}
public QueryType classify(String query) {
if (METHOD_NAME_PATTERN.matcher(query.trim()).matches() ||
ANNOTATION_PATTERN.matcher(query.trim()).matches()) {
return QueryType.EXACT_IDENTIFIER;
}
if (CONFIG_KEY_PATTERN.matcher(query.trim()).matches()) {
return QueryType.EXACT_IDENTIFIER;
}
String[] words = query.trim().split("\\s+");
if (words.length >= 4 && !containsIdentifier(query)) {
return QueryType.NATURAL_LANGUAGE;
}
if (query.contains("Exception") || query.contains("Error") ||
query.contains("error:") || query.contains("failed")) {
return QueryType.ERROR_MESSAGE;
}
return QueryType.MIXED;
}
private boolean containsIdentifier(String query) {
return METHOD_NAME_PATTERN.matcher(query).find() ||
ANNOTATION_PATTERN.matcher(query).find();
}
}
Adaptive Hybrid Search
// HARDENED: Apply semantic search only when it adds value
public List<HybridResult> adaptiveSearch(String tenantId, String query, int k)
throws Exception {
QueryClassifier.QueryType queryType = classifier.classify(query);
return switch (queryType) {
case EXACT_IDENTIFIER -> lexicalOnlySearch(tenantId, query, k);
case NATURAL_LANGUAGE -> hybridSearch(tenantId, query, k);
case ERROR_MESSAGE -> lexicalOnlySearch(tenantId, query, k);
case MIXED -> hybridSearchWithLexicalBias(tenantId, query, k);
};
}
private List<HybridResult> hybridSearchWithLexicalBias(
String tenantId, String query, int k) throws Exception {
// Use RRF with a higher k constant for the semantic arm,
// reducing its influence on the final ranking
// Lexical: k=60 (standard), Semantic: k=120 (reduced influence)
return hybridSearchWithCustomK(tenantId, query, k, 60, 120);
}
The Measurement
NDCG@5 with adaptive hybrid search vs uniform hybrid search:
| Category | Lexical Only | Uniform Hybrid | Adaptive Hybrid |
|---|---|---|---|
| Method name | 0.89 | 0.87 | 0.89 |
| Concept | 0.71 | 0.77 | 0.77 |
| Error message | 0.72 | 0.71 | 0.72 |
| Config key | 0.81 | 0.79 | 0.81 |
| How-to | 0.65 | 0.78 | 0.78 |
| Overall | 0.77 | 0.82 | 0.83 |
Adaptive hybrid search routes exact-match queries to lexical-only search, preserving their high NDCG, while routing conceptual queries to hybrid search, capturing the semantic benefit. The overall NDCG@5 of 0.83 exceeds both uniform strategies.
The Decision Rule
Do not apply semantic search uniformly to all query types. Classify queries and route exact-match patterns (method names, configuration keys, annotations, error messages) to lexical-only search.
Measure the per-category NDCG impact of semantic search. If semantic search degrades a category by more than 0.02 NDCG without a corresponding improvement elsewhere, the category should be excluded from semantic scoring.
Evaluate the infrastructure cost of the embedding pipeline (GPU for inference, storage for vectors, operational complexity of keeping vectors synchronized) against the NDCG improvement. If the overall improvement is less than 0.03 NDCG and concentrated in a single query category, the infrastructure cost may not be justified.