What prompt features increase LLM hallucination risk the most?

Lack of Specificity (OR 2.382) and Clause Complexity (OR 1.764) are the top risk-increasing features, according to a study of 369,837 real-world queries.

What prompt features reduce hallucination risk?

Answerability (OR 0.331) and Intention Grounding (OR 0.846) are the strongest protective features. Explicitly stating what operation you want and ensuring the query has a verifiable answer significantly lowers risk.

Do rare words or negation confuse LLMs?

Surprisingly, features that confuse humans — rare vocabulary, superlatives, and negation — show minimal association with LLM hallucination. Human and model failure modes do not necessarily coincide.

How does sqlew help reduce hallucination in AI coding?

sqlew delivers design Decisions and Constraints individually with rationale, naturally satisfying Intention Grounding and Answerability — the two strongest hallucination-reducing features identified in the research.

LLMs Get Confused by "Everything at Once": Good Prompts Still Matter in 2026

Have you ever noticed that the more detail you pack into a prompt, the less reliable the AI's output becomes? You carefully elaborate specifications, list every condition and edge case — and hallucinations actually increase.

A February 2026 paper from J.P. Morgan AI Research confirms this intuition with data from approximately 370,000 real-world queries.

What Triggers Hallucination: 17 Linguistic Features and a "Risk Landscape"

Watson et al. (2026) defined 17 query-level features known in classical linguistics to hinder human comprehension, then analyzed their correlation with hallucination rates across 369,837 real queries spanning 13 QA datasets.

The strongest risk-increasing factor was Lack of Specificity, with an odds ratio of 2.382. Queries missing concrete constraints — time, place, scope — showed markedly higher hallucination rates. Clause Complexity followed at OR 1.764, indicating that deeply nested subordinate clauses impose interpretive overhead on LLMs as well.

On the protective side, Answerability had the strongest effect at OR 0.331. Queries with clear, verifiable answers showed dramatically lower hallucination. Intention Grounding also reduced risk significantly — explicitly stating the desired operation ("summarize," "compare," "extract") directly improved output accuracy.

Human Confusion and LLM Confusion Are Not the Same

One of the study's most striking findings is that several features known to confuse humans had minimal effect on LLMs.

Rare vocabulary, superlative expressions, and complex negation are well-documented sources of difficulty in human reading comprehension. Yet in this large-scale analysis, their association with LLM hallucination was weak.

This carries important practical implications. Specifications written to be thorough and clear for human readers are not necessarily good inputs for AI. Specs inherently tend toward high clause complexity, and striving for completeness invites excessive details. The format that helps a human understand and the format that prevents LLM hallucination do not always align.

Three High-Impact, Low-Cost Improvements

The paper proposes three low-cost improvements derived directly from the strongest statistical signals:

Eliminate ambiguity: Add explicit constraints such as time, location, and target entity
State intent explicitly: Use specific operation verbs — "summarize," "compare," "extract," "verify"
Resolve polysemy upfront: When a word could have multiple meanings in context, disambiguate it before the LLM has to guess

These improvements are particularly effective for short prompts. Short, open-ended instructions show the widest gap between high-risk and low-risk outcomes.

What "Providing Rationale" Means

These findings connect directly to context design in AI-assisted development.

The evidence that Intention Grounding and Answerability reduce hallucination aligns with the principle that rationale outperforms bare instruction for stabilizing LLM reasoning. sqlew delivers Decisions and Constraints to AI individually, each accompanied by the rationale behind it, preventing the kind of confusion this research quantifies.

Rather than adding more instructions, provide information that is short, intent-clear, and answerable. Even in 2026, prompt quality remains a decisive factor in output quality.

References

Watson, W., Cho, N., Ganesh, S., & Veloso, M. (2026). "What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance." — arXiv:2602.20300 — https://doi.org/10.48550/arXiv.2602.20300

PromptEngineering Hallucination QueryDesign ContextEngineering AiCoding