Context Window — Glossary

Definition

The context window is the maximum number of tokens — input plus generated output — that a language model can process in a single interaction. Context window size determines how much input a user, application, or RAG system can supply to the model: short windows (a few thousand tokens) accommodate brief queries and short documents; large windows (hundreds of thousands of tokens to multiple millions) accommodate long documents, codebases, and multi-document research workflows. Modern frontier models offer context windows of 100,000 tokens to several million tokens, depending on provider and product tier. Effective use of large context windows is non-trivial: model performance can degrade across a large context, and structuring content for retrieval within the context window is its own discipline.

Why it matters for communications

Context window size is now a feature-comparison vocabulary in enterprise AI procurement and consumer AI press. Communications around AI product capability — particularly around long-document analysis, codebase reasoning, and multi-document research — depends on accurate context-window claims. The discipline of structuring content to be effectively retrieved within large context windows overlaps significantly with the structural discipline of Generative Engine Optimization for RAG-based answer engines.

Related terms Token · Retrieval-Augmented Generation · Tool Use · Long-context evaluation

Related entities OpenAI · Anthropic · Google · Meta · Mistral · enterprise AI vendors

Primary sources OpenAI, Anthropic, Google DeepMind, and Meta product documentation · academic literature on long-context performance and retrieval.