Tokens & Context Windows

Learn about tokens, context windows, and why they determine what AI can and cannot do.

6 min read
2 quiz questions

AI models don't read words — they read tokens. A token is a chunk of text, typically 3-4 characters. The word "hamburger" becomes three tokens: "ham", "bur", "ger". The word "the" is one token. Understanding tokens matters because everything in AI is measured in them — cost, speed, and the amount of text the model can process.

  • English averages about 1 token per 0.75 words (or ~4 characters per token)
  • A typical page of text is about 300-400 tokens
  • Code is more token-dense — a line of code may use 10-20 tokens
  • Non-English languages often use more tokens per word

The context window is the total amount of text the model can "see" at once — your prompt plus its response. Think of it as the model's working memory. Once you exceed the context window, the model literally cannot see the earlier parts of your conversation.

Prompt

Context window sizes vary dramatically across models:

GPT-4o

128K tokens (~96,000 words — a full novel)

Claude 3.5 Sonnet

200K tokens (~150,000 words)

Gemini 1.5 Pro

1M tokens (~750,000 words)

Bigger context windows don't always mean better results. Models tend to pay more attention to the beginning and end of the context, sometimes losing information in the middle. This is called the "lost in the middle" effect.

Every token costs money when using AI APIs, and longer prompts take longer to process. Being concise isn't just about clarity — it's about cost and speed. A prompt that uses 500 tokens instead of 2,000 is 4x cheaper and noticeably faster.

  1. Put the most important information at the beginning and end of your prompt
  2. Remove filler words and redundant instructions
  3. For long documents, summarize or extract key sections before sending to the model
  4. Track your token usage — most API dashboards show this

Prompt Templates

Document Summarizer (Token-Efficient)

Extracts key information while keeping token usage low.

Summarize this document in under 200 words, focusing on: [SPECIFIC ASPECT]. Use bullet points for key facts. Skip background information I already know.

Document:
[PASTE TEXT]

Long Document Analyzer

Efficiently analyzes specific parts of long documents.

I'm going to give you a long document. Focus on these sections specifically:
1. [SECTION/TOPIC 1]
2. [SECTION/TOPIC 2]

For each, extract: the main claim, supporting evidence, and any caveats. Ignore everything else.

Document:
[PASTE TEXT]

Test Your Knowledge

Knowledge Check

1 / 2

Approximately how many tokens does the average English word use?

Key Takeaways

  • Tokens are the fundamental unit of AI text processing — typically 3-4 characters each
  • The context window is the model's total working memory for your conversation
  • Place critical information at the beginning and end of long prompts
  • Concise prompts are cheaper, faster, and often more effective
  • Different models have vastly different context window sizes