Lesson 3 · Beginner

Tokens, Context Windows, and Prompts

Understand how text is split into tokens, why context windows matter, and how prompt size affects quality and cost.

Read the explanation carefully, then review the examples and coding section. The goal is to understand both the concept and how it appears inside a real application workflow.

Explanation

Models do not see raw sentences the way humans do; they process chunks called tokens.

A context window is the amount of prompt and response text a model can consider at one time.

Good prompts fit the available context and prioritize the most useful instructions and evidence.

Why this topic matters in practice

In generative AI products, the model is only one part of the system. The surrounding workflow determines whether the output is useful, safe, and maintainable. This lesson matters because it helps you connect the idea to tasks such as tutoring, search, copilots, business assistants, and production automation.

Examples

Long documents

If a document exceeds the context window, you may need chunking, summarization, or retrieval.

Instruction priority

A short, clear system instruction often works better than a long and repetitive prompt.

Conversation history

When chat history grows too long, older messages may need summarization or trimming.

Estimate token-heavy prompts with simple heuristics

The code below is intentionally concise so the underlying pattern stays clear. It focuses on the application logic you can reuse, even if you later switch model providers or deployment environments.

def rough_token_estimate(text: str) -> int:
    # A rough rule of thumb for English prose
    return max(1, len(text) // 4)

prompt = """You are a tutor.
Explain machine learning to a beginner.
Use 5 bullet points and one real-world example.
"""

estimated_tokens = rough_token_estimate(prompt)
print("Estimated tokens:", estimated_tokens)

How the coding section works

This rough estimate is only a planning tool, not a replacement for a provider tokenizer.
Token awareness helps you control prompt size, cost, and context usage.
In production, use the tokenizer for the exact model you deploy.

Implementation advice

When turning this lesson into a real feature, think beyond the code snippet itself. Decide what inputs should be allowed, how you will validate outputs, how you will recover from errors, and how you will measure whether the feature is actually helping users. Those surrounding choices often determine whether an AI feature feels polished or unreliable.

Summary / key takeaways

Tokens are the units models actually process.
Context windows limit how much information the model can use at once.
Prompt design must balance clarity, completeness, and length.

Exercises

Rewrite a long instruction into a shorter prompt without losing meaning.
Why might an overloaded context window reduce answer quality?
Estimate the token length of three prompts you might use on your site.