Lesson 15 ยท Intermediate

Retrieval-Augmented Generation (RAG) Basics

Understand the core RAG pattern: retrieve relevant information first, then ask the model to answer using that evidence.

Read the explanation carefully, then review the examples and coding section. The goal is to understand both the concept and how it appears inside a real application workflow.

Explanation

RAG helps ground answers in retrieved documents rather than only model memory.

The basic flow is query -> retrieve -> construct context -> generate answer.

RAG improves reliability when the knowledge source is domain-specific or frequently updated.

Why this topic matters in practice

In generative AI products, the model is only one part of the system. The surrounding workflow determines whether the output is useful, safe, and maintainable. This lesson matters because it helps you connect the idea to tasks such as tutoring, search, copilots, business assistants, and production automation.

Examples

Product assistant

Retrieve the latest product policy page before answering customer questions.

School guide bot

Fetch official admission information so the answer reflects current requirements.

Document QA

Search internal manuals first instead of expecting the model to remember niche procedures.

Build a simple RAG prompt

The code below is intentionally concise so the underlying pattern stays clear. It focuses on the application logic you can reuse, even if you later switch model providers or deployment environments.

def build_rag_prompt(question, retrieved_passages):
    joined_passages = "\n\n".join(retrieved_passages)
    return f'''
Answer the question using only the retrieved evidence below.
If the answer is not in the evidence, say so clearly.

Evidence:
{joined_passages}

Question:
{question}
'''.strip()

passages = [
    "Refunds may be requested within 14 days of purchase.",
    "Refund processing usually takes 5 business days."
]

print(build_rag_prompt("How long do refunds take?", passages))

How the coding section works

  • The prompt explicitly constrains the model to the retrieved evidence.
  • RAG works best when retrieval quality is high and chunking is sensible.
  • This pattern is often the starting point for production knowledge assistants.

Implementation advice

When turning this lesson into a real feature, think beyond the code snippet itself. Decide what inputs should be allowed, how you will validate outputs, how you will recover from errors, and how you will measure whether the feature is actually helping users. Those surrounding choices often determine whether an AI feature feels polished or unreliable.

Summary / key takeaways

  • RAG improves factual grounding by retrieving evidence first.
  • Retrieval quality and prompt design both affect answer quality.
  • RAG is one of the most practical generative AI patterns in production today.

Exercises

  1. Explain the difference between a plain chatbot and a RAG chatbot.
  2. Write a RAG prompt for a university FAQ system.
  3. Why should the model say 'not found' when evidence is missing?