Week-by-Week Schedule

Week 17 — Prompt Engineering + Structured Output + LLM APIs

System/User/Assistant structure, OpenAI API basics, async API calls · Anthropic API + Claude vision API (multimodal bonus) · Few-shot prompting, chain-of-thought, self-consistency, ReAct · Structured output: JSON mode, function calling · …

7 daily tasks

Week 18 — RAG Foundations + Vector Databases

What is RAG? Why it exists. RAG vs fine-tuning decision · Chunking strategies: fixed, semantic, recursive, late chunking · Embedding models: BGE, e5, nomic, OpenAI text-embedding-3 · Vector databases: Qdrant + pgvector · …

7 daily tasks

Week 19 — Advanced RAG: Reranking, Query Rewriting, Evaluation

Why basic RAG fails in production · Cross-encoder rerankers: bge-reranker, cohere-rerank · Query rewriting + HyDE (Hypothetical Document Embeddings) · Multi-hop retrieval, parent-document retrieval · …

7 daily tasks

Week 20 — PROJECT 3: Production RAG over Real Corpus

Pick corpus, document ingestion pipeline · Implement chunking · Embedding + Qdrant storage · Retrieval + reranker integration · …

7 daily tasks

Week 21 — Fine-Tuning Theory: LoRA, QLoRA, PEFT

When fine-tuning beats prompting · Full FT vs PEFT trade-offs · LoRA math: W' = W + AB, why low-rank works · QLoRA: 4-bit quantization + LoRA · …

7 daily tasks

Week 22 — Hands-On QLoRA Fine-Tuning

Pick base model (Qwen2.5-1.5B or Llama-3.2-1B-Instruct) and domain dataset · Prepare dataset in correct chat template format · Set up trl.SFTTrainer + bitsandbytes 4-bit + LoRA config · First training run (small) — debug issues · …

7 daily tasks

Week 23 — Agents + Tool Use + MCP

What is an AI agent? Tool calling protocols · LangGraph for stateful agents · LangGraph tutorial — build first agent · Pydantic AI / instructor for typed agent outputs · …

7 daily tasks

Week 24 — Production LLM Patterns: Observability, Streamlit, Frontend Demo

Streamlit basics — build LLM demos in Python · Wrap your RAG (Project 3) with a Streamlit UI · LLM observability: Langfuse setup + integration · Add tracing + cost tracking to your RAG project · …

7 daily tasks

Topics Covered

Every subtopic below is a separate daily task in the roadmap, with hand-picked resources (YouTube videos, docs, papers) for each.

System/User/Assistant structure, OpenAI API basics, async API calls

Anthropic API + Claude vision API (multimodal bonus)

Few-shot prompting, chain-of-thought, self-consistency, ReAct

Structured output: JSON mode, function calling

Pydantic + instructor library for typed LLM outputs

Build resume → JSON extractor with retries + exponential backoff

Token budgeting, context window management, cost tracking

What is RAG? Why it exists. RAG vs fine-tuning decision

Chunking strategies: fixed, semantic, recursive, late chunking

Embedding models: BGE, e5, nomic, OpenAI text-embedding-3

Vector databases: Qdrant + pgvector

Cosine similarity, HNSW indexing intuition

Build basic RAG: load → chunk → embed → store → retrieve → generate

Hybrid search: BM25 + dense embeddings + multimodal (CLIP) intro

Why basic RAG fails in production

Cross-encoder rerankers: bge-reranker, cohere-rerank

Query rewriting + HyDE (Hypothetical Document Embeddings)

Multi-hop retrieval, parent-document retrieval

RAG evaluation: faithfulness, answer relevance, context precision

Upgrade Week 18 RAG: + reranker + query rewriting + RAGAS

Document everything in README. Before/after metrics comparison

Pick corpus, document ingestion pipeline

Implement chunking

Embedding + Qdrant storage

Retrieval + reranker integration

FastAPI streaming endpoint + auth + rate limiting

Dockerize + deploy to Render/HF Spaces

RAGAS evaluation + README + demo video

When fine-tuning beats prompting

Full FT vs PEFT trade-offs

LoRA math: W' = W + AB, why low-rank works

QLoRA: 4-bit quantization + LoRA

SFT vs DPO vs RLHF concepts

HuggingFace PEFT library hands-on

Dataset formatting: chat templates, ShareGPT, Alpaca

Pick base model (Qwen2.5-1.5B or Llama-3.2-1B-Instruct) and domain dataset

Prepare dataset in correct chat template format

Set up trl.SFTTrainer + bitsandbytes 4-bit + LoRA config

First training run (small) — debug issues

Full training run with W&B logging

Evaluate before/after on held-out set — target ≥5% improvement

Push LoRA adapter to HF Hub with model card

What is an AI agent? Tool calling protocols

LangGraph for stateful agents

LangGraph tutorial — build first agent

Pydantic AI / instructor for typed agent outputs

MCP (Model Context Protocol)

Build tool-using agent: ≥3 tools (web search, calculator, file I/O)

Memory: short-term vs long-term, episodic

Streamlit basics — build LLM demos in Python

Wrap your RAG (Project 3) with a Streamlit UI

LLM observability: Langfuse setup + integration

Add tracing + cost tracking to your RAG project

Build simple vanilla HTML+JS frontend that calls your FastAPI

Add prompt injection guardrails + PII detection (Presidio)

Buffer: catch up on incomplete work, polish projects

Applied LLM Engineering — RAG, Fine-Tuning & AI Agents

All 5 Phases

Week-by-Week Schedule

Week 17 — Prompt Engineering + Structured Output + LLM APIs

Week 18 — RAG Foundations + Vector Databases

Week 19 — Advanced RAG: Reranking, Query Rewriting, Evaluation

Week 20 — PROJECT 3: Production RAG over Real Corpus

Week 21 — Fine-Tuning Theory: LoRA, QLoRA, PEFT

Week 22 — Hands-On QLoRA Fine-Tuning

Week 23 — Agents + Tool Use + MCP

Week 24 — Production LLM Patterns: Observability, Streamlit, Frontend Demo

Topics Covered