AI Engineering Full-Time

Program Roadmap

Week one through completion, laid out as a locked sprint path. Finish the active sprint and its check before the next sprint opens.

Progress

0 of 60 sprints complete

Sprint 1

Hello, Agent

Personal AI agent deployed at {handle}.devforgehq.com/my-first-agent.

Continue

Active chapter

Build the next production behavior, then prove it with a sprint check.

Content

12 weeks, 60 sprints, one completion path.

Week 1

Foundations

Ship a deployed AI agent on day one. Then make it useful.

Sprint 1

Current

Hello, Agent

Personal AI agent deployed at {handle}.devforgehq.com/my-first-agent.

25 min3 skill areasstarter

Continue

Sprint 2

Current

Give it memory

Agent remembers the last 10 messages of the conversation.

30 min3 skill areasstarter

Continue

Sprint 3

Current

Stream the response

Tokens appear as they're generated, not in one wall.

35 min3 skill areasstarter

Continue

Sprint 4

Current

First tool call

Agent fetches live weather from a real API and answers grounded.

40 min3 skill areasstarter

Continue

Sprint 5

Current

Persist to Postgres

Conversations save to Neon Postgres and survive a reload.

45 min3 skill areasstarter

Continue

Week 2

Tools & Plans

Multi-step reasoning: pick the right tool, then chain them.

Sprint 6

Locked

Tool registry pattern

A typed registry with 5 tools the agent can introspect.

35 min3 skill areascore

Sprint 7

Locked

Routing — which tool when

Agent reliably picks fetch vs calc vs search based on intent.

40 min3 skill areascore

Sprint 8

Locked

Multi-step plans

Agent decomposes 'plan a 3-day trip' into 6 sequential tool calls.

60 min3 skill areascore

Sprint 9

Locked

Cost tracking

Per-message $ cost shown in UI; logged to DB; soft cap at $0.50/day.

40 min3 skill areascore

Sprint 10

Locked

Prompt caching

System prompt cached — saves 90% input cost on repeated calls.

35 min3 skill areascore

Week 3

Embeddings & Search

Make the agent retrieve relevant context instead of guessing.

Sprint 11

Locked

Embeddings 101

Compute embeddings for 1000 doc chunks, store as JSON.

40 min3 skill areascore

Sprint 12

Locked

Vector store: pgvector

Same docs in Postgres pgvector with HNSW index — queries in <50ms.

50 min3 skill areascore

Sprint 13

Locked

Naive RAG

Q&A bot that retrieves 5 chunks and answers from them.

55 min3 skill areascore

Sprint 14

Locked

Chunking strategies

A/B test 3 chunking strategies; pick the winner with evidence.

60 min3 skill areascore

Sprint 15

Locked

RAG over a real doc set

Q&A over your project's GitHub repo — answers with file:line citations.

70 min3 skill areasstretch

Week 4

Production RAG

Naive RAG breaks at scale. Fix retrieval, citations, and failure modes.

Sprint 16

Locked

Reranking

Cohere/Voyage reranker boosts top-3 accuracy from 64% to 86%.

50 min3 skill areascore

Sprint 17

Locked

Hybrid search

Combine BM25 + vector — beats either alone on your eval set.

60 min3 skill areascore

Sprint 18

Locked

Source-grounded answers

Every claim in the answer links back to a specific chunk.

55 min3 skill areascore

Sprint 19

Locked

Empty-result handling

When retrieval returns nothing, the bot says so — does NOT confabulate.

40 min3 skill areascore

Sprint 20

Locked

RAG evals

An eval suite of 50 questions with groundedness + accuracy scores.

70 min3 skill areasstretch

Week 5

Agents

Beyond single calls — reasoning loops, planning, self-correction.

Sprint 21

Locked

ReAct loop

Agent that interleaves thought / action / observation until done.

60 min3 skill areascore

Sprint 22

Locked

Multi-agent: planner + worker

Planner decomposes; worker executes; loop coordinated by orchestrator.

75 min3 skill areasstretch

Sprint 23

Locked

Self-correction

Agent reviews its own output, catches its own mistakes 70% of the time.

65 min3 skill areascore

Sprint 24

Locked

Long-horizon decomposition

Agent handles a task requiring 20+ steps without losing the thread.

80 min3 skill areasstretch

Sprint 25

Locked

State machines for agents

Agent flow modeled as XState — every state, transition, and guard visible.

70 min3 skill areasstretch

Week 6

Production

Make it real: real users, real auth, real observability.

Sprint 26

Locked

Deploy to Vercel

App live on a custom subdomain with TLS and CDN caching.

35 min3 skill areascore

Sprint 27

Locked

Add auth (Clerk)

Email + Google login. Sessions across pages. Per-user state.

50 min3 skill areascore

Sprint 28

Locked

Multi-tenant isolation

Two users can't see each other's data — enforced at the DB layer.

60 min3 skill areascore

Sprint 29

Locked

Cost guardrails

Per-user daily + monthly $ caps. Soft warning, hard stop.

50 min3 skill areascore

Sprint 30

Locked

Structured logging + tracing

Every request traceable end-to-end — Langfuse + OTLP exporter.

65 min3 skill areascore

Week 7

Quality

If you can't measure it, you can't ship it.

Sprint 31

Locked

Eval harness from scratch

Run 100 test cases on every prompt change — see pass-rate trend.

60 min3 skill areascore

Sprint 32

Locked

LLM-as-judge

Auto-grade open-ended answers with a stronger model.

55 min3 skill areascore

Sprint 33

Locked

Regression catching

Prompt A vs prompt B — diff regressions before merging to main.

50 min3 skill areascore

Sprint 34

Locked

A/B testing models

Sonnet vs Haiku vs GPT-4o — pick winner per task with data.

60 min3 skill areasstretch

Sprint 35

Locked

Red-team prompts

20 jailbreak prompts; show your safety layer blocks 18+.

70 min3 skill areasstretch

Week 8

UX

The model is the easy part. The UX is the product.

Sprint 36

Locked

Streaming UX patterns

Cursor blink, partial-render handling, abort button.

50 min3 skill areascore

Sprint 37

Locked

Structured outputs

Agent returns guaranteed JSON matching a Zod schema.

45 min3 skill areascore

Sprint 38

Locked

Show the work

Tool-use trace visible in the UI — like Linear's inline thought.

55 min3 skill areascore

Sprint 39

Locked

Interruptible generation

User can hit ESC mid-stream; partial output saved cleanly.

40 min3 skill areascore

Sprint 40

Locked

Optimistic UI for agents

Action buttons feel instant — server confirms in background.

50 min3 skill areascore

Week 9

Multimodal

Beyond text — images, documents, audio, code, browsers.

Sprint 41

Locked

Vision input

Upload an image; agent extracts structured data from it.

55 min3 skill areascore

Sprint 42

Locked

PDF understanding

PDF → searchable + queryable. Handles tables, charts, footnotes.

70 min3 skill areasstretch

Sprint 43

Locked

Audio: STT + TTS

Voice-driven agent — Whisper in, ElevenLabs out.

65 min3 skill areascore

Sprint 44

Locked

Computer-use agent

Agent navigates a real web page and fills a form via screenshots.

90 min3 skill areasstretch

Sprint 45

Locked

Code-writing agent

Agent writes + tests + ships a small CLI tool end-to-end.

90 min3 skill areasstretch

Week 10

Scale

What breaks at 10k users? Fix it before they show up.

Sprint 46

Locked

Prompt caching at scale

90% hit rate on system-prompt cache — cost down 6×.

50 min3 skill areascore

Sprint 47

Locked

Request batching

Batch API saves 50% on async eval workloads.

55 min3 skill areascore

Sprint 48

Locked

Latency budgets

p50 TTFT < 600ms, p95 < 1.4s, with dashboards proving it.

60 min3 skill areascore

Sprint 49

Locked

Provider failover

Anthropic down → OpenAI takes over in <2s. Users don't notice.

65 min3 skill areasstretch

Sprint 50

Locked

Edge inference

Lightweight classifier runs on Cloudflare Workers AI — 30ms p95.

60 min3 skill areasstretch

Week 11

Hardening

Production AI = security AI. Adversaries are now your users.

Sprint 51

Locked

Prompt injection defense

20 documented attacks; your defense blocks 17+.

70 min3 skill areasstretch

Sprint 52

Locked

Jailbreak resistance

Run published jailbreak corpus — measure & report defense rate.

60 min3 skill areascore

Sprint 53

Locked

PII redaction pipeline

Names/emails/SSNs scrubbed pre-prompt + post-response.

55 min3 skill areascore

Sprint 54

Locked

Rate limiting + abuse

Bot floods blocked at the edge — real users never see slowdowns.

50 min3 skill areascore

Sprint 55

Locked

SOC 2 readiness

Controls matrix + evidence pipeline ready for an auditor.

75 min3 skill areasstretch

Week 12

Capstone

One real project. Ship it. Defend it. Add it to your portfolio.

Sprint 56

Locked

Capstone: design

Written spec, success metric, user research notes — reviewed by mentor.

90 min3 skill areascapstone

Sprint 57

Locked

Capstone: v1 ship

Live URL anyone can use. Works end-to-end on the happy path.

180 min3 skill areascapstone

Sprint 58

Locked

Capstone: add evals

Eval suite with 50+ cases; dashboard shows current pass rate.

120 min3 skill areascapstone

Sprint 59

Locked

Capstone: production hardening

Cost caps, rate limits, observability, auth — all green.

120 min3 skill areascapstone

Sprint 60

Locked

Capstone: demo day + handoff

5-min recorded demo, decision log, mentor endorsement letter, portfolio entry.

120 min3 skill areascapstone

Completion

Portfolio defense and career proof packet

Unlocks after Sprint 60. Includes your case study, GitHub proof, resume bullets, interview story, and final demo package.

AI Engineering Full-Time

Program Roadmap

Week one through completion, laid out as a locked sprint path. Finish the active sprint and its check before the next sprint opens.

Progress

0 of 60 sprints complete

Sprint 1

Hello, Agent

Personal AI agent deployed at {handle}.devforgehq.com/my-first-agent.

Continue

Active chapter

Build the next production behavior, then prove it with a sprint check.

Content

12 weeks, 60 sprints, one completion path.

Week 1

Foundations

Ship a deployed AI agent on day one. Then make it useful.

Sprint 1

Current

Hello, Agent

Personal AI agent deployed at {handle}.devforgehq.com/my-first-agent.

25 min3 skill areasstarter

Continue

Sprint 2

Current

Give it memory

Agent remembers the last 10 messages of the conversation.

30 min3 skill areasstarter

Continue

Sprint 3

Current

Stream the response

Tokens appear as they're generated, not in one wall.

35 min3 skill areasstarter

Continue

Sprint 4

Current

First tool call

Agent fetches live weather from a real API and answers grounded.

40 min3 skill areasstarter

Continue

Sprint 5

Current

Persist to Postgres

Conversations save to Neon Postgres and survive a reload.

45 min3 skill areasstarter

Continue

Week 2

Tools & Plans

Multi-step reasoning: pick the right tool, then chain them.

Sprint 6

Locked

Tool registry pattern

A typed registry with 5 tools the agent can introspect.

35 min3 skill areascore

Sprint 7

Locked

Routing — which tool when

Agent reliably picks fetch vs calc vs search based on intent.

40 min3 skill areascore

Sprint 8

Locked

Multi-step plans

Agent decomposes 'plan a 3-day trip' into 6 sequential tool calls.

60 min3 skill areascore

Sprint 9

Locked

Cost tracking

Per-message $ cost shown in UI; logged to DB; soft cap at $0.50/day.

40 min3 skill areascore

Sprint 10

Locked

Prompt caching

System prompt cached — saves 90% input cost on repeated calls.

35 min3 skill areascore

Week 3

Embeddings & Search

Make the agent retrieve relevant context instead of guessing.

Sprint 11

Locked

Embeddings 101

Compute embeddings for 1000 doc chunks, store as JSON.

40 min3 skill areascore

Sprint 12

Locked

Vector store: pgvector

Same docs in Postgres pgvector with HNSW index — queries in <50ms.

50 min3 skill areascore

Sprint 13

Locked

Naive RAG

Q&A bot that retrieves 5 chunks and answers from them.

55 min3 skill areascore

Sprint 14

Locked

Chunking strategies

A/B test 3 chunking strategies; pick the winner with evidence.

60 min3 skill areascore

Sprint 15

Locked

RAG over a real doc set

Q&A over your project's GitHub repo — answers with file:line citations.

70 min3 skill areasstretch

Week 4

Production RAG

Naive RAG breaks at scale. Fix retrieval, citations, and failure modes.

Sprint 16

Locked

Reranking

Cohere/Voyage reranker boosts top-3 accuracy from 64% to 86%.

50 min3 skill areascore

Sprint 17

Locked

Hybrid search

Combine BM25 + vector — beats either alone on your eval set.

60 min3 skill areascore

Sprint 18

Locked

Source-grounded answers

Every claim in the answer links back to a specific chunk.

55 min3 skill areascore

Sprint 19

Locked

Empty-result handling

When retrieval returns nothing, the bot says so — does NOT confabulate.

40 min3 skill areascore

Sprint 20

Locked

RAG evals

An eval suite of 50 questions with groundedness + accuracy scores.

70 min3 skill areasstretch

Week 5

Agents

Beyond single calls — reasoning loops, planning, self-correction.

Sprint 21

Locked

ReAct loop

Agent that interleaves thought / action / observation until done.

60 min3 skill areascore

Sprint 22

Locked

Multi-agent: planner + worker

Planner decomposes; worker executes; loop coordinated by orchestrator.

75 min3 skill areasstretch

Sprint 23

Locked

Self-correction

Agent reviews its own output, catches its own mistakes 70% of the time.

65 min3 skill areascore

Sprint 24

Locked

Long-horizon decomposition

Agent handles a task requiring 20+ steps without losing the thread.

80 min3 skill areasstretch

Sprint 25

Locked

State machines for agents

Agent flow modeled as XState — every state, transition, and guard visible.

70 min3 skill areasstretch

Week 6

Production

Make it real: real users, real auth, real observability.

Sprint 26

Locked

Deploy to Vercel

App live on a custom subdomain with TLS and CDN caching.

35 min3 skill areascore

Sprint 27

Locked

Add auth (Clerk)

Email + Google login. Sessions across pages. Per-user state.

50 min3 skill areascore

Sprint 28

Locked

Multi-tenant isolation

Two users can't see each other's data — enforced at the DB layer.

60 min3 skill areascore

Sprint 29

Locked

Cost guardrails

Per-user daily + monthly $ caps. Soft warning, hard stop.

50 min3 skill areascore

Sprint 30

Locked

Structured logging + tracing

Every request traceable end-to-end — Langfuse + OTLP exporter.

65 min3 skill areascore

Week 7

Quality

If you can't measure it, you can't ship it.

Sprint 31

Locked

Eval harness from scratch

Run 100 test cases on every prompt change — see pass-rate trend.

60 min3 skill areascore

Sprint 32

Locked

LLM-as-judge

Auto-grade open-ended answers with a stronger model.

55 min3 skill areascore

Sprint 33

Locked

Regression catching

Prompt A vs prompt B — diff regressions before merging to main.

50 min3 skill areascore

Sprint 34

Locked

A/B testing models

Sonnet vs Haiku vs GPT-4o — pick winner per task with data.

60 min3 skill areasstretch

Sprint 35

Locked

Red-team prompts

20 jailbreak prompts; show your safety layer blocks 18+.

70 min3 skill areasstretch

Week 8

UX

The model is the easy part. The UX is the product.

Sprint 36

Locked

Streaming UX patterns

Cursor blink, partial-render handling, abort button.

50 min3 skill areascore

Sprint 37

Locked

Structured outputs

Agent returns guaranteed JSON matching a Zod schema.

45 min3 skill areascore

Sprint 38

Locked

Show the work

Tool-use trace visible in the UI — like Linear's inline thought.

55 min3 skill areascore

Sprint 39

Locked

Interruptible generation

User can hit ESC mid-stream; partial output saved cleanly.

40 min3 skill areascore

Sprint 40

Locked

Optimistic UI for agents

Action buttons feel instant — server confirms in background.

50 min3 skill areascore

Week 9

Multimodal

Beyond text — images, documents, audio, code, browsers.

Sprint 41

Locked

Vision input

Upload an image; agent extracts structured data from it.

55 min3 skill areascore

Sprint 42

Locked

PDF understanding

PDF → searchable + queryable. Handles tables, charts, footnotes.

70 min3 skill areasstretch

Sprint 43

Locked

Audio: STT + TTS

Voice-driven agent — Whisper in, ElevenLabs out.

65 min3 skill areascore

Sprint 44

Locked

Computer-use agent

Agent navigates a real web page and fills a form via screenshots.

90 min3 skill areasstretch

Sprint 45

Locked

Code-writing agent

Agent writes + tests + ships a small CLI tool end-to-end.

90 min3 skill areasstretch

Week 10

Scale

What breaks at 10k users? Fix it before they show up.

Sprint 46

Locked

Prompt caching at scale

90% hit rate on system-prompt cache — cost down 6×.

50 min3 skill areascore

Sprint 47

Locked

Request batching

Batch API saves 50% on async eval workloads.

55 min3 skill areascore

Sprint 48

Locked

Latency budgets

p50 TTFT < 600ms, p95 < 1.4s, with dashboards proving it.

60 min3 skill areascore

Sprint 49

Locked

Provider failover

Anthropic down → OpenAI takes over in <2s. Users don't notice.

65 min3 skill areasstretch

Sprint 50

Locked

Edge inference

Lightweight classifier runs on Cloudflare Workers AI — 30ms p95.

60 min3 skill areasstretch

Week 11

Hardening

Production AI = security AI. Adversaries are now your users.

Sprint 51

Locked

Prompt injection defense

20 documented attacks; your defense blocks 17+.

70 min3 skill areasstretch

Sprint 52

Locked

Jailbreak resistance

Run published jailbreak corpus — measure & report defense rate.

60 min3 skill areascore

Sprint 53

Locked

PII redaction pipeline

Names/emails/SSNs scrubbed pre-prompt + post-response.

55 min3 skill areascore

Sprint 54

Locked

Rate limiting + abuse

Bot floods blocked at the edge — real users never see slowdowns.

50 min3 skill areascore

Sprint 55

Locked

SOC 2 readiness

Controls matrix + evidence pipeline ready for an auditor.

75 min3 skill areasstretch

Week 12

Capstone

One real project. Ship it. Defend it. Add it to your portfolio.

Sprint 56

Locked

Capstone: design

Written spec, success metric, user research notes — reviewed by mentor.

90 min3 skill areascapstone

Sprint 57

Locked

Capstone: v1 ship

Live URL anyone can use. Works end-to-end on the happy path.

180 min3 skill areascapstone

Sprint 58

Locked

Capstone: add evals

Eval suite with 50+ cases; dashboard shows current pass rate.

120 min3 skill areascapstone

Sprint 59

Locked

Capstone: production hardening

Cost caps, rate limits, observability, auth — all green.

120 min3 skill areascapstone

Sprint 60

Locked

Capstone: demo day + handoff

5-min recorded demo, decision log, mentor endorsement letter, portfolio entry.

120 min3 skill areascapstone

Completion

Portfolio defense and career proof packet

Unlocks after Sprint 60. Includes your case study, GitHub proof, resume bullets, interview story, and final demo package.