Each phase asks one question and ends with a gate. You don't advance until the gate is green. Phases 4–6 are a loop, not a line.
Phases 4–6 are a loop, not a line: pick a model → build → eval → (too weak? climb a tier or add RAG; too expensive? drop a tier or shrink the prompt) → eval again. You exit when the eval is green and the cost fits the budget. That loop is the whole job.