DDD Melbourne 2026
Throw away the vibes.
Reliable results from AI coding agents don't come from prompting harder. They come from engineering the context you feed the model.
The familiar story
The magic that evaporates.
Most of us feel an intoxicating high the first time we use an AI coding agent. Then we point it at a real codebase and the magic disappears.
It works flawlessly
The toy example lands perfectly. It feels like the future has arrived.
The frustration hits
Point the same agent at a large business repo and the results fall apart. The tools aren't useless — we're feeding them the wrong context.
So the fix is not a better prompt. It's a better information environment.
The premise
The instinct is to keep prompting across many turns until the agent eventually gets it right. The better goal is to fix the context so the first answer lands close to correct — that's context engineering, and it matters far more than the vibes.
Where the effort goes
Stop nudging. Curate the input.
Vibe coding leans on a hopeful loop: generate, notice it's wrong, nudge repeatedly until it converges. That loop is slow, it pollutes the conversation, and it rarely meets our bar.
Correct after the fact
Let the agent generate, then prompt it again and again. Brilliant generator, weak judge — so the loop drags on.
Invest in the input
Curate exactly what the model sees before it writes a line, so the very first answer lands close to correct.
Reflex Once you internalize the generator–judge asymmetry, the job changes.
Part one
Why a long context window isn't enough.
The common reaction is to assume the fix is simply more context. In practice, that approach makes things worse.
The needle problem
A million tokens of noise.
Cramming thousands of files into a huge window doesn't help the model find the one detail that matters. The signal drowns.
The takeaway is not "less context" as a rule. It is just enough context, delivered at the exact moment it is needed.
Four failure modes of long context
How long context breaks down.
Poisoning
A hallucinated error early in a thread lingers and keeps corrupting everything generated after it.
Distraction
Attention weights vary across a long prompt, so the model misses crucial details buried in big blocks — the needle in a haystack.
Confusion
Superfluous, redundant information dilutes the signal and pulls the model toward irrelevant details.
Clash
Contradictory pieces of code in the same window leave the model unable to tell which one to trust.
The rule of thumb
Part two
The architecture of context engineering.
A distinct layer that sits on top of prompt engineering and beneath autonomous agents and opinionated engineering workflows.
Where it sits
A layer of its own.
Prompt engineering shapes the instruction. Context engineering shapes the information environment that instruction operates within — the layer everything else builds on.
Keeping context lean and relevant
Four architectural tactics.
Externalizing context
Move context out of volatile chat history into a shared space the human and agent edit as one source of truth.
Tool loadout
Restrict active tools, MCP servers, and skills to only what the problem needs, so the model isn't overwhelmed by options.
Compression
Summarize to condense with some loss, or compact by swapping full text for references the agent can retrieve later.
Isolation
Quarantine work into separate threads or sub-agent scopes so no single agent drowns in the broader orchestration.
Part three
Practical workflows.
Theory is only useful if it changes how you work on Monday morning. Two workflows turn these ideas into repeatable practice.
Workflow one
The Breadcrumb Protocol.
A lightweight human-and-agent pattern built around a single markdown scratchpad. The conversation is disposable. The file is durable.
Plan & break down
Human and agent co-author a task breakdown in a markdown file before any code is generated.
Iterate & log
As the agent executes, the file is continuously updated with state, decisions, and discoveries.
Quarantine failures
When a thread is polluted, abandon it. Keep the file, note why it failed, and feed it to a fresh session.
Workflow two
Research, Plan, Implement, Review.
The same principles scaled across a team. Microsoft's open-source
hve-core structures development into a constrained, multi-step pipeline where each stage hands a
clean, scoped context to the next.
Isolation Each stage is the isolation tactic applied at the level of a whole development process.
Takeaways for 2026 and beyond
Practices worth adopting.
- Watch your context threshold. Past roughly
60%capacity, compact it or start a fresh thread rather than pushing on. - Treat sub-agents as functions. Resist anthropomorphizing them — each is a discrete, tightly scoped step with clear inputs and outputs.
- Shift the review left. Don't wait for the PR to review a wall of green text. Review incrementally as the work happens.
- Master one harness. Stop chasing every new tool. Pick a core stack, learn its context limits, and get good at harness engineering.
Wrapping up
Human expertise hasn't become less valuable. The role shifts from typing code toward scaffolding, steering, and the systematic curation of context.
The full write-up, the talk recording, and the workflows referenced here: