You may have noticed the term “Context Engineering” becoming more prevalant, replacing the formerly common “Prompt Engineering”.

To understand why, It helps to take a step back and look at the role of context in Large Language Models.

Since LLMs are inherently stateless, they rely on context windows for state. They effectively become the working memory of the LLM. The absolute size of these windows continue to grow, but will likely continue to be finite for the foreseeable future.

So, there are be cost, latency & performance incentives to optimize the use of this scarce resource.

"Prompt Engineering" was coined when it was typical to have a one-off LLM inference run with a single prompt for context.

But now, with the advent of agents that run for longer periods and create more ongoing, “on-the-fly” context, we need more sophisticated techniques to manage this context over the lifespan of the agent to ensure the successful completion of its objectives.

Context Engineering is the act of ensuring the contents of the context window at any given time are optimized for the specific goals of the agent’s trajectory.

Drew Breunig posted an astute analysis of the different ways context can breakdown as it grows.

https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html

He identified 4 different categories of “Context Rot” (another new, yet apt term!):

1. Context Poisoning: When a hallucination or other incorrect data compromises the integrity of the context

2. Context Distraction: When the context overwhelms the knowledge embedded in the model training

3. Context Confusion: When superfluous context dilutes the analysis

4. Context Clash: When parts of the context contradict each other

These categories of problems are quite novel in a world where we’ve become accustomed to virtually unlimited resources, whether it be storage or memory. In some ways, it harkens back to the early days of computing, which were severely resource-constrained, as well as to the qualities of the human brain, which, while extraordinary, has its own foibles.

We’ve developed techniques to get the most out of our brains, whether related to memory or focus. We can apply some of these same psychological concepts to Context Engineering architecture.

In our agentic workflows, we’re using some of the following techniques to address these challenges.

Context Storage and Retrieval

Taking the key details and concepts from our context and temporarily moving it to another location, like a simple scratchpad, or a semantic datastore. Think of it like a human taking notes in a meeting, easing the cognitive load of having to keep everything in their head. We can always pull this information back into the context window when we need it.

Compressing Context

Compressing the knowledge in our context window so that’s it’s represented more efficiently, and can store more of it without running out of space. We have similar mental tricks we perform to make it easier to remember complicated facts, like mnemonic devices.

Isolating Context

The same way humans struggle to juggle many ideas or tasks at once, streamlining our context helps our AI model stay focused specifically on the task at hand, without getting distracted by knowledge that isn’t relevant.

As we move towards more complex, multi-agent architectures, each updating their own context via knowledge from MCP, or making tool calls, this Context Engineering becomes even more intricate and consequential.

It’s fascinating how the current challenges of modern AI engineering overlap so closely with characteristics of the human brain, and can benefit from the application of well established psychological theories and techniques.

The rise of “Context Engineering”