An AI agent is a large language model put in a loop. It plans by breaking a goal into steps, acts by calling tools to read data, run code, send messages, or query a system, observes the real result that comes back, and repeats until the goal is met. Three things make this work: the model is the reasoning engine that decides, tools are its hands that touch the outside world, and memory is what makes it stateful, because an LLM on its own forgets everything once its context window fills. That is the whole idea. Everything else is engineering around those three parts.

This article maps that three-part anatomy the way the people building agents actually describe it. Anthropic, AWS, Google, and IBM use slightly different words, but they converge on the same model. By the end you will have the canonical mental model, in one read, without the jargon. If you would rather we do this for you, see how we run generative AI architecture, but everything here is yours to use on your own.

What is an AI agent, in one sentence?

AWS gives the plain definition: an AI agent is software that can interact with its environment, collect data, and use that data to perform self-directed tasks that meet predetermined goals. The key word is self-directed. Humans set the goal, but the agent independently chooses the actions it takes to get there.

That distinction is what separates an agent from a chatbot and from old-fashioned automation. A chatbot answers and stops. Traditional automation follows a fixed script you wrote in advance and breaks the moment reality does not match the script. An agent reads context, makes a judgment call, takes an action, sees the result, and adjusts. It can handle the messy, multi-step work that used to need a person.

What is the plan, act, observe loop?

The loop is the heartbeat of every agent. Strip away the branding and all four major sources describe the same cycle:

  1. Plan. The model breaks the goal into smaller, ordered steps. AWS calls this the planning module, which sequences the steps logically before work begins.
  2. Act. The agent calls a tool to do something real: fetch a record, run code, send an email, query a database.
  3. Observe. The result comes back from the environment, and the agent reads it. Anthropic stresses that gaining this ground truth from the environment at each step is what keeps the agent honest, instead of confidently making things up.
  4. Repeat. With the new information, the agent re-plans and acts again, looping until it reaches the goal or a stopping point.

AWS frames the same loop as four stages: determine goals, acquire information, implement tasks, and evaluate progress against objectives. Google calls the part of the system that runs this loop the orchestration layer, and says it continues until the agent has reached its goal or a stopping point. Different labels, identical machinery.

One nuance worth knowing. IBM distinguishes planning agents, which anticipate future states and generate a full action plan before they execute, from reactive agents, which respond one step at a time. Most useful agents blend both: they sketch a plan, then adapt it as the observe step feeds back reality.

What does the model do? (the brain)

The model is the reasoning engine. Google calls it the agent's brain and the central decision maker. It is the part that interprets the goal, decides which tool to use, and judges whether the last result moved things forward.

How does it actually reason? A few established techniques show up across the field:

  • Chain-of-Thought. The model decomposes a problem into intermediate logical steps instead of jumping to an answer. IBM notes that agents adjust their strategies using this kind of step-by-step reasoning.
  • ReAct. The model alternates between verbal reasoning and task-specific actions, thinking and then doing in a tight cycle. This is the pattern most agent loops are built on.
  • Tree-of-Thoughts. The model explores a branching tree of reasoning paths rather than a single line, useful when a problem has many possible approaches.

The important thing for a buyer to understand: the model does not contain a special "agent" mode. It is the same kind of LLM you have used in a chat window. What turns it into an agent is wrapping it in the loop, the tools, and the memory described here.

What are tools? (the hands and eyes)

A model on its own can only produce text. Tools are how it touches the world. Google calls them the agent's hands and eyes, and the metaphor is exact: tools are how the agent both senses (reads data) and acts (changes something).

Anthropic frames the basic building block as an LLM enhanced with augmentations such as retrieval, tools, and memory. AWS lists the everyday examples: tools let an agent retrieve data, send emails, run code, query databases, or even control hardware. Google groups them into three types worth knowing:

  • Extensions. A standardized bridge to an external API, so the agent can scale to many systems through a common interface.
  • Functions. Specific capabilities the agent can call, like a single operation in your own software.
  • Data Stores. Vector databases and retrieval (RAG) that give the agent up-to-date, grounded information instead of relying only on what the model memorized during training.

Here is where the theory meets reality. Anthropic's three core design principles for agents are: keep it simple, keep it transparent (show the planning steps), and carefully craft the agent-computer interface (the ACI). That last one is the quiet make-or-break. A vague, badly documented tool produces a confused, error-prone agent. A clear tool definition produces a reliable one. Wiring tools well is engineering work, not a checkbox.

What is agent memory, and why does it matter so much?

This is the part most explainers undersell. IBM states the problem bluntly: LLMs are stateless and do not inherently remember things. Every turn starts from a blank slate. Memory is the layer that lets an agent learn from past interactions, retain information, and maintain context. Without it, your agent forgets the customer's name, the plan it made, and the result of the tool it just called, the instant the context window fills.

Memory comes in two tiers, and the long-term tier has three flavors. AWS and IBM line up on this:

Memory typeWhat it holdsEveryday example
Short-termThe live context window, the current conversationThe chat history in the task you are running now
Long-term: episodicSpecific past eventsWhat happened in a customer's previous ticket
Long-term: semanticStructured facts, definitions, and rulesYour product catalog, your policies
Long-term: proceduralLearned skills and behaviorsHow to run your refund process, step by step

Google places memory, state, reasoning, and planning together in the orchestration layer, the part it calls the agent's nervous system. That is the right mental picture: memory is not a bolt-on, it is woven through the loop.

And memory is where real-world reliability and cost actually live. Anthropic published numbers from its memory tool, a file-based system where the model can create, read, update, and delete files in a dedicated memory directory that persists across conversations and sits outside the context window. Paired with context editing, which automatically clears stale tool calls as the model nears its token limit, the results were not subtle:

  • Memory tool plus context editing improved agentic-search performance by 39% over baseline on Anthropic's internal multi-step evaluation.
  • Context editing alone improved performance by 29% on the same evaluation.
  • In a 100-turn web-search test, context editing cut token consumption by 84% and let agents finish workflows that would otherwise have failed from context exhaustion.

Read those numbers again. The difference between an agent that works and one that runs out of memory mid-task is largely a memory and context-management decision. That is why we treat memory as the hero of the system, not the third box on a diagram.

Where do all four sources agree?

For a non-technical buyer, the reassuring part is the consensus. Strip away each company's vocabulary and the anatomy is identical:

The partAnthropicAWSGoogleIBM
ReasoningThe LLM that directs its own processFoundation model as reasoning engineThe model, the brain and decision makerAgentic reasoning, decision-making
DoingTools and retrieval augmentationsTools (APIs, code, databases)Tools, the hands and eyesAction module
RememberingMemory plus context managementShort-term and long-term memoryMemory in the orchestration layerStateless LLM made stateful by memory
The loopGround truth from the environment each stepDetermine, acquire, implement, evaluateOrchestration layer loops to the goalPlan, then act, then adapt

Same machine, four dialects. A model that plans and reasons, tools that act, memory that persists, all running in a loop that checks reality at every step.

So why is building an agent still hard?

If the anatomy is this clear, why do so many agent projects stall? Because every source describes the loop as if it runs itself, and none of them mention who keeps it running.

In practice, the hard part is not the loop. It is the engineering around it. Anthropic's own guidance is to start with the simplest thing that works (often a fixed workflow, not a fully autonomous agent), define every tool with care, give the agent honest feedback from the environment at each step, and manage context so it does not run out of memory in the middle of a task. Each of those is ongoing work: designing the tool interface, structuring the memory, choosing a context-management strategy, and building the evaluation loop that tells you whether the agent is getting better or quietly drifting.

That is exactly the gap most companies cannot staff. You now understand the mechanism. Turning the mechanism into something that runs reliably inside your business, day after day, is a different job. It is the one we do: we plan, build, and run the agents (the tools, the memory, the context strategy, and the evals) inside your company, so you get an operating system instead of a science project.

If you want the canonical mental model turned into a working agent, book a free consultation below and we will map your first one together.