How to Build an AI Agent From Scratch

Build a tool-using AI agent from first principles — the agent loop, function calling, memory, and the guardrails that make it safe enough to ship.

Frameworks hide the mechanics of agents. Building one yourself — even a small one — is the fastest way to actually understand agentic AI. Here's the whole thing from first principles, no framework required.

The agent loop

An agent is a loop around an LLM. At each step the model looks at the goal and the history, decides on an action, you run it, and you feed the result back:

def run(goal: str, tools: dict) -> str:
    history = [system_prompt(goal, tools)]
    for _ in range(MAX_STEPS):
        step = llm.complete(history)          # model decides
        if step.type == "final":
            return step.answer
        result = tools[step.tool](**step.args)  # you act
        history.append(observation(step, result))
    return "stopped: step limit reached"

That's it. Everything else — memory, planning, guardrails — hangs off this loop.

Tools: how the agent acts

Tools are just functions the model can call with structured arguments — this is function calling. Keep each tool small, typed, and least-privilege:

def search_docs(query: str) -> list[str]: ...
def run_sql(query: str) -> list[dict]: ...   # read-only role!

Describe them to the model with names, argument schemas, and one-line docs. The model returns which tool to call and with what arguments; your code runs it. Never let the model execute arbitrary code or hold write access it doesn't need.

Memory

Two kinds, and you usually want both:

Working memory — the running history of steps in the current task.
Long-term memory — facts stored in a database or vector store and retrieved when relevant (this is RAG used as a tool).

Summarize working memory as it grows so you don't blow the context window.

Planning and control flow

Simple agents decide the next action one step at a time. For harder tasks, prompt the model to draft a short plan first, then execute it step by step. Always cap the loop (MAX_STEPS) and define a clear stop condition — runaway agents are a cost and safety problem.

Guardrails: what makes it shippable

A demo stops at the loop. A production agent adds:

Input isolation — treat retrieved/tool content as untrusted; defend against prompt injection.
Least-privilege tools — read-only by default; approvals for risky actions.
Output validation — enforce schemas on what the agent returns.
Human-in-the-loop — a checkpoint before anything irreversible.
Evals — a test set so you can prove it behaves.

Next steps

You now have the mental model. Go deeper with what is agentic AI?, and see whether you even need a framework in LangChain vs LlamaIndex. When you're ready to go end-to-end, follow the Agentic AI learning path.