Frameworks hide the mechanics of agents. Building one yourself — even a small one — is the fastest way to actually understand agentic AI. Here's the whole thing from first principles, no framework required.
The agent loop
An agent is a loop around an LLM. At each step the model looks at the goal and the history, decides on an action, you run it, and you feed the result back:
def run(goal: str, tools: dict) -> str:
history = [system_prompt(goal, tools)]
for _ in range(MAX_STEPS):
step = llm.complete(history) # model decides
if step.type == "final":
return step.answer
result = tools[step.tool](**step.args) # you act
history.append(observation(step, result))
return "stopped: step limit reached"
That's it. Everything else — memory, planning, guardrails — hangs off this loop.
Tools: how the agent acts
Tools are just functions the model can call with structured arguments — this is function calling. Keep each tool small, typed, and least-privilege:
def search_docs(query: str) -> list[str]: ...
def run_sql(query: str) -> list[dict]: ... # read-only role!
Describe them to the model with names, argument schemas, and one-line docs. The model returns which tool to call and with what arguments; your code runs it. Never let the model execute arbitrary code or hold write access it doesn't need.
Memory
Two kinds, and you usually want both:
- Working memory — the running history of steps in the current task.
- Long-term memory — facts stored in a database or vector store and retrieved when relevant (this is RAG used as a tool).
Summarize working memory as it grows so you don't blow the context window.
Planning and control flow
Simple agents decide the next action one step at a time. For harder tasks, prompt
the model to draft a short plan first, then execute it step by step. Always cap the
loop (MAX_STEPS) and define a clear stop condition — runaway agents are a cost and
safety problem.
Guardrails: what makes it shippable
A demo stops at the loop. A production agent adds:
- Input isolation — treat retrieved/tool content as untrusted; defend against prompt injection.
- Least-privilege tools — read-only by default; approvals for risky actions.
- Output validation — enforce schemas on what the agent returns.
- Human-in-the-loop — a checkpoint before anything irreversible.
- Evals — a test set so you can prove it behaves.
Next steps
You now have the mental model. Go deeper with what is agentic AI?, and see whether you even need a framework in LangChain vs LlamaIndex. When you're ready to go end-to-end, follow the Agentic AI learning path.