0% found this document useful (0 votes)
64 views12 pages

Complete Guide To AI Agents

The document is a comprehensive guide on building AI agents from scratch, detailing a structured approach that includes defining the agent's role, designing input/output protocols, and adding memory and reasoning capabilities. It outlines various steps and tools necessary for development, such as using Pydantic AI, LangChain, and OpenAI frameworks, while also addressing real-world use cases and evaluation metrics. Additionally, it discusses the evolution of AI agents, including self-evolving models and their applications across different industries.

Uploaded by

yeowchong1186
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views12 pages

Complete Guide To AI Agents

The document is a comprehensive guide on building AI agents from scratch, detailing a structured approach that includes defining the agent's role, designing input/output protocols, and adding memory and reasoning capabilities. It outlines various steps and tools necessary for development, such as using Pydantic AI, LangChain, and OpenAI frameworks, while also addressing real-world use cases and evaluation metrics. Additionally, it discusses the evolution of AI agents, including self-evolving models and their applications across different industries.

Uploaded by

yeowchong1186
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

COMPLETE GUIDE

TO
AI AGENTS
Dr. Maryam Miradi

Build from Scratch


HOW TO BUILD
AI AGENTS
from Scratch
Dr. Maryam Miradi

Step 1 》DEFINE THE


AGENT’S ROLE
Step 2 》STRUCTURED
DESIGN Step 3 》& ADD
TUNE BEHAVIOR
PROTOCOL
AND GOAL INPUT & OUTPUT ✸ Start with role-based
system prompts
✸ What will your agent do? ✸ Use Pydantic AI or
✸ Use Prompt Tuning or
✸ Who is it helping? JSON Schemas to
Prefix Tuning
✸ What kind of output will it define what the agent
✸ Use MCP to standardize
generate? receives and returns.
→ Example: A medical ✸ Avoid messy text —
how your agents
→ Tools: MCP, GPT-4,
assistant agent that reads think like an API.
X-rays, summarizes → Tool: Pydantic AI,
Claude, Prompt Tuning
findings, and speaks LangChain Output
results. Parsers

Step 4 》 ADD
REASONING AND
Step 5 》MULTI-AGENT
STRUCTURE Step 6 》AND
ADD MEMORY
LONG-TERM
TOOL USE LOGIC (IF NEEDED) CONTEXT (RAG)
✸ Equip the agent with ✸ Use orchestration ✸ Does your agent
reasoning frameworks: frameworks to define need to remember
☆ ReAct (Reasoning + agent roles and what happened earlier?
Action) coordination. ✸ Use conversational
☆ Chain-of-Thought ✸ Create Planner, memory, summary
✸ Allow access to tools like Researcher, Reporter memory, or vector-
web search, code agents — each with its based memory.
interpreters, or document
retrievers.
own input/output
schema.
→ Tools: Zep,
LangChain Memory,
→ Tools: LangChain, → Tools: CrewAI, ChromaDB, FAISS
OpenAI Tools, ReAct LangGraph, OpenAI
Framework Swarm

Step 7 》VISION
ADD VOICE OR Step 8 》OUTPUT
DELIVER THE Step 9 》WRAP IN A UI
✸ Create a front-end
CAPABILITIES ✸ Format outputs into
(OPTIONAL) Markdown → PDF or JSON
✸ Use Gradio, Streamlit,
or FastAPI
✸ Text-to-speech: Use
✸ Output must be readable
and parsable
→ This is what turns your
Coqui or ElevenLabs
✸ Image understanding:
→ Tools: Pydantic AI,
agent into a product.
LangChain Parsers
Use GPT-4o or LLaMA 3.2
Vision
→ Let your agent see and Step 10 》EVALUATE AND MONITOR
speak. ✸ Run test prompts and toolchains to check reliability.
✸ Use logs, benchmarks, and feedback to improve over time.
→ Tools: MCP Logs, OpenAI Evaluation API, Custom Metrics
Dashboards
📦 NOW INCLUDES: MCP · EVALUATION
MASTERING
AI AGENTS
Dr. Maryam Miradi
CheatSheet
WHAT ARE AI TYPES OF AI 10 QUESTIONS TO ASK BEFORE
AGENTS? AGENTS YOU CONSIDER AN AI AGENT
✸ AI agents are software ✸ Fixed Automation
✸ What is the complexity of the task?
applications that use large ✸ How often does the task occur?
✸ LLM-Enhanced ✸ What is the expected volume of data or queries?
language models (LLMs) to ✸ ReAct
autonomously perform ✸ Does the task require adaptability?
✸ ReAct + RAG ✸ Can the task benefit from learning & evolving over time?
specific tasks
→ Tasks that demand
✸ Tool-Enhanced
✸ Self-Reflecting
✸ What level of accuracy is required?
✸ Is human expertise or emotional intelligence essential?
complex decision- ✸ Memory-Enhanced ✸ What are the privacy and security implications?
making, autonomy, and ✸ Environment Controllers ✸ What are the regulatory and compliance requirements?
adaptability. ✸ Self-Learning ✸ What is the cost-benefit analysis?

REAL-WORLD TOP FRAMEWORKS METRICS FOR


USE CASES IN PYTHON CODE EVALUATING AI AGENTS
✸ Clinical History Search Engine ✸ LangChain ✸ Latency & Speed – Tool call latency, task duration
✸ Predictive Maintenance Agent ✸ LangGraph ✸ API Efficiency – Call frequency, token usage
✸ Protocol Summarizer ✸ Cost & Resources – Task cost, context usage
✸ CrewAI
✸ 10Q/10K Documents Extraction ✸ Error Rate – LLM call failures
✸ OpenAI Swarm
✸ Route Optimization System ✸ Task Success – Agent and task success rates
✸ Marketing Campaign Agent ✸ Pydantic AI
✸ LlamaIndex ✸ Human Input – Steps per task, human help needed
✸ SOAP Notes Generator ✸ Instruction Match – Follows given instructions
✸ Inventory Management ✸ AutoGen
✸ Meta AgentKit ✸ Output Format – Format and context accuracy
Assistant
✸Vertex AI Agent Builder ✸ Tool Use – Tool selection, arguments, success
✸ Anti-Fraud Agent

DEVELOPMENT LLM ISSUES & PRODUCTION


ISSUES & FIXES FIXES ISSUES & FIXES
Poor prompts →
✸ Hard to steer Specialized ✸ No guardrails →Rule filters,
prompts, hierarchy, fine-tuning human oversight, ethics
✸ Define objectives
✸ Craft detailed personas ✸ Too expensive → Reduce frameworks
✸ Scaling limits → Scalable infra,
✸ Use effective prompting context, use smaller/cloud
models resource control, performance
Weak evaluation
✸ Real-world tasks ✸ Planning fails→ Decompose tracking
✸ No recovery →Add
✸ Continuous evaluation tasks, multi-plan selection,
refine steps redundancy, automate failover,
✸ Feedback loops
✸ Weak reasoning → Improve stateful memory
✸ Infinite loops →Define stop
reasoning, fine-tune, use
specialist agents rules, smarter planning, monitor

✸ Tool errors Set behavior
parameters, validate outputs,
add verification
AI Agents Ecosystem
Real-World Use Cases, Workflow, and RAG

Dr. Maryam Miradi


Self-Evolving
AI Agents Dr. Maryam Miradi

Explained
From Static Models to Lifelong Agents
LLM-centric learning Model Offline Pretraining (MOP), Model Online
visual taxonomy of AI
Adaptation (MOA), Multi-Agent Orchestration (MAO),
and Multi-Agent Self-Evolving (MASE) agent evolution

Conceptual framework of the self-

agent optimisation approaches


evolving process in agent systems
Small Models
Language
are
Future of
Agentic AI
Dr. Maryam Miradi
Tiny Models
Language
powered by
Recursive Reasoning
are
Future of AI
Dr. Maryam Miradi
25 AI Agents
Use
Real-World Cases
Event
Dr. Maryam Miradi

GeoAI

3
Drugs & Genes

1 4
FinRobot

5
Patent Analysis Forecasting
Agents

2
Transfer © Crafted & illustrated by: Dr. Maryam Miradi

1 Data Collection 2 Preprocessing & Tokenization 3 Pretraining + Architecture


Improved Retrieval

☆ Biomedical papers, ☆ It parses research goals, then ☆ iIt debates, refines, and

for
EARTH
molecule libraries, prior tokenizes, aligns metadata, and evolves each research proposal

LLM + VLM
experiments formats everything into —mirroring the actual scientific
structured hypotheses. method.

Snapshots

Deployment & Optimization Evaluation & Benchmarking


4 Post-Training Alignment 6
☆ The model’s hypotheses
5
☆ In one case, the AI ☆ Hypotheses are judged using
Generating Result

must be novel, plausible, predicted unpublished an Elo-like system that ranks


testable, and aligned. experimental results that ideas based on tournament
mirrored real-world discoveries​ wins. Higher Elo = higher-quality
ideas.

10
Dementia

9
PODAgent
2nd Me

8
Detection Self-Replicating AI Coding

6 7
Podcast Generation © Crafted by:
Dr. Maryam Miradi
Assistant © Crafted & illustrated by: Dr. Maryam Miradi

Code Suggestions,

Multi-Layer Memory Issue Description


Generation
Explanation,
Generation,
Plan Smarter
Refactor and Fix. Code Faster

1 2
Root Cause
Merge Request
Analysis
& Commit Message
Generation
Review Easier Debug Quicker

3 4
summary by hand
Vulnerability
Hand-made by: Explanation
Dr. Maryam Miradi & Resolution
Secure Sooner Monitor Metrics

5 6

AgentMemory

12 13 14 15
Recommendation Driver Agent Graph Memory

11
AI Agent Builder
© Crafted & illustrated by: Dr. Maryam Miradi

Dynamic Storage Agents traffic simulations


© Crafted & illustrated by: Dr. Maryam Miradi

Generate Code. Test LLMs.


Prototype Visually. Deploy. + Graph Emoji's
with
reinforcement learning Semantic Entity Extraction:
Entities (Nodes) & Facts
(Edges) are extracted,
3
Personalized AI Agents
Community Formation: Similar
Entities are Clustered into
Communities using Label Propagation

Graph Construction
resolving duplicates. Reflexion Episode Subgraph:
and Map-Reduce Summerization.
removes Hallucination, finally 1 Raw data (messages, 2
1 Visual Prototyping: 2 Seamless API Integration: 3 Agent Code generation: Embedding and Cypher text, JSON) is stored in

Agent
Queries for storage. the Episode Subgraph
Design agent workflows using Instantly connect to 100,000+ APIs, 18,000+ Convert APIs into reusable agent
with Bi-Temporal
Postman Flows (drag-and- companies from Postman’s global API hub. tools by generating Codes in Python
drop, no-code canvas). → No boilerplate or auth headaches. for Langchain.→ Add instructions,
(Time-Aware).

→ Combine LLM prompts, logic metadata, and validation.

Construction
blocks, and API calls visually.
Companies
18,000+


Reranker Improve Relevance 6 Context Construction:
100,000+ APIs by Sorting results. Formats retrieved
knowledge for LLMs. It

Memory Construction
Search Layer Finds
keeps the 4 statement 5
relevant knowledge using
itself, Its time validity
three methods:
range and Each entity

Memory
Cosine Similarity
4 LLM Evaluation: 5 AI Agent 6 Context-Aware
and community name
and summary.
(semantic meaning)
Test prompts directly BM25 Search (exact
Code Export: Flow Simulation: keyword matching)
against models like GPT,
Claude, Gemini. → Compare
Instantly generate working code of
agent logic for example in LangChain- →
Run workflows locally in dev mode.
Inject system prompts, simulate
Breadth-First Search
(BFS) (contextual
cost, latency, output quality

Construction
ready Python code. real-world usage, and debug relationships)
in one interface.
→ Bring your Postman workflow into results.
your own dev environment.

Evaluation (2 benchmarks): Performance Results: Zep


DMR 500 conversations from the 9
outperforms MemGPT with
Multi-Session Chat dataset +
Curated by: higher Accuracy & lower
LongMemEval extra-long 7 Data for Evaluation:
Dr. Maryam Miradi Latency. 8
conversations Structured Business
(avg. 115,000 tokens). Accuracy Data + Unstructured

94.8%

Evaluation
Conversational Data
7 Monitoring + 8 Productionization 9 End-to-End LLM
Move from test to deployment
Debugging +API DevOps:

Evaluation
instantly.

~90%
Latency was reduced by
Track performance, latency,
and errors inside Postman. → → Agents work as HTTP endpoints or
Manage LLMs, APIs, tools, and
workflows in one platform.
Test full agent stack before
inside Postman Flows.
→ Designed for both technical
going live. teams and no-code creators. Curated by:
outperforming
MemGPT
Dr. Maryam Miradi

Hand-made by:
Dr. Maryam Miradi

19
Creativity

18
Protein

17
BI AI Agents Physical AI

16 20
Ocean AI Agents AI Agent
Pencil to Pixel
Engineering
Solving Ocean Pollution
AI Agents Nvidia
Cosmos
Cosmos
Data Features Creativity
Scoring
LLMs BI
Questions

SQL
Data Sources

22 23 24 25
Unknown

21
NASA Agents Unknown
Complex
Engineering BIO AI Agents Drug Discovery
RO
SALangchain CO-
STORM
AI Agents
Dr. Maryam Miradi
60 AI AGENTS
USECASES
Transforming Industries

🏦 FINANCE
Investment Memo Generator
🛠 OPERATIONS
AI Staffing Assistant
Buy vs. Sell Side Agent Staff Training Assistant for New Employees
Due Diligence Assistant Infosec Agent
10Q/10K Documents Extraction AI Slackbot
Competitive Analysis Assistant Customer Support Chatbot
Spreadsheet AI Assistant RFP Response Assistant
Loan Underwriting Assistant Tender Document Analysis
Compliance Assistant Database Assistant for PostgreSQL
Commodities Copilot SharePoint Assistant for Ops
Company Due Diligence Call Center QA
Contract Redlining Contract Analyzer
KYC Agent Leads Scoring Assistant for Sales Teams

🏥 HEALTHCARE
Financial Reports Assistant Tender Offers Review Assistant
Admin Assistant for Personnel
RFP Generation
Training/Onboarding Assistant
Patient Reports Receipts Info Extraction
Custom AI Copilot

🌐 OTHERS
Call Center QA Agent
SOAP Notes Generator
Protocol Summarizer
Contract Redlining
Physician Assistant
AI Booking Assistant for Patients Predictive Maintenance Agent
Insurance Policy Copilot Route Optimization System
Hospital CSR Assistant Quality Control Agent
Medical Research Review Assistant Legal Research Agent
Back Office Automation Inventory Management Assistant

📈 MARKETING
Clinical History Search Engine Anti-Fraud Agent
HR Support Bot
AI SDR
Marketing Campaign Agent
Lead Scoring Agent SEO Content Creation Agent
AI Writing Assistant
Programmatic SEO Tool
Video to Blog Post Generator
Salesforce Assistant
AI Sales Assistant for Snowflake
81
from Real-World
AI AGENTS
USE CASES
Dr. Maryam Miradi
Companies

Built with
Anthropic.
COMMERCE

EDUCATION
SOFTWARE

SECURITY
FINANCE
Dr. Maryam Miradi
AI Agents
Evals Explained
Dr. Maryam Miradi

base on AI Engineer Talk: Evals Are Not Unit Tests — Ido Pesok, Vercel v0

You might also like