- Himanshu Ramchandani
- Posts
- How Does AI Agent Work?
How Does AI Agent Work?
Elon Musk is hiring software engineer, Google's AI hallucinating and Building your first AI agent.
You have already heard a lot of noise about AI agents. Today’s newsletter breaks down how AI agents work and includes my new video on building your first AI agent.
I added some content, news, and resources about AI agents, Elon Musk, and Google.
In today’s edition:
Roundup Weekly— new YouTube video on AI Agents
Sponsor Spotlight— artisan
Dive Deep Drill— how does an AI agent work?
Build Together— here’s how I can help you.
AI Leadership Academy—Join leaders, PMs, VPs, CEOs, Consultants & professionals from various domain & get do it yourself & done with you sessions.
Roundup Weekly
I found these resources, content, and news.
— [news] Google’s AI keeps hallucinating.
— [content] build your first AI agent by using n8n an automation software tool.
— [resource] AI agents notes, quiz, architecture, PPT, etc.
— [news] ChatGPT becomes more Siri-like.
— [content] building agentic systems by the Claude team.
— [resource] white paper by Google on Agents.
— [news] Elon Musk is hiring a software engineer, sounds fun. Check my work at https://localhost:3000
Elon Musk on X
Sponsor Spotlight
Hire Ava, the Industry-Leading AI BDR
Ava automates your entire outbound demand generation so you can get leads delivered to your inbox on autopilot. She operates within the Artisan platform, which consolidates every tool you need for outbound:
300M+ High-Quality B2B Prospects
Automated Lead Enrichment With 10+ Data Sources Included
Full Email Deliverability Management
Personalization Waterfall using LinkedIn, Twitter, Web Scraping & More
Dive Deep Drill
How AI Agents Work?
Imagine if your smart fridge had an AI agent that not only ordered milk when you were out but also debated whether almond milk is better for you based on your browsing history.
Scary or helpful? You decide!
This is the most simple I can define an AI agent:
AI agents have the power to understand our language(because of LLMs), reason, plan, and also execute the tasks given to them without human intervention.
AI Agents can handle complex challenges, making them far more dynamic than basic automation tools.
They are designed as part of software, not just a script, which allows them to have complex interactions with their environment.
How AI Agents Are Different from Simple Automation?
You must be having the same question.
Well, they are different because of 2 major capabilities:
tools
planning
You have seen ChatGPT making mistakes in basic math problems. That is because it only responds based on the data it was trained on.
In the same way, if I ask you to multiply 85 and 65, as a human you can directly answer this if you already know the answer or by using a tool called calculator, correct?
You are doing the same with agents, giving them access to tools.
The second thing is planning.
Take the same math calculation, you can only solve this problem if you know multiplication or you know what parameters to pass to a calculator that is 85 and 65 along with a multiply.
That’s what planning and reason is.
Here is the flow of what happens when you query an AI agent.
The architecture of an AI Agent
3 major components of AI agents:
Orchestration layer
Models
Tools
Let us understand each component individually.
1. Orchestration layer (The Control Center)
Let’s say I want to create an AI agent meet scheduler, I query the scheduler, “I want to host a webinar for all my students”.
This will be considered as a trigger to the AI agent.
orchestration layer
The query can be text, audio, video, or image. (You already know that whatever the type of data is, it will always be converted into numerical values for the machine)
The query will be handled by the orchestration layer aka the control center of an AI agent.
There are 4 major works of the orchestration layer:
Memory: maintaining the memory of your whole interaction.
State: storing the current state of the whole process.
Reasoning: guiding the agent’s reasoning.
Planning: what are the steps and what will be the next step?
It will interact with the model(LLM).
2. Models (The Brain)
The model is the centralized decision-maker for the whole agent.
It is typically an AI model like the Large Language Model.
models in AI agents
To understand the query, formulate a plan, and determine the next action, the model uses reasoning and logic frameworks like:
ReAct
(Reason + Act) ensures thoughtful and deliberate actionsChain-of-Thought
reason through intermediate steps.Tree-of-Thoughts
explores multiple paths to find the best solution
The model determines what actions to take, and performs those actions using specific tools.
3. Tools (The Hands)
Using tools the agent can interact with the external world.
Like I told you, a calculator, APIs, web search, external databases, etc.
Tools enable agents to perform actions beyond the model's capabilities, access real-time information, or complete real-world tasks.
There are 3 types of tools:
Extensions: when the agent needs external live API calls.
Functions: similar to programming functions for client-side code execution.
Data Stores: vector databases, RAG, structured and unstructured data.
The model outputs a function and its arguments but doesn’t make a live API call.
The whole process will iterate until the goal is reached.
If you want to build your first AI Agent, here is my video on [YouTube]:
Want to work together? Here’s How I can help you
AI Engineering & Consulting (B2B) at Dextar—[Request a Brainstorm]
You are a leader?—join the [AI leadership academy]
AI Training for Enterprise Team—[MasterDexter]
Get in front of 50k+ AI leaders & professionals—[Sponsor this Newsletter]
I use BeeHiiv to send this newsletter.
Paper Unfold
A series in which you will get a breakdown of complex research papers into easy-to-understand pointers. If you missed the previous ones:
How satisfied are you with today's Newsletter?This will help me serve you better |
Personalize Your Experience!
Fill out this survey to help yourself get better AI newsletter editions in 30 seconds.
PS: Reply to this email if you want me to write on the topic you are interested in.
—Him
Reply