Let's make our Agent
An overview on Agents and how they work

What is an Agent?
An agent is something that can automatically perform tasks, reason, and generate results.
We have seen that LLM models and AIs are like brains that can think, reason, and answer questions. But this isn't very practical on its own. What would you do with just that? Create a chatbot and chat all day?
Here comes the concept of AI Agents, which is much broader than just LLMs! The main idea is to give some actions to the LLM models, like giving hands and legs to the AI so they can do their tasks.
So, the official definition of AI Agents is:
“An AI agent is a software system designed to interact with its environment, gather information, and perform tasks autonomously to achieve predetermined goals set by humans or other systems. “
How does it work?
Ok, we are done with the definition. Now comes the part of its mechanism. According to some resources, there are five core components of AI Agents:
Perception System: Agents receive input from users or sensors. (Generally, the user query)
Reasoning Engine: The LLM that processes information and makes decisions. (The AI models)
Tool Use: The ability to call external functions. (We will get back to it.)
Decision Framework: some structured workflow: plan → action → observe → output
Plan: Decides what to do based on the query
Action: Calls appropriate functions with specific parameters
Observe: Processes the results from function calls
Output: Provides final responses to users
Memory: The agent maintains conversation history to track context.
This is the basic workflow of an agent. Now, let’s make a simple weather agent of our own :)
Our First AI Agent:
I am gonna explain the code step by step. Don’t forget to read the comments.
Just some general imports:
import os
import json
import requests
from google import genai
from dotenv import load_dotenv
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
client = genai.Client(api_key=GEMINI_API_KEY)
#just install the pacakages/dependencies in your code file
Perception system: Input from the user.
user_query = input('> ')Reasoning Engine: I am using the Gemini API here. (It’s kind of free :) )
system_prompt = f""" You're a helpful AI assistant who is specialized in resolving user query. You work on plan, action, observe, output mode. Available tools: {list(available_functions.keys())} Tool descriptions: - get_weather(city: str): Returns weather information for a given city IMPORTANT RULES: - Return ONLY ONE step per response, not multiple steps - Start with "plan" step first - Wait for next input before proceeding to next step - When step is "action", you MUST specify function and input Output JSON Format (return only ONE): {{ "step": "plan|action|observe|output", "content": "description of what you're doing", "function": "function name (only for action step)", "input": "function parameter (only for action step)" }} User Query: """This is the system prompt of the system, which will decide where to execute what part. In a word, will reason the whole process based on the user query.
Tool Use: I am using a weather API, and the LLM will call the API when needed.
## This is the main part for the function call def get_weather(city: str): response = requests.get(f"https://wttr.in/{city}?format=%C:%t") if(response.status_code == 200): data = response.text.split(':') situation = data[0] temp = data[1] return f"weather situation: {situation} and temparature: {temp}" else: print("API failed to get weather data") ## This is the object for function listing. Notice in the systemp_prompt, I am listing the available functions there available_functions = { "get_weather": get_weather }Decision Framework + Memorising:
Observe the code closely.
while step_count < max_steps: ## Memorising the previous prompts full_prompt = system_prompt + user_query + "\n" + conversation_history response = client.models.generate_content( model="gemini-2.0-flash-001", contents=full_prompt ) print(f"AI Response: {response.text}\n") parsed = parse_response(response.text) ## An additional function for parsing, will give it later. if not parsed: print("Failed to parse response") break ## Here, the main game begins. First the function checks the step name and its content. ## Based on the name and content, it decides its action. ## If step is calling for an action aka. funciton, it calls a function ## If step is calling for output, it stops. ## No action for plan and observe, as it will just be handled and will do nothing (printing it though) step = parsed.get("step") content = parsed.get("content") if step == "action": function_name = parsed.get("function") function_input = parsed.get("input") if function_name in available_functions: result = available_functions[function_name](function_input) observation = f"Function {function_name} returned: {result}" print(f"Function Call: {function_name} ('{function_input}')") print(f"Result: {result}\n") conversation_history += f"\nObservation: {observation}" else: print(f"Function {function_name} not available") break elif step == "output": print("=== FINAL ANSWER ===") print(content) break conversation_history += f"\nstep: {response.text}" step_count += 1 if step_count >= max_steps: print("Maximum steps reached")Like the explanation (comments in the code), the code executes the decision framework nicely.
Yeah, this is the main workflow of an agent. Now, this is the whole code:
import os
import json
import requests
from google import genai
from dotenv import load_dotenv
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
client = genai.Client(api_key=GEMINI_API_KEY)
def get_weather(city: str):
response = requests.get(f"https://wttr.in/{city}?format=%C:%t")
if(response.status_code == 200):
data = response.text.split(':')
situation = data[0]
temp = data[1]
return f"weather situation: {situation} and temparature: {temp}"
else:
print("API failed to get weather data")
available_functions = {
"get_weather": get_weather
}
system_prompt = f"""
You're a helpful AI assistant who is specialized in resolving user query.
You work on plan, action, observe, output mode.
Available tools: {list(available_functions.keys())}
Tool descriptions:
- get_weather(city: str): Returns weather information for a given city
IMPORTANT RULES:
- Return ONLY ONE step per response, not multiple steps
- Start with "plan" step first
- Wait for next input before proceeding to next step
- When step is "action", you MUST specify function and input
Output JSON Format (return only ONE):
{{
"step": "plan|action|observe|output",
"content": "description of what you're doing",
"function": "function name (only for action step)",
"input": "function parameter (only for action step)"
}}
User Query: """
def parse_response(response_text):
"""Extract JSON from the response"""
try:
text = response_text.replace("```json", "").replace('```', "")
lines = text.strip().split('\n')
for line in lines:
line = line.strip()
if line.startswith('{') and line.endswith('}'):
try:
return json.loads(line)
except:
continue
start = text.find('{')
if start != -1:
brace_count = 0
for i, char in enumerate(text[start:], start):
if char == '{':
brace_count += 1
elif char == '}':
brace_count -= 1
if brace_count == 0:
json_str = text[start: i+1]
return json.loads(json_str)
except Exception as e:
print(f"Parse error: {e}")
pass
return None
def run_agent(user_query):
conversation_history = ""
step_count = 0
max_steps = 100
print(f"User Query: {user_query}\n")
while step_count < max_steps:
full_prompt = system_prompt + user_query + "\n" + conversation_history
response = client.models.generate_content(
model="gemini-2.0-flash-001",
contents=full_prompt
)
print(f"AI Response: {response.text}\n")
parsed = parse_response(response.text)
if not parsed:
print("Failed to parse response")
break
step = parsed.get("step")
content = parsed.get("content")
if step == "action":
function_name = parsed.get("function")
function_input = parsed.get("input")
if function_name in available_functions:
result = available_functions[function_name](function_input)
observation = f"Function {function_name} returned: {result}"
print(f"Function Call: {function_name} ('{function_input}')")
print(f"Result: {result}\n")
conversation_history += f"\nObservation: {observation}"
else:
print(f"Function {function_name} not available")
break
elif step == "output":
print("=== FINAL ANSWER ===")
print(content)
break
conversation_history += f"\nstep: {response.text}"
step_count += 1
if step_count >= max_steps:
print("Maximum steps reached")
user_query = input('> ')
if __name__ == "__main__":
user_query = user_query
run_agent(user_query)
Sample Input:
> What is the weather in Satkhira?
Sample Output:
User Query: What is the weather in Satkhira?
AI Response: ```json
{
"step": "plan",
"content": "I need to get the weather information for Satkhira. I will use the get_weather tool to get the weather information.",
"function": null,
"input": null
}
```
AI Response: ```json
{
"step": "action",
"content": "Get weather information for Satkhira",
"function": "get_weather",
"input": "Satkhira"
}
```
Function Call: get_weather ('Satkhira')
Result: weather situation: Overcast and temparature: +31°C
AI Response: step: ```json
{
"step": "observe",
"content": "The weather in Satkhira is Overcast and the temperature is +31°C.",
"function": null,
"input": null
}
```
AI Response: ```json
{
"step": "output",
"content": "The weather in Satkhira is Overcast and the temperature is +31°C.",
"function": null,
"input": null
}
```
=== FINAL ANSWER ===
The weather in Satkhira is Overcast and the temperature is +31°C.
The agent will extract the city name and pass it to the get_weather function, then extract the result from the API, and show the result.
Conclusion:
Seems pretty normal, right? This is just one example of how Agents work. Try giving more complex queries, like finding the average temperature of three districts or extracting the temperatures of several districts and displaying them in a table. Then you'll realize how much more powerful it is compared to a regular API call and query. Just imagine the possibilities if the query is done on a database or the entire Internet.




