I’ve spent the last six weekends diving headfirst into four major open source AI agent frameworks. I wanted to know which one actually makes building a useful, multi-step agent easier—not just which one has the best README. Here’s what I found, with real code, real commands, and a brutally honest comparison table.
What We’re Actually Comparing
Let me be clear: I’m not comparing chatbots or RAG pipelines. I’m comparing frameworks that let you build an agent—a program that can reason, use tools, and decide on its own next action. The four contenders I tested are:
- – LangChain (v0.3, with LangGraph)
- – AutoGen (from Microsoft, v0.2)
- – CrewAI (v0.30)
- – Semantic Kernel (from Microsoft, v1.0)
Each claims to let you create a “multi-agent” system. I built the same test project in all four: an agent that can search the web, fetch a weather report, and then summarize the results into a single email draft. No fluff—just working code.
Requirements Table
Before we write a single line, here’s what you need installed. I’m assuming you’re on Linux or macOS, but Windows with WSL2 works too.
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.10+ | 3.11 recommended for type hints |
| OpenAI API key (or local LLM) | N/A | I used GPT-4o-mini for cost |
| pip | 23+ | Use virtual env to avoid conflicts |
| DuckDuckGo Search (DuckDuckGoSearchRun) | Latest | Free, no API key needed |
| requests | 2.31+ | For weather API calls |
Step 1: Setting Up the Environment
Create a virtual environment and install the base dependencies. I use this exact sequence every time:
python -m venv agent_env
source agent_env/bin/activate
pip install langchain langchain-community langgraph openai autogen crewai semantic-kernel duckduckgo-search requests
This installs all four frameworks in one go. Yes, it’s a bit heavy (around 300 MB), but it lets you switch between them without rebuilding.
Step 2: Building with LangChain + LangGraph
LangChain’s agent ecosystem has evolved. For my test, I used LangGraph to define a state graph with three nodes: search, weather, and summarize. Here’s the core logic:
from langgraph.graph import StateGraph, MessagesState
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import ToolNode
search_tool = DuckDuckGoSearchRun()
tools = [search_tool]
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
def weather_tool(city: str) -> str:
import requests
url = f"https://wttr.in/{city}?format=%C+%t"
return requests.get(url).text
# Build graph
graph = StateGraph(MessagesState)
graph.add_node("search", ToolNode([search_tool]))
graph.add_node("weather", lambda state: {"messages": [weather_tool(state["messages"][-1].content)]})
graph.add_node("summarize", lambda state: {"messages": [model.invoke(state["messages"])]})
graph.add_edge("search", "weather")
graph.add_edge("weather", "summarize")
graph.set_entry_point("search")
app = graph.compile()
result = app.invoke({"messages": [{"role": "user", "content": "Latest AI news and weather in London"}]})
print(result["messages"][-1].content)
What I liked: the graph is explicit and debuggable. What I didn’t: you have to manually wire every edge. For a three-step agent it’s fine, but for ten tools it becomes a mess.
Step 3: Building with AutoGen
AutoGen takes a different approach—you define agents as classes with a send method. Here’s the same task:
import autogen
config_list = [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]
assistant = autogen.AssistantAgent(
name="Assistant",
llm_config={"config_list": config_list}
)
user_proxy = autogen.UserProxyAgent(
name="UserProxy",
human_input_mode="NEVER",
code_execution_config={"use_docker": False}
)
task = """1. Search the web for latest AI news.
2. Get weather in London.
3. Write a summary email."""
user_proxy.initiate_chat(assistant, message=task)
This is simpler, but AutoGen’s agents are designed for conversation, not tool chaining. To actually call the search tool, you need to register functions. I had to add:
@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Search the web")
def search(query: str) -> str:
from duckduckgo_search import DDGS
with DDGS() as ddgs:
return " ".join([r["body"] for r in ddgs.text(query, max_results=3)])
@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Get weather")
def weather(city: str) -> str:
import requests
return requests.get(f"https://wttr.in/{city}?format=%C+%t").text
The registration pattern is powerful but verbose. For a beginner, the mental model of “who registers what” is confusing.
Step 4: Building with CrewAI
CrewAI is the most opinionated of the bunch—you define “Agents” and “Tasks” and a “Crew” that runs them. Here’s my implementation:
from crewai import Agent, Task, Crew
from duckduckgo_search import DDGS
class SearchTool:
def run(self, query):
with DDGS() as ddgs:
return " ".join([r["body"] for r in ddgs.text(query, max_results=3)])
class WeatherTool:
def run(self, city):
import requests
return requests.get(f"https://wttr.in/{city}?format=%C+%t").text
researcher = Agent(
role="Researcher",
goal="Search web and get weather",
backstory="You are a research agent.",
tools=[SearchTool(), WeatherTool()],
verbose=True
)
writer = Agent(
role="Writer",
goal="Write a summary email",
backstory="You are a professional writer.",
verbose=True
)
task1 = Task(description="Search for AI news and get London weather", agent=researcher)
task2 = Task(description="Write an email summarizing the findings", agent=writer)
crew = Crew(agents=[researcher, writer], tasks=[task1, task2])
result = crew.kickoff()
print(result)
CrewAI’s strength is readability. I could hand this code to a junior developer and they’d understand it in five minutes. The downside: it’s rigid. You can’t easily insert conditional logic (e.g., “if weather is rainy, include an umbrella reminder”).
Step 5: Building with Semantic Kernel
Semantic Kernel from Microsoft is the most enterprise-oriented. It uses “plugins” and “planners” rather than agents. Here’s my attempt:
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
from semantic_kernel.planners import SequentialPlanner
kernel = sk.Kernel()
kernel.add_chat_service("gpt", OpenAIChatCompletion("gpt-4o-mini", api_key="YOUR_KEY"))
# Register native functions
@kernel.register_function("SearchPlugin", "search")
def search(query: str) -> str:
from duckduckgo_search import DDGS
with DDGS() as ddgs:
return " ".join([r["body"] for r in ddgs.text(query, max_results=3)])
@kernel.register_function("WeatherPlugin", "weather")
def weather(city: str) -> str:
import requests
return requests.get(f"https://wttr.in/{city}?format=%C+%t").text
planner = SequentialPlanner(kernel)
plan = planner.create_plan("Search for AI news and get London weather, then summarize")
result = await plan.invoke_async()
print(result)
The planner is supposed to automatically chain the functions, but in practice, it often fails to call the weather function after the search. I had to add explicit step definitions. Semantic Kernel feels like it’s designed for .NET developers—the Python port is a second-class citizen.
The Honest Comparison Table
After building the same project in all four, here’s my real-world scorecard:
| Criterion | LangChain | AutoGen | CrewAI | Semantic Kernel |
|---|---|---|---|---|
| Ease of setup | Good | Fair | Excellent | Poor |
| Debugging | Graph visualization | Logging only | Verbose output | Minimal |
| Multi-agent support | Via LangGraph | Native | Native | Via plugins |
| Documentation quality | Excellent | Fair | Good | Poor (Python) |
| Learning curve | Steep | Medium | Low | Steep |
Which One Should You Choose?
Here’s my honest take after building the same agent in all four:
- – Pick LangChain + LangGraph if you need fine-grained control over agent flow and you’re comfortable reading graphs. It’s the most flexible, but you pay in complexity.
- – Pick AutoGen if you’re building a multi-agent conversation system (like a debate or Q&A). The agent-to-agent messaging is excellent, but tool integration is clunky.
- – Pick CrewAI if you want to get a prototype running in an afternoon. It’s the most intuitive for beginners, but advanced logic requires workarounds.
- – Skip Semantic Kernel for Python projects. It’s clearly designed for .NET, and the Python version lags in both features and documentation.
For my next real project—a customer support agent that triages tickets—I’m going with CrewAI. The readability wins, and I can always drop down to LangGraph if I need custom routing. But if you’re building a research agent that needs to browse multiple sources, LangChain’s tool ecosystem is hard to beat.
The key lesson I learned: open source AI agent frameworks compared is not about which one is “best” in abstract—it’s about which one matches your team’s skill level and your project’s complexity. Start simple with CrewAI,
Related Articles
- AI Agents 101: Complete Beginner’s Guide to Agentic AI in 2026 — Main Guide
- How AI Agents Work Step by Step: A Practical 2026 Guide to Autonomous Systems
- AI Agent Safety in 2026: Essential Security Guardrails Every Business Must Know
- AI Agents Explained in Simple Terms: What They Are and Why 2026 Changes Everything
