Open Source AI Agent Frameworks Compared: Which One Is Best for Your Next Project?

I’ve spent the last six weekends diving headfirst into four major open source AI agent frameworks. I wanted to know which one actually makes building a useful, multi-step agent easier—not just which one has the best README. Here’s what I found, with real code, real commands, and a brutally honest comparison table.

What We’re Actually Comparing

Let me be clear: I’m not comparing chatbots or RAG pipelines. I’m comparing frameworks that let you build an agent—a program that can reason, use tools, and decide on its own next action. The four contenders I tested are:

– LangChain (v0.3, with LangGraph)
– AutoGen (from Microsoft, v0.2)
– CrewAI (v0.30)
– Semantic Kernel (from Microsoft, v1.0)

Each claims to let you create a “multi-agent” system. I built the same test project in all four: an agent that can search the web, fetch a weather report, and then summarize the results into a single email draft. No fluff—just working code.

Requirements Table

Before we write a single line, here’s what you need installed. I’m assuming you’re on Linux or macOS, but Windows with WSL2 works too.

Requirement	Version	Notes
Python	3.10+	3.11 recommended for type hints
OpenAI API key (or local LLM)	N/A	I used GPT-4o-mini for cost
pip	23+	Use virtual env to avoid conflicts
DuckDuckGo Search (DuckDuckGoSearchRun)	Latest	Free, no API key needed
requests	2.31+	For weather API calls

Step 1: Setting Up the Environment

Create a virtual environment and install the base dependencies. I use this exact sequence every time:

python -m venv agent_env
source agent_env/bin/activate
pip install langchain langchain-community langgraph openai autogen crewai semantic-kernel duckduckgo-search requests

This installs all four frameworks in one go. Yes, it’s a bit heavy (around 300 MB), but it lets you switch between them without rebuilding.

Step 2: Building with LangChain + LangGraph

LangChain’s agent ecosystem has evolved. For my test, I used LangGraph to define a state graph with three nodes: search, weather, and summarize. Here’s the core logic:

from langgraph.graph import StateGraph, MessagesState
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import ToolNode

search_tool = DuckDuckGoSearchRun()
tools = [search_tool]

model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def weather_tool(city: str) -> str:
    import requests
    url = f"https://wttr.in/{city}?format=%C+%t"
    return requests.get(url).text

# Build graph
graph = StateGraph(MessagesState)
graph.add_node("search", ToolNode([search_tool]))
graph.add_node("weather", lambda state: {"messages": [weather_tool(state["messages"][-1].content)]})
graph.add_node("summarize", lambda state: {"messages": [model.invoke(state["messages"])]})
graph.add_edge("search", "weather")
graph.add_edge("weather", "summarize")
graph.set_entry_point("search")
app = graph.compile()

result = app.invoke({"messages": [{"role": "user", "content": "Latest AI news and weather in London"}]})
print(result["messages"][-1].content)

What I liked: the graph is explicit and debuggable. What I didn’t: you have to manually wire every edge. For a three-step agent it’s fine, but for ten tools it becomes a mess.

Step 3: Building with AutoGen

AutoGen takes a different approach—you define agents as classes with a send method. Here’s the same task:

import autogen

config_list = [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]

assistant = autogen.AssistantAgent(
    name="Assistant",
    llm_config={"config_list": config_list}
)

user_proxy = autogen.UserProxyAgent(
    name="UserProxy",
    human_input_mode="NEVER",
    code_execution_config={"use_docker": False}
)

task = """1. Search the web for latest AI news.
2. Get weather in London.
3. Write a summary email."""

user_proxy.initiate_chat(assistant, message=task)

This is simpler, but AutoGen’s agents are designed for conversation, not tool chaining. To actually call the search tool, you need to register functions. I had to add:

@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Search the web")
def search(query: str) -> str:
    from duckduckgo_search import DDGS
    with DDGS() as ddgs:
        return " ".join([r["body"] for r in ddgs.text(query, max_results=3)])

@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Get weather")
def weather(city: str) -> str:
    import requests
    return requests.get(f"https://wttr.in/{city}?format=%C+%t").text

The registration pattern is powerful but verbose. For a beginner, the mental model of “who registers what” is confusing.

Step 4: Building with CrewAI

CrewAI is the most opinionated of the bunch—you define “Agents” and “Tasks” and a “Crew” that runs them. Here’s my implementation:

from crewai import Agent, Task, Crew
from duckduckgo_search import DDGS

class SearchTool:
    def run(self, query):
        with DDGS() as ddgs:
            return " ".join([r["body"] for r in ddgs.text(query, max_results=3)])

class WeatherTool:
    def run(self, city):
        import requests
        return requests.get(f"https://wttr.in/{city}?format=%C+%t").text

researcher = Agent(
    role="Researcher",
    goal="Search web and get weather",
    backstory="You are a research agent.",
    tools=[SearchTool(), WeatherTool()],
    verbose=True
)

writer = Agent(
    role="Writer",
    goal="Write a summary email",
    backstory="You are a professional writer.",
    verbose=True
)

task1 = Task(description="Search for AI news and get London weather", agent=researcher)
task2 = Task(description="Write an email summarizing the findings", agent=writer)

crew = Crew(agents=[researcher, writer], tasks=[task1, task2])
result = crew.kickoff()
print(result)

CrewAI’s strength is readability. I could hand this code to a junior developer and they’d understand it in five minutes. The downside: it’s rigid. You can’t easily insert conditional logic (e.g., “if weather is rainy, include an umbrella reminder”).

Step 5: Building with Semantic Kernel

Semantic Kernel from Microsoft is the most enterprise-oriented. It uses “plugins” and “planners” rather than agents. Here’s my attempt:

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
from semantic_kernel.planners import SequentialPlanner

kernel = sk.Kernel()
kernel.add_chat_service("gpt", OpenAIChatCompletion("gpt-4o-mini", api_key="YOUR_KEY"))

# Register native functions
@kernel.register_function("SearchPlugin", "search")
def search(query: str) -> str:
    from duckduckgo_search import DDGS
    with DDGS() as ddgs:
        return " ".join([r["body"] for r in ddgs.text(query, max_results=3)])

@kernel.register_function("WeatherPlugin", "weather")
def weather(city: str) -> str:
    import requests
    return requests.get(f"https://wttr.in/{city}?format=%C+%t").text

planner = SequentialPlanner(kernel)
plan = planner.create_plan("Search for AI news and get London weather, then summarize")
result = await plan.invoke_async()
print(result)

The planner is supposed to automatically chain the functions, but in practice, it often fails to call the weather function after the search. I had to add explicit step definitions. Semantic Kernel feels like it’s designed for .NET developers—the Python port is a second-class citizen.

The Honest Comparison Table

After building the same project in all four, here’s my real-world scorecard:

Criterion	LangChain	AutoGen	CrewAI	Semantic Kernel
Ease of setup	Good	Fair	Excellent	Poor
Debugging	Graph visualization	Logging only	Verbose output	Minimal
Multi-agent support	Via LangGraph	Native	Native	Via plugins
Documentation quality	Excellent	Fair	Good	Poor (Python)
Learning curve	Steep	Medium	Low	Steep

Which One Should You Choose?

Here’s my honest take after building the same agent in all four:

– Pick LangChain + LangGraph if you need fine-grained control over agent flow and you’re comfortable reading graphs. It’s the most flexible, but you pay in complexity.
– Pick AutoGen if you’re building a multi-agent conversation system (like a debate or Q&A). The agent-to-agent messaging is excellent, but tool integration is clunky.
– Pick CrewAI if you want to get a prototype running in an afternoon. It’s the most intuitive for beginners, but advanced logic requires workarounds.
– Skip Semantic Kernel for Python projects. It’s clearly designed for .NET, and the Python version lags in both features and documentation.

For my next real project—a customer support agent that triages tickets—I’m going with CrewAI. The readability wins, and I can always drop down to LangGraph if I need custom routing. But if you’re building a research agent that needs to browse multiple sources, LangChain’s tool ecosystem is hard to beat.

The key lesson I learned: open source AI agent frameworks compared is not about which one is “best” in abstract—it’s about which one matches your team’s skill level and your project’s complexity. Start simple with CrewAI,