Skip to content

Agent Framework Evaluation

Which framework should power the OpenClaw Discord bot -- the front-end agent that receives documents, extracts data, calls the Accounting API, and replies in Discord threads?

Requirements

Requirement Why
Persistent personality The bot has a defined character -- Norwegian accounting assistant with specific tone and expertise
Multi-user conversation memory Parallel Discord threads per document, remembers user preferences and past interactions
Function calling / tool use Calls Accounting API endpoints (POST /receipt/attach, GET /folio/balance, etc.)
Discord native Gateway connection, thread management, file attachments, embeds
Multi-replica scaling Horizontal scaling with state in Postgres, not in-process
Self-hosted on k3s Runs on our Dell k3s cluster via ArgoCD
Postgres memory Conversation history, learned patterns, user facts stored in PostgreSQL
LLM provider flexibility Claude, Gemini Flash, or local models depending on task complexity

Tier 1 -- Strong Fits

ElizaOS (elizaOS/eliza)

Character-driven AI agent framework. Originally built for AI personalities with persistent identity.

Attribute Detail
GitHub elizaOS/eliza -- 17,886 stars
Language Rust core (v2), TypeScript plugins
License MIT
Activity Daily commits, active community
Discord Native first-class plugin
Postgres Native via adapter-postgres + pgvector
Pricing Free (open source, self-hosted)

Architecture: Plugin-based agent runtime. Agents are defined via JSON character files (name, bio, lore, style, topics, example messages). Plugins provide actions (tools), evaluators, and providers. V2 rewrote the core in Rust for performance.

Character system: First-class. Character files are the core abstraction -- this is what ElizaOS was built for. Define personality, speaking style, knowledge areas, and example conversations in structured JSON.

Memory: Built-in short-term (conversation context) and long-term (facts extracted from conversations) memory. adapter-postgres provides PostgreSQL + pgvector for vector embedding storage and semantic search.

Discord support: Native plugin (elizaos-plugin-discord). Handles messages, voice, threads. One of the most mature Discord integrations of any framework.

Function calling: Plugin action system. Not standard OpenAI-style function calling but achieves the same result. Define actions in plugins, agent decides when to call them.

LLM providers: Claude, OpenAI, Gemini, Llama (local) via model provider plugins.

Scaling: Single-process runtime per agent. No built-in horizontal scaling. Postgres adapter enables shared state, but you'd need custom sharding across replicas. Discord gateway is 1 connection per shard.

Concerns:

  • V2 Rust rewrite is recent -- plugin ecosystem still catching up
  • Strong Web3/crypto community focus -- less enterprise attention
  • No native multi-replica coordination
  • Sharding would need external orchestration

Verdict: Best batteries-included option. Native Discord + native character system + native Postgres. Weakest on horizontal scaling.


Mastra + discord.js

TypeScript-first agent framework from the Gatsby team, paired with a thin discord.js gateway layer.

Attribute Detail
GitHub mastra-ai/mastra -- 22,243 stars
Language TypeScript
License See repo (listed as NOASSERTION -- needs verification)
Activity Daily commits, fast-moving
Discord None native -- build ~200 lines of discord.js gateway
Postgres Native via @mastra/memory + pgvector
Pricing Free (open source, self-hosted)

Architecture: Agents, workflows, tools, and memory are composable modules. Server adapters (Express, Hono, Fastify) expose agents as HTTP endpoints. Built on Vercel AI SDK.

Character system: System prompts only. No structured character files. You define personality in the instruction string -- functional but not first-class.

Memory: @mastra/memory with Postgres + pgvector. Thread-based conversations, semantic search, working memory, conversation summarization. Supports multi-user parallel threads natively. Memory is externalized -- stateless workers, state in DB.

Discord support: No native adapter. You build a discord.js bot (~200 LOC) that routes messages to Mastra agents via HTTP. Mastra has a Discord MCP server for reading Discord data, but not for running as a bot.

Function calling: First-class. Tools defined with Zod schemas. OpenAI-style function calling natively. Strong tool ecosystem.

LLM providers: Built on Vercel AI SDK -- OpenAI, Anthropic, Google, Mistral, Groq, and any AI SDK provider. Best provider flexibility of any framework.

Scaling: Stateless HTTP server with Postgres for all state. Multiple replicas behind a load balancer. Best scaling story of any option here.

Concerns:

  • 405 open issues -- fast-moving but rough edges possible
  • License needs verification (NOASSERTION on GitHub)
  • No Discord gateway -- you build it yourself
  • Newer framework, less battle-tested than LangGraph

Verdict: Best scaling model, best TypeScript DX, best LLM flexibility. Extra work to build the Discord layer, but it's small.


Custom discord.js + Claude SDK

Build from scratch using proven libraries.

Attribute Detail
Language TypeScript
License Ours
Discord discord.js -- 25k+ stars, gold standard
Postgres Full control over schema
Pricing Free (LLM API costs only)

Architecture: discord.js for gateway, @anthropic-ai/sdk for Claude tool use, custom Postgres layer for memory. You own everything.

Character system: Full control. System prompts, personality injection, Norwegian context, however you want. Can implement ElizaOS-style character files if desired.

Memory: You build it. Postgres tables for conversations (per-user, per-thread, per-company), pgvector for semantic search, custom retention policies. Full control over schema.

Discord support: discord.js is the definitive Discord library. Full gateway, slash commands, threads, reactions, embeds, buttons, modals. Native sharding support for large deployments.

Function calling: Claude SDK native tool use. Define tools with JSON schemas, Claude returns tool calls, you dispatch. Clean, well-documented.

LLM providers: You control the client. Swap @anthropic-ai/sdk, openai, @google/generative-ai as needed. Vercel AI SDK for unified interface.

Scaling: discord.js supports native sharding. Stateless workers with Postgres. Multiple replicas straightforward.

Concerns:

  • Most upfront development (estimated 2-4 weeks for core)
  • You build memory management, conversation threading, tool dispatch, error handling, rate limiting
  • No framework community for bug fixes -- it's your code

Verdict: Maximum control, maximum flexibility, no framework risk. More upfront work but no paradigm mismatch, no dead dependencies, no lock-in.


Tier 2 -- Viable with Caveats

LangGraph + discord.py

Stateful graph-based workflow engine from LangChain.

Attribute Detail
GitHub langchain-ai/langgraph -- 27,228 stars
Language Python
License MIT
Activity Daily commits
Discord None -- build with discord.py
Postgres Native checkpointer + store with semantic search
Pricing Free self-hosted. LangGraph Platform (hosted) is paid

Architecture: Agents are state machines (nodes = steps, edges = transitions). Checkpointing persists full graph state to Postgres. Supports cycles, branching, human-in-the-loop.

Memory: Strong. PostgresSaver for conversation state, PostgresStore for cross-thread long-term memory with semantic search. Thread-based via thread_id. State survives restarts, works across replicas.

Scaling: Postgres-backed state enables multi-replica. Graph is stateless between invocations.

Concerns:

  • Python-only (rest of our stack is TypeScript/C#)
  • LangChain abstractions can be leaky and complex
  • 476 open issues
  • Graph paradigm has a learning curve
  • No Discord integration -- you build the bot layer

Verdict: Most mature stateful agent framework. Excellent Postgres memory. But Python-only and requires building the Discord layer.


Tier 3 -- Eliminated

CrewAI

Reason Detail
Wrong paradigm Multi-agent task orchestration, not conversational bots
Memory model Crew-scoped, not user-thread-scoped
Discord None native

CrewAI is designed for batch task workflows (research, content generation) -- agents with roles completing tasks in sequence. Not for real-time conversational interaction. Memory doesn't map to multi-user Discord threads.

Botpress

Reason Detail
Self-hosted dead V12 discontinued for new deployments
Cloud pricing Free: 500 msg/mo, Plus: $79/mo, Team: $445/mo, Enterprise: $2000/mo
Discord No integration

Cloud-only with per-message pricing. Self-hosted V12 is in maintenance mode. No Discord support.

Rasa

Reason Detail
Maintenance mode Last commit 2+ months ago
Not LLM-native Rule/ML-based, LLM support bolted on via CALM
Discord No official connector

Predates the LLM era. Transitioning to CALM but momentum has stalled. Not a fit for LLM-native function calling agents.

LiveKit Agents

Reason Detail
Wrong paradigm Real-time voice/video AI agents (WebRTC)
No Discord Uses LiveKit rooms, not Discord gateway
No persistent memory In-memory only

Built for voice AI (STT -> LLM -> TTS pipelines). Not for text-based Discord bots.

Pipecat

Reason Detail
Wrong paradigm Voice/multimodal AI pipelines (by Daily.co)
No Discord Uses Daily WebRTC
No persistent memory In-memory only

Same category as LiveKit -- voice/multimodal focused. Not for text chat bots.

AutoGen (Microsoft)

Reason Detail
Maintenance mode Being replaced by Microsoft Agent Framework
No Discord None built-in
No Postgres ChromaDB/Redis only
License CC-BY-4.0 (unusual for software -- documentation license)

Microsoft is consolidating AutoGen + Semantic Kernel into the new Microsoft Agent Framework (currently RC, GA expected mid-2026). AutoGen gets bug fixes only.


Comparison Matrix

ElizaOS Mastra + discord.js Custom Build LangGraph
Character system Native (JSON) Prompt-only Full control Prompt-only
Discord Native plugin ~200 LOC gateway discord.js (native) Build yourself
Postgres memory Native (pgvector) Native (pgvector) Full control Native (checkpointer)
Multi-replica Manual sharding Excellent discord.js sharding Good
LLM flexibility Good Excellent Full control Excellent
Tool/function calling Plugin actions Zod schemas Claude native LangChain tools
Language Rust + TypeScript TypeScript TypeScript Python
Development time Days (config) ~1 week 2-4 weeks ~1-2 weeks
Framework risk Medium (v2 new) Medium (fast-moving) None Low (mature)
Community Web3-heavy Growing N/A Large (LangChain)
License MIT Verify Ours MIT

Recommendation

For the AI Accountant, we recommend Mastra + discord.js for these reasons:

  1. TypeScript -- consistent with our Node.js infrastructure (discord.js, MCP servers)
  2. Best scaling model -- stateless workers + Postgres, designed for multi-replica from day one
  3. Excellent memory -- thread-based, per-user, Postgres + pgvector, with summarization
  4. Best LLM flexibility -- Vercel AI SDK supports all providers, easy to switch between Claude and Gemini Flash based on task
  5. Tool system -- Zod-validated tools map cleanly to our Accounting API endpoints
  6. The Discord gap is small -- ~200 lines of discord.js to route messages to Mastra agents via HTTP

ElizaOS is the runner-up if we want a faster start (days vs. a week) and value the character file system. The tradeoff is weaker scaling and a Web3-oriented community.

Custom build is the fallback if both frameworks prove problematic. More work but zero risk.


Next Steps

  1. Verify Mastra license (NOASSERTION on GitHub)
  2. Prototype: discord.js gateway -> Mastra agent -> mock Accounting API
  3. Test Postgres memory with parallel threads
  4. Define character/personality as Mastra agent instructions
  5. Deploy to k3s alongside Accounting API