We've shipped a lot in the past few months: multi-model routing, workflow automation, AI advisory boards, agent coordination, browser automation, cloud compute, edge deployment, and a universal installer for 9 IDEs.
But we're just getting started. Here's what's coming next — and why each feature exists.
Agent Observability: The Security Camera for AI Operations
The problem: When your agent makes 47 API calls, routes through 3 models, and spends 200 credits — you have no idea what happened. The response looks fine, but was it efficient? Did it retry failed calls? Did it use the expensive model when the cheap one would've worked?
What we're building: A full observability layer for agent operations. Every request logged with latency, model used, cost, and outcome. Traces that show you exactly how a multi-step workflow executed. Cost attribution that breaks down spending by model, by tool, by workflow, by user.
Why it matters: As AI spending scales, visibility becomes critical. You wouldn't run a production API without monitoring. You shouldn't run production AI agents without it either. This is the feature that turns HexaClaw from a tool into a platform — the kind of visibility you can't get from any single model provider.
Remote Task Delegation: Cloud Meets Local
The problem: Cloud agents are great for background work. Local agents (like Claude Code on your laptop) are great for interactive coding. But they can't talk to each other. Your cloud agent can't say "I need someone to implement this function" and have your local Claude Code pick it up.
What we're building: A task delegation system where cloud agents can dispatch work to local agent instances. Your cloud research agent finds a bug in the codebase and creates a task. Your local Claude Code session sees the task, claims it, fixes the bug, and reports back. All through the platform.
Why it matters: Nobody else does this. The boundary between cloud and local AI is currently a wall. We're turning it into a door. This lets you build workflows where expensive compute happens in the cloud and interactive coding happens locally — coordinated through one system.
Agent-to-Agent Relay: Twilio for AI
The problem: Your agents can talk to each other within HexaClaw. But what about agents on other platforms? What about your partner's agents? Your client's agents? There's no standard way for agents to discover and communicate with agents outside their own system.
What we're building: A public relay network for agent-to-agent communication. Agents register with capabilities. Other agents discover them by what they can do. Messages route through the relay. Think of it as DNS + SMTP for AI agents — a way for any agent to find and talk to any other agent.
Why it matters: The AI industry is building toward multi-agent ecosystems. Google has A2A. Anthropic has multi-agent patterns. But nobody offers agent communication as a service. We're positioning HexaClaw as the infrastructure layer — the Twilio of agent-to-agent communication.
Coordination Dashboard: See Your Agent Team
The problem: The agent coordination system works great via API and tools. But there's no visual interface for humans to see what's happening.
What we're building: A full dashboard for agent coordination:
- Agents tab — see all registered agents, their status (idle, working, blocked, offline), and what they're currently working on
- Task Board tab — a Kanban-style view of all tasks. Drag to reassign. Click to see details. Filter by priority, status, or assignee
- Messages tab — threaded conversations between agents. Unread counts. Search across all agent communications
- Escalations tab — pending decisions waiting for your input. Resolve directly from the dashboard with one click
Why it matters: Right now, coordination is invisible unless you query the API. The dashboard makes it tangible. You see your agent team at a glance, spot bottlenecks, resolve escalations, and intervene when needed. It turns agent coordination from an API feature into a product experience.
App Hosting Phase 2: Databases, Storage, Custom Domains
The problem: Phase 1 of our edge deployment platform handles code. But real apps need databases, file storage, and custom domains.
What we're building:
- Databases — per-app SQLite databases at the edge. Your AI-generated app can store and query data without you setting up a database server.
- Object storage — file uploads, image hosting, document storage. Your app gets its own bucket.
- Custom domains — point your own domain to your deployed app. Not just a
.workers.devURL, butapp.yourdomain.com.
Why it matters: Phase 1 proved the concept — instant deployment is powerful. Phase 2 makes it practical for real applications. The combination of instant deploy + database + storage + custom domain is a full hosting platform, accessible entirely through your AI assistant.
Knowledge Engine: Production-Grade Memory
The problem: Our current knowledge features work well for simple recall. But production agents need knowledge graphs — relationships between entities, semantic search across documents, and retrieval that improves over time.
What we're building: A production-hardened knowledge engine with entity extraction, relationship mapping, vector search, and feedback loops. When an agent stores information, it's not just a key-value pair — it's a node in a knowledge graph that can be traversed, queried, and enriched.
Why it matters: Memory is the difference between a stateless chatbot and a useful agent. An agent that remembers your project's architecture, your team's preferences, your codebase's patterns, and last quarter's decisions is exponentially more valuable than one that starts fresh every conversation.
Browser Automation Upgrades
What's coming:
- Cookie persistence — sessions that survive across runs
- Multi-tab support — agents working across multiple pages simultaneously
- File downloads — extracting invoices, reports, and exports from browser sessions
- Scheduled browser tasks — "Every Monday, screenshot my analytics dashboard and email it to my team"
- Session recordings — full replay of every browser session, searchable and shareable
Expanded Model Catalog
We're targeting 50+ models across all major providers. More options means better routing — the right model for each task, at the right price point.
New additions coming:
- Hugging Face open-source models
- Specialized models for code, math, and creative writing
- Smaller, faster models for simple routing tasks
- Multi-modal models for image understanding and analysis
Skills Marketplace
The vision: A curated marketplace of agent skills — vetted, tested, and installable with one click. Research skills. Writing skills. Data analysis skills. Code review skills.
Unlike existing skill registries (which are mostly unvetted repositories with security concerns), our marketplace will include:
- Security review before listing
- Compatibility testing across IDEs
- Credit cost transparency
- Usage analytics and ratings
The goal: when you need your agent to do something new, you browse the marketplace, install the skill, and it works. No config. No debugging. No "this skill only works in one specific IDE."
The Timeline
We're not giving dates — we've learned that lesson. But here's the priority order:
- Observability and Coordination Dashboard — these are next
- Remote Task Delegation — the unique differentiator
- App Hosting Phase 2 — databases and storage
- Agent-to-Agent Relay — the long-term bet
- Skills Marketplace — the growth engine
Each feature builds on the last. Observability makes coordination visible. The dashboard makes coordination usable. Task delegation extends coordination across cloud and local. The relay extends it across organizations.
The Big Picture
HexaClaw started as an LLM proxy — a simpler, cheaper way to access AI models. It's becoming something bigger: the operating system for AI agents.
- Runtime: multi-model routing, tool calling, security scanning
- Processes: deterministic workflows, human-in-the-loop automation
- Coordination: task boards, messaging, escalation, checkpoints
- Compute: cloud instances, edge deployment, local integration
- Memory: knowledge graphs, vector search, context persistence
- Observability: traces, cost attribution, performance monitoring
No single piece is revolutionary. Together, they're something new: a complete platform for building, running, and managing AI agents at scale.
We're building this in the open. Follow along, try the features as they ship, and tell us what to prioritize next. The roadmap belongs to our users as much as it belongs to us.