Ask HN: How are you using multi-agent AI systems in your daily workflow?

raffaeleg · 2026-03-15T17:49:00.000Z 1773596940

Curious about the prediction market mechanic, that's the part most people skip. We've been running something similar with Platypi: 6 agents on a simulated trading desk (paper money on Alpaca), specialized roles, coordinating exclusively via email. No dashboard, no human intervention. The coordination patterns that emerged were unexpected. Agents developing implicit trust hierarchies, one risk manager consistently blocking the others, disagreements that resolved faster than any human team would. it's like here: https://platypi.empla.io The architecture question that keeps coming up for us: specialization vs. redundancy. Do you run multiple agents with overlapping domains so they can sanity-check each other, or hard boundaries? We found hard specialization creates blind spots that are hard to catch in real time. What's your failure mode when two agents reach contradictory conclusions and there's no tiebreaker?

jovanaccount · 2026-03-15T13:36:47.000Z 1773581807

From my experience building multi-agent systems: the biggest underappreciated problem is state coordination.

Frameworks handle individual agent capabilities well. What they don't handle: preventing two agents from silently overwriting each other's work on shared state. It's a classic race condition but in AI systems the output looks reasonable, so you don't notice it until production.

We open-sourced a coordination layer that adds atomic state management to any framework (LangChain, AutoGen, CrewAI, MCP, etc.): https://github.com/Jovancoding/Network-AI

formreply · 2026-03-06T16:20:14.000Z 1772814014

What fails spectacularly in our setup: agents that share a conversation thread and try to resolve conflicts in real time. They race to add the last word, produce verbose non-decisions, and eventually one agent just agrees with whatever was said last. Consensus is a bad protocol for async, unequal agents.

What works: role clarity + veto rights. One agent can only block, never propose. One agent makes calls, others can raise flags. You stop the chatbot parliament problem and actually get decisions.

The other pattern worth stealing from production systems: treat inbound events (emails, webhooks, form submissions) as the task boundary, not the conversation turn. An agent that owns a mailbox and processes messages one at a time is dramatically more auditable than one that's always-on and decides what to react to. You can replay it, diff its outputs, and understand why it did what it did.

raffaeleg · 2026-03-08T01:12:28.000Z 1772932348

We're running a live event at platypi.empla.io — a simulated trading desk where 6 agents coordinate entirely via email with no human in the loop. No shared conversation thread, no central orchestrator. Bozen (supervisor) gets a morning briefing from each PM agent, they argue about positions over email, Mizumo executes. The interesting thing isn't the trading — it's that email as coordination protocol produces naturally auditable, replayable agent behavior. Paper money on Alpaca, but the coordination infrastructure is the point.

stokemoney · 2026-03-06T23:43:06.000Z 1772840586

Built my own custom solution that is completely spec-driven. Have concepts of specs, plans, and then a kanban board to monitor all agents as it progresses

It takes a plan, breaks it into dependent tasks, has human-in-the-loop for approval, and then is fire-and-forget after the plan is started with parallel agent workers. Has complete code review loops and testing loops for accuracy and quality. Idempotent retries and restarts...Completely frontend-driven so I don't have to deal with dumb terminals like claude code...

mrothroc · 2026-03-06T14:52:14.000Z 1772808734

I've been running a multi-agent setup for quite a while to do software development. I set up a workflow with agents at each stage, spec->plan->design->code->review. The key thing I learned was that the arrangement of the checks between agents matters more than which model you pick for any one step. Most failures were omissions that a gate between stages catches.

Horos · 2026-03-06T11:33:59.000Z 1772796839

I've set a fully async patern. blobs chunked into sqlite shards.

It's a blind fire n forget go worker danse.

wich can be hold as monitoreed or scale as multiple instances if needed by simple parameters.

Basicaly, It's a job as librairy patern.

If you dont need real time, its bulletproof and very llm friendly.

and a good token saver by the batching abilities.

leandot · 2026-03-06T12:11:27.000Z 1772799087

Curious about more details about this setup?

Horos · 2026-03-06T12:29:58.000Z 1772800198

The "job as library" pattern is simple: instead of wiring jobs into main or a framework, you split into 3 things.

Your queue is a struct with New(db) — it knows submit, poll, complete, fail, nothing else.

Your worker is another struct that loops on the queue and dispatches to handlers registered via RegisterHandler("type", fn). Your handlers are pure functions (ctx,payload) → (result, error) carried by a dependency struct.

Main just assembles: open DB, create queue, create worker, register handlers, call worker.Start(ctx). Result: each handler is unit-testable without the worker or network, the worker is reusable across any pipeline, and lifecycle is controlled by a simple context.Cancel().

Bonus: here the queue is a SQLite table with atomic poll (BEGIN IMMEDIATE), zero external infra.

The whole "framework" is 500 lines of readable Go, not an opaque DSL. TL;DR: every service is a library with New() + Start(ctx), the binary is just an assembler.

The "all in connectivity" pattern means every capability in your system — embeddings, document extraction, replication, MCP tools — is called through one interface: router.Call(ctx,"service", payload).

The router looks up a SQLite routes table to decide how to fulfill that call: in-memory function (local), HTTP POST (http), QUIC stream (quic), MCP tool (mcp), vector embedding (embed), DB replication (dbsync), or silent no-op (noop).

You code everything as local function calls — monolith. When you need to split a service out, you UPDATE one row in the routes table, the watcher picks it up via PRAGMA data_version, and the next call goes remote.

Zero code change, zero restart. Built-in circuit breaker, retry with backoff, fallback-to-local on remote failure, SSRF guard.

The caller never knows where the work happens.

That's the "job as library" pattern: the boundary between monolith and microservices is a config row, not an architecture decision.

https://github.com/hazyhaar/pkg/tree/main/connectivity

Horos · 2026-03-13T13:39:37.000Z 1773409177

had a look ?

dhruvkar · 2026-03-07T04:32:48.000Z 1772857968

Following.

I'm using Openclaw + Opus. Several subagents.

However, performance is degraded when using subagents - scraping is less smart, content is worse written, etc.

I'm curious about using different instances instead, but not sure how to use a shared memory foundation effectively.

humbleharbinger · 2026-03-07T05:03:07.000Z 1772859787

We built a messaging platform for exactly this use case and instruct claws to check in with each other or share context with each other at regular intervals.

Check out htpps://agentbus.org

xpnsec · 2026-03-06T11:40:42.000Z 1772797242

More interestingly, what frameworks/harnesses/architecture are people using to drive multi-agent workflows?

Nancy0904 · 2026-03-06T08:54:07.000Z 1772787247

It sounds complicated. Is your Agent trying to solve everything?

Irving-AI · 2026-03-06T09:00:45.000Z 1772787645

How well is your agent performing?