Back to Blog
AI & LLMs September 8, 2025 · 11 min read

Synapse Studio: A 2D Virtual Office Where AI Agents Do the Real Work

GM

Gonzalo Monzón

Founder & Lead Architect

What if your AI wasn't a chat window but a virtual office? A visual space where you can see agents walking between desks, sitting down to work on tasks, thinking, collaborating — and delivering real results? That's Synapse Studio: a 2D animated office (think SimTower meets AI) where multi-agent teams with 7 multimodal capabilities process real tasks autonomously.

No React. No frameworks. Zero npm dependencies. Just Vanilla JS, CSS animations, and Cloudflare's edge infrastructure. This article covers the architecture, the multi-agent orchestration, and the iterative image evolution system that makes it unique.

The Concept: Why a Virtual Office?

AI is abstract. You send a prompt, you get a response. But when you're orchestrating multiple AI agents working in parallel on a complex task — research, analysis, image generation, quality review, final compilation — the abstraction breaks down. Who's doing what? What's the status? Where's the bottleneck?

We turned the abstraction into a spatial metaphor:

  • Each agent has a desk in a 2D office building
  • Departments occupy different floors (Marketing, Sales, Support, Creative)
  • Agents physically move to collaborate — you see them walk to a colleague's desk
  • Visual states: idle (at desk), thinking (animation), working (typing), collaborating (at another desk)
  • Task progress is visible in real-time through the office activity

It's not a gimmick — it's a genuine UX solution to the multi-agent observability problem. When you have 4 agents working on a task, you see the work happening.

7 Multimodal Capabilities

Each agent can be configured with any combination of these capabilities:

CapabilityCodeWhat It DoesProvider
LLM (Text)BaseChat, reasoning, analysisGemini / Claude / GPT-4o / Groq
Web SearchF1Real-time internet searchGroq Compound Beta
Vision (ITT)F2Analyze and understand imagesGemini 2.0 Flash
Text-to-Image (TTI)F3Generate images from textFLUX Schnell / SDXL Lightning
Image-to-Image (I2I)F6Transform images iterativelyStable Diffusion 1.5 / SDXL
Speech-to-Text (STT)F4Transcribe audioElevenLabs Scribe / Whisper
Text-to-Speech (TTS)F5Generate voice narrationElevenLabs

The key insight: agents aren't just chatbots with extra features. They're multimodal workers. A "Visual Designer" agent has TTI + I2I + Vision. A "Researcher" has LLM + Web Search. A "Fullstack Media" agent has all 7 capabilities. The role defines the tool belt, not the other way around.

Multi-Agent Orchestration

When a task arrives — from Cadences, a public form, a scraper, or a webhook — the orchestration engine takes over:

Incoming task
  │
  ├── Orchestrator AI analyzes the task
  │     └── Generates a multi-agent execution plan
  │
  ├── Role assignment
  │     ├── Agent 1: Research (data gathering, web search)
  │     ├── Agent 2: Expertise (deep analysis with premium AI)
  │     ├── Agent 3: Review (quality check, fact verification)
  │     └── Agent 4: Compilation (final deliverable)
  │
  ├── Dual AI execution
  │     ├── Fast AI: quick responses (Groq, DeepSeek)
  │     └── Deep AI: thorough analysis (Gemini 2.5, Claude)
  │
  └── Delivery
        ├── Email
        ├── WhatsApp
        └── Webhook

The Dual AI system is critical for cost optimization. The Fast AI handles initial analysis, routing, and simple subtasks at near-zero cost (Groq is essentially free at our volume). The Deep AI only activates for complex reasoning, creative generation, or quality-critical outputs — saving ~70% on AI costs compared to running everything through premium models.

1-8 Agents Per Task

Not every task needs 8 agents. A simple "research this topic" might use 1 agent with LLM + Web Search. A complex "create a marketing campaign with visuals" might use 5 agents across Research, Copy, Design, Review, and Compilation roles. The orchestrator decides based on task complexity.

I2I: Iterative Image Evolution

This is the most innovative feature. Instead of generating one image and hoping it's right, Synapse Studio evolves images through chains of transformations:

Original image → I2I Transform (strength: 0.7) → Result v1
                                                     │
                                          AI proposes next step
                                                     │
                                       Auto-subtask created
                                                     │
                                  Result v1 → I2I Transform → v2
                                                                │
                                                      ...continues

Each transformation has two key parameters:

  • Strength (0.0-1.0): How much the image changes. 0.3 = subtle refinement. 0.9 = dramatic reimagining.
  • Guidance (1-20): How literally the AI follows the text prompt. Higher = more literal.

After each transformation, the AI analyzes the result and proposes the next iteration — what to change, what to keep, what strength to use. The chain continues until max_depth is reached or the AI decides the result is optimal.

Use cases: logo refinement, concept art iteration, style transfer evolution, progressive detail enhancement. Each step builds on the previous one, maintaining visual consistency while exploring creative directions.

Two Built-In Bots

SynapseBot (The Architect)

A conversational wizard that creates complete agencies from natural language:

"I need a marketing agency with 5 agents specializing in social media, copywriting, and visual design"

SynapseBot generates the entire structure: agency name, departments, agents with predefined roles and capabilities. Instant multi-agent team setup.

CommandBot (The CEO)

An AI CEO that supervises operations in real-time:

  • Approves/rejects subtasks proposed by agents
  • Monitors execution progress
  • Makes prioritization decisions when agents compete for resources
  • Can override agent decisions when quality doesn't meet standards

3-Level Quality System

Every organization can choose its quality tier, balancing cost vs. output quality:

LevelAI ProviderImage ModelCost
L1Workers AI (free)FLUX Schnell (free)$0
L2Groq / DeepSeekFLUX Schnell~$0.001/task
L3Gemini / Claude / GPT-4SDXL Lightning~$0.02-0.10/task

L1 is completely free — using Cloudflare's Workers AI models. Good enough for internal tasks, drafts, and experimentation. L3 is premium quality for client-facing deliverables. Most organizations run L2 for routine work and escalate to L3 for final outputs.

Input/Output Sources

Tasks can come from anywhere and results can go anywhere:

Input SourcesOutput Destinations
Cadences Bridge (automatic flow)Email
Public DATA_TABLE formsWhatsApp
Web scrapersWebhook (any URL)
Webhooks (external APIs)Back to Cadences

The Cadences Bridge is the most interesting: tasks created in Cadences automatically flow to Synapse Studio for AI processing. Results flow back. The user never leaves Cadences — they just see their task completed with AI-generated deliverables attached.

The Subtask System: Agents That Think Ahead

Agents don't just complete their assigned work — they propose additional work when they identify gaps:

  1. Agent detects that it needs additional information or a complementary output
  2. Proposes a subtask with type and parameters
  3. CommandBot (CEO AI) validates the proposal
  4. Subtask executes with its own pipeline
  5. Result feeds back into the original task

Subtask types: additional research, image generation, I2I evolution, data analysis, complementary writing. This means a simple "write a blog post about Morocco" can organically expand into: research current travel trends → generate hero image → write the post → evolve the image with I2I → review quality → compile final output with images embedded.

Zero Dependencies, Maximum Control

Synapse Studio is built with zero npm dependencies. No React, no Vue, no build step. Pure Vanilla JS + CSS.

Why? Because the animation system — agents walking between floors, sitting at desks, showing thought bubbles, status indicators — requires precise control over DOM manipulation and CSS transitions. No framework overhead, no virtual DOM reconciliation delays. Every animation runs at 60fps because there's nothing between our code and the browser.

The 5 main JS files:

  • synapse-canvas.js — The 2D office renderer (building, floors, desks, agents)
  • synapse-engine.js — Task orchestration, agent assignment, execution pipeline
  • synapse-ui.js — Panels, interactions, responsive layout
  • config-panel.js — Agency/agent/capability configuration
  • detail-panel.js — Rich result display (images, I2I comparisons, text)

Backend: a single [[path]].js catch-all route on Cloudflare Workers. Storage: R2 for images, D1 for agencies/agents/tasks/results. Total infrastructure cost? Part of the $65/month Cloudflare bill we detailed in our Cloudflare article.

Key Takeaways

1. Spatial metaphors solve observability. When you have multiple AI agents working in parallel, a chat log isn't enough. A visual office where you see agents move, think, and collaborate makes the invisible visible.

2. Dual AI is the cost sweet spot. Fast AI for routing and simple tasks, Deep AI for quality-critical outputs. The 70% cost savings make multi-agent systems economically viable even for small teams.

3. Image evolution > one-shot generation. The I2I chain system produces dramatically better results than single-prompt image generation, because each iteration refines while maintaining visual consistency.

4. The subtask proposal system is emergent behavior. Agents that can identify gaps and propose additional work create outcomes that exceed what you'd get from rigid task assignment. The CEO Bot prevents scope explosion while allowing organic expansion.

5. Zero dependencies isn't a limitation — it's freedom. No framework lock-in, no dependency vulnerabilities, no build complexity. The entire studio ships as static files to Cloudflare Pages. Deploys in seconds.

Tags

AI Agents Multi-Agent Multimodal Image Generation Vanilla JS Cloudflare Workers

About the Author

Gonzalo Monzón

Gonzalo Monzón

Founder & Lead Architect

Gonzalo Monzón is a Senior Solutions Architect & AI Engineer with over 26 years building mission-critical systems in Healthcare, Industrial Automation, and enterprise AI. Founder of Cadences Lab, he specializes in bridging legacy infrastructure with cutting-edge technology.

Stay in the loop

Get notified when we publish new articles about AI automation, use cases, and practical guides.