Skip to main content

Stage · Beta

Voice AI Assistant

Jarvis AI

A JARVIS-style desktop assistant with streaming voice, multi-agent system, and full PC automation.

Voice
Streaming, sub-second
Agents
4 specialised
Failover
LLM + STT + TTS
Safety
Plan → Confirm → Run
Local router
Zero-cost queries
Platform
macOS, Win + Linux next

Control your Mac with voice or text. Open apps, manage files, send emails, browse the web, and automate tasks. Streaming voice pipeline with sub-second response, multi-agent ReAct loop, and a Plan to Confirm to Execute safety pipeline for complex actions.

Why this exists.

Existing voice assistants (Siri, Alexa, ChatGPT voice) are either rigid (skill-based, narrow) or stateless (no PC control, no real automation). Power users want a desktop-native assistant that can actually do things on their machine, with sub-second voice response and a safety layer for destructive actions.

  • § 01

    Voice latency kills the loop

    Round-trip voice on most tools takes 3 to 5 seconds. The intent-to-action loop breaks; you give up and use the keyboard.

  • § 02

    No actual PC control

    ChatGPT can describe how to do a thing. It cannot open Safari, send the email, take the screenshot, set the volume, or move the file.

  • § 03

    One-size-fits-all agents

    Research, dev, scheduling, and general help all need different context, voices, and constraints. Single-agent products mash them together.

  • § 04

    Cost runaway

    Always-on voice assistants burn tokens on greetings, time queries, and small talk. Costs creep without visibility.

Working software, not slides.

Each item below is in the codebase today. The full architecture and code walkthrough is available under NDA.

  • Streaming voice pipeline: Silero VAD (browser) → Deepgram STT → Agent → Cartesia TTS, with binary PCM in/out and sub-second response.
  • Barge-in: click the orb to interrupt mid-speech.
  • 4-agent system: JARVIS (main), Scout (research), Mason (dev), Echo (scheduler), each with its own voice, profile, and toolset.
  • 9 tools, 25 functions: AppManager (80+ aliases), FileManager, SystemControl, WebSearch, Browser (Playwright), Calculator, Reminder, EmailManager (Gmail OAuth2), AgentSwitch.
  • Plan → Confirm → Execute safety pipeline for complex multi-step tasks.
  • Local intent router handles simple queries (time, date, greetings, status) with zero API calls.
  • Token usage SQLite database tracking cost per model, per hour, per conversation.
  • Multi-LLM failover: Grok (xAI) → OpenRouter Qwen → OpenAI GPT-4o-mini.
  • Multi-STT failover: Deepgram Nova-2 → OpenAI Whisper → Local Whisper.
  • Multi-TTS failover: Cartesia Sonic → OpenAI → Coqui → gTTS → pyttsx3.
  • Glassmorphic Electron desktop app with HolographicOrb (5 states), animated boot, Orbitron/Rajdhani type system.

Who buys this.

Power users, developers, executives, and accessibility-conscious operators on macOS, with Windows and Linux distribution to follow.

Where it is today.

Validation signals, in plain language. Detailed numbers under NDA.

  • Streaming voice pipeline working end-to-end
  • Multi-agent system shipped with workspace profiles
  • Token usage dashboard live

The tools in play.

  • Electron 33
  • React 19
  • Vite 6
  • Tailwind v4
  • Python 3.9+
  • FastAPI
  • Zustand
  • TanStack Query
  • PyAutoGUI
  • Playwright
  • APScheduler
  • SQLite

Codebase, glassmorphic UI system, agent profiles, and full LLM/STT/TTS failover orchestration. Electron packaging in progress.

The full transfer package.

Acquisition transfers everything below. Investment unlocks the same access for due diligence: code walkthrough, financial pack, customer/user list, and architecture briefing under NDA.

  • Full source code with complete git history.
  • All architecture, API, and operational documentation.
  • Database schema, migrations, and seed data.
  • Brand assets (logo, design system, marketing copy).
  • Domain and DNS handover.
  • Vendor account transfer support (hosting, AI APIs, third-party services).
  • 30 to 90 day transition with founder, including code walkthrough and architecture briefing.
  • Full IP rights, including code copyright and the product brand.

Open to the following.

Anything outside this list is a no for now.

  • Open to investment
  • Open to acquisition
  • Open to partnership

Ready to talk?

Email lands directly with me. I reply within 24 hours, send an NDA, and book a 30-minute call. From there: data room, code walkthrough, and a structured term discussion.

← Back to all ventures