← Documents
Reference · AI Host

Agentic Gathering Host

A speaking AI companion that joins Sponic Agentic Gatherings as the host's wing-friend — listens, replies in the speaker's language, and runs the conversational glue. Production now uses LiveKit Agents + Gemini Live native audio, orchestrated by the self-hosted Phoenix worker.

Prepared 2026-05-06 · Sponic Gardens · Sibling of Live Translation Subtitles
Current model path (2026-05-13): production uses livekit.plugins.google.realtime.RealtimeModel with gemini-3.1-flash-live-preview for native audio in/out. The older Deepgram → Claude → ElevenLabs/Gemini-TTS cascade is historical; apps/ai-host/gemini_tts.py remains in the tree but is not the active runtime.
Phase 1 β€” shipped Pre-credentials Local-only run mode
Contents
  1. What it is & what it does
  2. How it works (Phase 1 architecture)
  3. File map
  4. First-time setup
  5. Normal operation (day-to-day)
  6. Testing checklist
  7. Production operation (deploy targets, monitoring)
  8. Cost
  9. Troubleshooting
  10. Phase 1 vs the long roadmap

What it is & what it does

The Agentic Gathering Host is a real-time voice agent designed to sit at the head of a Sponic Gardens gathering and play the role of warm, curious, slightly playful host. It joins a LiveKit room, receives participant audio, sends it through Gemini Live native audio, and publishes spoken responses back into the same room. Optional Simli renders the host as a live avatar on the venue display.

It is not a generic assistant. The persona is constrained to the Agentic Gathering register: 1–2 sentence replies, no medical/legal/financial advice, no politics or religion unless the table clearly invites it, no pretending to eat or drink.

The active production worker is apps/ai-host/agent.py on Oracle Phoenix, registered as the named LiveKit worker sponic-dinner-host. The historical dev task still documents how the stack evolved, but current cost/model planning should use Gemini Live.

Billing model: LiveKit room cost scales by connected participant minutes. Gemini Live and Simli are room/session costs for the host/avatar, not a separate AI charge per human. The Phoenix worker is self-hosted, so LiveKit Cloud agent-session minutes do not apply unless we move the worker to LiveKit Cloud hosting.

How it works (Phase 1 architecture)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       LiveKit Cloud (SFU)                         β”‚
β”‚       wss://sponic-XXXX.livekit.cloud  Β·  WebRTC transport        β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚  guest mic audio                              β”‚  agent voice
       β–Ό                                               β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                               β”‚
β”‚  Browser / phone     β”‚  (LiveKit Playground for       β”‚
β”‚  in the LiveKit room β”‚   testing; later a venue tab    β”‚
β”‚                      β”‚   or a Sponic intranet page)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                               β”‚
                                                       β”‚
                                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                          β”‚  Python agent worker       β”‚
                                          β”‚  apps/ai-host/agent.py     β”‚
                                          β”‚                            β”‚
                                          β”‚  Gemini Live native audio  β”‚  speech ↔ speech
                                          β”‚  gemini-3.1-flash-live-    β”‚  RealtimeModel
                                          β”‚  preview Β· voice "Puck"    β”‚
                                          β”‚      β”‚                     β”‚
                                          β”‚      β”œβ”€ transcript capture β”‚  for Supabase
                                          β”‚      β”‚                     β”‚
                                          β”‚      └─ optional Simli     β”‚  avatar track
                                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The agent worker is a long-lived Python process that registers with LiveKit Cloud as a named worker and gets dispatched into a room when one is created. It attaches an AgentSession backed by Gemini Live native audio. The session handles speech input, reasoning, and spoken output in one realtime model call; the worker captures transcript metadata for Supabase and can attach a Simli avatar session that subscribes to the generated audio.

Component choices & why

StageChoiceWhy
Transport LiveKit Cloud (SFU) Handles WebRTC plumbing, multi-participant rooms, and a hosted SFU we don't have to operate. Free tier covers Phase 1 (well under the 50 connection-min/mo dev limit).
Realtime model Gemini Live via livekit.plugins.google.realtime Native audio in/out, multilingual speech, interruption handling, and one billing line for the core host session.
Persona prompts Supabase prompt library Runtime prompts are loaded from public.prompts by stable prompt names/codes, so prompt tuning does not require redeploying the worker.
Avatar Simli optional Runs only when the event has a face/avatar configuration. Cost is per active avatar minute, independent of how many humans are in the room.
Compute Oracle Phoenix VM Self-hosted worker, already paid. This avoids LiveKit-hosted agent-session metering while preserving LiveKit Cloud rooms.

File map

PathPurpose
apps/ai-host/agent.pyThe worker entrypoint. Wires Gemini Live native audio into an AgentSession, loads prompt-library personas, records transcripts, starts optional Simli, and registers with LiveKit via cli.run_app(WorkerOptions(...)).
apps/ai-host/persona.pyPrompt/dossier helpers used by the database-managed persona prompts. Internal persona names still include dinner_host and tea_party for compatibility.
apps/ai-host/pyproject.tomlDeclares Python 3.12 + livekit-agents 1.x + the Google realtime plugin. Use uv sync to install.
apps/ai-host/gemini_tts.pyHistorical custom TTS wrapper. Kept for reference, but production now uses Gemini Live native audio through agent.py.
apps/ai-host/.env.exampleTemplate for runtime credentials. Copy to .env and fill in. Never commit .env.
apps/ai-host/README.mdQuick-start guide (this doc is the longer-form reference; the README is the 5-minute version).

First-time setup

Do this once per machine. After it's done, you never repeat any of these steps — the day-to-day workflow in the next section is just a single command.

Step 1 — Install runtime

You need Python 3.12 and uv (the package manager). On macOS:

# Python 3.12
pyenv install 3.12          # if you use pyenv (recommended)
# or: brew install [email protected]

# uv (manages venv + lockfile)
brew install uv             # macOS
# Linux/Ubuntu: curl -LsSf https://astral.sh/uv/install.sh | sh

# Verify
python3 --version           # β†’ Python 3.12.x
uv --version                # β†’ uv 0.x.y

Step 2 — Provision four service accounts

Sign up for four services and grab keys from each. Save every credential to the Bitwarden collection DevOps-sponicgarden (in the ALPU.CA org — it's a collection, not a personal folder; bw create item requires organizationId + collectionIds). The project never reads keys from .env alone — BW is the source of truth.

BW items already provisioned as of 2026-05-06:

ServiceSign-upWhat to grabBW item name
LiveKit Cloud
(WebRTC transport)
cloud.livekit.io Create project named sponic. From Settings → Keys, copy three values: project URL (wss://sponic-XXXX.livekit.cloud), API key (starts with API), API secret. LiveKit Cloud β€” sponic
Google AI (Gemini Live)
(native audio model)
aistudio.google.com/app/apikey Reuse the existing Sponic Gemini key (already in BW). No separate provisioning needed — the same key powers live translation and Gemini Live. Default model: gemini-3.1-flash-live-preview; default voice: Puck. Google Gemini β€” SponicGardens (USE THIS)
Simli
(optional avatar)
simli.com Only required for visual avatar sessions. The worker tracks avatar minutes separately with SIMLI_USD_PER_MINUTE. existing Simli key on Phoenix

Step 3 — Install dependencies

cd apps/ai-host
uv sync          # creates .venv/, installs livekit-agents 1.x + all plugins
                 # takes ~30 s on a warm cache, ~3 min from scratch

Step 4 — Configure .env

cp .env.example .env
# Open .env in your editor and paste these from the BW items above:
#   LIVEKIT_URL=wss://sponic-XXXX.livekit.cloud
#   LIVEKIT_API_KEY=API...
#   LIVEKIT_API_SECRET=...
#   GOOGLE_AI_API_KEY=AIza...
#   GEMINI_MODEL=gemini-3.1-flash-live-preview
#   GEMINI_VOICE=Puck
Never commit .env. The .gitignore excludes it, but if you rename it or add a sibling file (.env.local, etc.) the rule may not catch it. Keep credentials in BW; .env is just a runtime cache.

Step 5 — First-run smoke test

Confirm the install worked without going near a microphone:

cd apps/ai-host
uv run python -c "from persona import DINNER_HOST_PROMPT, build_prompt; print(f'Persona loaded: {len(build_prompt())} chars')"
# β†’ Persona loaded: 1234 chars

uv run python -c "import ast; ast.parse(open('agent.py').read()); print('agent.py syntax OK')"
# β†’ agent.py syntax OK

If both lines succeed, you're set up. Move to Normal operation.

Normal operation (day-to-day)

Once setup is done, every working session collapses to this:

Start the agent

cd apps/ai-host
uv run python agent.py dev

You should see:

[INFO] connecting to wss://sponic-XXXX.livekit.cloud
[INFO] registered worker: dinner-host-1
[INFO] waiting for room dispatch...

The dev subcommand runs in foreground with hot-reload on file changes — edit persona.py and the worker restarts itself. Use this for everything except production.

The worker is now idle, waiting for someone to join a LiveKit room. It costs nothing while idle (no API calls, no tokens).

Test in the LiveKit Playground

The official Playground is the day-to-day way to talk to your agent — no client code, just a browser tab.

1

Open agents-playground.livekit.io in a fresh tab. Paste your project URL, API key, and API secret (same values as in .env). The Playground caches them in localStorage, so this is one-time per browser.

2

Click Connect. Your worker terminal logs agent joined room within a second; the browser asks for mic permission.

3

You should hear the smoke-test greeting: "Welcome to Sponic Gardens! I'm your Agentic Gathering host tonight." If you hear it, the whole pipeline is healthy.

4

Talk to it. Try English ("What's for the Agentic Gathering?"), then Polish ("SkΔ…d jesteΕ›?"). Replies should arrive in ~1–1.5 s and match the language you spoke.

5

Hit Disconnect when done. The worker logs room closed and goes idle — ready for the next session. Leave it running while you iterate on persona.py.

Stopping

Just Ctrl-C the uv run terminal. There's no state to clean up — the worker is stateless between rooms.

Iterating on the persona

Edit apps/ai-host/persona.py while the worker runs in dev mode. The worker auto-reloads. Reconnect from the Playground (or refresh the page) to test the new prompt — existing rooms keep the old persona until you reconnect.

Updating dependencies

cd apps/ai-host
uv sync --upgrade        # pull latest livekit-agents, plugins, etc.
uv run python -c "from livekit.agents import Agent; print('still works')"

Testing checklist

When validating a build (yours or after a pull), check these in order. They're listed cheapest-to-most-expensive in time, so a failure at any step short-circuits the rest.

Static checks (no creds, no mic)

Live checks (Playground)

Production operation (deploy targets, monitoring)

Everything above assumes you're running locally during dev. To leave the agent running for an Agentic Gathering (or for hands-off staging), you need a host that stays up.

Phase 1 only needs a way to keep one Python worker process running — no public ingress, no DNS, no scaling story. Three reasonable hosts depending on stage:

WhereUsePros / cons
M4 (laptop) Day-to-day dev & one-off demos Free, instant feedback. Dies when the lid closes — not for "always on".
Oracle Phoenix Always-on staging Already paid (Always Free tier). 4 ARM cores, 24 GB RAM, plenty of headroom alongside the prompt-runner. Outbound-only (worker connects out to LiveKit) so no firewall changes. Recipe in infra/runbook.md → "SponicControl prompt-runner on Oracle Phoenix" — same systemd-service pattern.
ALPUCA (M4 Mac Mini, Poland) Low-latency for actual Agentic Gatherings ~150 ms closer to Polish guests than Phoenix. Already runs the live-translation backend. Trade-off: home internet up = critical, single point of failure.

Production-ish run (Phoenix)

Once the agent is stable enough to leave running:

# On Phoenix:
sudo apt install -y python3.12 python3.12-venv     # 22.04 needs the deadsnakes PPA
curl -LsSf https://astral.sh/uv/install.sh | sh
git clone <repo> ~/sponic         # or scp the apps/ai-host subdir
cd ~/sponic/apps/ai-host
uv sync
cp .env.example .env && nano .env                  # paste creds

# systemd unit (mirror the pattern from prompt-runner.service):
sudo nano /etc/systemd/system/sponic-ai-host.service
# [Service]
# WorkingDirectory=/home/ubuntu/sponic/apps/ai-host
# ExecStart=/home/ubuntu/.local/bin/uv run python agent.py start
# Restart=always
# EnvironmentFile=/home/ubuntu/sponic/apps/ai-host/.env

sudo systemctl daemon-reload
sudo systemctl enable --now sponic-ai-host
sudo journalctl -fu sponic-ai-host                  # tail logs
Use agent.py start in production, not dev. The dev mode auto-restarts on file changes — useful locally, dangerous on a host.

Monitoring

Cost

Current production costs are session-based: Gemini Live for the host, LiveKit participant minutes for connected clients, and optional Simli avatar minutes.

10-minute scenarioHumansSimli displayEstimated cost
Interview1No$0.24
Agentic Gathering4No$0.25
Agentic Gathering10No$0.28
Agentic Gathering4Yes$3.26
Agentic Gathering10Yes$3.29

Assumptions: Gemini Live $0.005/min audio input + $0.018/min audio output, LiveKit WebRTC $0.0005/participant-min, Simli $0.30/min, and one extra display participant when Simli is enabled. The Phoenix worker is self-hosted, so LiveKit-hosted agent-session minutes are not included.

Troubleshooting

SymptomLikely causeFix
Worker connects but never gets dispatchedProject URL mismatch — worker is on a different LiveKit project than the Playground.Confirm LIVEKIT_URL in .env matches the URL pasted in the Playground.
"401 Unauthorized" from LiveKitAPI key/secret swapped or stale.Re-copy from Settings → Keys; secrets are only shown once at creation, you may need a fresh key.
No greeting on connect, but worker shows "joined room"Gemini Live model/voice is unknown, or GOOGLE_AI_API_KEY is missing/invalid.Reset GEMINI_MODEL to gemini-3.1-flash-live-preview and GEMINI_VOICE to Puck. Verify the key with: curl -s "https://generativelanguage.googleapis.com/v1beta/models?key=$GOOGLE_AI_API_KEY" | head — should return a JSON list of models, not a 401.
Latency > 3 sGemini Live response delay, overloaded worker, or network path issue.Check Google AI Studio status/quota, Phoenix CPU/memory, and /var/log/sponic-ai-host.log. If the delay grows with context, reduce prompt size or restart the session.
Replies are in English no matter whatPersona prompt was edited and the multilingual rule got cut.Re-check persona.py for the "Reply in the language of the most recent speaker" line.
Replies are 5 paragraphs longGemini Live is ignoring a soft persona constraint.The 1–2 sentence cap is in the prompt library/persona prompt — tighten the wording (try "Never exceed 30 words.").
uv sync fails on macOS arm64Apple Silicon needs Python 3.12+ for some plugins.Confirm python3 --version is 3.12.x. pyenv install 3.12 if not.
Worker dies silently after a minuteMissing Gemini key, bad LiveKit credentials, or unhandled plugin error.Check /var/log/sponic-ai-host.log, then restart sponic-ai-host.service after fixing env.

Phase 1 vs the long roadmap

This document tracks the active Gemini Live implementation. The older 8-phase plan remains useful as history, but its cascade model/cost estimates are superseded by this page and the Phoenix runbook.

PhaseAddsStatus
1 — Native-audio agentGemini Live, single participant, smoke-test greetingShipped 2026-05-13
2 — Multi-mic subscriptionSubscribe to all participants in the room, label utterances by speakerPending
3 — Intervention controlPrompt/tool tuning for when the host should speak versus stay quietIn progress
4 — Programmatic promptsData-channel API so the host can deliver scheduled toasts and table announcements on cuePending
5 — Simli avatarLive avatar on a venue display, driven by the Gemini Live audio output.Shipped
6 — Translation-app bridgeSubscribe to ALPUCA's existing wss://subs.sponicgardens.com/subtitles?lang=... for transcripts — eliminates duplicate STT costPending
7 — Persistence & memoryPer-guest memory across Agentic Gatherings (likes, dietary, prior conversations)Pending
8 — Phone interviewerPhone-callable host for prospect screeningPending — extend Vapi, don't rebuild.
Phase 8 caveat: the dev task doc still describes building LiveKit SIP from scratch for the phone interviewer. Sponic already has a working Vapi integration in production at apps/control/supabase/functions/vapi-server/index.ts (caller-ID lookup, role-based prompts, smart-home permission routing) and an admin UI at apps/control/public/spaces/admin/voice.html. When Phase 8 lands, the right move is to extend Vapi — provision a Sponic-owned Vapi account, reuse the assistant-request handler, plug in the same persona — not rebuild the phone path on LiveKit. Estimated savings: ~5 days vs ~10 days. Update the dev task doc before Phase 8 starts.