Reference · AI Host

Agentic Gathering Host

A speaking AI companion that joins Sponic Agentic Gatherings as the host's wing-friend — listens, replies in the speaker's language, and runs the conversational glue. Production now uses LiveKit Agents + Gemini Live native audio, orchestrated by the self-hosted Phoenix worker.

Prepared 2026-05-06 · Sponic Gardens · Sibling of Live Translation Subtitles

Current model path (2026-05-13): production uses livekit.plugins.google.realtime.RealtimeModel with gemini-3.1-flash-live-preview for native audio in/out. The older Deepgram → Claude → ElevenLabs/Gemini-TTS cascade is historical; apps/ai-host/gemini_tts.py remains in the tree but is not the active runtime.

Phase 1 — shipped Pre-credentials Local-only run mode

Contents

What it is & what it does
How it works (Phase 1 architecture)
File map
First-time setup
Normal operation (day-to-day)
Testing checklist
Production operation (deploy targets, monitoring)
Cost
Troubleshooting
Phase 1 vs the long roadmap

What it is & what it does

The Agentic Gathering Host is a real-time voice agent designed to sit at the head of a Sponic Gardens gathering and play the role of warm, curious, slightly playful host. It joins a LiveKit room, receives participant audio, sends it through Gemini Live native audio, and publishes spoken responses back into the same room. Optional Simli renders the host as a live avatar on the venue display.

It is not a generic assistant. The persona is constrained to the Agentic Gathering register: 1–2 sentence replies, no medical/legal/financial advice, no politics or religion unless the table clearly invites it, no pretending to eat or drink.

The active production worker is apps/ai-host/agent.py on Oracle Phoenix, registered as the named LiveKit worker sponic-dinner-host. The historical dev task still documents how the stack evolved, but current cost/model planning should use Gemini Live.

Billing model: LiveKit room cost scales by connected participant minutes. Gemini Live and Simli are room/session costs for the host/avatar, not a separate AI charge per human. The Phoenix worker is self-hosted, so LiveKit Cloud agent-session minutes do not apply unless we move the worker to LiveKit Cloud hosting.

How it works (Phase 1 architecture)

┌──────────────────────────────────────────────────────────────────┐
│                       LiveKit Cloud (SFU)                         │
│       wss://sponic-XXXX.livekit.cloud  ·  WebRTC transport        │
└──────┬──────────────────────────────────────────────┬─────────────┘
       │  guest mic audio                              │  agent voice
       ▼                                               │
┌──────────────────────┐                               │
│  Browser / phone     │  (LiveKit Playground for       │
│  in the LiveKit room │   testing; later a venue tab    │
│                      │   or a Sponic intranet page)    │
└──────────────────────┘                               │
                                                       │
                                          ┌────────────▼──────────────┐
                                          │  Python agent worker       │
                                          │  apps/ai-host/agent.py     │
                                          │                            │
                                          │  Gemini Live native audio  │  speech ↔ speech
                                          │  gemini-3.1-flash-live-    │  RealtimeModel
                                          │  preview · voice "Puck"    │
                                          │      │                     │
                                          │      ├─ transcript capture │  for Supabase
                                          │      │                     │
                                          │      └─ optional Simli     │  avatar track
                                          └────────────────────────────┘

The agent worker is a long-lived Python process that registers with LiveKit Cloud as a named worker and gets dispatched into a room when one is created. It attaches an AgentSession backed by Gemini Live native audio. The session handles speech input, reasoning, and spoken output in one realtime model call; the worker captures transcript metadata for Supabase and can attach a Simli avatar session that subscribes to the generated audio.

Component choices & why

Stage	Choice	Why
Transport	LiveKit Cloud (SFU)	Handles WebRTC plumbing, multi-participant rooms, and a hosted SFU we don't have to operate. Free tier covers Phase 1 (well under the 50 connection-min/mo dev limit).
Realtime model	Gemini Live via `livekit.plugins.google.realtime`	Native audio in/out, multilingual speech, interruption handling, and one billing line for the core host session.
Persona prompts	Supabase prompt library	Runtime prompts are loaded from `public.prompts` by stable prompt names/codes, so prompt tuning does not require redeploying the worker.
Avatar	Simli optional	Runs only when the event has a face/avatar configuration. Cost is per active avatar minute, independent of how many humans are in the room.
Compute	Oracle Phoenix VM	Self-hosted worker, already paid. This avoids LiveKit-hosted agent-session metering while preserving LiveKit Cloud rooms.

File map

Path	Purpose
`apps/ai-host/agent.py`	The worker entrypoint. Wires Gemini Live native audio into an `AgentSession`, loads prompt-library personas, records transcripts, starts optional Simli, and registers with LiveKit via `cli.run_app(WorkerOptions(...))`.
`apps/ai-host/persona.py`	Prompt/dossier helpers used by the database-managed persona prompts. Internal persona names still include `dinner_host` and `tea_party` for compatibility.
`apps/ai-host/pyproject.toml`	Declares Python 3.12 + `livekit-agents 1.x` + the Google realtime plugin. Use `uv sync` to install.
`apps/ai-host/gemini_tts.py`	Historical custom TTS wrapper. Kept for reference, but production now uses Gemini Live native audio through `agent.py`.
`apps/ai-host/.env.example`	Template for runtime credentials. Copy to `.env` and fill in. Never commit `.env`.
`apps/ai-host/README.md`	Quick-start guide (this doc is the longer-form reference; the README is the 5-minute version).

First-time setup

Do this once per machine. After it's done, you never repeat any of these steps — the day-to-day workflow in the next section is just a single command.

Step 1 — Install runtime

You need Python 3.12 and uv (the package manager). On macOS:

# Python 3.12
pyenv install 3.12          # if you use pyenv (recommended)
# or: brew install [email protected]

# uv (manages venv + lockfile)
brew install uv             # macOS
# Linux/Ubuntu: curl -LsSf https://astral.sh/uv/install.sh | sh

# Verify
python3 --version           # → Python 3.12.x
uv --version                # → uv 0.x.y

Step 2 — Provision four service accounts

Sign up for four services and grab keys from each. Save every credential to the Bitwarden collection DevOps-sponicgarden (in the ALPU.CA org — it's a collection, not a personal folder; bw create item requires organizationId + collectionIds). The project never reads keys from .env alone — BW is the source of truth.

BW items already provisioned as of 2026-05-06:

LiveKit Cloud — BW item 64d75f73-474a-4312-89e4-b442015302d8
Google AI (Gemini) — BW item c4b16931-335b-457a-9910-b416006d3b8c ("Google Gemini — SponicGardens") (key is in login.password, not in custom fields)

Service	Sign-up	What to grab	BW item name
LiveKit Cloud (WebRTC transport)	cloud.livekit.io	Create project named `sponic`. From Settings → Keys, copy three values: project URL (`wss://sponic-XXXX.livekit.cloud`), API key (starts with `API`), API secret.	`LiveKit Cloud — sponic`
Google AI (Gemini Live) (native audio model)	aistudio.google.com/app/apikey	Reuse the existing Sponic Gemini key (already in BW). No separate provisioning needed — the same key powers live translation and Gemini Live. Default model: `gemini-3.1-flash-live-preview`; default voice: `Puck`.	`Google Gemini — SponicGardens (USE THIS)`
Simli (optional avatar)	simli.com	Only required for visual avatar sessions. The worker tracks avatar minutes separately with `SIMLI_USD_PER_MINUTE`.	existing Simli key on Phoenix

Step 3 — Install dependencies

cd apps/ai-host
uv sync          # creates .venv/, installs livekit-agents 1.x + all plugins
                 # takes ~30 s on a warm cache, ~3 min from scratch

Step 4 — Configure `.env`

cp .env.example .env
# Open .env in your editor and paste these from the BW items above:
#   LIVEKIT_URL=wss://sponic-XXXX.livekit.cloud
#   LIVEKIT_API_KEY=API...
#   LIVEKIT_API_SECRET=...
#   GOOGLE_AI_API_KEY=AIza...
#   GEMINI_MODEL=gemini-3.1-flash-live-preview
#   GEMINI_VOICE=Puck

Never commit .env. The .gitignore excludes it, but if you rename it or add a sibling file (.env.local, etc.) the rule may not catch it. Keep credentials in BW; .env is just a runtime cache.

Step 5 — First-run smoke test

Confirm the install worked without going near a microphone:

cd apps/ai-host
uv run python -c "from persona import DINNER_HOST_PROMPT, build_prompt; print(f'Persona loaded: {len(build_prompt())} chars')"
# → Persona loaded: 1234 chars

uv run python -c "import ast; ast.parse(open('agent.py').read()); print('agent.py syntax OK')"
# → agent.py syntax OK

If both lines succeed, you're set up. Move to Normal operation.

Normal operation (day-to-day)

Once setup is done, every working session collapses to this:

Start the agent

cd apps/ai-host
uv run python agent.py dev

You should see:

[INFO] connecting to wss://sponic-XXXX.livekit.cloud
[INFO] registered worker: dinner-host-1
[INFO] waiting for room dispatch...

The dev subcommand runs in foreground with hot-reload on file changes — edit persona.py and the worker restarts itself. Use this for everything except production.

The worker is now idle, waiting for someone to join a LiveKit room. It costs nothing while idle (no API calls, no tokens).

Test in the LiveKit Playground

The official Playground is the day-to-day way to talk to your agent — no client code, just a browser tab.

Open agents-playground.livekit.io in a fresh tab. Paste your project URL, API key, and API secret (same values as in .env). The Playground caches them in localStorage, so this is one-time per browser.

Click Connect. Your worker terminal logs agent joined room within a second; the browser asks for mic permission.

You should hear the smoke-test greeting: "Welcome to Sponic Gardens! I'm your Agentic Gathering host tonight." If you hear it, the whole pipeline is healthy.

Talk to it. Try English ("What's for the Agentic Gathering?"), then Polish ("Skąd jesteś?"). Replies should arrive in ~1–1.5 s and match the language you spoke.

Hit Disconnect when done. The worker logs room closed and goes idle — ready for the next session. Leave it running while you iterate on persona.py.

Stopping

Just Ctrl-C the uv run terminal. There's no state to clean up — the worker is stateless between rooms.

Iterating on the persona

Edit apps/ai-host/persona.py while the worker runs in dev mode. The worker auto-reloads. Reconnect from the Playground (or refresh the page) to test the new prompt — existing rooms keep the old persona until you reconnect.

Updating dependencies

cd apps/ai-host
uv sync --upgrade        # pull latest livekit-agents, plugins, etc.
uv run python -c "from livekit.agents import Agent; print('still works')"

Testing checklist

When validating a build (yours or after a pull), check these in order. They're listed cheapest-to-most-expensive in time, so a failure at any step short-circuits the rest.

Static checks (no creds, no mic)

uv run python -c "import ast; ast.parse(open('agent.py').read())" — syntax
uv run python -c "import ast; ast.parse(open('persona.py').read())" — syntax
uv run python -c "from persona import DINNER_HOST_PROMPT, build_prompt" — imports + exports

Live checks (Playground)

Connection: worker logs registered worker within ~3 s of starting
Dispatch: worker logs agent joined room within ~1 s of clicking Connect in Playground
Greeting plays: hearing the smoke-test sentence proves LiveKit + Gemini Live are wired
Latency is acceptable on a short reply. Higher? Check Google AI Studio status and the Phoenix worker logs.
Multilingual works: Polish in → Polish out. Spanish in → Spanish out. If everything comes back English, the persona prompt got mangled.
Tone is right: 1–2 sentence replies, warm/curious. If it's lecturing, tighten persona.py.
No crashes through 20+ consecutive turns. Memory should stay flat.

Production operation (deploy targets, monitoring)

Everything above assumes you're running locally during dev. To leave the agent running for an Agentic Gathering (or for hands-off staging), you need a host that stays up.

Phase 1 only needs a way to keep one Python worker process running — no public ingress, no DNS, no scaling story. Three reasonable hosts depending on stage:

Where	Use	Pros / cons
M4 (laptop)	Day-to-day dev & one-off demos	Free, instant feedback. Dies when the lid closes — not for "always on".
Oracle Phoenix	Always-on staging	Already paid (Always Free tier). 4 ARM cores, 24 GB RAM, plenty of headroom alongside the prompt-runner. Outbound-only (worker connects out to LiveKit) so no firewall changes. Recipe in infra/runbook.md → "SponicControl prompt-runner on Oracle Phoenix" — same systemd-service pattern.
ALPUCA (M4 Mac Mini, Poland)	Low-latency for actual Agentic Gatherings	~150 ms closer to Polish guests than Phoenix. Already runs the live-translation backend. Trade-off: home internet up = critical, single point of failure.

Production-ish run (Phoenix)

Once the agent is stable enough to leave running:

# On Phoenix:
sudo apt install -y python3.12 python3.12-venv     # 22.04 needs the deadsnakes PPA
curl -LsSf https://astral.sh/uv/install.sh | sh
git clone <repo> ~/sponic         # or scp the apps/ai-host subdir
cd ~/sponic/apps/ai-host
uv sync
cp .env.example .env && nano .env                  # paste creds

# systemd unit (mirror the pattern from prompt-runner.service):
sudo nano /etc/systemd/system/sponic-ai-host.service
# [Service]
# WorkingDirectory=/home/ubuntu/sponic/apps/ai-host
# ExecStart=/home/ubuntu/.local/bin/uv run python agent.py start
# Restart=always
# EnvironmentFile=/home/ubuntu/sponic/apps/ai-host/.env

sudo systemctl daemon-reload
sudo systemctl enable --now sponic-ai-host
sudo journalctl -fu sponic-ai-host                  # tail logs

Use agent.py start in production, not dev. The dev mode auto-restarts on file changes — useful locally, dangerous on a host.

Monitoring

LiveKit Cloud dashboard — rooms, participants, connection minutes used. cloud.livekit.io
Google AI Studio dashboard — Gemini Live usage and quota. Same key as live translation, so usage is shared. aistudio.google.com
Simli dashboard — avatar session minutes, if avatar display is enabled.
journalctl if running under systemd. Worker logs to stdout.

Cost

Current production costs are session-based: Gemini Live for the host, LiveKit participant minutes for connected clients, and optional Simli avatar minutes.

10-minute scenario	Humans	Simli display	Estimated cost
Interview	1	No	$0.24
Agentic Gathering	4	No	$0.25
Agentic Gathering	10	No	$0.28
Agentic Gathering	4	Yes	$3.26
Agentic Gathering	10	Yes	$3.29

Assumptions: Gemini Live $0.005/min audio input + $0.018/min audio output, LiveKit WebRTC $0.0005/participant-min, Simli $0.30/min, and one extra display participant when Simli is enabled. The Phoenix worker is self-hosted, so LiveKit-hosted agent-session minutes are not included.

Troubleshooting

Symptom	Likely cause	Fix
Worker connects but never gets dispatched	Project URL mismatch — worker is on a different LiveKit project than the Playground.	Confirm `LIVEKIT_URL` in `.env` matches the URL pasted in the Playground.
"401 Unauthorized" from LiveKit	API key/secret swapped or stale.	Re-copy from Settings → Keys; secrets are only shown once at creation, you may need a fresh key.
No greeting on connect, but worker shows "joined room"	Gemini Live model/voice is unknown, or `GOOGLE_AI_API_KEY` is missing/invalid.	Reset `GEMINI_MODEL` to `gemini-3.1-flash-live-preview` and `GEMINI_VOICE` to `Puck`. Verify the key with: `curl -s "https://generativelanguage.googleapis.com/v1beta/models?key=$GOOGLE_AI_API_KEY" \| head` — should return a JSON list of models, not a 401.
Latency > 3 s	Gemini Live response delay, overloaded worker, or network path issue.	Check Google AI Studio status/quota, Phoenix CPU/memory, and `/var/log/sponic-ai-host.log`. If the delay grows with context, reduce prompt size or restart the session.
Replies are in English no matter what	Persona prompt was edited and the multilingual rule got cut.	Re-check `persona.py` for the "Reply in the language of the most recent speaker" line.
Replies are 5 paragraphs long	Gemini Live is ignoring a soft persona constraint.	The 1–2 sentence cap is in the prompt library/persona prompt — tighten the wording (try "Never exceed 30 words.").
`uv sync` fails on macOS arm64	Apple Silicon needs Python 3.12+ for some plugins.	Confirm `python3 --version` is 3.12.x. `pyenv install 3.12` if not.
Worker dies silently after a minute	Missing Gemini key, bad LiveKit credentials, or unhandled plugin error.	Check `/var/log/sponic-ai-host.log`, then restart `sponic-ai-host.service` after fixing env.

Phase 1 vs the long roadmap

This document tracks the active Gemini Live implementation. The older 8-phase plan remains useful as history, but its cascade model/cost estimates are superseded by this page and the Phoenix runbook.

Phase	Adds	Status
1 — Native-audio agent	Gemini Live, single participant, smoke-test greeting	Shipped 2026-05-13
2 — Multi-mic subscription	Subscribe to all participants in the room, label utterances by speaker	Pending
3 — Intervention control	Prompt/tool tuning for when the host should speak versus stay quiet	In progress
4 — Programmatic prompts	Data-channel API so the host can deliver scheduled toasts and table announcements on cue	Pending
5 — Simli avatar	Live avatar on a venue display, driven by the Gemini Live audio output.	Shipped
6 — Translation-app bridge	Subscribe to ALPUCA's existing `wss://subs.sponicgardens.com/subtitles?lang=...` for transcripts — eliminates duplicate STT cost	Pending
7 — Persistence & memory	Per-guest memory across Agentic Gatherings (likes, dietary, prior conversations)	Pending
8 — Phone interviewer	Phone-callable host for prospect screening	Pending — extend Vapi, don't rebuild.

Phase 8 caveat: the dev task doc still describes building LiveKit SIP from scratch for the phone interviewer. Sponic already has a working Vapi integration in production at apps/control/supabase/functions/vapi-server/index.ts (caller-ID lookup, role-based prompts, smart-home permission routing) and an admin UI at apps/control/public/spaces/admin/voice.html. When Phase 8 lands, the right move is to extend Vapi — provision a Sponic-owned Vapi account, reuse the assistant-request handler, plug in the same persona — not rebuild the phone path on LiveKit. Estimated savings: ~5 days vs ~10 days. Update the dev task doc before Phase 8 starts.

Agentic Gathering Host

What it is & what it does

How it works (Phase 1 architecture)

Component choices & why

File map

First-time setup

Step 1 — Install runtime

Step 2 — Provision four service accounts

Step 3 — Install dependencies

Step 4 — Configure .env

Step 5 — First-run smoke test

Normal operation (day-to-day)

Start the agent

Test in the LiveKit Playground

Stopping

Iterating on the persona

Updating dependencies

Testing checklist

Static checks (no creds, no mic)

Live checks (Playground)

Production operation (deploy targets, monitoring)

Production-ish run (Phoenix)

Monitoring

Cost

Troubleshooting

Phase 1 vs the long roadmap

Step 4 — Configure `.env`