← All docs

Agentic Gathering host — MVP for first contextual-nudge test

A focused cut on top of the existing realtime stack: anecdote-grade interviews, dual-artifact profiles, a dossier-driven host prompt, and a self-enroll Android lobby — the minimum needed to run the first real test of the personalization hypothesis.

2026-05-09 Build plan ~5–7 days Touches: Android · Supabase · LiveKit agent · Anthropic

How to read this doc

Sections 1–2 frame the goal and split the build into four workstreams. Section 3 is the data model. Sections 4–6 are the prompt specs — this is where most of the product lives; read them carefully if you are touching the agent or the interview. Section 7 covers party lifecycle and Android UX. Section 8 is the phased build sequence; section 9 is how we will evaluate after build. Sections 10–11 list out-of-scope items and tunable knobs. Markdown source: docs/devtasks/ai-host-mvp.md. Builds on ai-host-livekit (realtime stack) and ai-host-ios (iOS client).

Context
Three workstreams
Data model
Interview prompt — v2
Extraction prompt — transcript → profile + dossier
Host prompt — v2
Party lifecycle + Android lobby UX
Build sequence
Testing & tuning protocol
Out of scope (Phase 2)
Open for empirical tuning
Build journal

1. Context

The Sponic Agentic Gathering host already exists end-to-end as a stack — LiveKit room, Gemini Live native audio, optional Simli avatar, Android client, Apple client, and web fallback. The current production worker is apps/ai-host/agent.py on Oracle Phoenix, registered as the named LiveKit worker sponic-dinner-host; it no longer runs the older Deepgram → Claude → ElevenLabs cascade described in the original realtime build plan. See ai-host-livekit for the historical architecture plan and current realtime context. What is missing is the demonstrable personalization layer: the host today does not have meaningful per-guest context, the onboarding interview captures generic interest tags rather than material rich enough to enable interesting nudges, and admins must manually curate participants in the console before each event.

This devtask is the minimum cut to enable the first real test of the product hypothesis: that the AI can seamlessly improve the conversational flow of an Agentic Gathering by finding the right moments to nudge based on contextual knowledge of each participant.

Goal of the first test

Demonstrate that, in a real Agentic Gathering, the host produces demo-worthy nudges — specific moments where attendees feel the AI knew them and used that knowledge well. Examples: bridging a lull with a thread from a quiet guest's interview; weaving a story one guest told weeks ago into a current discussion; opening the Agentic Gathering with a hook that draws three different people in.

Why this is a separate doc

ai-host-livekit owns the realtime stack (rooms, transports, agent process). ai-host-ios owns the iOS client surface. This doc owns the personalization + access flow layer: profile capture, dossier generation, host prompt for nudging, and the Android UX that makes self-enrollment possible. Splitting it keeps the realtime build untouched and lets the personalization workstream proceed in parallel with the iOS client.

2. Three workstreams

A. Onboarding interview rework

Current state: an AI interview happens at signup, but the prompt produces interest tags ("hiking, software, jazz") rather than anecdote-grade material, and the response data is not structured for downstream use. Fix:

A new interview prompt designed to surface stories, strong opinions, geek-outs, life chapters, and specific recent experiences — the kind of material a host could imagine retelling at the Agentic Gathering.
Two-pass design: a natural conversational interview (warm, no quiz feel), then a separate LLM extraction pass over the transcript that produces both derived artifacts described in §B.

B. Profile pipeline — two artifacts from one transcript

Same source interview produces:

User-visible structured profile. Clean, editable fields the user sees in the Android app and can correct. Satisfies the transparency principle: the user understands what we know and how it is shared.
Host-only dossier. Freeform character sketch (one paragraph) + 5–10 concrete hooks. Optimized for the host LLM, not for human display. The user is told the dossier exists but does not edit it directly. The two artifacts intentionally diverge — the dossier preserves nuance the user-visible profile sanitizes away (hot takes, unflattering specifics) so the host has material rich enough to bridge with.

Single latest version per user; re-running the interview replaces both.

C. Host prompt + dossier integration

Reactive, time-blind, dossier-driven. The host receives all participants' dossiers in its system prompt at Agentic Gathering start. Four triggers drive when it speaks (lull, quiet guest, topic exhaustion, lobby/welcome). Bridging behavior weaves dossier threads into the existing conversation rather than reciting them. Detailed in §6.

D. Android lobby + self-enrollment

Replaces console-driven admin enrollment for the test phase. Android shows active joinable Agentic Gatherings; the user taps to join; lands in a lobby room; anyone can verbally signal "let's start" and the host transitions the Agentic Gathering to in-progress mode. Detailed in §7.

3. Data model

Migration at apps/control/migrations/20260509_ai_host_mvp.sql (applied 2026-05-09). The pre-existing dinner_events / dinner_event_participants tables are reused — Phase 1 reconciliation. The spec originally called these “parties”; actual table names are kept to avoid a rename.

RLS pattern matches the existing dinner_events policies: staff/admin/oracle SELECT via Supabase auth; mobile clients reach the data through Edge Functions using the service role (which bypasses RLS). user_dossiers is admin/oracle-only on read because hooks marked comfort: personal should not be casually browsable.

public.onboarding_interviews
  id uuid PK default gen_random_uuid(),
  app_user_id uuid references app_users(id) on delete cascade,
  transcript_messages jsonb default '[]',
  transcript_text text,
  status text default 'in_progress'
    CHECK (status IN ('in_progress','completed','extracted','abandoned')),
  supersedes_id uuid references onboarding_interviews(id),
  started_at timestamptz default now(),
  completed_at timestamptz

public.user_profiles                  -- user-visible, editable
  app_user_id uuid PK references app_users(id) on delete cascade,
  intro_one_liner text,
  interests text[] default '{}',
  what_im_into text,
  whats_new text,
  pronouns text,
  source_interview_id uuid references onboarding_interviews(id),
  generated_at timestamptz,
  updated_at timestamptz default now()         -- maintained by trigger

public.user_dossiers                  -- host-only, opaque to user
  app_user_id uuid PK references app_users(id) on delete cascade,
  character_sketch text,
  hooks jsonb default '[]',           -- Hook[] (see §5)
  source_interview_id uuid references onboarding_interviews(id),
  generated_at timestamptz default now()

public.dinner_events                  -- pre-existing; ALTER:
  ADD COLUMN lifecycle text default 'scheduled'
    CHECK (lifecycle IN ('scheduled','lobby_open','in_progress','ended')),
  ADD COLUMN lobby_opened_at timestamptz
  -- (started_at, ended_at already exist on this table)
  -- `status` (draft|live|ended|cancelled) stays in place for backwards
  -- compat with existing console flows; agent reads `lifecycle`.

public.dinner_event_participants      -- pre-existing; ALTER:
  ADD COLUMN session_status text default 'lobby'
    CHECK (session_status IN ('lobby','active','left'))
  -- (participant_role already exists with values 'guest'|'host'|'observer';
  --  joined_at / left_at / invited_at already exist)

onboarding_interviews is added to the supabase_realtime publication so the Android client can react when the extraction pass completes. dinner_events and dinner_event_participants are already published by the prior dinner-events migration.

4. Interview prompt — v2

Conversational, not a checklist. Drives the LLM toward five implicit goals without surfacing them as categories. Sketch:

You are a warm, curious interviewer for Sponic Gardens. You are talking
with someone who just signed up because they want to attend AI-hosted
Agentic Gatherings with strangers. Over the next 10–15 minutes, your job is to
collect anecdote-grade material — specific stories, opinions, current
obsessions, and life chapters a thoughtful host could weave into the Agentic Gathering
conversation.

Tone: a friend over coffee. Curious, not clinical. Never a quizmaster.

What you are trying to surface (work it in naturally — never list these
out loud):
  1. 2–3 short stories they would be willing to retell at the Agentic Gathering —
     funny, formative, weird, recent.
  2. 2–3 strong opinions or small hot takes they hold.
  3. What they are geeking out about right now — a project, rabbit
     hole, or thing they have spent too much time on lately.
  4. The life chapter they are in — what is new, what is ending, what
     they are chewing on.
  5. What they are hoping to get from meeting new people.

Rules:
  - One question at a time. Listen.
  - Never say "next question" or list categories.
  - Probe for specificity. If they say "I love hiking," ask about a
    recent hike, a place they keep coming back to, the weirdest thing
    that has happened on a trail.
  - If they get reflective or vulnerable, do not pivot — let the
    moment breathe and follow it.
  - Wrap when you have collected enough across the five goals, not
    by hitting a turn count.

When you are ready to wrap, say so warmly. Confirm they can update or
rerun anytime in the app.

The interview UI lives in the Android app — same surface as today, just with the new system prompt and a richer transcript persistence. Each user message and model reply is appended to onboarding_interviews.transcript_messages. On wrap, the row's status is set to completed and an extraction job is triggered.

5. Extraction prompt — transcript → profile + dossier

A separate one-shot Claude call (Sonnet 4.6 default) with structured output. Sketch:

You are an extraction pipeline. Given a Sponic onboarding interview
transcript, produce two artifacts as JSON.

ARTIFACT 1 — user-visible structured profile.
{
  "intro_one_liner": string,    // 1 sentence the user could imagine
                                // being introduced with at the Agentic Gathering
  "interests": string[],        // 5–10 broad tags
  "what_im_into": string,       // 2–3 sentence paragraph
  "whats_new": string,          // 1–2 sentences on current life chapter
  "pronouns": string | null
}

Constraint: shown back to the user. Sanitize. No embarrassing specifics,
no hot takes that would feel weird read back, no verbatim quotes. The
user should feel "yes, that's me" — not "oh god, delete this."

ARTIFACT 2 — host-only dossier.
{
  "character_sketch": string,   // 1 paragraph, 4–6 sentences. How
                                // this person comes across. What
                                // makes them them.
  "hooks": Hook[]               // 5–10 concrete threads
}

Hook = {
  "type": "story" | "opinion" | "geek_out" | "life_chapter" | "recent",
  "title": string,              // 4–8 word handle for the host
  "content": string,            // 2–4 sentences with concrete detail
                                // the host can weave in
  "comfort": "public" | "personal"
                                // public = freely bridge to.
                                // personal = bridge gently only when
                                // the room is in a thoughtful register.
}

Constraint: hooks MUST be specific.
  Bad:  "loves cooking"
  Good: "spent six months trying to perfect Sichuan mapo tofu, finally
         cracked it when she stopped using American doubanjiang"

If the transcript lacks specific material, return fewer hooks rather
than generic ones. It is better to surface 3 sharp hooks than 10 vague
ones.

The extraction call writes user_profiles and user_dossiers rows in a single transaction; sets onboarding_interviews.status = 'extracted'.

6. Host prompt — v2

Reactive, time-blind, dossier-driven. Built into the LiveKit agent process (Phase 4 confirms the exact path against the live agent code). Sketch:

You are the host of a small Agentic Gathering. You invited 4–6 people and
interviewed each of them beforehand. Their dossiers are below.

You are NOT the center of attention. You are the person who makes sure
the rhythm is good and everyone feels included. You speak rarely. When
you do speak, you are brief.

DEFAULT: silent. You speak only on triggers.

  1. LULL — the room has been silent for ~8 seconds. Bridge with a
     thread from someone's dossier that connects to whatever was last
     being discussed. Hook into what was just said; do not pivot
     abruptly.

  2. QUIET GUEST — someone has not contributed in several minutes. Do
     not call them out by name. Raise a topic from THEIR dossier in a
     way that invites them in naturally:
       "Speaking of food projects, has anyone tried something a bit
        obsessive lately?"  (lets Alice volunteer her sourdough)

  3. TOPIC EXHAUSTION — agreement loops, repetition, fading energy.
     Pivot via a dossier bridge.

  4. LOBBY / WELCOME (active when the Agentic Gathering is in lobby_open). Greet
     arrivals warmly and briefly. Light small talk; low-pressure
     tone. When you hear any participant signal readiness — "we're
     all here", "let's start", "should we get going?", etc. — confirm
     warmly and call the start_dinner tool. Open with a thread likely
     to spark, pulled from a hook with comfort: public.

BRIDGING.
  - Weave hooks into the current thread. NEVER recite a dossier item.
  - Bad:  "Alice, you mentioned in your interview that you geek out
           on sourdough."
  - Good: "Speaking of food projects — has anyone tried something
           a bit obsessive lately?"

ANTI-PATTERNS — never do these:
  - Round-robin questions ("let's go around the table").
  - "Tell us about yourself."
  - Quizmaster framing.
  - Putting people on the spot by name.
  - Long monologues.
  - Recapping or summarizing what was said.
  - Quoting the interview transcript directly.

COMFORT.
  - Hooks marked "public" — fine to bridge to openly.
  - Hooks marked "personal" — only bridge if the conversation is
    already in a thoughtful or vulnerable register, and bridge
    gently (no direct callout).

VOICE.
  - Brief. Two sentences maximum for most interjections.
  - Warm, slightly understated. Not a TV host. Not a therapist.
  - Comfortable with silence. Do not fill it just because you can.

TOOLS.
  - start_dinner() — transitions the Agentic Gathering to in_progress. Call when
    a lobby participant signals readiness.

DOSSIERS:
{{DOSSIERS_JSON}}

PARTICIPANTS PRESENT:
{{PARTICIPANT_LIST}}

Dossier injection is one-shot at Agentic Gathering start (token-cheap for ≤6 guests; revisit if average gathering size grows). No mid-gathering tool-call lookup in v1.

7. Party lifecycle + Android lobby UX

Lifecycle

scheduled → lobby_open → in_progress → ended ↑ ↑ | | admin opens start_dinner tool lobby fired by host

scheduled — created by admin, not yet joinable.
lobby_open — agent is in the room in welcome mode; users can join; nothing has "started." The host greets, makes small talk, listens for the start trigger.
in_progress — Agentic Gathering has begun; host is in dossier-bridging mode.
ended — agent has left or admin has closed.

Start trigger

The host detects readiness in natural speech and calls start_dinner. Implementation: a tool exposed to the host LLM (per §6). The LLM does the intent detection inline — no separate classifier. If false-positives become a problem in testing, add a confirmation step ("OK — should we begin?") before the tool call.

Android UX

Three additions to the app:

Active Agentic Gatherings list. Shows lobby_open and in_progress Agentic Gatherings the user is eligible to join. Open enrollment for the test phase — any authenticated user sees all open Agentic Gatherings. Production guest-list gating is out of scope (§10).
Lobby screen. Avatars + names of who is present, the host's avatar, and a hint: "say 'let's start' when everyone is here." Mic active; user is in the room.
Profile screen. Shows the user-visible structured profile (§5 Artifact 1) with edit affordances. Includes a "redo your interview" button that creates a new onboarding_interviews row and supersedes the prior one.

The existing in-progress Agentic Gathering UI is reused unchanged.

Listen mode

Out of product scope for v1. The host treats every visible participant as a speaker. The Android client retains a hidden "observer" role for dev/internal use; observers join party_participants with role = 'observer' and the host prompt does not receive them in PARTICIPANT_LIST.

8. Build sequence

Phase	What	Surface
1	Schema migration: `user_profiles`, `user_dossiers`, `onboarding_interviews`, lifecycle columns on `parties` / `party_participants`. RLS policies.	Supabase
2	Interview prompt v2 + transcript persistence + extraction job (transcript → profile + dossier).	Android (interview UI) + edge fn for extraction
3	User-facing profile screen on Android: read, edit, redo-interview action.	Android
4	Host prompt v2 + dossier injection at Agentic Gathering start. Reactive triggers (lull, quiet guest, topic exhaustion).	LiveKit agent
5	Lobby/welcome host behavior + `start_dinner` tool + lifecycle transitions.	LiveKit agent + Supabase
6	Android active Agentic Gatherings list + lobby screen + join flow.	Android
7	End-to-end test rehearsal with internal participants; tune triggers, prompts, comfort handling.	All

Phases 2–3 and 4–5 can run in parallel after Phase 1.

9. Testing & tuning protocol

After Phase 7 build-out, run a rehearsal Agentic Gathering with 4–6 internal participants who have completed the new interview. Observe for:

Demo-worthy nudge count. Target ≥ 3 in a 90-minute Agentic Gathering. A demo-worthy nudge is one a participant would notice and be impressed by post-hoc, not just any interjection.
False intervention rate. Host speaks when nobody wanted it to. Target < 1 per 30 minutes.
Bridge specificity. Host references concrete dossier material (not "speaking of food" generic). Target ≥ 60% of interventions.
Lobby trigger reliability. "Let's start" intent detection. Target 100% true-positive across 5 trial signals.

Tune the following parameters empirically — they are intentionally not pre-committed in this spec:

Lull threshold (6 vs 10 vs 15 seconds).
Whether the host narrates its move ("let me bring something up…") vs dives in.
Whether the host ever names guests directly when bridging.

Findings from the rehearsal go into the build journal of this devtask, not into the prompt spec — the prompts evolve, the spec stays a snapshot of intent.

10. Out of scope (Phase 2)

Goal-driven host (e.g. "make sure each guest shares something they care about"). Adds coherence at the risk of feeling agenda-pushy. Defer until reactive baseline is proven.
Time-aware behavior arcs (warm-up → meat → wind-down). Adds complexity; easier to add later than to remove.
Late-joiner mid-gathering handling. If someone joins after in_progress, today's behavior is "host treats them as a regular guest"; explicit re-introduction logic and dossier-late-injection is deferred.
Heat / discomfort detection. Host noticing a topic going sour and de-escalating. Adds significant prompt complexity; prove the simple bridging case first.
Profile freshening. Re-interviews triggered by time decay or post-gathering "what we learned about Alice" updates. Today: user manually re-runs.
Production guest-list gating / invitation model. For testing, open enrollment is fine. Production needs RSVP, capacity caps, and host-curated invites.

11. Open for empirical tuning

Decided as starting points; expect to revise after rehearsal:

Lull threshold (~8s).
Reactive only (no goals).
Time-blind.
Always-bridge style (never recite).
Comfort split into public/personal only (no finer gradient).

These are knobs, not commitments — Phase 7 produces the data to set them.

Build journal

2026-05-13 · Codex · ad-hoc · codex/onboarding-prompt-audit

Prompt selection is now auditable and admin-configurable.

Default selection. Admin → Interviews stores the global onboarding default in public.onboarding_prompt_defaults (key='default_interview'). The AI host checks this table before falling back to ONBOARDING_INTERVIEW_PROMPT_NAME.
Per-person overrides. Admin → Interviews can assign a specific active agent_interview prompt to one user via public.onboarding_prompt_assignments. Clearing the override returns that user to the default.
Historical clarity. Each extracted row in public.onboarding_interviews now stores prompt_name, prompt_code, prompt_version, prompt_record_id, prompt_source, and prompt_selection_reason, so old interviews show which prompt/version was actually used.
Prompt text editing. The prompt library remains the content editor. To change PDHA04 text, use /en/prompts; to change which prompt a user or all users receive next, use Admin → Interviews.
Tested. Production migration applied through the Supabase pooler; dinner-host-control/prompt?persona=interviewer returns PDHA04 v1; extract-host-mvp-profile deployed and loads successfully; Phoenix sponic-ai-host restarted and registered with LiveKit. Local tsc --noEmit and python3 -m py_compile passed. Local next build is blocked on this machine by native SWC/lightningcss binary install/signature issues, so production build verification comes from Cloudflare after push.
Files. apps/control/migrations/20260513_onboarding_prompt_audit.sql, apps/control/src/components/admin/onboarding-reset-panel.tsx, apps/ai-host/agent.py, apps/control/supabase/functions/extract-host-mvp-profile/index.ts, apps/control/supabase/functions/dinner-host-control/index.ts.