From 81675335ad81d49e480d8bec7bd7356c5a1b7ee1 Mon Sep 17 00:00:00 2001
From: Matt <matt@letsbe.solutions>
Date: Wed, 1 Apr 2026 13:46:47 -0400
Subject: [PATCH] docs: add voice discovery mode design spec

Captures the pivot from form-filling voice mode to a standalone
consultative discovery experience with separate entry point, rewritten
system prompt, on-screen contact verification, and reconnection handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 ...2026-04-01-voice-discovery-pivot-design.md | 144 ++++++++++++++++++
 1 file changed, 144 insertions(+)
 create mode 100644 docs/superpowers/specs/2026-04-01-voice-discovery-pivot-design.md

diff --git a/docs/superpowers/specs/2026-04-01-voice-discovery-pivot-design.md b/docs/superpowers/specs/2026-04-01-voice-discovery-pivot-design.md
new file mode 100644
index 0000000..d4d8980
--- /dev/null
+++ b/docs/superpowers/specs/2026-04-01-voice-discovery-pivot-design.md
@@ -0,0 +1,144 @@
+# Voice Discovery Mode — Design Spec
+
+## Overview
+
+Pivot the voice mode from a "faster way to fill out the configurator" into a standalone consultative discovery experience. Exploratory users — people who don't yet know exactly what they need — get a warm, conversational entry point separate from the typed configurator. The conversation is free-flowing and consultant-like, structured data is captured silently, and the user receives a personalized brief at the end.
+
+## Entry Point & Framing
+
+### Placement
+
+A new standalone section on the landing page, positioned after the services or process section — wherever the natural "I'm interested but not sure" moment occurs. Completely decoupled from the configurator.
+
+### Copy Direction
+
+- **Headline:** Warm and inviting — "Not sure where to start?" or "Still figuring out what you need?"
+- **Subtext:** "Tell us what you're thinking and we'll figure it out together. You'll get a personalized brief at the end."
+- **CTA button:** "Let's talk" — styled distinctly from the configurator's CTA.
+- Both EN and FR translations required.
+
+### Behavior
+
+Clicking the CTA scrolls to / reveals the voice conversation panel inline on the page. No route change, no modal. The panel expands in place, keeping the user grounded in the site context.
+
+### Configurator Changes
+
+- Remove the `ModeToggle` component and `mode` state from `WizardContainer.tsx`.
+- The configurator becomes typed-form-only — the "I know what I want" path.
+- No other changes to the configurator itself.
+
+## Voice Conversation UI
+
+### Layout
+
+A dedicated panel, roughly the same width as the configurator card but taller. Three zones stacked vertically:
+
+1. **Agent header** — LetsBe branding mark, agent name, connection status dot. Similar to current but slightly more prominent.
+
+2. **Orb + transcript area** — Orb is larger (24-28 units instead of 20). Live transcript below it with significantly more vertical space (`max-h-72` or similar instead of current `max-h-40`). Proper autoscroll using `scrollIntoView` on the bottom ref. **Selection chips are removed** — no visible evidence of structured data capture.
+
+3. **Controls** — Mic toggle and end call button. Same as current, cleaner without chips.
+
+### Mobile
+
+Panel goes nearly full-width on small screens. Transcript takes most of the viewport height. Orb may scale down slightly. Controls stay fixed at the bottom for thumb reach.
+
+### Contact Confirmation Card
+
+When the agent captures name and email, a small inline card appears (above controls or below transcript) showing the captured values with inline edit affordance. The agent says "I've got your details on screen — look right?" User can tap to edit, then confirm. **This replaces the verbal spell-back entirely.**
+
+Requires a new tool (e.g., `request_contact`) that the agent calls to surface the card, rather than collecting contact info verbally.
+
+### During Brief Generation
+
+After contact confirmation and `complete_brief` trigger:
+- Connection is closed (already fixed).
+- Panel transitions to a generating state — orb morphs to loader or StepGenerating-style progress indicators.
+- Transcript remains visible so the conversation doesn't vanish.
+
+### On Completion
+
+Transitions to the same `StepComplete` view (brief preview + book a call CTA). The brief content will be richer due to deeper conversation, but presentation is the same.
+
+## System Prompt & Agent Behavior
+
+### Tone
+
+The agent is a conversational consultant, not an interviewer with a checklist. No numbered topic list to work through. The prompt gives the agent a goal: "understand what this person needs deeply enough to write a compelling brief."
+
+### Behavioral Guidelines
+
+- **Follow the user's thread.** If they talk about a frustration, dig into it. Don't redirect to the next "topic."
+- **One question at a time.** This stays — it works.
+- **Offer perspective, not just questions.** "That sounds like it might be more of a systems problem than a website problem." The agent has opinions, not just a clipboard.
+- **Reference LetsBe naturally.** "We've done something similar for a hospitality client" — not a feature list.
+- **2-3 sentences per response.** Prevents monologuing.
+
+### Structured Data Capture
+
+`update_selections` tool stays. The agent is never instructed to "cover these topics." It maps what it hears to predefined values silently. If the conversation never touches timeline, that field stays empty — that's fine.
+
+### Brief Generation
+
+`conversationSummary` is the **primary payload**. The prompt instructs the agent to include everything discussed: pain points, current tools, what they want to keep vs change, business context, decision-makers, what success looks like. Structured fields (`services`, `industry`, `timeline`) are metadata that helps organize the brief, not the substance.
+
+### Brief Content Philosophy
+
+The brief should be **diagnostic, not prescriptive:**
+- **Deep on their world** — pain points, current tools, what's broken, customers, what success looks like.
+- **Deep on what matters** — priorities and trade-offs surfaced in conversation.
+- **LetsBe's perspective** — a few sentences of informed opinion on what the real problem is.
+- **High-level on implementation** — no stack recommendations, no architecture, no specific deliverables.
+- **No timeline/cost** — "that's what the call is for."
+
+The brief should make the user feel understood and make the follow-up call feel like a warm continuation, not a cold intro.
+
+### Contact Collection
+
+The agent asks for name and email when the conversation reaches a natural conclusion — "I think I've got a great picture of what you need. Let me put a brief together — what's your name and email?" No forced timing. The `request_contact` tool surfaces the on-screen card for verification.
+
+### Language
+
+Both EN and FR system prompts, same as now.
+
+## Reconnection Handling
+
+Exploratory conversations run longer than form-filling. If the WebSocket drops mid-conversation:
+
+- Preserve the transcript on disconnect.
+- Show a "reconnect" option instead of just an error.
+- On reconnect, seed the new Gemini session with the transcript so far (as context in the system prompt or initial message) so the agent can pick up where it left off.
+- The structured selections captured so far are preserved in state.
+
+## Technical Changes
+
+### Files to Modify
+
+- **`VoiceAgentProvider.tsx`** — Refactor `handleToolCall` so `conversationSummary` is the primary brief input. Add state for contact confirmation card (name + email captured, pending user confirm). Add reconnection logic (preserve transcript, re-seed on reconnect). Connection teardown on brief completion already fixed.
+
+- **`VoiceAgent.tsx`** — New layout: larger orb, bigger transcript area, no selection chips. Add contact confirmation card component (inline editable name + email). Fix autoscroll with `scrollIntoView`. Guard controls for brief-complete state (already done). Mobile-responsive layout.
+
+- **`gemini-live.ts`** — Rewrite `buildSystemPrompt()` for both locales with consultative tone. Adjust `complete_brief` tool description to emphasize `conversationSummary`. Add `request_contact` tool declaration that surfaces the on-screen card.
+
+- **`WizardContainer.tsx`** — Remove `ModeToggle` component import, `mode` state, and the voice mode rendering branch. Remove `handleVoiceComplete` and `VoiceAgentProvider` wrapper (these move to the new section).
+
+- **`ModeToggle.tsx`** — Delete entirely.
+
+- **New: Discovery section component** — New section component for the landing page with warm copy, CTA, and expandable voice panel. This is where `VoiceAgentProvider` and `VoiceAgent` now live.
+
+- **Landing page** — Add the new discovery section at the appropriate position.
+
+- **i18n message files** (`en.json`, `fr.json`) — Add translations for discovery section copy. Update voice-related strings as needed.
+
+- **Email template** — Verify the brief email template handles longer, more narrative content gracefully. Adjust if needed.
+
+### What Stays the Same
+
+- WebSocket connection to Gemini Live API
+- Audio worklet recording + playback pipeline
+- `update_selections` tool (used silently now)
+- `/api/configure` route and brief generation logic
+- `/api/gemini-token` route
+- `StepComplete` component
+- `analyze_website` tool (still useful when someone mentions their current site)
+- The typed configurator (minus the mode toggle)