Files
LetsBeBiz-Site/docs/superpowers/specs/2026-04-01-voice-discovery-pivot-design.md

145 lines
8.3 KiB
Markdown
Raw Normal View History

# Voice Discovery Mode — Design Spec
## Overview
Pivot the voice mode from a "faster way to fill out the configurator" into a standalone consultative discovery experience. Exploratory users — people who don't yet know exactly what they need — get a warm, conversational entry point separate from the typed configurator. The conversation is free-flowing and consultant-like, structured data is captured silently, and the user receives a personalized brief at the end.
## Entry Point & Framing
### Placement
A new standalone section on the landing page, positioned after the services or process section — wherever the natural "I'm interested but not sure" moment occurs. Completely decoupled from the configurator.
### Copy Direction
- **Headline:** Warm and inviting — "Not sure where to start?" or "Still figuring out what you need?"
- **Subtext:** "Tell us what you're thinking and we'll figure it out together. You'll get a personalized brief at the end."
- **CTA button:** "Let's talk" — styled distinctly from the configurator's CTA.
- Both EN and FR translations required.
### Behavior
Clicking the CTA scrolls to / reveals the voice conversation panel inline on the page. No route change, no modal. The panel expands in place, keeping the user grounded in the site context.
### Configurator Changes
- Remove the `ModeToggle` component and `mode` state from `WizardContainer.tsx`.
- The configurator becomes typed-form-only — the "I know what I want" path.
- No other changes to the configurator itself.
## Voice Conversation UI
### Layout
A dedicated panel, roughly the same width as the configurator card but taller. Three zones stacked vertically:
1. **Agent header** — LetsBe branding mark, agent name, connection status dot. Similar to current but slightly more prominent.
2. **Orb + transcript area** — Orb is larger (24-28 units instead of 20). Live transcript below it with significantly more vertical space (`max-h-72` or similar instead of current `max-h-40`). Proper autoscroll using `scrollIntoView` on the bottom ref. **Selection chips are removed** — no visible evidence of structured data capture.
3. **Controls** — Mic toggle and end call button. Same as current, cleaner without chips.
### Mobile
Panel goes nearly full-width on small screens. Transcript takes most of the viewport height. Orb may scale down slightly. Controls stay fixed at the bottom for thumb reach.
### Contact Confirmation Card
When the agent captures name and email, a small inline card appears (above controls or below transcript) showing the captured values with inline edit affordance. The agent says "I've got your details on screen — look right?" User can tap to edit, then confirm. **This replaces the verbal spell-back entirely.**
Requires a new tool (e.g., `request_contact`) that the agent calls to surface the card, rather than collecting contact info verbally.
### During Brief Generation
After contact confirmation and `complete_brief` trigger:
- Connection is closed (already fixed).
- Panel transitions to a generating state — orb morphs to loader or StepGenerating-style progress indicators.
- Transcript remains visible so the conversation doesn't vanish.
### On Completion
Transitions to the same `StepComplete` view (brief preview + book a call CTA). The brief content will be richer due to deeper conversation, but presentation is the same.
## System Prompt & Agent Behavior
### Tone
The agent is a conversational consultant, not an interviewer with a checklist. No numbered topic list to work through. The prompt gives the agent a goal: "understand what this person needs deeply enough to write a compelling brief."
### Behavioral Guidelines
- **Follow the user's thread.** If they talk about a frustration, dig into it. Don't redirect to the next "topic."
- **One question at a time.** This stays — it works.
- **Offer perspective, not just questions.** "That sounds like it might be more of a systems problem than a website problem." The agent has opinions, not just a clipboard.
- **Reference LetsBe naturally.** "We've done something similar for a hospitality client" — not a feature list.
- **2-3 sentences per response.** Prevents monologuing.
### Structured Data Capture
`update_selections` tool stays. The agent is never instructed to "cover these topics." It maps what it hears to predefined values silently. If the conversation never touches timeline, that field stays empty — that's fine.
### Brief Generation
`conversationSummary` is the **primary payload**. The prompt instructs the agent to include everything discussed: pain points, current tools, what they want to keep vs change, business context, decision-makers, what success looks like. Structured fields (`services`, `industry`, `timeline`) are metadata that helps organize the brief, not the substance.
### Brief Content Philosophy
The brief should be **diagnostic, not prescriptive:**
- **Deep on their world** — pain points, current tools, what's broken, customers, what success looks like.
- **Deep on what matters** — priorities and trade-offs surfaced in conversation.
- **LetsBe's perspective** — a few sentences of informed opinion on what the real problem is.
- **High-level on implementation** — no stack recommendations, no architecture, no specific deliverables.
- **No timeline/cost** — "that's what the call is for."
The brief should make the user feel understood and make the follow-up call feel like a warm continuation, not a cold intro.
### Contact Collection
The agent asks for name and email when the conversation reaches a natural conclusion — "I think I've got a great picture of what you need. Let me put a brief together — what's your name and email?" No forced timing. The `request_contact` tool surfaces the on-screen card for verification.
### Language
Both EN and FR system prompts, same as now.
## Reconnection Handling
Exploratory conversations run longer than form-filling. If the WebSocket drops mid-conversation:
- Preserve the transcript on disconnect.
- Show a "reconnect" option instead of just an error.
- On reconnect, seed the new Gemini session with the transcript so far (as context in the system prompt or initial message) so the agent can pick up where it left off.
- The structured selections captured so far are preserved in state.
## Technical Changes
### Files to Modify
- **`VoiceAgentProvider.tsx`** — Refactor `handleToolCall` so `conversationSummary` is the primary brief input. Add state for contact confirmation card (name + email captured, pending user confirm). Add reconnection logic (preserve transcript, re-seed on reconnect). Connection teardown on brief completion already fixed.
- **`VoiceAgent.tsx`** — New layout: larger orb, bigger transcript area, no selection chips. Add contact confirmation card component (inline editable name + email). Fix autoscroll with `scrollIntoView`. Guard controls for brief-complete state (already done). Mobile-responsive layout.
- **`gemini-live.ts`** — Rewrite `buildSystemPrompt()` for both locales with consultative tone. Adjust `complete_brief` tool description to emphasize `conversationSummary`. Add `request_contact` tool declaration that surfaces the on-screen card.
- **`WizardContainer.tsx`** — Remove `ModeToggle` component import, `mode` state, and the voice mode rendering branch. Remove `handleVoiceComplete` and `VoiceAgentProvider` wrapper (these move to the new section).
- **`ModeToggle.tsx`** — Delete entirely.
- **New: Discovery section component** — New section component for the landing page with warm copy, CTA, and expandable voice panel. This is where `VoiceAgentProvider` and `VoiceAgent` now live.
- **Landing page** — Add the new discovery section at the appropriate position.
- **i18n message files** (`en.json`, `fr.json`) — Add translations for discovery section copy. Update voice-related strings as needed.
- **Email template** — Verify the brief email template handles longer, more narrative content gracefully. Adjust if needed.
### What Stays the Same
- WebSocket connection to Gemini Live API
- Audio worklet recording + playback pipeline
- `update_selections` tool (used silently now)
- `/api/configure` route and brief generation logic
- `/api/gemini-token` route
- `StepComplete` component
- `analyze_website` tool (still useful when someone mentions their current site)
- The typed configurator (minus the mode toggle)