docs: add voice discovery mode design spec
Captures the pivot from form-filling voice mode to a standalone consultative discovery experience with separate entry point, rewritten system prompt, on-screen contact verification, and reconnection handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,144 @@
|
|||||||
|
# Voice Discovery Mode — Design Spec
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Pivot the voice mode from a "faster way to fill out the configurator" into a standalone consultative discovery experience. Exploratory users — people who don't yet know exactly what they need — get a warm, conversational entry point separate from the typed configurator. The conversation is free-flowing and consultant-like, structured data is captured silently, and the user receives a personalized brief at the end.
|
||||||
|
|
||||||
|
## Entry Point & Framing
|
||||||
|
|
||||||
|
### Placement
|
||||||
|
|
||||||
|
A new standalone section on the landing page, positioned after the services or process section — wherever the natural "I'm interested but not sure" moment occurs. Completely decoupled from the configurator.
|
||||||
|
|
||||||
|
### Copy Direction
|
||||||
|
|
||||||
|
- **Headline:** Warm and inviting — "Not sure where to start?" or "Still figuring out what you need?"
|
||||||
|
- **Subtext:** "Tell us what you're thinking and we'll figure it out together. You'll get a personalized brief at the end."
|
||||||
|
- **CTA button:** "Let's talk" — styled distinctly from the configurator's CTA.
|
||||||
|
- Both EN and FR translations required.
|
||||||
|
|
||||||
|
### Behavior
|
||||||
|
|
||||||
|
Clicking the CTA scrolls to / reveals the voice conversation panel inline on the page. No route change, no modal. The panel expands in place, keeping the user grounded in the site context.
|
||||||
|
|
||||||
|
### Configurator Changes
|
||||||
|
|
||||||
|
- Remove the `ModeToggle` component and `mode` state from `WizardContainer.tsx`.
|
||||||
|
- The configurator becomes typed-form-only — the "I know what I want" path.
|
||||||
|
- No other changes to the configurator itself.
|
||||||
|
|
||||||
|
## Voice Conversation UI
|
||||||
|
|
||||||
|
### Layout
|
||||||
|
|
||||||
|
A dedicated panel, roughly the same width as the configurator card but taller. Three zones stacked vertically:
|
||||||
|
|
||||||
|
1. **Agent header** — LetsBe branding mark, agent name, connection status dot. Similar to current but slightly more prominent.
|
||||||
|
|
||||||
|
2. **Orb + transcript area** — Orb is larger (24-28 units instead of 20). Live transcript below it with significantly more vertical space (`max-h-72` or similar instead of current `max-h-40`). Proper autoscroll using `scrollIntoView` on the bottom ref. **Selection chips are removed** — no visible evidence of structured data capture.
|
||||||
|
|
||||||
|
3. **Controls** — Mic toggle and end call button. Same as current, cleaner without chips.
|
||||||
|
|
||||||
|
### Mobile
|
||||||
|
|
||||||
|
Panel goes nearly full-width on small screens. Transcript takes most of the viewport height. Orb may scale down slightly. Controls stay fixed at the bottom for thumb reach.
|
||||||
|
|
||||||
|
### Contact Confirmation Card
|
||||||
|
|
||||||
|
When the agent captures name and email, a small inline card appears (above controls or below transcript) showing the captured values with inline edit affordance. The agent says "I've got your details on screen — look right?" User can tap to edit, then confirm. **This replaces the verbal spell-back entirely.**
|
||||||
|
|
||||||
|
Requires a new tool (e.g., `request_contact`) that the agent calls to surface the card, rather than collecting contact info verbally.
|
||||||
|
|
||||||
|
### During Brief Generation
|
||||||
|
|
||||||
|
After contact confirmation and `complete_brief` trigger:
|
||||||
|
- Connection is closed (already fixed).
|
||||||
|
- Panel transitions to a generating state — orb morphs to loader or StepGenerating-style progress indicators.
|
||||||
|
- Transcript remains visible so the conversation doesn't vanish.
|
||||||
|
|
||||||
|
### On Completion
|
||||||
|
|
||||||
|
Transitions to the same `StepComplete` view (brief preview + book a call CTA). The brief content will be richer due to deeper conversation, but presentation is the same.
|
||||||
|
|
||||||
|
## System Prompt & Agent Behavior
|
||||||
|
|
||||||
|
### Tone
|
||||||
|
|
||||||
|
The agent is a conversational consultant, not an interviewer with a checklist. No numbered topic list to work through. The prompt gives the agent a goal: "understand what this person needs deeply enough to write a compelling brief."
|
||||||
|
|
||||||
|
### Behavioral Guidelines
|
||||||
|
|
||||||
|
- **Follow the user's thread.** If they talk about a frustration, dig into it. Don't redirect to the next "topic."
|
||||||
|
- **One question at a time.** This stays — it works.
|
||||||
|
- **Offer perspective, not just questions.** "That sounds like it might be more of a systems problem than a website problem." The agent has opinions, not just a clipboard.
|
||||||
|
- **Reference LetsBe naturally.** "We've done something similar for a hospitality client" — not a feature list.
|
||||||
|
- **2-3 sentences per response.** Prevents monologuing.
|
||||||
|
|
||||||
|
### Structured Data Capture
|
||||||
|
|
||||||
|
`update_selections` tool stays. The agent is never instructed to "cover these topics." It maps what it hears to predefined values silently. If the conversation never touches timeline, that field stays empty — that's fine.
|
||||||
|
|
||||||
|
### Brief Generation
|
||||||
|
|
||||||
|
`conversationSummary` is the **primary payload**. The prompt instructs the agent to include everything discussed: pain points, current tools, what they want to keep vs change, business context, decision-makers, what success looks like. Structured fields (`services`, `industry`, `timeline`) are metadata that helps organize the brief, not the substance.
|
||||||
|
|
||||||
|
### Brief Content Philosophy
|
||||||
|
|
||||||
|
The brief should be **diagnostic, not prescriptive:**
|
||||||
|
- **Deep on their world** — pain points, current tools, what's broken, customers, what success looks like.
|
||||||
|
- **Deep on what matters** — priorities and trade-offs surfaced in conversation.
|
||||||
|
- **LetsBe's perspective** — a few sentences of informed opinion on what the real problem is.
|
||||||
|
- **High-level on implementation** — no stack recommendations, no architecture, no specific deliverables.
|
||||||
|
- **No timeline/cost** — "that's what the call is for."
|
||||||
|
|
||||||
|
The brief should make the user feel understood and make the follow-up call feel like a warm continuation, not a cold intro.
|
||||||
|
|
||||||
|
### Contact Collection
|
||||||
|
|
||||||
|
The agent asks for name and email when the conversation reaches a natural conclusion — "I think I've got a great picture of what you need. Let me put a brief together — what's your name and email?" No forced timing. The `request_contact` tool surfaces the on-screen card for verification.
|
||||||
|
|
||||||
|
### Language
|
||||||
|
|
||||||
|
Both EN and FR system prompts, same as now.
|
||||||
|
|
||||||
|
## Reconnection Handling
|
||||||
|
|
||||||
|
Exploratory conversations run longer than form-filling. If the WebSocket drops mid-conversation:
|
||||||
|
|
||||||
|
- Preserve the transcript on disconnect.
|
||||||
|
- Show a "reconnect" option instead of just an error.
|
||||||
|
- On reconnect, seed the new Gemini session with the transcript so far (as context in the system prompt or initial message) so the agent can pick up where it left off.
|
||||||
|
- The structured selections captured so far are preserved in state.
|
||||||
|
|
||||||
|
## Technical Changes
|
||||||
|
|
||||||
|
### Files to Modify
|
||||||
|
|
||||||
|
- **`VoiceAgentProvider.tsx`** — Refactor `handleToolCall` so `conversationSummary` is the primary brief input. Add state for contact confirmation card (name + email captured, pending user confirm). Add reconnection logic (preserve transcript, re-seed on reconnect). Connection teardown on brief completion already fixed.
|
||||||
|
|
||||||
|
- **`VoiceAgent.tsx`** — New layout: larger orb, bigger transcript area, no selection chips. Add contact confirmation card component (inline editable name + email). Fix autoscroll with `scrollIntoView`. Guard controls for brief-complete state (already done). Mobile-responsive layout.
|
||||||
|
|
||||||
|
- **`gemini-live.ts`** — Rewrite `buildSystemPrompt()` for both locales with consultative tone. Adjust `complete_brief` tool description to emphasize `conversationSummary`. Add `request_contact` tool declaration that surfaces the on-screen card.
|
||||||
|
|
||||||
|
- **`WizardContainer.tsx`** — Remove `ModeToggle` component import, `mode` state, and the voice mode rendering branch. Remove `handleVoiceComplete` and `VoiceAgentProvider` wrapper (these move to the new section).
|
||||||
|
|
||||||
|
- **`ModeToggle.tsx`** — Delete entirely.
|
||||||
|
|
||||||
|
- **New: Discovery section component** — New section component for the landing page with warm copy, CTA, and expandable voice panel. This is where `VoiceAgentProvider` and `VoiceAgent` now live.
|
||||||
|
|
||||||
|
- **Landing page** — Add the new discovery section at the appropriate position.
|
||||||
|
|
||||||
|
- **i18n message files** (`en.json`, `fr.json`) — Add translations for discovery section copy. Update voice-related strings as needed.
|
||||||
|
|
||||||
|
- **Email template** — Verify the brief email template handles longer, more narrative content gracefully. Adjust if needed.
|
||||||
|
|
||||||
|
### What Stays the Same
|
||||||
|
|
||||||
|
- WebSocket connection to Gemini Live API
|
||||||
|
- Audio worklet recording + playback pipeline
|
||||||
|
- `update_selections` tool (used silently now)
|
||||||
|
- `/api/configure` route and brief generation logic
|
||||||
|
- `/api/gemini-token` route
|
||||||
|
- `StepComplete` component
|
||||||
|
- `analyze_website` tool (still useful when someone mentions their current site)
|
||||||
|
- The typed configurator (minus the mode toggle)
|
||||||
Reference in New Issue
Block a user