Enterprise SaaS · Sole Product Designer

AI Voice Assisted Task Creation

Field teams dictate work orders instead of typing them. The AI handles the rest.

Product UI is proprietary. Diagrams represent the design thinking, not the actual interface.

System Architecture

Designing the AI's behavior

The system prompt is where the real design work lives. It tells the AI how to parse messy speech into validated work orders.

Voice Input

"Fix the leaky faucet at unit 4B on Maple Street, assign it to John, it's pretty urgent"

unstructured ambiguous names implicit priority natural language

processes

System Prompt — the designer's artifact

Parsing Behavior

Extract intent, entities, and field values from natural speech

Validate against database — assignees, properties, categories

Return human preview + structured JSON simultaneously

Lookup Disambiguation

✓ Exact match → populate field, confirm to user

? Fuzzy match → present numbered options for voice selection

✕ No match → set default, inform user, remind they can edit

! Emergency → route to emergency services immediately

Conversation Rules

Keyboard fallback available at any point

Voice edits to preview before form submission

Two-stage review: conversational → populated form

outputs

Structured Output

Human Preview

title Fix leaky faucet

property 4B Maple Street ✓

assignee John Lee ✓

priority High (inferred)

due Today (inferred)

Form JSON

"title": "Fix leaky faucet", "property_id": 2847, "assignee_id": 156, "priority": "high", "due_date": "2026-03-14"

System prompt as the design deliverable

Instead of static mockups, the system prompt defined the AI's behavior — parsing, disambiguation, error handling, conversation flow. Engineering used it as their production starting point.

Two-stage review for trust

Users see a conversational preview first, then the populated form. Research confirmed people wanted AI to infer fields but wanted final sign-off before anything was created.

Keyboard fallback at any point

The voice-first flow never blocks users. They can switch to keyboard input at any time, so the AI-first approach doesn't become a cage.

Director Scrutiny Passed

Product directors grilled the edge cases and research methodology. They had questions on every scenario; the research held up.

Prompt → Production

Engineering used the system prompt from prototyping as their production starting point

Methodology Shift

The UX team started using the working-prototype workflow for their own AI feature testing

Component anatomy

What the system prompt actually controls

The system prompt is a design artifact. It specifies how the AI parses speech, resolves ambiguous names, handles lookup failures, and manages the conversation.

Voice Input

Transcription

"Fix the leaky faucet at unit 4B on Maple Street, assign it to John, it's pretty urgent"

Voice Keyboard

"What did the user say?"

VOICE INPUT

Speech-to-text transcription via API

Users tap to record and describe the task in their own words. The app transcribes the speech, then passes it to the AI as raw text. That text is messy by nature: names are ambiguous, priorities are implied, and phrasing is casual.

Design decision Keyboard fallback is always visible. Voice-first doesn't mean voice-only — users can switch input modalities at any point without losing context.

Lookup Result

assignee

"John"

✓ John Lee Maintenance Tech

? Jon Li Property Manager

? John Lee Jr. Contractor

AI presents numbered options: "Did you mean 1) John Lee, 2) Jon Li, or 3) John Lee Jr.?"

"Who did they mean?"

AI BEHAVIOR

Lookup disambiguation decision tree

Assignees, properties, and categories all need to match real database records. When the AI finds an exact match, it populates the field automatically. Fuzzy matches get presented as numbered options so the user can select by voice. If nothing matches, the AI sets a sensible default and tells the user why.

Voice-specific constraint Phonetically similar names (John Lee vs Jon Li) are indistinguishable in speech. Numbered options let users say "one" or "two" to select — critical for a voice-first interface.

Review

Conversational Preview

AI summarizes: "I'll create a high-priority task to fix the leaky faucet at 4B Maple Street, assigned to John Lee, due today."

Edit by voice Looks good →

Populated Form

TitleFix leaky faucet

Property4B Maple St

AssigneeJohn Lee

PriorityHigh ⚡

Submit task

"Does this look right?"

AI BEHAVIOR

Two-stage review for user trust

User research confirmed people wanted the AI to infer fields like description, priority, and due date. But they also wanted final sign-off before anything was created. The two-stage review gives users a conversational confirmation first, then the populated form for final review and edits.

Showing the AI's work Inferred fields are visually distinguished from matched fields. Users can see exactly what the AI assumed vs. what it validated against the database.