|
| 1 | +--- |
| 2 | +title: Email address reading |
| 3 | +subtitle: Get your voice agent to collect, read back, and confirm email addresses clearly |
| 4 | +slug: assistants/email-address-reading |
| 5 | +--- |
| 6 | + |
| 7 | +## Overview |
| 8 | + |
| 9 | +Email addresses are one of the trickiest pieces of information for a voice agent to handle. They contain special characters (`@`, `.`, `-`, `_`), mixed-case text, and domain names that text-to-speech (TTS) engines often mispronounce or blur together when spoken aloud. |
| 10 | + |
| 11 | +This guide covers two sides of the problem: |
| 12 | + |
| 13 | +- **Built-in formatting** -- Vapi automatically transforms email characters for TTS so they sound natural. |
| 14 | +- **Prompt engineering** -- You instruct the LLM *how* to collect, read back, and confirm emails in conversation so users feel confident their address was captured correctly. |
| 15 | + |
| 16 | +## How Vapi handles emails automatically |
| 17 | + |
| 18 | +Vapi's [voice formatting plan](/assistants/voice-formatting-plan) includes a built-in `formatEmails` step that runs before text reaches the TTS provider. It replaces `@` with "at" and `.` with "dot" so the spoken output is intelligible without any prompt changes. |
| 19 | + |
| 20 | +| Raw LLM output | What the user hears | |
| 21 | +|---|---| |
| 22 | +| `john.doe@example.com` | "john dot doe at example dot com" | |
| 23 | +| `SALES@company.org` | "SALES at company dot org" | |
| 24 | + |
| 25 | +<Tip> |
| 26 | +The `formatEmails` formatter is enabled by default. You do not need to configure anything for basic email reading to work. The rest of this guide focuses on the **prompt-level techniques** that make the full collection-and-confirmation flow reliable. |
| 27 | +</Tip> |
| 28 | + |
| 29 | +## Why prompt engineering still matters |
| 30 | + |
| 31 | +Even though TTS formatting handles the character-level pronunciation, the LLM still controls *how* the conversation flows. Without explicit instructions, the agent might: |
| 32 | + |
| 33 | +- Read the email once at normal speed and move on, leaving the user unsure. |
| 34 | +- Fail to spell out ambiguous parts (was it "Jon" or "John"?). |
| 35 | +- Mispronounce uncommon domain names. |
| 36 | +- Skip a confirmation step entirely. |
| 37 | + |
| 38 | +Good prompt instructions solve these problems at the conversational level. |
| 39 | + |
| 40 | +## System prompt: collecting an email |
| 41 | + |
| 42 | +When asking a user for their email, instruct the agent to be patient and explicit about what it needs. The following snippet can be added to your system prompt. |
| 43 | + |
| 44 | +```md wordWrap title="System prompt -- collecting email" |
| 45 | +[Email Collection] |
| 46 | +When you need to collect the user's email address: |
| 47 | +1. Ask clearly: "Could you please tell me your email address?" |
| 48 | +2. Listen to the full response before repeating anything back. |
| 49 | +3. Once you have the email, read it back using these pronunciation rules: |
| 50 | + - Say "@" as "at" |
| 51 | + - Say "." as "dot" |
| 52 | + - Say "-" as "dash" |
| 53 | + - Say "_" as "underscore" |
| 54 | +4. After reading it back, ask "Is that correct?" |
| 55 | +5. If the user says no, ask them to spell it out letter by letter. |
| 56 | +6. Never guess or autocorrect the email. Use exactly what the user provides. |
| 57 | +``` |
| 58 | + |
| 59 | +## System prompt: reading back and confirming an email |
| 60 | + |
| 61 | +The confirmation step is where most agents fail. They read the email too fast or only once. This snippet teaches the agent to slow down and spell when needed. |
| 62 | + |
| 63 | +```md wordWrap title="System prompt -- confirming email" |
| 64 | +[Email Confirmation] |
| 65 | +When reading an email address back to the user: |
| 66 | +1. Speak slowly and clearly. Pause briefly between each part of the email |
| 67 | + (username, "at", domain, "dot", extension). |
| 68 | +2. For the username part, if it contains common words, say the words. |
| 69 | + If it is ambiguous or uncommon, spell it out letter by letter. |
| 70 | + For example: |
| 71 | + - "john.doe" → "john dot doe" |
| 72 | + - "jdoe42" → "j, d, o, e, four, two" |
| 73 | + - "msmith" → "m, s, m, i, t, h" |
| 74 | +3. For the domain, use the familiar name if it is a well-known provider: |
| 75 | + - "gmail.com" → "gmail dot com" |
| 76 | + - "yahoo.com" → "yahoo dot com" |
| 77 | + - "outlook.com" → "outlook dot com" |
| 78 | + - "hotmail.com" → "hotmail dot com" |
| 79 | + If the domain is uncommon, spell it out letter by letter. |
| 80 | +4. Always end with: "Is that correct?" |
| 81 | +5. If the user corrects any part, repeat the entire email back again |
| 82 | + after applying the correction. |
| 83 | +``` |
| 84 | + |
| 85 | +## Spelling out letter by letter |
| 86 | + |
| 87 | +For ambiguous usernames or unfamiliar domains, letter-by-letter spelling removes all doubt. Add this instruction to your prompt so the agent knows when and how to spell. |
| 88 | + |
| 89 | +```md wordWrap title="System prompt -- letter-by-letter spelling" |
| 90 | +[Letter-by-Letter Spelling] |
| 91 | +When spelling out part of an email: |
| 92 | +- Say each letter individually with a brief pause between letters. |
| 93 | +- For numbers, say the digit name ("one", "two", "three"), not the numeral. |
| 94 | +- For uppercase vs lowercase, only mention case if the email is case-sensitive |
| 95 | + or the user specifically asks. |
| 96 | +- Use the NATO phonetic alphabet only if the user is having trouble |
| 97 | + understanding individual letters. For example: |
| 98 | + "b as in bravo, d as in delta" |
| 99 | +``` |
| 100 | + |
| 101 | +<Note> |
| 102 | +Most email providers treat addresses as case-insensitive, so you typically do not need to distinguish uppercase from lowercase. Your prompt can note this to keep the conversation simpler. |
| 103 | +</Note> |
| 104 | + |
| 105 | +## Handling common domains naturally |
| 106 | + |
| 107 | +You can make the agent sound more natural by teaching it to recognize popular email domains and say them as single words rather than spelling them out. |
| 108 | + |
| 109 | +```md wordWrap title="System prompt -- common domains" |
| 110 | +[Common Email Domains] |
| 111 | +When reading these domains, say them as words, not spelled out: |
| 112 | +- gmail.com → "gmail dot com" |
| 113 | +- yahoo.com → "yahoo dot com" |
| 114 | +- outlook.com → "outlook dot com" |
| 115 | +- hotmail.com → "hotmail dot com" |
| 116 | +- icloud.com → "icloud dot com" |
| 117 | +- aol.com → "A O L dot com" |
| 118 | +- protonmail.com → "proton mail dot com" |
| 119 | +For any domain not in this list, spell it out letter by letter to avoid confusion. |
| 120 | +``` |
| 121 | + |
| 122 | +## Complete example: appointment booking agent |
| 123 | + |
| 124 | +Below is a full system prompt section you can copy into your assistant configuration. It combines all the techniques above into a single, production-ready block. |
| 125 | + |
| 126 | +```md wordWrap title="Complete system prompt section" |
| 127 | +[Identity] |
| 128 | +You are Sarah, a friendly appointment scheduling assistant for Acme Dental. |
| 129 | + |
| 130 | +[Email Collection and Confirmation] |
| 131 | +When you need the user's email address: |
| 132 | +1. Ask: "What email address should we send the confirmation to?" |
| 133 | +2. Wait for the full response. Do not interrupt. |
| 134 | +3. Read the email back to the user following these rules: |
| 135 | + - Say "@" as "at" |
| 136 | + - Say "." as "dot" |
| 137 | + - Say "-" as "dash" |
| 138 | + - Say "_" as "underscore" |
| 139 | + - Speak slowly with a brief pause between each part. |
| 140 | + - For well-known domains (gmail, yahoo, outlook, hotmail, icloud), |
| 141 | + say the domain name naturally. |
| 142 | + - For unfamiliar domains, spell them out letter by letter. |
| 143 | + - For the username, if it is a recognizable name or word, say it normally. |
| 144 | + If it looks like an abbreviation or random string, spell it out letter |
| 145 | + by letter. |
| 146 | +4. After reading the email, ask: "Did I get that right?" |
| 147 | +5. If the user says no: |
| 148 | + - Ask: "Could you spell it out for me letter by letter?" |
| 149 | + - Listen carefully, then read the corrected version back. |
| 150 | + - Ask again: "Is that correct now?" |
| 151 | +6. Do not proceed to the next step until the user confirms the email. |
| 152 | +7. Never modify, autocorrect, or guess any part of the email address. |
| 153 | + |
| 154 | +[Example Conversation] |
| 155 | +Agent: "What email address should we send the confirmation to?" |
| 156 | +User: "It's jsmith42@newcompany.io" |
| 157 | +Agent: "Let me read that back. j, s, m, i, t, h, four, two ...at... new company |
| 158 | + ...dot... i, o. Did I get that right?" |
| 159 | +User: "Yes, that's correct." |
| 160 | +``` |
| 161 | + |
| 162 | +<Tip> |
| 163 | +Including an example conversation in your system prompt helps the LLM understand the exact pacing and format you expect. This is one of the most effective techniques for consistent behavior. |
| 164 | +</Tip> |
| 165 | + |
| 166 | +## Using pronunciation dictionaries for domains |
| 167 | + |
| 168 | +If your agents frequently encounter a specific company or domain name that TTS mispronounces, you can use [pronunciation dictionaries](/assistants/pronunciation-dictionaries) (available with ElevenLabs voices) to set the correct pronunciation at the TTS level. |
| 169 | + |
| 170 | +For example, if the domain "vapi.ai" is being pronounced as "vappy dot ay-eye", you could create an alias rule: |
| 171 | + |
| 172 | +```json title="Pronunciation dictionary rule" |
| 173 | +{ |
| 174 | + "rules": [ |
| 175 | + { |
| 176 | + "stringToReplace": "vapi", |
| 177 | + "type": "alias", |
| 178 | + "alias": "vaahpee" |
| 179 | + } |
| 180 | + ] |
| 181 | +} |
| 182 | +``` |
| 183 | + |
| 184 | +This approach is complementary to prompt engineering -- pronunciation dictionaries fix TTS-level pronunciation, while prompt instructions control the conversational flow. |
| 185 | + |
| 186 | +## Using custom keywords for transcription accuracy |
| 187 | + |
| 188 | +If the speech-to-text (STT) transcriber is mishearing specific email domains or usernames, [custom keywords](/customization/custom-keywords) can boost transcription accuracy for those terms. |
| 189 | + |
| 190 | +For example, if users frequently mention their company email domain "contoso.com" and the transcriber misinterprets it, you can add "contoso" as a custom keyword to improve recognition. |
| 191 | + |
| 192 | +## Best practices |
| 193 | + |
| 194 | +<AccordionGroup> |
| 195 | + <Accordion title="Always confirm the full email address"> |
| 196 | + Never assume an email is correct after hearing it once. Always read the |
| 197 | + complete email back and wait for confirmation before proceeding. This single |
| 198 | + step prevents the majority of email capture errors. |
| 199 | + </Accordion> |
| 200 | + |
| 201 | + <Accordion title="Use a two-pass approach for difficult emails"> |
| 202 | + First, try reading the email back naturally (words and common domains). |
| 203 | + If the user says it is wrong, switch to letter-by-letter spelling for |
| 204 | + the entire address. This keeps simple emails fast while still handling |
| 205 | + complex ones reliably. |
| 206 | + </Accordion> |
| 207 | + |
| 208 | + <Accordion title="Do not autocorrect or assume"> |
| 209 | + Instruct the agent to never modify any part of the email address. |
| 210 | + Common mistakes include changing "jon" to "john" or assuming ".com" |
| 211 | + when the user said ".co". Treat the email as an exact string. |
| 212 | + </Accordion> |
| 213 | + |
| 214 | + <Accordion title="Handle interruptions gracefully"> |
| 215 | + Users sometimes interrupt mid-readback with a correction. Instruct the |
| 216 | + agent to accept the correction, incorporate it, and then restart the |
| 217 | + full readback from the beginning so both parties are aligned. |
| 218 | + </Accordion> |
| 219 | + |
| 220 | + <Accordion title="Keep voice formatting enabled"> |
| 221 | + Vapi's built-in `formatEmails` transformer handles the TTS-level |
| 222 | + conversion of "@" and "." automatically. Disabling the voice formatting |
| 223 | + plan will cause the TTS to receive raw characters, which may produce |
| 224 | + garbled output. Keep `voice.chunkPlan.formatPlan.enabled` set to `true` |
| 225 | + (the default). |
| 226 | + </Accordion> |
| 227 | +</AccordionGroup> |
| 228 | + |
| 229 | +## Common issues |
| 230 | + |
| 231 | +<AccordionGroup> |
| 232 | + <Accordion title="TTS reads the email as a URL or gibberish"> |
| 233 | + This usually happens when voice formatting is disabled. Verify that |
| 234 | + `voice.chunkPlan.formatPlan.enabled` is set to `true` (the default). |
| 235 | + See the [voice formatting plan](/assistants/voice-formatting-plan) for |
| 236 | + details. |
| 237 | + </Accordion> |
| 238 | + |
| 239 | + <Accordion title="Agent skips the confirmation step"> |
| 240 | + Add an explicit instruction like "Do not proceed until the user confirms |
| 241 | + the email" to your system prompt. Reinforcing this with an example |
| 242 | + conversation in the prompt helps the LLM follow the flow consistently. |
| 243 | + </Accordion> |
| 244 | + |
| 245 | + <Accordion title="Agent modifies or autocorrects the email"> |
| 246 | + LLMs sometimes try to be helpful by fixing perceived typos. Add a clear |
| 247 | + rule: "Never modify, autocorrect, or guess any part of the email address. |
| 248 | + Use exactly what the user provides." |
| 249 | + </Accordion> |
| 250 | + |
| 251 | + <Accordion title="User says a letter but transcriber hears a different one"> |
| 252 | + Letters like "b" and "d", or "m" and "n", sound similar over phone audio. |
| 253 | + If this happens frequently, instruct the agent to ask the user to use |
| 254 | + the NATO phonetic alphabet ("b as in bravo") or use |
| 255 | + [custom keywords](/customization/custom-keywords) to improve |
| 256 | + transcription accuracy for commonly confused terms. |
| 257 | + </Accordion> |
| 258 | +</AccordionGroup> |
| 259 | + |
| 260 | +## Next steps |
| 261 | + |
| 262 | +Now that your agent handles email addresses reliably: |
| 263 | + |
| 264 | +- **[Prompting guide](/prompting-guide)** -- General techniques for writing effective voice AI prompts. |
| 265 | +- **[Voice formatting plan](/assistants/voice-formatting-plan)** -- Understand and customize how Vapi formats text for TTS. |
| 266 | +- **[Pronunciation dictionaries](/assistants/pronunciation-dictionaries)** -- Fine-tune pronunciation for specific words and names. |
| 267 | +- **[Custom keywords](/customization/custom-keywords)** -- Improve transcription accuracy for specific terms. |
0 commit comments