Voice Mode API
Control voice interactions with the avatar through postMessage API.
The Voice Mode API enables parent frame control of the embedded avatar's voice features, perfect for custom applications like desktop companions or kiosk displays.
Voice Modes
| Mode | Behavior | Use Case |
|---|---|---|
continuous | Always listening with VAD (Voice Activity Detection) | Hands-free conversation |
pushToTalk | Manual recording control | Noisy environments, precise control |
postMessage Commands
Send commands to the avatar iframe:
const iframe = document.getElementById('avatar-frame');
// Send a command
function sendToAvatar(type, data = {}) {
iframe.contentWindow?.postMessage({ type, ...data }, '*');
}Available Commands
| Command | Parameters | Description |
|---|---|---|
setVoiceMode | { mode: 'continuous' | 'pushToTalk' } | Switch between voice modes |
startVoice | { mode?: string } | Start voice mode |
stopVoice | - | Stop voice mode |
startRecording | - | Start PTT recording |
stopRecording | - | Stop PTT recording and send |
stopSpeaking | - | Interrupt avatar speech |
getState | - | Request current state |
Examples
// Start continuous listening
sendToAvatar('startVoice', { mode: 'continuous' });
// Stop listening
sendToAvatar('stopVoice');Event Listening
Listen for events from the avatar:
window.addEventListener('message', (event) => {
if (event.data?.source !== 'avatarium') return;
const { type, state, mode } = event.data;
switch (type) {
case 'ready':
console.log('Avatar ready');
break;
case 'stateChanged':
console.log('State:', state);
// States: IDLE, LISTENING, RECORDING, PROCESSING, THINKING, SPEAKING
break;
case 'modeChanged':
console.log('Mode:', mode);
// Modes: continuous, pushToTalk
break;
}
});Event Types
| Event | Data | Description |
|---|---|---|
ready | { avatarId, capabilities } | Avatar loaded and ready |
stateChanged | { state } | Voice state changed |
modeChanged | { mode } | Voice mode changed |
Voice States
IDLE → LISTENING → RECORDING → PROCESSING → THINKING → SPEAKING → IDLE
↑ ↓
└────────────────────────────────────────────────────┘| State | Description |
|---|---|
IDLE | Waiting for input |
LISTENING | VAD active, waiting for speech |
RECORDING | Capturing user audio |
PROCESSING | Transcribing audio (STT) |
THINKING | AI generating response |
SPEAKING | Avatar speaking response |
Complete Example
<!DOCTYPE html>
<html>
<head>
<title>Voice Mode Demo</title>
</head>
<body>
<iframe
id="avatar-frame"
src="https://avatarium.ai/embed/YOUR_AVATAR_ID"
allow="microphone"
style="width: 400px; height: 600px; border: none;"
></iframe>
<div id="controls">
<button id="voice-btn">🎤 Talk</button>
<label>
<input type="checkbox" id="mode-toggle" checked>
Always-on mode
</label>
<span id="status">IDLE</span>
</div>
<script>
const iframe = document.getElementById('avatar-frame');
const voiceBtn = document.getElementById('voice-btn');
const modeToggle = document.getElementById('mode-toggle');
const status = document.getElementById('status');
let voiceMode = 'continuous';
let isRecording = false;
function sendToAvatar(type, data = {}) {
iframe.contentWindow?.postMessage({ type, ...data }, '*');
}
// Listen for avatar events
window.addEventListener('message', (event) => {
if (event.data?.source !== 'avatarium') return;
if (event.data.type === 'stateChanged') {
status.textContent = event.data.state;
}
if (event.data.type === 'modeChanged') {
voiceMode = event.data.mode;
modeToggle.checked = voiceMode === 'continuous';
}
});
// Mode toggle
modeToggle.addEventListener('change', (e) => {
voiceMode = e.target.checked ? 'continuous' : 'pushToTalk';
sendToAvatar('setVoiceMode', { mode: voiceMode });
});
// Voice button - continuous mode: toggle, PTT mode: hold
voiceBtn.addEventListener('click', () => {
if (voiceMode === 'continuous') {
sendToAvatar('startVoice', { mode: 'continuous' });
}
});
voiceBtn.addEventListener('mousedown', () => {
if (voiceMode === 'pushToTalk') {
sendToAvatar('startRecording');
isRecording = true;
}
});
voiceBtn.addEventListener('mouseup', () => {
if (voiceMode === 'pushToTalk' && isRecording) {
sendToAvatar('stopRecording');
isRecording = false;
}
});
// Keyboard shortcuts
document.addEventListener('keydown', (e) => {
if (e.code === 'Space' && voiceMode === 'pushToTalk') {
e.preventDefault();
voiceBtn.dispatchEvent(new MouseEvent('mousedown'));
}
if (e.code === 'Escape') {
sendToAvatar('stopSpeaking');
}
});
document.addEventListener('keyup', (e) => {
if (e.code === 'Space' && voiceMode === 'pushToTalk') {
e.preventDefault();
voiceBtn.dispatchEvent(new MouseEvent('mouseup'));
}
});
</script>
</body>
</html>Best Practices
Push-to-Talk for noisy environments: Use PTT mode when background noise could trigger false activations in continuous mode.
Keyboard Shortcuts
Recommended keyboard bindings:
| Key | Action |
|---|---|
Space | Hold for PTT, or toggle continuous |
Escape | Interrupt/stop avatar |
M | Toggle mode |
Visual Feedback
Always show clear visual feedback for voice state:
.voice-btn.listening { background: #4CAF50; }
.voice-btn.recording { background: #f44336; animation: pulse 1s infinite; }
.voice-btn.processing { background: #FF9800; }
.voice-btn.speaking { background: #2196F3; }Error Handling
Handle microphone permission errors gracefully:
navigator.mediaDevices.getUserMedia({ audio: true })
.then(() => {
// Microphone available
sendToAvatar('startVoice');
})
.catch((err) => {
if (err.name === 'NotAllowedError') {
alert('Microphone access denied. Please enable in browser settings.');
}
});Related
- Voice Configuration - TTS voice settings
- Embedding - Embed integration basics
- Best Practices - Production tips