Core Concepts
Vision & Camera

Vision & Camera

Enable your avatar to see and respond to what users show via their camera.

What is Vision?

Vision allows your avatar to:

  • See the user's camera feed
  • Analyze images shared in chat
  • Respond to visual context
  • Provide feedback on what it sees

Common use cases:

  • Product support – "Show me the error on your screen"
  • Education – "Let me see your work and I'll help"
  • Retail – "Show me the item you're looking at"
  • Technical support – "Point your camera at the device"

Enabling Vision

Dashboard

  1. Go to Dashboard > Avatars > [Your Avatar] > Settings
  2. Enable Vision capabilities
  3. Choose AI provider (must support vision):
    • GPT-4o (recommended)
    • Claude 3.5 Sonnet
    • Gemini Pro Vision
  4. Save changes

SDK

<Avatar
  model="scarlett"
  vision={{
    enabled: true,
    mode: 'camera',  // 'camera' | 'upload' | 'both'
    aiProvider: 'gpt-4o'
  }}
/>

Vision Modes

Camera Mode

Real-time camera feed analysis:

<Avatar
  vision={{
    enabled: true,
    mode: 'camera',
    captureInterval: 5000,  // Analyze every 5 seconds
    autoCapture: false      // Require user to click "capture"
  }}
/>

Image Upload Mode

Users can share images from their device:

<Avatar
  vision={{
    enabled: true,
    mode: 'upload',
    maxFileSize: 10 * 1024 * 1024,  // 10MB
    acceptedTypes: ['image/jpeg', 'image/png', 'image/webp']
  }}
/>

Combined Mode

Both camera and upload available:

<Avatar
  vision={{
    enabled: true,
    mode: 'both'
  }}
/>

Personality Configuration

Update your prompt to handle visual input:

const PERSONALITY = `
You are a technical support agent with vision capabilities.
 
When the user shares an image or shows their camera:
1. Acknowledge what you see clearly
2. Ask clarifying questions if the image is unclear
3. Provide specific, actionable guidance
 
Example responses:
- "I can see the error message on your screen. It says..."
- "I see you're pointing at the power button. Try holding it for 10 seconds."
- "The image is a bit blurry. Could you hold the camera steadier?"
`;

Privacy Considerations

Vision involves sensitive user data. Handle responsibly:

User Consent

Always inform users about vision capabilities:

<Avatar
  vision={{
    enabled: true,
    consentRequired: true,
    consentMessage: "This assistant can see images you share. Your camera feed is processed in real-time but not stored."
  }}
/>

Data Handling

DataStored?Details
Camera framesNoProcessed in memory, not saved
Uploaded imagesOptionalConfigure retention in settings
AI descriptionsYesPart of conversation transcript

Disable Recording

Prevent any image storage:

<Avatar
  vision={{
    enabled: true,
    storeImages: false,
    storeDescriptions: true  // Keep AI's text description
  }}
/>

Events

Listen for vision events:

<Avatar
  onImageCapture={(image) => {
    console.log('Image captured:', image.size);
  }}
  onVisionAnalysis={(result) => {
    console.log('AI saw:', result.description);
  }}
/>

Limits

PlanVision Analyses/Month
Free100
Creator1,000
Pro10,000
EnterpriseUnlimited

Note: Vision analyses use more tokens than text-only conversations.

Enable Vision → (opens in a new tab)