Vision & Multimodal

Meloqui supports multimodal models (like GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro) that can understand images.

Sending Images

Pass an array of content parts instead of a simple string:

typescript

await client.chat({
  role: 'user',
  content: [
    { type: 'text', text: 'What is in this image?' },
    { type: 'image', image: 'https://example.com/photo.jpg' }
  ]
});

Local Files

Local file paths are automatically base64 encoded:

typescript

await client.chat({
  role: 'user',
  content: [
    { type: 'image', image: './screenshots/dashboard.png' },
    { type: 'text', text: 'Analyze this dashboard UI.' }
  ]
});

Supported Formats

Meloqui supports common image formats:

PNG, JPEG, WebP, GIF
Maximum file size: 20MB (configurable via maxImageSize)

Image Optimization

Large images consume more tokens and increase latency. Meloqui can automatically resize and compress images before sending them to the API.

Installation

Image optimization requires the optional sharp package:

bash

npm install sharp

Without sharp installed, images pass through unchanged (with a warning logged).

Per-Request Optimization

Apply optimization to specific requests:

typescript

await client.chat({
  role: 'user',
  content: [
    { type: 'image', image: './high-res-photo.jpg' },
    { type: 'text', text: 'Describe this image.' }
  ]
}, {
  imageOptimization: {
    maxWidth: 1024,    // Resize if wider than 1024px
    maxHeight: 1024,   // Resize if taller than 1024px
    quality: 80,       // JPEG/WebP quality (1-100)
    format: 'jpeg'     // Convert to JPEG
  }
});

Options

Option	Type	Default	Description
`maxWidth`	number	2048	Maximum width in pixels
`maxHeight`	number	2048	Maximum height in pixels
`quality`	number	85	Quality for JPEG/WebP (1-100)
`format`	string	(keep original)	Output format: `'jpeg'`, `'png'`, `'webp'`

Behavior

Images are resized to fit within maxWidth × maxHeight while maintaining aspect ratio
Small images are not upscaled (withoutEnlargement: true)
GIF images are not optimized (to preserve animation)
If sharp is not installed, images pass through unchanged

When to Use

Use optimization when:

Processing user-uploaded images of unknown size
Working with screenshots or high-resolution photos
Reducing API costs (fewer tokens for smaller images)

Skip optimization when:

Images are already appropriately sized
You need pixel-perfect accuracy
Processing animated GIFs

Checking Vision Support

Verify the provider supports vision before sending images:

typescript

if (client.capabilities.vision) {
  // Safe to send images
  await client.chat({
    role: 'user',
    content: [
      { type: 'image', image: './photo.jpg' },
      { type: 'text', text: 'What is this?' }
    ]
  });
} else {
  // Fall back to text-only
  await client.chat('Describe a sunset');
}

Error Handling

If you send an image to a provider that doesn't support vision, a CapabilityError is thrown:

typescript

import { ChatClient, CapabilityError } from 'meloqui';

try {
  await client.chat({
    role: 'user',
    content: [{ type: 'image', image: './photo.jpg' }]
  });
} catch (error) {
  if (error instanceof CapabilityError) {
    console.error('This model does not support images');
  }
}

For more error handling patterns including vision fallbacks, see the Error Handling guide.

Next Steps

Error Handling - Comprehensive error handling patterns
API Reference: Images - Image utility functions

Vision & Multimodal ​

Sending Images ​

Local Files ​

Supported Formats ​

Image Optimization ​

Installation ​

Per-Request Optimization ​

Options ​

Behavior ​

When to Use ​

Checking Vision Support ​

Error Handling ​

Next Steps ​

Vision & Multimodal

Sending Images

Local Files

Supported Formats

Image Optimization

Installation

Per-Request Optimization

Options

Behavior

When to Use

Checking Vision Support

Error Handling

Next Steps