Local Models

Meloqui supports running LLMs locally for privacy and offline use. This guide covers the supported local model options.

Options Comparison

Option	Provider Type	Setup Complexity	Best For
Ollama	`ollama`	Easy	General local inference
Docker Model Runner	`openai` (compatible)	Easy	Docker Desktop users
OpenAI-compatible servers	`openai` (compatible)	Varies	Custom deployments

Ollama

Ollama is the simplest way to run LLMs locally.

Setup

Install Ollama
Pull a model: ollama pull llama3
Run the server (default localhost:11434)

Usage

typescript

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'ollama',
  model: 'llama3'
});

const response = await client.chat('Why is the sky blue?');

Remote Server

If Ollama is on another machine:

typescript

const client = new ChatClient({
  provider: 'ollama',
  model: 'llama3',
  baseUrl: 'http://192.168.1.50:11434'
});

CLI Assistant

bash

npm install -g @meloqui/ollama-assistant
ollama-chat

Or run without a global install:

bash

npx -p @meloqui/ollama-assistant ollama-chat

Docker Model Runner

Docker Model Runner (DMR) is built into Docker Desktop and exposes an OpenAI-compatible API.

Setup

Enable Model Runner in Docker Desktop
Pull a model: docker model pull ai/smollm2

Usage

Since DMR uses an OpenAI-compatible API, use provider: 'openai' with a custom baseUrl:

typescript

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'openai',  // OpenAI-compatible API
  model: 'ai/smollm2:latest',
  baseUrl: 'http://localhost:12434/engines/llama.cpp/v1',
  apiKey: 'not-needed'  // DMR doesn't require auth
});

const response = await client.chat('Hello!');

CLI Assistant

bash

npm install -g @meloqui/docker-assistant
docker-chat

Or run without a global install:

bash

npx -p @meloqui/docker-assistant docker-chat

Configuration

Environment Variable	Default
`DOCKER_MODEL`	`ai/smollm2:latest`
`DOCKER_BASE_URL`	`http://localhost:12434/engines/llama.cpp/v1`

OpenAI-Compatible Servers

Many local inference servers expose OpenAI-compatible APIs. Use provider: 'openai' with a custom baseUrl.

Examples

LocalAI:

typescript

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-3.5-turbo',  // Model name depends on your setup
  baseUrl: 'http://localhost:8080/v1',
  apiKey: 'not-needed'
});

vLLM:

typescript

const client = new ChatClient({
  provider: 'openai',
  model: 'meta-llama/Llama-2-7b-chat-hf',
  baseUrl: 'http://localhost:8000/v1',
  apiKey: 'not-needed'
});

LM Studio:

typescript

const client = new ChatClient({
  provider: 'openai',
  model: 'local-model',
  baseUrl: 'http://localhost:1234/v1',
  apiKey: 'lm-studio'
});

Provider Type Reference

Server	Provider	Notes
Ollama	`ollama`	Native support
Docker Model Runner	`openai`	OpenAI-compatible API
LocalAI	`openai`	OpenAI-compatible API
vLLM	`openai`	OpenAI-compatible API
LM Studio	`openai`	OpenAI-compatible API
text-generation-webui	`openai`	With OpenAI extension

Privacy

All local model options keep your data on your machine. No requests are sent to external servers.

Local Models ​

Options Comparison ​

Ollama ​

Setup ​

Usage ​

Remote Server ​

CLI Assistant ​

Docker Model Runner ​

Setup ​

Usage ​

CLI Assistant ​

Configuration ​

OpenAI-Compatible Servers ​

Examples ​

Provider Type Reference ​

Privacy ​

Local Models

Options Comparison

Ollama

Setup

Usage

Remote Server

CLI Assistant

Docker Model Runner

Setup

Usage

CLI Assistant

Configuration

OpenAI-Compatible Servers

Examples

Provider Type Reference

Privacy