Usage Examples¶

This guide provides practical examples of how to use the Ollama Proxy with various tools and applications. These examples will help you get started quickly and show you how to integrate the proxy into your existing workflows.

Table of Contents¶

Basic Usage with Ollama CLI
Using with Ollama Clients
Programming Language Examples
Advanced Configuration Examples
Troubleshooting Common Issues

Basic Usage with Ollama CLI¶

Listing Available Models¶

Once the proxy is running, you can list all available models:

# List models through the proxy
ollama list

# Or using curl directly
curl http://localhost:11434/api/tags

Chat with a Model¶

You can chat with any model available through OpenRouter:

# Start a chat session
ollama run gemini-pro:latest

# Or send a single message
echo "Why is the sky blue?" | ollama run gemini-pro:latest

Generate Text¶

Generate text using a model:

# Simple text generation
ollama run gemini-pro:latest "Write a short poem about programming"

# Using prompt files
ollama run gemini-pro:latest -f ./prompt.txt

Using with Ollama Clients¶

Python Client Example¶

If you're using the Ollama Python client, you can configure it to use the proxy:

import ollama

# Configure client to use proxy
client = ollama.Client(host='http://localhost:11434')

# List models
models = client.list()
print(models)

# Chat with a model
response = client.chat(
    model='gemini-pro:latest',
    messages=[{
        'role': 'user',
        'content': 'Why is the sky blue?',
    }]
)
print(response['message']['content'])

JavaScript/Node.js Client Example¶

For the JavaScript client:

import ollama from 'ollama';

// The client will automatically use http://localhost:11434
// unless you specify a different host

// List models
const models = await ollama.list();
console.log(models);

// Chat with a model
const response = await ollama.chat({
  model: 'gemini-pro:latest',
  messages: [{ role: 'user', content: 'Why is the sky blue?' }]
});
console.log(response.message.content);

Programming Language Examples¶

Python Direct API Usage¶

You can also interact with the proxy directly using HTTP requests:

import requests
import json

# Proxy endpoint
proxy_url = 'http://localhost:11434'

# List available models
response = requests.get(f'{proxy_url}/api/tags')
models = response.json()
print("Available models:", [m['name'] for m in models['models']])

# Chat completion
chat_data = {
    "model": "gemini-pro:latest",
    "messages": [
        {
            "role": "user",
            "content": "Explain quantum computing in simple terms"
        }
    ]
}

response = requests.post(
    f'{proxy_url}/api/chat',
    headers={'Content-Type': 'application/json'},
    data=json.dumps(chat_data)
)

if response.status_code == 200:
    result = response.json()
    print("Response:", result['message']['content'])
else:
    print(f"Error: {response.status_code} - {response.text}")

Streaming Responses¶

To handle streaming responses in Python:

import requests
import json

proxy_url = 'http://localhost:11434'

# Streaming chat completion
chat_data = {
    "model": "gemini-pro:latest",
    "messages": [
        {
            "role": "user",
            "content": "Write a story about a robot learning to paint"
        }
    ],
    "stream": True
}

with requests.post(
    f'{proxy_url}/api/chat',
    headers={'Content-Type': 'application/json'},
    data=json.dumps(chat_data),
    stream=True
) as response:
    for line in response.iter_lines():
        if line:
            chunk = json.loads(line.decode('utf-8'))
            if 'message' in chunk and 'content' in chunk['message']:
                print(chunk['message']['content'], end='', flush=True)
            if chunk.get('done', False):
                break

cURL Examples¶

You can also use cURL to interact with the proxy:

# List models
curl http://localhost:11434/api/tags

# Chat completion
curl http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-pro:latest",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ]
  }'

# Streaming chat completion
curl http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-pro:latest",
    "messages": [
      {
        "role": "user",
        "content": "Count from 1 to 100."
      }
    ],
    "stream": true
  }'

Advanced Configuration Examples¶

Model Filtering¶

Create a custom models-filter.txt to limit which models are available:

# Only allow these specific models
gemini-pro:latest
gpt-4:latest
claude-3-5-sonnet:latest

# Allow all models from a specific family using wildcards
llama*:*latest

# Exclude specific models (prefix with !)
!llama-2-13b-chat:latest

Then start the proxy with your filter:

ollama-proxy --models-filter ./models-filter.txt

Environment-based Configuration¶

Create a .env file for different environments:

# Development environment
OPENROUTER_API_KEY=your_dev_api_key_here
LOG_LEVEL=DEBUG
HOST=localhost
PORT=11434
MODELS_FILTER_PATH=./dev-models-filter.txt

For production:

# Production environment
OPENROUTER_API_KEY=your_prod_api_key_here
LOG_LEVEL=WARNING
HOST=0.0.0.0
PORT=11434
MODELS_FILTER_PATH=./prod-models-filter.txt
MAX_CONCURRENT_REQUESTS=200

Docker Compose with Custom Configuration¶

Create a docker-compose.prod.yml for production deployment:

version: '3.8'
services:
  ollama-proxy:
    build: .
    ports:
      - "11434:11434"
    environment:
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
      - LOG_LEVEL=INFO
      - MAX_CONCURRENT_REQUESTS=200
    volumes:
      - ./prod-models-filter.txt:/app/models-filter.txt
    restart: unless-stopped

Troubleshooting Common Issues¶

Connection Issues¶

If you can't connect to the proxy:

Check if the proxy is running:

# Check if the port is listening
netstat -tlnp | grep :11434

# Or use lsof
lsof -i :11434

Verify the proxy logs:

# If running with Docker
docker logs ollama-proxy-container

# If running directly
# Check your terminal where you started the proxy

Model Not Found Errors¶

If you get "model not found" errors:

List available models to see what's actually available:

curl http://localhost:11434/api/tags

Check your model filter configuration:

# View your filter file
cat models-filter.txt

Try without filtering to see all models:

ollama-proxy --models-filter ""

API Key Issues¶

If you get authentication errors:

Verify your API key:

echo $OPENROUTER_API_KEY

Test the key directly with OpenRouter:

curl https://openrouter.ai/api/v1/models \
  -H "Authorization: Bearer $OPENROUTER_API_KEY"

Performance Issues¶

If responses are slow:

Check the proxy logs for any errors or warnings.
Monitor the proxy's health endpoint:

curl http://localhost:11434/health

Check the metrics endpoint for performance data:

curl http://localhost:11434/metrics