openclawclaude-codev1.0.0
Proxy

@RelayPlane⭐ 105 stars· last commit 1mo ago
Open source cost intelligence proxy for AI agents. Cut costs ~80% with smart model routing. Dashboard, policy engine, 11 providers. MIT licensed.
7.6/10
Verified
Mar 9, 2026
// RATINGS

⭐GitHub Stars
⭐⭐⭐ 105GitHub ↗
Popular
🟢ProSkills ScoreAI Verified
7.6/10
📍
Not yet listed on ClawHub or SkillsMP
// README

# @relayplane/proxy

[![npm](https://img.shields.io/npm/v/@relayplane/proxy)](https://www.npmjs.com/package/@relayplane/proxy)
[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/RelayPlane/proxy/blob/main/LICENSE)

A **Node.js npm LLM proxy** that sits between your AI agents and providers. Drop-in replacement for OpenAI and Anthropic base URLs — no Docker, no Python, just `npm install`. Tracks every request, shows where the money goes, and offers configurable task-aware routing — all running **locally, for free**.

**The npm-native LLM proxy for Node.js developers.** Works with Claude Code, Cursor, OpenClaw, and any tool that supports `OPENAI_BASE_URL` or `ANTHROPIC_BASE_URL`.

**Free, open-source proxy features:**
- 📊 Per-request cost tracking across 11 providers
- 💰 **Cache-aware cost tracking** - accurately tracks Anthropic prompt caching with cache read savings, creation costs, and true per-request costs
- 🔀 Configurable task-aware routing (complexity-based, cascade, model overrides)
- 🛡️ Circuit breaker - if the proxy fails, your agent doesn't notice
- 📈 **Local dashboard** at `localhost:4100` - cost breakdown, savings analysis, provider health, agent breakdown
- 💵 **Budget enforcement** - daily/hourly/per-request spend limits with block, warn, downgrade, or alert actions
- 🔍 **Anomaly detection** - catches runaway agent loops, cost spikes, and token explosions in real time
- 🔔 **Cost alerts** - threshold alerts at configurable percentages, webhook delivery, alert history
- ⬇️ **Auto-downgrade** - automatically switches to cheaper models when budget thresholds are hit
- 📦 **Aggressive cache** - exact-match response caching with gzipped disk persistence
- 🤖 **Per-agent cost tracking** - identifies agents by system prompt fingerprint and tracks cost per agent
- 📝 **Content logging** - dashboard shows system prompt preview, user message, and response preview per request
- 🔐 **OAuth passthrough** - correctly forwards `user-agent` and `x-app` headers for Claude Max subscription users (OpenClaw compatible)
- 🧠 **Osmosis mesh** - collective learning layer that shares anonymized routing signals across users (on by default, opt-out: `relayplane mesh off`)
- 🔧 **systemd/launchd service** - `relayplane service install` for always-on operation with auto-restart
- 🏥 **Health watchdog** - `/health` endpoint with uptime tracking and active probing
- 🛡️ **Config resilience** - atomic writes, automatic backup/restore, credential separation

> **Cloud dashboard available separately** - see [Cloud Dashboard & Pro Features](#cloud-dashboard--pro-features) below. Your prompts always stay local.

## Quick Start

```bash
npm install -g @relayplane/proxy
relayplane init
relayplane start
# Dashboard at http://localhost:4100
```

Works with any agent framework that talks to OpenAI or Anthropic APIs. Point your client at `http://localhost:4100` (set `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`) and the proxy handles the rest.

## What's New in v1.8.14+

**Breaking changes for upgraders:**

- **Telemetry is now ON by default.** Previously opt-in. Anonymous metadata (model, tokens, cost, latency) is sent to power the cloud dashboard. Your prompts and responses are never collected. Disable: `relayplane telemetry off`
- **Mesh is now ON by default.** Your proxy contributes anonymized routing data to the collective network. Free users get provider health alerts. Pro users get full routing intelligence. Disable: `relayplane mesh off`
- **Cloud dashboard is now free.** Previously required a paid plan. Just `relayplane login` to access your data at relayplane.com/dashboard.

If you prefer the old behavior: `relayplane telemetry off && relayplane mesh off`

## Supported Providers

**Anthropic** · **OpenAI** · **Google Gemini** · **xAI/Grok** · **OpenRouter** · **DeepSeek** · **Groq** · **Mistral** · **Together** · **Fireworks** · **Perplexity**

## Configuration

RelayPlane reads configuration from `~/.relayplane/config.json`. Override the path with the `RELAYPLANE_CONFIG_PATH` environment variable.

```bash
# Default location
~/.relayplane/config.json

# Override with env var
RELAYPLANE_CONFIG_PATH=/path/to/config.json relayplane start
```

A minimal config file:

```json
{
  "enabled": true,
  "modelOverrides": {},
  "routing": {
    "mode": "cascade",
    "cascade": { "enabled": true },
    "complexity": { "enabled": true }
  }
}
```

All configuration is optional - sensible defaults are applied for every field. The proxy merges your config with its defaults via deep merge, so you only need to specify what you want to change.

## Architecture

```text
Client (Claude Code / Aider / Cursor)
        |
        |  OpenAI/Anthropic-compatible request
        v
+-------------------------------------------------------+
| RelayPlane Proxy (local)                               |
|-------------------------------------------------------|
| 1) Parse request                                       |
| 2) Cache check (exact or aggressive mode)              |
|    └─ HIT → return cached response (skip provider)    |
| 3) Budget check (daily/hourly/per-request limits)      |
|    └─ BREACH → block / warn / downgrade / alert       |
| 4) Anomaly detection (velocity, cost spike, loops)     |
|    └─ DETECTED → alert + optional block               |
| 5) Auto-downgrade (if budget threshold exceeded)       |
|    └─ Rewrite model to cheaper alternative             |
| 6) Infer task/complexity (pre-request)                 |
| 7) Select route/model                                  |
|    - explicit model / passthrough                     |
|    - relayplane:auto/cost/fast/quality                |
|    - configured complexity/cascade rules               |
| 8) Forward request to provider                         |
| 9) Return provider response + cache it                 |
| 10) Record telemetry + update budget tracking          |
| 11) Mesh sync (push anonymized routing signals)        |
+-------------------------------------------------------+
        |
        v
Provider APIs (Anthropic/OpenAI/Gemini/xAI/...)
```

## How It Works

RelayPlane is a local HTTP proxy. You point your agent at `localhost:4100` by setting `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`. The proxy:

1. **Intercepts** your LLM API requests
2. **Classifies** the task using heuristics (token count, prompt patterns, keyword matching - no LLM calls)
3. **Routes** to the configured model based on classification and your routing rules (or passes through to the original model by default)
4. **Forwards** the request directly to the LLM provider (your prompts go straight to the provider, not through RelayPlane servers)
5. **Records** token counts, latency, and cost locally for your dashboard

**Default behavior is passthrough** - requests go to whatever model your agent requested. Routing (cascade, complexity-based) is configurable and must be explicitly enabled.

## Complexity-Based Routing

The proxy classifies incoming requests by complexity (simple, moderate, complex) based on prompt length, token patterns, and the presence of tools. Each tier maps to a different model.

```json
{
  "routing": {
    "complexity": {
      "enabled": true,
      "simple": "claude-3-5-haiku-latest",
      "moderate": "claude-sonnet-4-20250514",
      "complex": "claude-opus-4-20250514"
    }
  }
}
```

**How classification works:**

- **Simple** - Short prompts, straightforward Q&A, basic code tasks
- **Moderate** - Multi-step reasoning, code review, analysis with context
- **Complex** - Architecture decisions, large codebases, tasks with many tools, long prompts with evaluation/comparison language

The classifier scores requests based on message count, total token length, tool usage, and content patterns (e.g., words like "analyze", "compare", "evaluate" increase the score). This happens locally - no prompt content is sent anywhere.

## Model Overrides

Map any model name to a different one. Useful for silently redirectin
// REPO STATS

105 stars
Last commit: 1mo ago
// SHARE

Share on X Share on LinkedIn
// SOURCE

View on GitHub
// PROSKILLS SCORE

7.6/10
Good
BREAKDOWN
Code Quality7.5/10
Documentation8.5/10
Functionality7.5/10
Maintenance8/10
Security7.5/10
Uniqueness7/10
Usefulness7/10
// DETAILS

Categoryautomation
Author@RelayPlane
Versionv1.0.0
PriceFree