openclawclaude-codev1.0.0
Proxy
@RelayPlane⭐ 105 stars· last commit 1mo ago
Open source cost intelligence proxy for AI agents. Cut costs ~80% with smart model routing. Dashboard, policy engine, 11 providers. MIT licensed.
7.6/10
Verified
Mar 9, 2026// RATINGS
🟢ProSkills ScoreAI Verified
7.6/10📍
Not yet listed on ClawHub or SkillsMP
// README
# @relayplane/proxy
[](https://www.npmjs.com/package/@relayplane/proxy)
[](https://github.com/RelayPlane/proxy/blob/main/LICENSE)
A **Node.js npm LLM proxy** that sits between your AI agents and providers. Drop-in replacement for OpenAI and Anthropic base URLs — no Docker, no Python, just `npm install`. Tracks every request, shows where the money goes, and offers configurable task-aware routing — all running **locally, for free**.
**The npm-native LLM proxy for Node.js developers.** Works with Claude Code, Cursor, OpenClaw, and any tool that supports `OPENAI_BASE_URL` or `ANTHROPIC_BASE_URL`.
**Free, open-source proxy features:**
- 📊 Per-request cost tracking across 11 providers
- 💰 **Cache-aware cost tracking** - accurately tracks Anthropic prompt caching with cache read savings, creation costs, and true per-request costs
- 🔀 Configurable task-aware routing (complexity-based, cascade, model overrides)
- 🛡️ Circuit breaker - if the proxy fails, your agent doesn't notice
- 📈 **Local dashboard** at `localhost:4100` - cost breakdown, savings analysis, provider health, agent breakdown
- 💵 **Budget enforcement** - daily/hourly/per-request spend limits with block, warn, downgrade, or alert actions
- 🔍 **Anomaly detection** - catches runaway agent loops, cost spikes, and token explosions in real time
- 🔔 **Cost alerts** - threshold alerts at configurable percentages, webhook delivery, alert history
- ⬇️ **Auto-downgrade** - automatically switches to cheaper models when budget thresholds are hit
- 📦 **Aggressive cache** - exact-match response caching with gzipped disk persistence
- 🤖 **Per-agent cost tracking** - identifies agents by system prompt fingerprint and tracks cost per agent
- 📝 **Content logging** - dashboard shows system prompt preview, user message, and response preview per request
- 🔐 **OAuth passthrough** - correctly forwards `user-agent` and `x-app` headers for Claude Max subscription users (OpenClaw compatible)
- 🧠 **Osmosis mesh** - collective learning layer that shares anonymized routing signals across users (on by default, opt-out: `relayplane mesh off`)
- 🔧 **systemd/launchd service** - `relayplane service install` for always-on operation with auto-restart
- 🏥 **Health watchdog** - `/health` endpoint with uptime tracking and active probing
- 🛡️ **Config resilience** - atomic writes, automatic backup/restore, credential separation
> **Cloud dashboard available separately** - see [Cloud Dashboard & Pro Features](#cloud-dashboard--pro-features) below. Your prompts always stay local.
## Quick Start
```bash
npm install -g @relayplane/proxy
relayplane init
relayplane start
# Dashboard at http://localhost:4100
```
Works with any agent framework that talks to OpenAI or Anthropic APIs. Point your client at `http://localhost:4100` (set `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`) and the proxy handles the rest.
## What's New in v1.8.14+
**Breaking changes for upgraders:**
- **Telemetry is now ON by default.** Previously opt-in. Anonymous metadata (model, tokens, cost, latency) is sent to power the cloud dashboard. Your prompts and responses are never collected. Disable: `relayplane telemetry off`
- **Mesh is now ON by default.** Your proxy contributes anonymized routing data to the collective network. Free users get provider health alerts. Pro users get full routing intelligence. Disable: `relayplane mesh off`
- **Cloud dashboard is now free.** Previously required a paid plan. Just `relayplane login` to access your data at relayplane.com/dashboard.
If you prefer the old behavior: `relayplane telemetry off && relayplane mesh off`
## Supported Providers
**Anthropic** · **OpenAI** · **Google Gemini** · **xAI/Grok** · **OpenRouter** · **DeepSeek** · **Groq** · **Mistral** · **Together** · **Fireworks** · **Perplexity**
## Configuration
RelayPlane reads configuration from `~/.relayplane/config.json`. Override the path with the `RELAYPLANE_CONFIG_PATH` environment variable.
```bash
# Default location
~/.relayplane/config.json
# Override with env var
RELAYPLANE_CONFIG_PATH=/path/to/config.json relayplane start
```
A minimal config file:
```json
{
"enabled": true,
"modelOverrides": {},
"routing": {
"mode": "cascade",
"cascade": { "enabled": true },
"complexity": { "enabled": true }
}
}
```
All configuration is optional - sensible defaults are applied for every field. The proxy merges your config with its defaults via deep merge, so you only need to specify what you want to change.
## Architecture
```text
Client (Claude Code / Aider / Cursor)
|
| OpenAI/Anthropic-compatible request
v
+-------------------------------------------------------+
| RelayPlane Proxy (local) |
|-------------------------------------------------------|
| 1) Parse request |
| 2) Cache check (exact or aggressive mode) |
| └─ HIT → return cached response (skip provider) |
| 3) Budget check (daily/hourly/per-request limits) |
| └─ BREACH → block / warn / downgrade / alert |
| 4) Anomaly detection (velocity, cost spike, loops) |
| └─ DETECTED → alert + optional block |
| 5) Auto-downgrade (if budget threshold exceeded) |
| └─ Rewrite model to cheaper alternative |
| 6) Infer task/complexity (pre-request) |
| 7) Select route/model |
| - explicit model / passthrough |
| - relayplane:auto/cost/fast/quality |
| - configured complexity/cascade rules |
| 8) Forward request to provider |
| 9) Return provider response + cache it |
| 10) Record telemetry + update budget tracking |
| 11) Mesh sync (push anonymized routing signals) |
+-------------------------------------------------------+
|
v
Provider APIs (Anthropic/OpenAI/Gemini/xAI/...)
```
## How It Works
RelayPlane is a local HTTP proxy. You point your agent at `localhost:4100` by setting `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`. The proxy:
1. **Intercepts** your LLM API requests
2. **Classifies** the task using heuristics (token count, prompt patterns, keyword matching - no LLM calls)
3. **Routes** to the configured model based on classification and your routing rules (or passes through to the original model by default)
4. **Forwards** the request directly to the LLM provider (your prompts go straight to the provider, not through RelayPlane servers)
5. **Records** token counts, latency, and cost locally for your dashboard
**Default behavior is passthrough** - requests go to whatever model your agent requested. Routing (cascade, complexity-based) is configurable and must be explicitly enabled.
## Complexity-Based Routing
The proxy classifies incoming requests by complexity (simple, moderate, complex) based on prompt length, token patterns, and the presence of tools. Each tier maps to a different model.
```json
{
"routing": {
"complexity": {
"enabled": true,
"simple": "claude-3-5-haiku-latest",
"moderate": "claude-sonnet-4-20250514",
"complex": "claude-opus-4-20250514"
}
}
}
```
**How classification works:**
- **Simple** - Short prompts, straightforward Q&A, basic code tasks
- **Moderate** - Multi-step reasoning, code review, analysis with context
- **Complex** - Architecture decisions, large codebases, tasks with many tools, long prompts with evaluation/comparison language
The classifier scores requests based on message count, total token length, tool usage, and content patterns (e.g., words like "analyze", "compare", "evaluate" increase the score). This happens locally - no prompt content is sent anywhere.
## Model Overrides
Map any model name to a different one. Useful for silently redirectin
// REPO STATS
105 stars
Last commit: 1mo ago
// SHARE
// SOURCE
View on GitHub// PROSKILLS SCORE
7.6/10
Good
BREAKDOWN
Code Quality7.5/10
Documentation8.5/10
Functionality7.5/10
Maintenance8/10
Security7.5/10
Uniqueness7/10
Usefulness7/10