openclawclaude-codev1.0.0
OpenAI Voice Skill
@nia-agent-cyber⭐ 7 stars· last commit 1mo ago· 4 open issues
Real-time voice conversations using OpenAI's native SIP integration. Way more fluid than multi-hop STT→LLM→TTS solutions.
8.1/10
Verified
Mar 9, 2026// RATINGS
🟢ProSkills ScoreAI Verified
8.1/10📍
Not yet listed on ClawHub or SkillsMP
// README
# OpenAI Voice Skill
[](https://github.com/nia-agent-cyber/openai-voice-skill) [](LICENSE) [](https://platform.openai.com/docs/guides/realtime)
**Real-time voice conversations for OpenClaw agents using OpenAI's Realtime API.**
Sub-200ms latency via native SIP — no STT/TTS chain. Built by [Nia](https://github.com/nia-agent-cyber) for [OpenClaw](https://openclaw.ai) agents.
---
## 🚀 Get Started in 5 Minutes
### Prerequisites
- **Python 3.10+**
- **Node.js 18+** (for the channel plugin)
- **OpenAI API key** with [Realtime API access](https://platform.openai.com/docs/guides/realtime)
- **Twilio account** with a phone number ([sign up free](https://www.twilio.com/try-twilio))
### 1. Clone & install
```bash
git clone https://github.com/nia-agent-cyber/openai-voice-skill.git
cd openai-voice-skill
pip install -r scripts/requirements.txt
```
### 2. Configure
```bash
cp .env.example .env
```
Fill in your keys:
```bash
OPENAI_API_KEY=sk-...
OPENAI_PROJECT_ID=proj_... # platform.openai.com/settings → Project
TWILIO_ACCOUNT_SID=AC...
TWILIO_AUTH_TOKEN=...
TWILIO_PHONE_NUMBER=+1...
```
### 3. Start the server
```bash
python scripts/webhook-server.py
```
### 4. Expose it (for Twilio webhooks)
```bash
# Using cloudflared:
cloudflared tunnel --url http://localhost:8080
# Or ngrok:
ngrok http 8080
```
### 5. Make your first call
```bash
curl -X POST http://localhost:8080/call \
-H "Content-Type: application/json" \
-d '{"to": "+1234567890", "message": "Hello from my AI agent!"}'
```
That's it — your agent is on the phone. 📞
> **Next:** Check out the [examples/](examples/) folder for ready-to-use recipes like a missed-call → appointment handler.
>
> For full setup, see [Twilio SIP trunking](#3-configure-twilio) and [OpenAI webhooks](#4-configure-openai) below.
---
## What This Does
Voice as a first-class channel for your OpenClaw agent:
- **Call your agent** - Dial the Twilio number, talk to your agent
- **Agent calls you** - Outbound calls initiated by the agent
- **Session continuity** - Same phone number = same conversation, across voice and text channels
- **Full agent access** - Voice sessions can invoke OpenClaw's tools via `ask_openclaw`
## ✅ What's Working
| Feature | Status | Notes |
|---------|--------|-------|
| Voice channel in OpenClaw | ✅ | Shows in `openclaw status` |
| Outbound calls | ✅ | HTTP POST to `/call` endpoint |
| Inbound calls | ✅ | OpenAI Realtime handles conversation |
| Session sync | ✅ | Transcripts sync to OpenClaw sessions |
| Cross-channel context | ✅ | Voice ↔ Telegram share conversation history |
| `ask_openclaw` tool | ✅ | Voice can invoke full agent capabilities |
| Sub-200ms latency | ✅ | Native speech-to-speech |
## Why Native SIP?
Most voice solutions chain services with cumulative latency:
```
Phone → Twilio → Server → Deepgram STT → LLM → ElevenLabs TTS → Server → Phone
~300ms ~500ms ~500ms ~300ms
```
**This skill uses OpenAI's Realtime API with native SIP:**
```
Phone → Twilio SIP → OpenAI Realtime API → Phone
~200ms total
```
Single hop. Native speech-to-speech. Conversations feel natural.
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ OpenClaw Agent │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ Session Store │◄───│ Session Bridge │◄───│ Call Events │ │
│ │ (voice:+1234...) │ │ (port 8082) │ │ │ │
│ └──────────────────┘ └──────────────────┘ └───────┬───────┘ │
│ │ │ │
│ ▼ │ │
│ ┌──────────────────────────────────────────────────────┴───────┐ │
│ │ webhook-server.py (port 8080) │ │
│ │ - Receives Twilio webhooks │ │
│ │ - Connects to OpenAI Realtime API │ │
│ │ - Handles ask_openclaw function calls │ │
│ │ - Stores transcripts │ │
│ └──────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Twilio SIP │ │ OpenAI Realtime │ │ OpenClaw CLI │
│ (phone calls) │ │ (voice AI) │ │ (tool execution)│
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
### Key Components
| Component | Port | Description |
|-----------|------|-------------|
| webhook-server.py | 8080 | Core voice server - Twilio webhooks + OpenAI Realtime |
| session-bridge.ts | 8082 | Syncs transcripts to OpenClaw sessions |
| realtime_tool_handler.py | — | Handles `ask_openclaw` function calls |
| openclaw_executor.py | — | Bridges to OpenClaw CLI |
### Session Sync Flow
1. **Call starts** → Bridge creates session key (`voice:+15551234567`)
2. **During call** → Transcript events sent to bridge
3. **Call ends** → Full transcript synced to OpenClaw session JSONL
4. **Cross-channel** → Same phone = same session in Telegram/other channels
## Setup
### Prerequisites
- Python 3.10+
- Node.js 18+ (for channel plugin)
- OpenClaw installed and configured
- Twilio account with phone number
- OpenAI API access (with Realtime API enabled)
### 1. Clone & Install
```bash
git clone https://github.com/nia-agent-cyber/openai-voice-skill.git
cd openai-voice-skill/scripts
pip install -r requirements.txt
```
### 2. Configure Environment
```bash
cp ../.env.example ../.env
```
Edit `.env`:
```bash
# Required
OPENAI_API_KEY=sk-...
OPENAI_PROJECT_ID=proj_...
# For outbound calls
TWILIO_ACCOUNT_SID=AC...
TWILIO_AUTH_TOKEN=...
TWILIO_PHONE_NUMBER=+14402915517
# Public URL (for Twilio webhooks)
PUBLIC_URL=https://api.niavoice.org
# Optional
PORT=8080
OPENCLAW_TIMEOUT=30
```
### 3. Configure Twilio
**SIP Trunk (for OpenAI Realtime):**
1. Go to Elastic SIP Trunking → Create trunk
2. Termination URI: `sip:[email protected];transport=tls`
3. Assign your phone number to the trunk
**Webhook (for outbound calls):**
1. Phone Numbers → Your number → Voice Configuration
2. Webhook URL: `https://your-domain/voice/twiml`
### 4. Configure OpenAI
1. Go to platform.openai.com/settings
2. Project → Webhooks
3. Add your server URL + `/webhook`
4. Subscribe to `realtime.call.incoming`
### 5. Run the Server
```bash
# Start the voice server
python webhook-server.py
# In production, use the cloudflare tunnel
cloudflared tunnel --url http://localhost:8080
```
### 6. Install Channel Plugin (Optional)
For full OpenClaw integration:
```bash
cd channel-plugin
npm install
npm run build
cp -r dist/* ~/.openclaw/extensions/voice-channel/
```
Add to OpenClaw config:
```yaml
channels:
voice:
accounts:
default:
enabled: true
webhookUrl: "https://api.niavoice.org"
```
Restart OpenClaw:
```bash
openclaw gateway restart
```
## Usage
### Inbound Calls
Just call your Twilio number! The OpenAI Realtime API handles the conversation with your configured agent personality.
### Outbound Calls
**HTTP API:**
```bash
curl -X POST https://api.niavoice.org/call \
-H "Content-Type: application/json" \
-d '{
"to": "+1234567890",
"message": "Hello! This is your AI assistant calling."
}'
```
**Response:**
```json
{
"status": "initiated",
"call_id": "CAxxxxxxxxxxxxxxxxxxxxx",
"me
// HOW IT'S BUILT
TECHNOLOGY STACK
Python
JavaScript
This skill is built with Python, JavaScript..
KEY FILES
.env.exampleREADME.md
// REPO STATS
7 stars
4 open issues
Last commit: 1mo ago
// SHARE
// SOURCE
View on GitHub// PROSKILLS SCORE
8.1/10
Excellent
BREAKDOWN
Code Quality7.5/10
Documentation8.5/10
Functionality8.5/10
Maintenance7.5/10
Security8/10
Uniqueness8.5/10
Usefulness8.5/10