Documentation

Everything you need to integrate TruthProbe — the Trust + Efficiency Layer for LLM APIs.

Free SDK (Audit Only)

Install the TruthProbe SDK for free local audit detection. Your API key never leaves your machine.

Installation

pip install truthprobe

Usage

Option 1: Auto-patch (recommended)

Automatically hooks into the OpenAI SDK. Zero code changes needed beyond the import.

import truthprobe
truthprobe.patch()

import openai
client = openai.OpenAI()
# All calls are now audited automatically

Option 2: Explicit wrap

Wrap your existing client for more control over configuration.

import truthprobe
import openai

client = openai.OpenAI()
audited = truthprobe.wrap(client)

response = audited.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

What you'll see

After each request, your terminal shows a colored audit signature:

┌─────────────────────────────────────────┐
│  TruthProbe Audit                       │
│  Model: gpt-4o  ✓ PASS                  │
│  Text complexity: 8.7/10                │
│  Timing CoV: 0.29                       │
│  Confidence: 94%                        │
└─────────────────────────────────────────┘
The SDK reports anonymized signals (timing, text metrics) to improve public rankings. No prompt or response content is ever sent.

Quick Start

Get TruthProbe running in under 30 seconds.

1. Register and set your budget

curl -X POST https://truthprobe.com/register \
  -H "Content-Type: application/json" \
  -d '{
    "upstream_key": "sk-your-openai-key",
    "upstream_url": "https://api.openai.com/v1",
    "monthly_limit_usd": 50.0,
    "over_budget_action": "degrade"
  }'

# Response:
# {
#   "truth_probe_key": "tp-abc123...",
#   "endpoint": "https://truthprobe.com/v1/chat/completions",
#   "budget": { "monthly_limit_usd": 50.0, "over_budget_action": "degrade" }
# }

2. Use it in your code

Python
import openai

client = openai.OpenAI(
    api_key="tp-abc123...",           # Your TruthProbe key
    base_url="https://truthprobe.com/v1"  # Just change this line
)

# Use exactly as before — fully OpenAI-compatible
response = client.chat.completions.create(
    model="claude-opus-4-6",  # Request expensive model
    messages=[
        {"role": "user", "content": "翻译: Hello world"}
    ]
)
# Simple task → auto-routed to Sonnet, effort:low
# Saved ~80% vs. Opus pricing
Node.js
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'tp-abc123...',
  baseURL: 'https://truthprobe.com/v1',
});

const response = await client.chat.completions.create({
  model: 'claude-opus-4-6',
  messages: [{ role: 'user', content: 'Translate: Hello' }],
});

3. Check the response metadata

// Every response includes _truth_probe metadata:
{
  "choices": [...],
  "usage": {"prompt_tokens": 12, "completion_tokens": 8},
  "_truth_probe": {
    "complexity": "simple",
    "effort": "low",
    "original_model": "claude-opus-4-6",
    "actual_model": "claude-sonnet-4-6",
    "cost_usd": 0.000744,
    "would_have_cost_usd": 0.00372,
    "saved_usd": 0.002976,
    "compression_tokens_saved": 0,
    "budget_action": null
  }
}

How It Works

Every request passes through an optimization pipeline:

Step 1: Complexity Detection

Pattern matching classifies your request as simple, medium, or complex based on keywords and token count.

Step 2: Budget Check

If you've exceeded your monthly limit, the request is either auto-degraded (cheaper model + low effort) or paused (HTTP 429), depending on your setting.

Step 3: Effort Optimization

Simple tasks get effort:low + capped max_tokens. This reduces output tokens by 40-70% without quality loss on simple tasks.

Step 4: Model Routing

Simple tasks get routed to cheaper models: Opus→Sonnet, Sonnet→Haiku, GPT-4o→Mini. Complex tasks always keep your requested model.

Step 5: Cost Tracking

After the response, actual cost is calculated and logged. User spending is updated. Response includes full cost metadata.

Supported Models

TruthProbe works with any OpenAI-compatible API. Model routing map:

Requested ModelSimple Task Routes ToSavings
claude-opus-4-7/4-6claude-sonnet-4-6~80%
claude-sonnet-4-6claude-haiku-4-5~74%
gpt-4ogpt-4o-mini~94%
gpt-5.2gpt-4o-mini~94%
deepseek-v4-prodeepseek-v4-flash~80%

Medium and complex tasks always use your requested model at full power.

API Reference

POST /v1/chat/completions

Fully OpenAI-compatible. Supports streaming. Responses include _truth_probe cost metadata.

100% compatible with openai SDK

POST /register

Register and get a TruthProbe key with budget settings.

// Request body:
{
  "upstream_key": "sk-xxx",        // required
  "upstream_url": "https://...",   // required
  "monthly_limit_usd": 50.0,      // optional, default 50
  "over_budget_action": "degrade"  // "degrade" or "pause"
}

POST /budget

Update budget settings. Requires Authorization header.

curl -X POST https://truthprobe.com/budget \
  -H "Authorization: Bearer tp-xxx" \
  -H "Content-Type: application/json" \
  -d '{"monthly_limit_usd": 100, "over_budget_action": "pause"}'

// Response:
// { "budget": {...}, "current_spend": {"2026-05": 33.70} }

GET /user/stats

Get your spending and savings breakdown.

curl https://truthprobe.com/user/stats \
  -H "Authorization: Bearer tp-xxx"

// Response:
// {
//   "total_requests": 847,
//   "total_saved_usd": 18.42,
//   "monthly_spend_usd": 33.70,
//   "monthly_limit_usd": 50.0,
//   "budget_remaining_usd": 16.30,
//   "budget_used_pct": "67.4%"
// }

GET /logs

Get recent request logs with per-request cost breakdown.

curl "https://truthprobe.com/logs?limit=5" \
  -H "Authorization: Bearer tp-xxx"

// Each log entry:
// {
//   "time": "...",
//   "complexity": "simple",
//   "requested_model": "claude-opus-4-6",
//   "actual_model": "claude-sonnet-4-6",
//   "effort": "low",
//   "cost_usd": 0.000744,
//   "saved_usd": 0.002976,
//   "budget_action": null
// }

Self-Hosting

Run TruthProbe on your own infrastructure:

docker run -d \
  -p 8000:8000 \
  -e UPSTREAM_API_KEY=sk-xxx \
  -e UPSTREAM_BASE_URL=https://api.openai.com/v1 \
  truthprobe/truthprobe:latest

# Or run locally:
pip install fastapi uvicorn httpx tiktoken python-dotenv
python main.py

Complexity Detection

How TruthProbe classifies request complexity:

ClassificationTriggersAction
simpletranslate, summarize, extract, classify, format, list, yes/no, rewriteeffort:low + model downgrade
mediumNo simple/complex patterns, or <2000 tokens with 1 complex patternpass through (no changes)
complexanalyze, compare, design, implement, debug, refactor, multi-step, or >2000 tokenspass through (no changes)

Complex and medium tasks always use your original model at full power. Only simple tasks get optimized.