Everything you need to integrate TruthProbe — the Trust + Efficiency Layer for LLM APIs.
Install the TruthProbe SDK for free local audit detection. Your API key never leaves your machine.
pip install truthprobe
Automatically hooks into the OpenAI SDK. Zero code changes needed beyond the import.
import truthprobe truthprobe.patch() import openai client = openai.OpenAI() # All calls are now audited automatically
Wrap your existing client for more control over configuration.
import truthprobe
import openai
client = openai.OpenAI()
audited = truthprobe.wrap(client)
response = audited.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)After each request, your terminal shows a colored audit signature:
┌─────────────────────────────────────────┐ │ TruthProbe Audit │ │ Model: gpt-4o ✓ PASS │ │ Text complexity: 8.7/10 │ │ Timing CoV: 0.29 │ │ Confidence: 94% │ └─────────────────────────────────────────┘
Get TruthProbe running in under 30 seconds.
curl -X POST https://truthprobe.com/register \
-H "Content-Type: application/json" \
-d '{
"upstream_key": "sk-your-openai-key",
"upstream_url": "https://api.openai.com/v1",
"monthly_limit_usd": 50.0,
"over_budget_action": "degrade"
}'
# Response:
# {
# "truth_probe_key": "tp-abc123...",
# "endpoint": "https://truthprobe.com/v1/chat/completions",
# "budget": { "monthly_limit_usd": 50.0, "over_budget_action": "degrade" }
# }import openai
client = openai.OpenAI(
api_key="tp-abc123...", # Your TruthProbe key
base_url="https://truthprobe.com/v1" # Just change this line
)
# Use exactly as before — fully OpenAI-compatible
response = client.chat.completions.create(
model="claude-opus-4-6", # Request expensive model
messages=[
{"role": "user", "content": "翻译: Hello world"}
]
)
# Simple task → auto-routed to Sonnet, effort:low
# Saved ~80% vs. Opus pricingimport OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'tp-abc123...',
baseURL: 'https://truthprobe.com/v1',
});
const response = await client.chat.completions.create({
model: 'claude-opus-4-6',
messages: [{ role: 'user', content: 'Translate: Hello' }],
});// Every response includes _truth_probe metadata:
{
"choices": [...],
"usage": {"prompt_tokens": 12, "completion_tokens": 8},
"_truth_probe": {
"complexity": "simple",
"effort": "low",
"original_model": "claude-opus-4-6",
"actual_model": "claude-sonnet-4-6",
"cost_usd": 0.000744,
"would_have_cost_usd": 0.00372,
"saved_usd": 0.002976,
"compression_tokens_saved": 0,
"budget_action": null
}
}Every request passes through an optimization pipeline:
Pattern matching classifies your request as simple, medium, or complex based on keywords and token count.
If you've exceeded your monthly limit, the request is either auto-degraded (cheaper model + low effort) or paused (HTTP 429), depending on your setting.
Simple tasks get effort:low + capped max_tokens. This reduces output tokens by 40-70% without quality loss on simple tasks.
Simple tasks get routed to cheaper models: Opus→Sonnet, Sonnet→Haiku, GPT-4o→Mini. Complex tasks always keep your requested model.
After the response, actual cost is calculated and logged. User spending is updated. Response includes full cost metadata.
TruthProbe works with any OpenAI-compatible API. Model routing map:
| Requested Model | Simple Task Routes To | Savings |
|---|---|---|
| claude-opus-4-7/4-6 | claude-sonnet-4-6 | ~80% |
| claude-sonnet-4-6 | claude-haiku-4-5 | ~74% |
| gpt-4o | gpt-4o-mini | ~94% |
| gpt-5.2 | gpt-4o-mini | ~94% |
| deepseek-v4-pro | deepseek-v4-flash | ~80% |
Medium and complex tasks always use your requested model at full power.
Fully OpenAI-compatible. Supports streaming. Responses include _truth_probe cost metadata.
100% compatible with openai SDKRegister and get a TruthProbe key with budget settings.
// Request body:
{
"upstream_key": "sk-xxx", // required
"upstream_url": "https://...", // required
"monthly_limit_usd": 50.0, // optional, default 50
"over_budget_action": "degrade" // "degrade" or "pause"
}Update budget settings. Requires Authorization header.
curl -X POST https://truthprobe.com/budget \
-H "Authorization: Bearer tp-xxx" \
-H "Content-Type: application/json" \
-d '{"monthly_limit_usd": 100, "over_budget_action": "pause"}'
// Response:
// { "budget": {...}, "current_spend": {"2026-05": 33.70} }Get your spending and savings breakdown.
curl https://truthprobe.com/user/stats \
-H "Authorization: Bearer tp-xxx"
// Response:
// {
// "total_requests": 847,
// "total_saved_usd": 18.42,
// "monthly_spend_usd": 33.70,
// "monthly_limit_usd": 50.0,
// "budget_remaining_usd": 16.30,
// "budget_used_pct": "67.4%"
// }Get recent request logs with per-request cost breakdown.
curl "https://truthprobe.com/logs?limit=5" \
-H "Authorization: Bearer tp-xxx"
// Each log entry:
// {
// "time": "...",
// "complexity": "simple",
// "requested_model": "claude-opus-4-6",
// "actual_model": "claude-sonnet-4-6",
// "effort": "low",
// "cost_usd": 0.000744,
// "saved_usd": 0.002976,
// "budget_action": null
// }Run TruthProbe on your own infrastructure:
docker run -d \ -p 8000:8000 \ -e UPSTREAM_API_KEY=sk-xxx \ -e UPSTREAM_BASE_URL=https://api.openai.com/v1 \ truthprobe/truthprobe:latest # Or run locally: pip install fastapi uvicorn httpx tiktoken python-dotenv python main.py
How TruthProbe classifies request complexity:
| Classification | Triggers | Action |
|---|---|---|
| simple | translate, summarize, extract, classify, format, list, yes/no, rewrite | effort:low + model downgrade |
| medium | No simple/complex patterns, or <2000 tokens with 1 complex pattern | pass through (no changes) |
| complex | analyze, compare, design, implement, debug, refactor, multi-step, or >2000 tokens | pass through (no changes) |
Complex and medium tasks always use your original model at full power. Only simple tasks get optimized.