TruthProbe — the independent trust layer for LLM API transit providers.
TruthProbe is an independent third-party tool that helps you verify whether your LLM transit provider is actually serving the model you're paying for. Install our SDK, audit your provider, and compare prices across providers — all without changing your existing workflow.
Detect if your transit provider is secretly swapping expensive models for cheaper ones.
Compare transit providers by trust score, speed, and price — per model.
Anonymous signals from SDK users feed into public rankings. More users = more accurate data.
The TruthProbe MCP server integrates directly into your AI coding tool. Zero code changes — just add to your config and get continuous trust monitoring.
Works with Claude Code, Codex CLI, Cursor, Windsurf, Hermes Agents & all MCP-compatible tools
Any AI tool that supports the MCP protocol can use TruthProbe:
+ all MCP-compatible AI tools
npm install -g truthprobe-mcp
Add to your AI tool's MCP config (e.g., ~/.claude/claude_desktop_config.json for Claude Code):
{
"mcpServers": {
"truthprobe": {
"command": "truthprobe-mcp",
"env": {
"ANTHROPIC_BASE_URL": "https://your-transit.com/v1",
"ANTHROPIC_API_KEY": "sk-xxx"
}
}
}
}Or set environment variables for auto-detection:
# If your AI tool already uses these env vars, # TruthProbe auto-detects — no config needed! export ANTHROPIC_BASE_URL="https://your-transit.com/v1" export ANTHROPIC_API_KEY="sk-xxx"
truthprobe_verifyRun an on-demand audit probe. Ask your AI: "Is my transit provider honest?"
truthprobe_statusGet current provider status: availability, trust score, balance at a glance.
truthprobe_balanceTrack spending and detect unusual consumption. Shows total spent, daily average, and trend alerts.
truthprobe_reportGenerate a full trust report with historical probe data and trends.
truthprobe_rankingCompare your provider against others for a specific model.
The MCP server automatically runs probes every 30 minutes in the background. If trust drops or anomalies are detected, your AI tool is alerted proactively — no action needed from you.
The TruthProbe Python SDK runs locally on your machine. Your API keys never leave your environment.
pip install truthprobe
One line hooks into the OpenAI SDK. All subsequent API calls are audited automatically.
import truthprobe
truthprobe.patch()
import openai
client = openai.OpenAI(
api_key="sk-xxx",
base_url="https://your-transit.com/v1"
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello"}]
)
# Terminal shows audit result automaticallyRun a targeted audit against a specific provider and model without patching your client.
import truthprobe
result = truthprobe.verify(
provider_url="https://your-transit.com/v1",
api_key="sk-xxx",
model="claude-sonnet-4-6"
)
print(result)
# {
# "trust_score": 87.3,
# "confidence": 0.94,
# "signals": {
# "text_complexity": 8.7,
# "timing_cov": 0.29,
# "model_field_match": true
# },
# "verdict": "PASS"
# }Query your remaining balance at any transit provider that exposes a balance endpoint.
import truthprobe
balance = truthprobe.balance(
provider_url="https://your-transit.com",
api_key="sk-xxx"
)
print(f"Balance: {balance['balance_usd']:.2f} USD")The truthprobe CLI gives you quick access to audit reports and verification from your terminal.
Show a detailed audit summary for your current provider. Includes trust score breakdown and confidence level.
$ truthprobe report ┌─────────────────────────────────────────────────┐ │ TruthProbe Weekly Report │ │ │ │ Provider: your-transit.com │ │ Model: claude-sonnet-4-6 │ │ Requests audited: 142 │ │ │ │ Trust Score: 87.3 / 100 │ │ ├─ Text analysis: 8.7/10 (45%) │ │ ├─ Timing variance: 0.29 CoV (40%) │ │ └─ Model field match: ✓ (15%) │ │ │ │ Confidence: 94% (SPRT: 20+ observations) │ │ Verdict: ✓ PASS │ └─────────────────────────────────────────────────┘
$ truthprobe score --provider your-transit.com Trust Score: 87.3 (PASS) Based on 142 observations, 94% confidence
$ export TRUTHPROBE_API_KEY=sk-xxx $ truthprobe verify --provider https://your-transit.com/v1 --model claude-sonnet-4-6 Running 5 probe tests... [1/5] Instruction precision: PASS [2/5] Response timing: PASS [3/5] Text complexity: PASS [4/5] Model field: PASS [5/5] Entropy check: PASS Overall: PASS (score: 87.3)
TruthProbe uses 3 independent signals to detect model substitution. Each signal is weighted and combined using SPRT (Sequential Probability Ratio Test) for statistical confidence.
Evaluates vocabulary richness, reasoning depth, stylistic markers, and semantic complexity against known baselines for each model family.
Analyzes response time distributions (TTFT, inter-token intervals, throughput). Each model architecture has a characteristic timing signature that is hard to fake.
Checks if the model identifier returned matches what was requested. Low weight because providers can trivially spoof this field.
| Detection Scenario | Accuracy |
|---|---|
| Large gap swap (e.g., Opus → Haiku) | 92-97% |
| Adjacent tier (e.g., Opus → Sonnet) | 85-90% |
| Cross-family (Claude vs GPT) | 95%+ |
The ranking page shows per-model provider comparisons. Data comes from three sources with automatic priority:
Real API calls with balance-diff price verification. The gold standard — measures actual cost by checking balance before and after a request.
Automated daily crawl of provider /v1/models endpoints and OpenRouter pricing data. Runs twice daily, no API keys needed.
Official upstream pricing × known provider markup. Used as fallback when no real data is available.
Higher priority data automatically overrides lower. As more probe data comes in, rankings become increasingly accurate.
When you use the SDK with reporting enabled (default), anonymized audit signals are submitted to TruthProbe. This crowd-sourced data powers the public rankings.
import truthprobe # Community reporting is enabled by default truthprobe.patch() # To opt out: truthprobe.patch(report=False) # Data submitted is anonymous: # - provider domain (e.g., "xxx-transit.com") # - model name # - timing metrics # - trust score # NO prompts, responses, or API keys are ever sent
Returns the list of models that have ranking data available.
curl https://truthprobe.com/ranking/models
// Response:
// {
// "models": ["claude-sonnet-4-6", "claude-haiku-4-5", "gpt-4o", ...]
// }Returns provider rankings for a specific model, sorted by trust score.
curl "https://truthprobe.com/ranking?model=claude-sonnet-4-6"
// Response:
// {
// "model": "claude-sonnet-4-6",
// "providers": [
// {
// "provider_id": "timesniper",
// "trust_score": 85.2,
// "speed_ms": 340,
// "price_input": 2.8,
// "price_output": 11.2,
// "vs_official": 0.93,
// "probe_count": 10,
// "price_source": "balance_diff"
// }
// ]
// }Submit anonymous probe data from the SDK. Used internally by truthprobe.patch() — you don't need to call this directly.
Rate limit: 60/min/IPcurl -X POST https://truthprobe.com/api/v1/submit-probe \
-H "Content-Type: application/json" \
-d '{
"provider_id": "timesniper",
"model": "claude-sonnet-4-6",
"trust_score": 85.2,
"speed_ms": 340,
"prompt_tokens": 12,
"completion_tokens": 156,
"ttfb_ms": 280
}'
// Rate limit: 60 requests/min/IP
// Dedup: same data within 5s is ignoredChange your base_url to route through TruthProbe for automatic audit + budget enforcement on every request. No SDK installation needed — works with any OpenAI-compatible client.
# Coming soon — change one line to get audit + budget enforcement:
import openai
client = openai.OpenAI(
api_key="tp-xxx", # TruthProbe Pro key
base_url="https://truthprobe.com/v1" # Route through TruthProbe
)
# Features:
# ✓ Automatic model audit on every request
# ✓ Monthly budget enforcement (hard cap)
# ✓ Team spending dashboard
# ✓ Alert notifications (budget / trust / balance)
# ✓ Request logs with cost breakdown