Semantic caching & pattern detection

KnowwhatyourLLMspendactuallycosts

A proxy worker that turns raw LLM traffic into cost candles, trace waterfalls, and pattern-based savings. 5ms overhead.

Start for free
View docs
tokytics.dev
Live
Tokytics
Overview
Analytics
Requests
Patterns
Prompt Lab
Keys
Overview
Total Cost
$0
Requests
1.24M
P95 Latency
842ms
Cache Rate
34.2%
Cost trend (30d)
By provider
OpenAI42%
Anthropic28%
Google18%
OpenRouter12%
Trace
proxy
5ms
cache
8ms
provider.llm
719ms
log
34ms
Live requests
200gpt-4o-mini$0.003
429gpt-4o
200gemini-flash$0.001
Budget
team-a$420/$500
team-b$180/$300
proxy.forwardstatus=200provider=openaimodel=gpt-4o-minicache=semanticclickhouse.insertlatency_p95=842mscost=$0.003budget.check=passpatterns.match=3tokens=1847provider=anthropicproxy.forwardstatus=200provider=openaimodel=gpt-4o-minicache=semanticclickhouse.insertlatency_p95=842mscost=$0.003budget.check=passpatterns.match=3tokens=1847provider=anthropicproxy.forwardstatus=200provider=openaimodel=gpt-4o-minicache=semanticclickhouse.insertlatency_p95=842mscost=$0.003budget.check=passpatterns.match=3tokens=1847provider=anthropic
Features

Everything you need to
control LLM costs

From real-time monitoring to automatic optimization. No SDK changes required.

How it works

One proxy.
Full visibility.

Point your LLM calls at our proxy. We handle caching, budgets, logging, and analytics.

01
STEP 01

Your app calls our proxy

Swap your base URL. OpenAI, Anthropic, Google — all work. Zero SDK changes.

const res = await fetch(
"proxy.tokytics.dev/v1/chat",
{ model: "gpt-4o" }
);
↑ only change: base URL
02
STEP 02

Edge proxy in 5ms

Cloudflare Workers route, check cache, enforce budgets, and forward.

Route
1.2ms
Cache
0.8ms
Budget
0.5ms
Forward
2.5ms
5ms
03
STEP 03

Async log ingest

Every request, token count, latency, and cost streams into ClickHouse.

gpt-4o1,847$0.03
sonnet923$0.01
flash412$0.00
streaming to ClickHouse
04
STEP 04

See everything

Cost candles, trace waterfalls, pattern detection — real-time in your dashboard.

$1.2k
saved
34%
cached
842ms
p95
Quickstart

Two lines.
That's the whole setup.

Replace your provider's base URL with ours. Your API keys, models, and parameters stay exactly the same.

2
lines changed
30s
to integrate
0
dependencies
1
import OpenAI from "openai"
2
3
const client = new OpenAI({
4
baseURL: "https://proxy.tokytics.dev/v1",
5
apiKey: process.env.OPENAI_KEY
6
});
7
8
// That's it. All calls now go through Tokytics.
proxy.tokytics.dev○ waiting
Security

Your keys never
touch disk.

Zero-knowledge proxy. Provider keys pass through encrypted, never stored. Every request authenticated at the edge.

API key received
Encrypting in memory
Forwarded to provider
Purged from memory

Zero-knowledge pass-through

Keys are encrypted in memory, forwarded, and immediately purged. Nothing persisted.

timeeventstatusdetail

Immutable audit trail

Every auth, forward, and budget check logged. RBAC controls. 7-365 day retention.

AES-256 in transit

V8 isolates. Dedicated ClickHouse nodes. SOC 2 compliant infrastructure. Encrypted at rest.

AES-256
SOC 2
V8 Isolates
Zero-knowledge
RBAC
OpenAI
Anthropic
Google
Cohere
Mistral
Groq
Free tier: 0 req/mo

Start saving on

LLM costs today.

No credit card. No SDK changes. Set up in 30 seconds.

Start for free Talk to us
Free tier
No credit card
SOC 2
< 5ms overhead