Semantic caching & pattern detection

KnowwhatyourLLMspendactuallycosts

A proxy worker that turns raw LLM traffic into cost candles, trace waterfalls, and pattern-based savings. 5ms overhead.

Start for free

View docs

tokytics.dev

Live

Tokytics

Overview

Analytics

Requests

Patterns

Prompt Lab

Keys

Overview

Total Cost

Requests

1.24M

P95 Latency

842ms

Cache Rate

34.2%

Cost trend (30d)

By provider

OpenAI42%

Anthropic28%

Google18%

OpenRouter12%

Trace

proxy

5ms

cache

8ms

provider.llm

719ms

log

34ms

Live requests

200gpt-4o-mini$0.003

429gpt-4o—

200gemini-flash$0.001

Budget

team-a$420/$500

team-b$180/$300

proxy.forwardstatus=200provider=openaimodel=gpt-4o-minicache=semanticclickhouse.insertlatency_p95=842mscost=$0.003budget.check=passpatterns.match=3tokens=1847provider=anthropicproxy.forwardstatus=200provider=openaimodel=gpt-4o-minicache=semanticclickhouse.insertlatency_p95=842mscost=$0.003budget.check=passpatterns.match=3tokens=1847provider=anthropicproxy.forwardstatus=200provider=openaimodel=gpt-4o-minicache=semanticclickhouse.insertlatency_p95=842mscost=$0.003budget.check=passpatterns.match=3tokens=1847provider=anthropic

Features

Everything you need to
control LLM costs

From real-time monitoring to automatic optimization. No SDK changes required.

How it works

One proxy.
Full visibility.

Point your LLM calls at our proxy. We handle caching, budgets, logging, and analytics.

STEP 01

Your app calls our proxy

Swap your base URL. OpenAI, Anthropic, Google — all work. Zero SDK changes.

const res = await fetch(
"proxy.tokytics.dev/v1/chat",
{ model: "gpt-4o" }
);
↑ only change: base URL

STEP 02

Edge proxy in 5ms

Cloudflare Workers route, check cache, enforce budgets, and forward.

Route
1.2ms
Cache
0.8ms
Budget
0.5ms
Forward
2.5ms
5ms

STEP 03

Async log ingest

Every request, token count, latency, and cost streams into ClickHouse.

gpt-4o1,847$0.03
sonnet923$0.01
flash412$0.00
streaming to ClickHouse

STEP 04

See everything

Cost candles, trace waterfalls, pattern detection — real-time in your dashboard.

$1.2k
saved
34%
cached
842ms
p95

1import OpenAI from "openai"
2 
3const client = new OpenAI({
4baseURL: "https://proxy.tokytics.dev/v1",
5apiKey: process.env.OPENAI_KEY
6});
7 
8// That's it. All calls now go through Tokytics.

proxy.tokytics.dev○ waiting

Security

Your keys never
touch disk.

Zero-knowledge proxy. Provider keys pass through encrypted, never stored. Every request authenticated at the edge.

API key received

Encrypting in memory

Forwarded to provider

Purged from memory

Zero-knowledge pass-through

Keys are encrypted in memory, forwarded, and immediately purged. Nothing persisted.

timeeventstatusdetail

Immutable audit trail

Every auth, forward, and budget check logged. RBAC controls. 7-365 day retention.

AES-256 in transit

V8 isolates. Dedicated ClickHouse nodes. SOC 2 compliant infrastructure. Encrypted at rest.

AES-256

SOC 2

V8 Isolates

Zero-knowledge

RBAC

OpenAI

Anthropic

Google

Cohere

Mistral

Groq

Free tier: 10,000 req/mo

Start saving on

LLM costs today.

No credit card. No SDK changes. Set up in 30 seconds.

Start for free Talk to us

Free tier
No credit card
SOC 2
< 5ms overhead