We added TOON compression to our LLM gateway – compress prompts, saves tokens

raaihank · 2026-02-04T06:04:56.000Z 1770185096

Costbase is an LLM cost optimization proxy. We just shipped TOON (Token-Oriented Object Notation) compression.

It converts JSON like this:

    {"id": "cust_001", "name": "Acme", "mrr": 15000}

Into: id: cust_001 name: Acme mrr: 15000

We integrated it into our gateway to automatically compress JSON in tool results, user messages, and tool call arguments before they hit the LLM.

Benchmarks on real payloads:

- CRM query (10 records): 48% tokens saved - E-commerce orders (4 orders): 34% saved - API metrics (8 endpoints): 43% saved

Sub-100μs latency overhead. LLMs parse it correctly in our testing (GPT-4o, Claude, etc).

Not a silver bullet — works best on arrays of objects with uniform schemas. Deeply nested or irregular JSON sees less benefit.

Curious what strategies others use for token compression. We considered CSV for tabular data but it doesn't handle nested structures.