Version 0.3

Token Optimized Data Encoding

Highly compact, schema‑driven data format designed specifically for LLMs, RAG memory, AI agents, and any workflow where token count matters.

Overview

TODE is a compact, schema‑driven data format designed for environments where token efficiency, structure, and clarity are critical — particularly LLM memory, RAG systems, AI agent state, and high‑volume event logs. It combines a minimal syntax with powerful compression features such as dictionaries, schemas, and references, resulting in data that is:

Small

Dramatically smaller in token count, resulting in significantly reduced context size while preserving every critical piece of information.

Fast

Extremely fast for large language models to parse, process, and reason over efficiently, leading to noticeably quicker inference times and a much smoother overall user experience.

Structured

Highly relational and rigorously structured in its design without introducing any unnecessary overhead or bloat, making complex data relationships intuitive and straightforward to work with.

Goal: Efficiently encode 3-4x more structured data within the same token budget.

Core Principles

Principle Meaning
Single header per table Column definitions are declared once per table and reused across all rows
Minimal quoting Values are unquoted by default; quotes used only for spaces or special characters
References instead of duplication Shared entities referenced using @ref(table.id) rather than repeated
Dictionary-based compression Map frequent string values to short integer codes via @dict
Boolean and null shortcuts Compact representations: t, f, -
Positional row values Rows encoded positionally after header: 1,Alice,0

Basic Types

Type Syntax Example
String text or "with spaces" Alice, "Bob Jr"
Number 123, 45.6 100, 3.14
Boolean t, f, true, false t, f
Null -, ~ -
Omitted _ (optional columns only) _

Dictionaries

Replace frequent or long string values with short numeric identifiers. Dictionary entries can be shared across multiple tables and can be generated automatically by tooling.

@dict role {admin:0, staff:1, guest:2}
@dict act {login:0, logout:1, error:9}

Table Definitions

Define table structure with column modifiers:

@table_name col1,col2:ref(table.id),col3:bool=default
Modifier Meaning
col String column (default type)
col:int Column constrained to integer values
col:ref(table.id) Foreign key reference to another table
col:bool=t Boolean column with default value true
col? Optional column; values may be omitted using _

Row Data

1,Alice,0,t
2,Bob,1,f
3,Charlie,0,-

Row values are positional, aligned with the previously declared header. Column names are not repeated per row, significantly reducing token usage.

References

Link data across tables efficiently using foreign key references:

@users id,name
1,Alice
2,Bob

@events id,user:ref(users.id),act
100,1,0
  • References use ref(table.id) at the type level
  • Values use @ref(table.id) at the value level
  • Encoded references are compact (~2 tokens)
  • References can be expanded via joins for downstream consumption

Lightly Nested Data

Support for simple key-value structures, light object nesting, and arrays while remaining compact and human-readable:

@config
debug: t
limits {cpu:4, mem:"2G"}
features: [ai,auth]
owner: @ref(users.1)

Complete Example

A comprehensive example demonstrating users, events, and sessions with dictionaries, references, and positional encoding:

# Dictionaries
@dict role {admin:0, staff:1, guest:2}
@dict act {login:0, logout:1, click:2, error:9}
@dict status {ok:0, fail:1, pending:2}

# Users
@users id,name,role,active:bool=t
1,Alice,0
2,Bob,1,f
3,Charlie,0
4,Dana,2

# Events
@events id,user:ref(users.id),act,time
100,1,0,0s
101,1,1,3600s
102,2,2,7200s
103,3,9,8000s

# Sessions
@sessions id,user:ref(users.id),event:ref(events.id),status,duration
10,1,100,0,3600
11,2,102,2,0
12,1,101,1,300

Grammar (EBNF)

Formal grammar specification for TODE:

tode ::= stmt*
stmt ::= dict | table | join | output | comment | kv

dict ::= "@dict" id "{" mapping "}"
mapping ::= (value ":" id) ("," value ":" id)*

table ::= "@" id header nl row*
header ::= col ("," col)*
col ::= id [":" type] ["=" default] ["?"]
type ::= "ref(" ref ")" | "bool" | "int"

row ::= value ("," value)*
value ::= id | number | bool | null | omit | ref | string | array | object

ref ::= "@ref(" id "." id ")"
bool ::= "t" | "f" | "true" | "false"
null ::= "-" | "~"
omit ::= "_"

comment ::= "#" .* nl
nl ::= "\n"