Version 0.3

Token Optimized Data Encoding

Highly compact, schema‑driven data format designed specifically for LLMs, RAG memory, AI agents, and any workflow where token count matters.

Overview

TODE is a compact, schema‑driven data format designed for environments where token efficiency, structure, and clarity are critical — particularly LLM memory, RAG systems, AI agent state, and high‑volume event logs. It combines a minimal syntax with powerful compression features such as dictionaries, schemas, and references, resulting in data that is:

Small

Dramatically smaller in token count, resulting in significantly reduced context size while preserving every critical piece of information.

Fast

Extremely fast for large language models to parse, process, and reason over efficiently, leading to noticeably quicker inference times and a much smoother overall user experience.

Structured

Highly relational and rigorously structured in its design without introducing any unnecessary overhead or bloat, making complex data relationships intuitive and straightforward to work with.

Goal: Efficiently encode 3-4x more structured data within the same token budget.

Core Principles

Principle	Meaning
Single header per table	Column definitions are declared once per table and reused across all rows
Minimal quoting	Values are unquoted by default; quotes used only for spaces or special characters
References instead of duplication	Shared entities referenced using `@ref(table.id)` rather than repeated
Dictionary-based compression	Map frequent string values to short integer codes via `@dict`
Boolean and null shortcuts	Compact representations: `t`, `f`, `-`
Positional row values	Rows encoded positionally after header: `1,Alice,0`

Basic Types

Type	Syntax	Example
String	`text` or `"with spaces"`	`Alice`, `"Bob Jr"`
Number	`123`, `45.6`	`100`, `3.14`
Boolean	`t`, `f`, `true`, `false`	`t`, `f`
Null	`-`, `~`	`-`
Omitted	`_` (optional columns only)	`_`

Dictionaries

Replace frequent or long string values with short numeric identifiers. Dictionary entries can be shared across multiple tables and can be generated automatically by tooling.

@dict role {admin:0, staff:1, guest:2}
@dict act {login:0, logout:1, error:9}

Table Definitions

Define table structure with column modifiers:

@table_name col1,col2:ref(table.id),col3:bool=default

Modifier	Meaning
`col`	String column (default type)
`col:int`	Column constrained to integer values
`col:ref(table.id)`	Foreign key reference to another table
`col:bool=t`	Boolean column with default value `true`
`col?`	Optional column; values may be omitted using `_`

Row Data

1,Alice,0,t
2,Bob,1,f
3,Charlie,0,-

Row values are positional, aligned with the previously declared header. Column names are not repeated per row, significantly reducing token usage.

References

Link data across tables efficiently using foreign key references:

@users id,name
1,Alice
2,Bob

@events id,user:ref(users.id),act
100,1,0

References use ref(table.id) at the type level
Values use @ref(table.id) at the value level
Encoded references are compact (~2 tokens)
References can be expanded via joins for downstream consumption

Lightly Nested Data

Support for simple key-value structures, light object nesting, and arrays while remaining compact and human-readable:

@config
debug: t
limits {cpu:4, mem:"2G"}
features: [ai,auth]
owner: @ref(users.1)

Complete Example

A comprehensive example demonstrating users, events, and sessions with dictionaries, references, and positional encoding:

# Dictionaries
@dict role {admin:0, staff:1, guest:2}
@dict act {login:0, logout:1, click:2, error:9}
@dict status {ok:0, fail:1, pending:2}

# Users
@users id,name,role,active:bool=t
1,Alice,0
2,Bob,1,f
3,Charlie,0
4,Dana,2

# Events
@events id,user:ref(users.id),act,time
100,1,0,0s
101,1,1,3600s
102,2,2,7200s
103,3,9,8000s

# Sessions
@sessions id,user:ref(users.id),event:ref(events.id),status,duration
10,1,100,0,3600
11,2,102,2,0
12,1,101,1,300

Grammar (EBNF)

Formal grammar specification for TODE:

tode ::= stmt*
stmt ::= dict | table | join | output | comment | kv

dict ::= "@dict" id "{" mapping "}"
mapping ::= (value ":" id) ("," value ":" id)*

table ::= "@" id header nl row*
header ::= col ("," col)*
col ::= id [":" type] ["=" default] ["?"]
type ::= "ref(" ref ")" | "bool" | "int"

row ::= value ("," value)*
value ::= id | number | bool | null | omit | ref | string | array | object

ref ::= "@ref(" id "." id ")"
bool ::= "t" | "f" | "true" | "false"
null ::= "-" | "~"
omit ::= "_"

comment ::= "#" .* nl
nl ::= "\n"