Skip to content
/ JToon Public
forked from toon-format/toon-java

β˜• Java πŸŽ’ Token-Oriented Object Notation – JSON for LLMs at half the token cost

License

Notifications You must be signed in to change notification settings

totto/JToon

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

49 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TOON logo with step‑by‑step guide

JToon - Token-Oriented Object Notation (TOON)

Build Release Maven Central

Token-Oriented Object Notation is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage.

TOON excels at uniform complex objects – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.

Why TOON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money – and standard JSON is verbose and token-expensive:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

TOON conveys the same information with fewer tokens:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Test the differences on THIS online playground

Another reason

xkcd: Standards

Benchmarks

Learn more: For complete format specification, rules, and additional benchmarks, see TOON-SPECIFICATION.md.

Token Efficiency Example

TOON typically achieves 30–60% fewer tokens than JSON. Here's a quick summary:

Total across 4 datasets        β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  13,418 tokens
                               vs JSON: 26,379  πŸ’° 49.1% saved
                               vs XML:  30,494  πŸ’° 56.0% saved

See TOON-SPECIFICATION.md for detailed benchmark results and LLM retrieval accuracy tests.

Installation

Maven Central

JToon is available on Maven Central. Add it to your project using your preferred build tool:

Gradle (Groovy DSL):

dependencies {
    implementation 'com.felipestanzani:jtoon:0.1.2'
}

Gradle (Kotlin DSL):

dependencies {
    implementation("com.felipestanzani:jtoon:0.1.2")
}

Maven:

<dependency>
    <groupId>com.felipestanzani</groupId>
    <artifactId>jtoon</artifactId>
    <version>0.1.2</version>
</dependency>

Note: See the latest version on Maven Central (also shown in the badge above).

Alternative: Manual Installation

You can also download the JAR directly from the GitHub Releases page and add it to your project's classpath.

Quick Start

import com.felipestanzani.jtoon.JToon;
import java.util.*;

record User(int id, String name, List<String> tags, boolean active, List<?> preferences) {}
record Data(User user) {}

User user = new User(123, "Ada", List.of("reading", "gaming"), true, List.of());
Data data = new Data(user);

System.out.println(JToon.encode(data));

Output:

user:
  id: 123
  name: Ada
  tags[2]: reading,gaming
  active: true
  preferences[0]:

TOON Format Basics

Complete specification: For detailed formatting rules, quoting rules, and comprehensive examples, see TOON-SPECIFICATION.md.

TOON uses indentation-based structure (like YAML) combined with efficient tabular format for uniform arrays (like CSV):

Simple objects:

id: 123
name: Ada

Nested objects:

user:
  id: 123
  name: Ada

Primitive arrays:

tags[3]: admin,ops,dev

Tabular arrays (uniform objects with same fields):

items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Type Conversions

Some Java-specific types are automatically normalized for LLM-safe output:

Input Type Output
Number (finite) Decimal form; -0 β†’ 0; whole numbers as integers
Number (NaN, Β±Infinity) null
BigInteger Integer if within Long range, otherwise string (no quotes)
BigDecimal Decimal number
LocalDateTime ISO date-time string in quotes
LocalDate ISO date string in quotes
LocalTime ISO time string in quotes
ZonedDateTime ISO zoned date-time string in quotes
OffsetDateTime ISO offset date-time string in quotes
Instant ISO instant string in quotes
java.util.Date ISO instant string in quotes
Optional<T> Unwrapped value or null if empty
Stream<T> Materialized to array
Map Object with string keys
Collection, arrays Arrays

Number normalization examples:

-0    β†’ 0
1e6   β†’ 1000000
1e-6  β†’ 0.000001

API

JToon.encode(Object value): String

JToon.encode(Object value, EncodeOptions options): String

JToon.encodeJson(String json): String

JToon.encodeJson(String json, EncodeOptions options): String

Converts any Java object or JSON-string to TOON format.

Parameters:

  • value – Any Java object (Map, List, primitive, or nested structure). Non-serializable values are converted to null. Java temporal types are converted to ISO strings, Optional is unwrapped, and Stream is materialized.
  • options – Optional encoding options (EncodeOptions record):
    • indent – Number of spaces per indentation level (default: 2)
    • delimiter – Delimiter enum for array values and tabular rows: Delimiter.COMMA (default), Delimiter.TAB, or Delimiter.PIPE
    • lengthMarker – Boolean to prefix array lengths with # (default: false)

For encodeJson overloads:

  • json – A valid JSON string to be parsed and encoded. Invalid or blank JSON throws IllegalArgumentException.

Returns:

A TOON-formatted string with no trailing newline or spaces.

Example:

import com.felipestanzani.jtoon.JToon;
import java.util.*;

record Item(String sku, int qty, double price) {}
record Data(List<Item> items) {}

Item item1 = new Item("A1", 2, 9.99);
Item item2 = new Item("B2", 1, 14.5);
Data data = new Data(List.of(item1, item2));

System.out.println(JToon.encode(data));

Output:

items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Encode a plain JSON string

String json = """
{
  "user": {
    "id": 123,
    "name": "Ada",
    "tags": ["reading", "gaming"]
  }
}
""";
System.out.println(JToon.encodeJson(json));

Output:

user:
  id: 123
  name: Ada
  tags[2]: reading,gaming

Delimiter Options

The delimiter option allows you to choose between comma (default), tab, or pipe delimiters for array values and tabular rows. Alternative delimiters can provide additional token savings in specific contexts.

Tab Delimiter (\t)

Using tab delimiters instead of commas can reduce token count further, especially for tabular data:

import com.felipestanzani.jtoon.*;
import java.util.*;

record Item(String sku, String name, int qty, double price) {}
record Data(List<Item> items) {}

Item item1 = new Item("A1", "Widget", 2, 9.99);
Item item2 = new Item("B2", "Gadget", 1, 14.5);
Data data = new Data(List.of(item1, item2));

EncodeOptions options = new EncodeOptions(2, Delimiter.TAB, false);
System.out.println(JToon.encode(data, options));

Output:

items[2 ]{sku name qty price}:
  A1 Widget 2 9.99
  B2 Gadget 1 14.5

Benefits:

  • Tabs are single characters and often tokenize more efficiently than commas.
  • Tabs rarely appear in natural text, reducing the need for quote-escaping.
  • The delimiter is explicitly encoded in the array header, making it self-descriptive.

Considerations:

  • Some terminals and editors may collapse or expand tabs visually.
  • String values containing tabs will still require quoting.
Pipe Delimiter (|)

Pipe delimiters offer a middle ground between commas and tabs:

// Using the same Item and Data records from above
EncodeOptions options = new EncodeOptions(2, Delimiter.PIPE, false);
System.out.println(JToon.encode(data, options));

Output:

items[2|]{sku|name|qty|price}:
  A1|Widget|2|9.99
  B2|Gadget|1|14.5

Length Marker Option

The lengthMarker option adds an optional hash (#) prefix to array lengths to emphasize that the bracketed value represents a count, not an index:

import com.felipestanzani.jtoon.*;
import java.util.*;

record Item(String sku, int qty, double price) {}
record Data(List<String> tags, List<Item> items) {}

Item item1 = new Item("A1", 2, 9.99);
Item item2 = new Item("B2", 1, 14.5);
Data data = new Data(List.of("reading", "gaming", "coding"), List.of(item1, item2));

System.out.println(JToon.encode(data, new EncodeOptions(2, Delimiter.COMMA, true)));
// tags[#3]: reading,gaming,coding
// items[#2]{sku,qty,price}:
//   A1,2,9.99
//   B2,1,14.5

// Works with custom delimiters
System.out.println(JToon.encode(data, new EncodeOptions(2, Delimiter.PIPE, true)));
// tags[#3|]: reading|gaming|coding
// items[#2|]{sku|qty|price}:
//   A1|2|9.99
//   B2|1|14.5

See Also

Implementations in Other Languages

License

MIT License Β© 2025-PRESENT Felipe Stanzani

About

β˜• Java πŸŽ’ Token-Oriented Object Notation – JSON for LLMs at half the token cost

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%