1. Encoding mode flag

A document can be encoded in five different forms. The form is indicated by the first byte.

Decoder and Encoder pseudo-code are defined here.

1. Form - Stream

The streamed form allow a quick writing with no backward cursor movement. The downside is an unpredictable document size until reading is finished. This form is more suited for tiny and very large documents.

Bytes 0   1               X               Y                               Z
      +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+
      | 0 |  FIELD 1 DEF  | FIELD 1 VALUE |  FIELD 2 DEF  | FIELD 2 VALUE |
      +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+

Bytes     Z
          +---+ - - - +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+
          |     . . .     |     . . .     |  FIELD N DEF  | FIELD N VALUE |
          +---+ - - - +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+
Offset Description
0-1 signature, 0x00 for streamed
1-X field définition if the type do not define all elements
X-Y field value if values are present

2. Form - Indexed

The indexed form allow quick skipping and access to properties. The downside is a slightly bigger file and backward cursor positioning when writing.

Bytes 0   1   2               X               Y               Z
      +---+---+ - - - + - - - +---+ - - - +---+---+ - - - +---+
      | 1 | S | SIZE1 | SIZEN |  FIELD N DEF  | FIELD N VALUE |
      +---+---+ - - - + - - - +---+ - - - +---+---+ - - - +---+
Offset Description
0-1 marker, 0x01 for indexed
1-2 number of bytes used to store a field size
2-X size in bytes of each document field, the number of values is given by
the document type, the total document size can be calculated using the
formula : sum ( size1 ... sizeN ) + 2
X-Y field definition if the type do not define all elements
Y-Z field value if values are present

The fields definitions and values use the same structure as in the streamed form

3. Form - Encapsulated

The encapsulated form is intended to be used for compression and encryption needs. The encapsulated document is a document in any form, this allows the combine several layers of encapsulation.

3.1 Fixed size

Bytes 0   1               A               B               C
      +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+
      | 2 |    METHOD     |   ENC. SIZE   |  COMP. DOC.   |
      +---+---+ - - - +---+---+ - - - +---+---+ - - - +---+
Offset Description
0-1 signature, 0x02 for encapsulated
1-A String in UTF-8 to identify the mehod.
for example [0x03,'Z','I','P'] or [0x03,'A','E','S']
A-B variable size integer to indicate the encapsulated document size
B-C the encapsulated document on N bytes, N being the number defined at [X-Y[

If the encapsulated size if not zero the complete encapsulated document is in the [Y-Z] bytes.

3.2 By block

If the encapsulated size is zero, the document is split in blocks as following.

Bytes 0   1               A   B               C
      +---+---+ - - - +---+---+---+ - - - +---+
      | 2 |    METHOD     | 0 |  BLOCK SIZE   |
      +---+---+ - - - +---+---+---+ - - - +---+

Bytes C               D   E
      +---+ - - - +---+---+---+ - - - +---+---+
      |    BLOCK 1    | F |    BLOCK N    | F |
      +---+ - - - +---+---+---+ - - - +---+---+
Offset Description
0-1 signature, 0x02 for encapsulated
1-A String in UTF-8 to identify the mehod.
for example [0x03,'Z','I','P'] or [0x03,'A','E','S']
A-B zero
B-C variable size integer to define size of the blocks
C-D block
D-E block flag, if value is zero, this was the last block

4. Form - Reference

In some cases it is necessary to define cyclic, backward or distant references. The reference binary structure encodes and UTF-8 string which points toward the document. Common cases include URL, URN or file paths but those are not restricted.

Bytes 0   1               X
      +---+---+ - - - +---+
      | 3 |   REFERENCE   |
      +---+---+ - - - +---+
Offset Description
0-1 signature, 0x03 for reference
1-X Reference String in UTF-8

5. Form - Deleted

Documents can be deleted, this particular structure allows document files to be modified without rewriting the entire file. Decoders must skip those documents when they occur.

Bytes 0   1               X               Y
      +---+---+ - - - +---+---+ - - - +---+
      |255|   DOC SIZE    |    PADDING    |
      +---+---+ - - - +---+---+ - - - +---+
Offset Description
0-1 signature, 0xFF for deleted
1-X VarUInt, size of the deleted document, the size includes only the
padding length
X-Y bytes to skip, may contain any kind of data, encoders should fill it
with random or constant values for security reasons.