# Annotating FlatBuffers This provides a way to annotate flatbuffer binary data, byte-by-byte, with a schema. It is useful for development purposes and understanding the details of the internal format. ## Annotating Given a `schema`, as either a plain-text (`.fbs`) or a binary schema (`.bfbs`), and `binary` file(s) that were created by the `schema`. You can annotate them using: ```sh flatc --annotate SCHEMA -- BINARY_FILES... ``` This will produce a set of annotated files (`.afb` Annotated FlatBuffer) corresponding to the input binary files. ### Example Taken from the [tests/annotated_binary](https://github.com/google/flatbuffers/tree/master/tests/annotated_binary). ```sh cd tests/annotated_binary ../../flatc --annotate annotated_binary.fbs -- annotated_binary.bin ``` Which will produce a `annotated_binary.afb` file in the current directory. The `annotated_binary.bin` is the flatbufer binary of the data contained within `annotated_binary.json`, which was made by the following command: ```sh ..\..\flatc -b annotated_binary.fbs annotated_binary.json ``` ## .afb Text Format Currently there is a built-in text-based format for outputting the annotations. A full example is shown here: [`annotated_binary.afb`](https://github.com/google/flatbuffers/blob/master/tests/annotated_binary/annotated_binary.afb) The data is organized as a table with fixed [columns](#columns) grouped into Binary [sections](#binary-sections) and [regions](#binary-regions), starting from the beginning of the binary (offset `0`). ### Columns The columns are as follows: 1. The offset from the start of the binary, expressed in hexadecimal format (e.g. `+0x003c`). The prefix `+` is added to make searching for the offset (compared to some random value) a bit easier. 2. The raw binary data, expressed in hexadecimal format. This is in the little endian format the buffer uses internally and what you would see with a normal binary text viewer. 3. The type of the data. This may be the type specified in the schema or some internally defined types: | Internal Type | Purpose | |---------------|----------------------------------------------------| | `VOffset16` | Virtual table offset, relative to the table offset | | `UOffset32` | Unsigned offset, relative to the current offset | | `SOffset32` | Signed offset, relative to the current offset | 4. The value of the data. This is shown in big endian format that is generally written for humans to consume (e.g. `0x0013`). As well as the "casted" value (e.g. `0x0013 `is `19` in decimal) in parentheses. 5. Notes about the particular data. This describes what the data is about, either some internal usage, or tied to the schema. ### Binary Sections The file is broken up into Binary Sections, which are comprised of contiguous [binary regions](#binary-regions) that are logically grouped together. For example, a binary section may be a single instance of a flatbuffer `Table` or its `vtable`. The sections may be labelled with the name of the associated type, as defined in the input schema. An example of a `vtable` Binary Section that is associated with the user-defined `AnnotateBinary.Bar` table. ``` vtable (AnnotatedBinary.Bar): +0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable +0x00A2 | 13 00 | uint16_t | 0x0013 (19) | size of referring table +0x00A4 | 08 00 | VOffset16 | 0x0008 (8) | offset to field `a` (id: 0) +0x00A6 | 04 00 | VOffset16 | 0x0004 (4) | offset to field `b` (id: 1) ``` These are purely annotative, there is no embedded information about these regions in the flatbuffer itself. ### Binary Regions Binary regions are contiguous bytes regions that are grouped together to form some sort of value, e.g. a `scalar` or an array of scalars. A binary region may be split up over multiple text lines, if the size of the region is large. #### Annotation Example Looking at an example binary region: ``` vtable (AnnotatedBinary.Bar): +0x00A0 | 08 00 | uint16_t | 0x0008 (8) | size of this vtable ``` The first column (`+0x00A0`) is the offset to this region from the beginning of the buffer. The second column are the raw bytes (hexadecimal) that make up this region. These are expressed in the little-endian format that flatbuffers uses for the wire format. The third column is the type to interpret the bytes as. For the above example, the type is `uint16_t` which is a 16-bit unsigned integer type. The fourth column shows the raw bytes as a compacted, big-endian value. The raw bytes are duplicated in this fashion since it is more intuitive to read the data in the big-endian format (e.g., `0x0008`). This value is followed by the decimal representation of the value (e.g., `(8)`). For strings, the raw string value is shown instead. The fifth column is a textual comment on what the value is. As much metadata as known is provided. ### Offsets If the type in the 3rd column is of an absolute offset (`SOffet32` or `Offset32`), the fourth column also shows an `Loc: +0x025A` value which shows where in the binary this region is pointing to. These values are absolute from the beginning of the file, their calculation from the raw value in the 4th column depends on the context.