Data Serialization in Automation - Everything you need to know

1. Introduction

This article covers data serialization used by automation and goes over how to read, interpret and write serialized data. This knowledge was used to make Techpool Unleashed.

1.1 What is data serialization?

Serialized data is data stored in binary format. In software terms that means that instead of reading and writing text data, we are instead working with bytes. For a more detailed explanation of serialization you can refer to the wikipedia page.

1.2 Where does Automation apply serialization?

There are a few ways Automation uses serialized data. For example, .car files consist entirely of serialized data. Another place you will see serialization is in some columns of the database. Such columns include, but are not limited to, fixtures, paints and tech pool.

2. Reading and writing Automation serialization

As stated above, serialized data is stored in binary. There are several ways to read bytes from data buffers or binary files. One way is to open the data in a hex editor. Beware however that most of the data is not exactly human-readable and you probably would not be having fun working with this data manually.
An optimal alternative is creating software that can read and write data in a way Automation understands.

2.1 Data types

When creating proprietary software for this purpose it is important to understand what data types Automation’s serialization uses as not all programming/coding languages support some of them.
The following table displays all data types you may encounter when working with Automation’s serialized data:

Type Size Description
Int32 4 Bytes A standard integer with no floating point, is not used for values, rather used in pair with other data types
Double 8 Bytes A floating point number representing all numeric values, you may find terms "Double" and "Numeric" being used interchangeably further on
UTF-8 Char 1 Byte A simple character. Used as data identifiers for defining how data should be read. An array of characters makes for a string. A string may be of indefinite size

2.2 Data identifiers

Serialization used in Automation does not have a strict structure or an order of data. This means certain identifiers need to be used to define how to read provided data. Identifiers are represented by single characters and say how to read data that comes right after them. The following table displays data identifiers you may encounter in Automation.

Identifier Data Description
T Table Tables are dictionaries that contain all the data included in serialization. The identifier is followed by two Int32s, the purpose of the first one is unknown and it is always represented as 0x00000000, the second one defines how many items there are in the table. More information about tables can be found further below
N Numeric The identifier is directly followed with a numeric value
S String Strings are used for defining keys and values of the table. The identifier is followed by an Int32 which defines the length of the string, which is subsequently followed by a number of UTF-8 characters, the amount of characters is equal to the size

2.3 Tables

All serialized data is nested into tables. Tables are dictionaries. Dictionaries contain pairs of keys and values. In Automation’s serialization keys are always represented as strings. Values can contain any other identified type, even another table. There are 2 types of tables used by Automation that you may find:

  • Top level, tables which are meant to be treated separately
  • Nested tables, which are represented as a value of another table

Examples of top level tables include .car files and some items such as fixtures, tech pool, paints etc. Top level tables are not values and thus do not directly have a key, instead their identifiers are preceded by a byte of 0x01. In .car files, other than the main container itself, such tables are represented as string values.

Nested tables are values that have a key and they are not saved as strings. Examples of such tables may be found in .car files, where the data is split into such tables as models, trims, families, variants.

The second Int32 value defining the size of the table counts pairs of keys and values inside the table, not the byte size.

Conclusion

The provided information is the fundamental knowledge about Automation’s serialization which should be enough to start working with the data. All that is left is choosing a fitting programming language and making your own software to handle any of Automation’s serialized data. Unless you are still settled on manually editing everything in a hex editor, in which case, bless your soul.

4 Likes