1. Introduction
This article covers data serialization used by automation and goes over how to read, interpret and write serialized data. This knowledge was used to make Techpool Unleashed and DotCarDumper.
1.1 What is data serialization?
Serialized data is data stored in binary format. In software terms that means that instead of reading and writing text data, we are instead working with bytes. For a more detailed explanation of serialization you can refer to the wikipedia page.
1.2 Where does Automation apply serialization?
There are a few ways Automation uses serialized data. For example, .car files consist entirely of serialized data. Another place you will see serialization is in some columns of the database. Such columns include, but are not limited to, fixtures, paints and tech pool.
2. Reading and writing Automation serialization
As stated above, serialized data is stored in binary. There are several ways to read bytes from data buffers or binary files. One way is to open the data in a hex editor. Beware however that most of the data is not exactly human-readable and you probably would not be having fun working with this data manually.
An optimal alternative is creating software that can read and write data in a way Automation understands.
2.1 Data types
When creating proprietary software for this purpose it is important to understand what data types Automation’s serialization uses as not all programming/coding languages support some of them.
The following table displays all data types you may encounter when working with Automation’s serialized data:
Type | Size | Description |
---|---|---|
Int32 | 4 Bytes | A standard integer with no floating point, is not used for values, rather used in pair with other data types |
Double | 8 Bytes | A floating point number representing all numeric values, you may find terms "Double" and "Numeric" being used interchangeably further on |
UTF-8 Char | 1 Byte* | A simple character. Used as data identifiers for defining how data should be read. An array of characters makes for a string. A string may be of indefinite size |
Boolean | 1 Byte | A true/false value. Represented in bytes 0x30 for false and 0x31 for true |
2.2 Data identifiers
Serialization used in Automation does not have a strict structure or an order of data. This means certain identifiers need to be used to define how to read provided data. Identifiers are represented by single characters and say how to read data that comes right after them. The following table displays data identifiers you may encounter in Automation.
Identifier | Data | Description |
---|---|---|
T (0x54) | Table | Tables are dictionaries that contain all the data included in serialization. The identifier is followed by two Int32s which define how many items there are in the table. Only one of the two seems to ever have a value, thus the other one can be ignored. More information about tables can be found further below |
N (0x4E) | Numeric | The identifier is directly followed with Double value |
S (0x53) | String | Strings are used for defining keys and values of the table. The identifier is followed by an Int32 which defines the length of the string, which is subsequently followed by a number of UTF-8 characters, the amount of characters is equal to the size |
2.3 Tables
All serialized data is nested into tables. Tables are dictionaries. Dictionaries contain pairs of keys and values. In Automation’s serialization both keys and values can be of any type. There are 2 types of tables used by Automation that you may find:
- Top level, tables which are meant to be treated separately
- Nested tables, which are represented as a value of another table
Examples of top level tables include .car files and some items such as fixtures, tech pool, paints etc. Top level tables are not values by themselves and thus do not directly have a key, instead their identifiers are preceded by a byte of 0x01 (However they will technically have a key if they are stored as string values). In .car files, other than the main container itself, such tables are represented as string values.
Nested tables are values that have a key and they are not saved as strings. Examples of such tables may be found in .car files, where the data is split into such tables as models, trims, families, variants.
The identifier is followed with 2 Int32s, one of which always remains 0 and the other one defines the size of the table. Which one of the two is which varies but only one ever has a non-zero value.
The size definition of the table counts pairs of keys and values inside the table (1 pair = 1), not the byte size.
Conclusion
The provided information is the fundamental knowledge about Automation’s serialization which should be enough to start working with the data. All that is left is choosing a fitting programming language and making your own software to handle any of Automation’s serialized data. Unless you are still settled on manually editing everything in a hex editor, in which case, bless your soul.