In Jai, you can write something like:
a: Type = int; b: Type = Video_File;
So what is the value that a and b actually hold? What is
int or Video_File when used as a value?
Type value is a pointer to a Type_Info struct.
When the compiler compiles your program, it creates exactly one
Type_Info struct for every distinct type in the program. These all live in the
program's data segment. There is one for int, one for float64, one
for Video_File, and so on. There are never duplicates.
Every time you refer to a type such as int, or Video_File, or type_of(x) you get back the same pointer to that type's single Type_Info.
Program Memory (data segment):
0xA000: Type_Info_Integer { runtime_size = 8, signed = true } - this is "int"
0xA040: Type_Info_Float { runtime_size = 8 } - this is "float64"
0xA070: Type_Info_Struct { name = "Video_File", members = ... } - this is "Video_File"
0xA100: Type_Info_Struct { name = "Audio_File", members = ... } - this is "Audio_File"
So when you write:
a: Type = int; // a holds 0xA000
b: Type = int; // b holds 0xA000
c: Type = Video_File; // c holds 0xA070
a == b is 0xA000 == 0xA000 which is true (same type)
a == c is 0xA000 == 0xA070 which is false (different types)
That's it. Every reference to int anywhere in your program resolves to the same pointer. Comparing two types is just comparing two pointers which equates to one machine instruction.
Why This Matters for Serialization
Because a Type is a pointer into the current process's memory, it is meaningless outside that process.
- You cannot send it over the network. The other machine has its own type info table at different addresses.
- You cannot save it to disk. When the program restarts, the type info table will be at a different address (due to ASLR, recompilation, etc.).
- You cannot share it between two different programs. They have completely separate type info tables.
If you have a struct with a Type field and you write its bytes to a file or send them over a socket, the Type field becomes a dangling pointer on the other side.
So if you need to serialize type identity, you must convert it to something stable yourself: an enum value, a string name, a hash, or some other identifier that both sides agree on. The Type value itself is purely a fast, in-process convenience.
A Solution to Serialization
In order to make serialization work we can do something like this
My_Type_Info :: struct {
type: Type;
struct_name: string;
type_id: u64;
};
types: [..] My_Type_Info;
add_type :: (types: *[..] My_Type_Info, type: Type) {
info := cast(*Type_Info) type;
assert(info.type == .STRUCT, "add_type only supports structs");
ti := cast(*Type_Info_Struct) type;
name := ti.name;
id := get_hash(name);
for types.* {
assert(it.type_id != id || it.type == type,
"Hash collision between '%' and '%'", it.struct_name, name);
if it.type == type return; // already registered
}
array_add(types, .{type, name, id});
}
get_type :: (type_id: u64) -> Type, bool {
for types {
if it.type_id == type_id
return it.type, true;
}
return Type.{}, false;
}
get_type_id :: (type: Type) -> u64, bool {
for types {
if it.type == type
return it.type_id, true;
}
return 0, false;
}
How to Use This
First, at startup, register every struct type that you intend to serialize. Both the sending side and the receiving side must register the same types:
init_types :: () {
add_type(*types, Video_File);
add_type(*types, Audio_File);
add_type(*types, Executable_File);
}
When you need to serialize a document (for example, to send it over the network or write it to a file), convert its Type field into a u64 ID using get_type_id. Write that ID to your output instead of the raw Type pointer:
serialize_document :: (doc: *Document) -> [] u8 {
id, ok := get_type_id(doc.type);
assert(ok, "Tried to serialize an unregistered type");
// Write id (a u64) to your byte stream instead of doc.type.
// Then write the rest of the struct's fields as normal.
// ...
}
On the receiving side, read the u64 ID back and convert it to a Type
using get_type. You now have a valid in-process Type pointer that
you can assign to the struct's type field:
deserialize_document :: (bytes: [] u8) -> *Document {
// read the u64 type id from the byte stream.
id := read_u64(bytes);
type, ok := get_type(id);
assert(ok, "Received an unknown type ID: %", id);
// allocate the correct struct and set its type field.
doc := cast(*Document) alloc(size_of_type(type));
doc.type = type;
// Read the rest of the fields...
// ...
return doc;
}
The important rule is: both sides must call add_type for the same set of
types. Because the ID is derived from a hash of the struct name, as long as both sides
have a struct with the same name, the IDs will match regardless of where in memory each
program's type info table lives.
If you add a new type later, just add another add_type call on both sides. If you
rename a struct, the hash changes, and the old ID becomes invalid, which is usually what
you want, since a renamed struct likely has different semantics. If you need to survive renames,
consider hashing the struct's members and layout instead of (or in addition to) the name.