Core Types
u8
An unsigned 8-bit integer. It stores whole numbers in the range 0 to 255 (inclusive).
u16
An unsigned 16-bit integer. It stores whole numbers in the range 0 to 65,535 (inclusive).
u32
An unsigned 32-bit integer. It stores whole numbers in the range 0 to 4,294,967,295 (inclusive).
u64
An unsigned 64-bit integer. It stores whole numbers in the range 0 to 18,446,744,073,709,551,615 (inclusive).
Two's Complement
Two's complement is a method for representing signed integers in binary. In an n-bit number, the most significant bit is used as the sign bit: 0 for non-negative values and 1 for negative values.
Negative numbers are represented by inverting all bits of the positive value and then adding 1. This representation allows addition and subtraction to work uniformly for both positive and negative numbers.
s8
A signed 8-bit integer represented using two's complement. It stores whole numbers in the range -128 to 127 (inclusive).
s16
A signed 16-bit integer represented using two's complement. It stores whole numbers in the range -32,768 to 32,767 (inclusive).
s32
A signed 32-bit integer represented using two's complement. It stores whole numbers in the range -2,147,483,648 to 2,147,483,647 (inclusive).
s64
A signed 64-bit integer represented using two's complement. It stores whole numbers in the range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (inclusive).
int
The type int is just s64.
Floating Point Representation
Floating point numbers are represented using three components: a sign, an exponent, and a mantissa (also called the significand). These components are stored in fixed positions within the bits of the number.
The most significant bit stores the sign. The next group of bits stores the exponent. The remaining bits store the mantissa.
The value of a floating point number is given by:
((-1)^sign) * mantissa * (2^exponent).
The exponent controls the scale of the number, while the mantissa controls its precision. Because the number of bits is limited, many real numbers cannot be represented exactly and are instead rounded to the nearest representable value.
When a floating point literal (such as 3.14) is written in code, it is converted into this representation by an algorithm that determines the closest representable value using the available bits.
float32
A 32-bit floating point number represented using floating point representation. It provides approximately 7 decimal digits of precision.
float64
A 64-bit floating point number represented using floating point representation. It provides approximately 15–16 decimal digits of precision.
float
float is the default floating point type. It is equivalent to float32.
The Type Type
In Jai, you can write something like:
a: Type = int; b: Type = Video_File;
So what is the value that a and b actually hold? What is
int or Video_File when used as a value?
Type value is a pointer to a Type_Info struct.
When the compiler compiles your program, it creates exactly one
Type_Info struct for every distinct type in the program. These all live in the
program's data segment. There is one for int, one for float64, one
for Video_File, and so on. There are never duplicates.
Every time you refer to a type such as int, or Video_File, or type_of(x) you get back the same pointer to that type's single Type_Info.
Program Memory (data segment):
0xA000: Type_Info_Integer { runtime_size = 8, signed = true } - this is "int"
0xA040: Type_Info_Float { runtime_size = 8 } - this is "float64"
0xA070: Type_Info_Struct { name = "Video_File", members = ... } - this is "Video_File"
0xA100: Type_Info_Struct { name = "Audio_File", members = ... } - this is "Audio_File"
So when you write:
a: Type = int; // a holds 0xA000
b: Type = int; // b holds 0xA000
c: Type = Video_File; // c holds 0xA070
a == b is 0xA000 == 0xA000 which is true (same type)
a == c is 0xA000 == 0xA070 which is false (different types)
That's it. Every reference to int anywhere in your program resolves to the same pointer. Comparing two types is just comparing two pointers which equates to one machine instruction.
Why This Matters for Serialization
Because a Type is a pointer into the current process's memory, it is meaningless outside that process.
- You cannot send it over the network. The other machine has its own type info table at different addresses.
- You cannot save it to disk. When the program restarts, the type info table will be at a different address (due to ASLR, recompilation, etc.).
- You cannot share it between two different programs. They have completely separate type info tables.
If you have a struct with a Type field and you write its bytes to a file or send them over a socket, the Type field becomes a dangling pointer on the other side.
So if you need to serialize type identity, you must convert it to something stable yourself: an enum value, a string name, a hash, or some other identifier that both sides agree on. The Type value itself is purely a fast, in-process convenience.
A Solution to Serialization
In order to make serialization work we can do something like this
My_Type_Info :: struct {
type: Type;
struct_name: string;
type_id: u64;
};
types: [..] My_Type_Info;
add_type :: (types: *[..] My_Type_Info, type: Type) {
info := cast(*Type_Info) type;
assert(info.type == .STRUCT, "add_type only supports structs");
ti := cast(*Type_Info_Struct) type;
name := ti.name;
id := get_hash(name);
for types.* {
assert(it.type_id != id || it.type == type,
"Hash collision between '%' and '%'", it.struct_name, name);
if it.type == type return; // already registered
}
array_add(types, .{type, name, id});
}
get_type :: (type_id: u64) -> Type, bool {
for types {
if it.type_id == type_id
return it.type, true;
}
return Type.{}, false;
}
get_type_id :: (type: Type) -> u64, bool {
for types {
if it.type == type
return it.type_id, true;
}
return 0, false;
}
How to Use This
First, at startup, register every struct type that you intend to serialize. Both the sending side and the receiving side must register the same types:
init_types :: () {
add_type(*types, Video_File);
add_type(*types, Audio_File);
add_type(*types, Executable_File);
}
When you need to serialize a document (for example, to send it over the network or write it to a file), convert its Type field into a u64 ID using get_type_id. Write that ID to your output instead of the raw Type pointer:
serialize_document :: (doc: *Document) -> [] u8 {
id, ok := get_type_id(doc.type);
assert(ok, "Tried to serialize an unregistered type");
// Write id (a u64) to your byte stream instead of doc.type.
// Then write the rest of the struct's fields as normal.
// ...
}
On the receiving side, read the u64 ID back and convert it to a Type
using get_type. You now have a valid in-process Type pointer that
you can assign to the struct's type field:
deserialize_document :: (bytes: [] u8) -> *Document {
// read the u64 type id from the byte stream.
id := read_u64(bytes);
type, ok := get_type(id);
assert(ok, "Received an unknown type ID: %", id);
// allocate the correct struct and set its type field.
doc := cast(*Document) alloc(size_of_type(type));
doc.type = type;
// Read the rest of the fields...
// ...
return doc;
}
The important rule is: both sides must call add_type for the same set of
types. Because the ID is derived from a hash of the struct name, as long as both sides
have a struct with the same name, the IDs will match regardless of where in memory each
program's type info table lives.
If you add a new type later, just add another add_type call on both sides. If you
rename a struct, the hash changes, and the old ID becomes invalid, which is usually what
you want, since a renamed struct likely has different semantics. If you need to survive renames,
consider hashing the struct's members and layout instead of (or in addition to) the name.