I Started Development!
I did a few things, such as:
-
Drew some more concept art.
-
Created a window class for handling the window. Who would have guessed?
window.pollEvents(); window.clear(); window.update(); window.isOpen();
-
Implemented a chunk system to reduce memory usage and allow multithreading.
In order to save memory, chunks containing tiles with the same data are simplified into a UniformChunk.
Tiles, Padding, and Byte Alignment
Tile data is stored like so:
struct MaterialFraction {
uint8_t material_id = 0;
uint8_t mass = 0;
};
struct Tile {
MaterialFraction materials[MAX_MATERIALS_COUNT];
uint8_t materialsLength = 0;
uint16_t temperature = 0; // measured in K (Kelvin)
uint8_t totalMass = 0; // measured in KG (kilograms)
uint8_t electricalConductivity = 0; // arbitrary measurement (0 - 100)
uint8_t thermalConductivity = 0; // arbitrary measurement (0 - 100)
};
I use #pragma pack(push, 1)
before the structure definitions and #pragma pack(pop)
after to eliminate any padding between objects. This causes a minor performance decrease due to byte misalignment; however, the difference is negligible.
Memory usage per tile with and without padding:
Without padding: 26 bytes
With padding: 28 bytes
With 1 billion tiles, which is an extreme scenario, I would save 2 gigabytes of RAM!
Tile Performance Test
I was interested to see whether the difference in performance was noticeable on a large set of tiles, so I created a simple test script. It’s designed to test the speed at which my CPU can access the data contained within each tile.
const size_t NUM = 1'000'000;
Tile* tiles = new Tile[NUM];
std::cout << "sizeof(Tile): " << sizeof(Tile) << " bytes\n";
auto start = std::chrono::high_resolution_clock::now();
for (size_t i = 0; i < NUM; ++i) {
tiles[i].totalMass += 1;
}
auto end = std::chrono::high_resolution_clock::now();
std::cout << "Elapsed time: "
<< std::chrono::duration<double>(end - start).count()
<< " sec\n";
delete[] tiles;
The results of running theS test script on my processor, an AMD64, and using a sample size of 1 million (a total sample size of ~26–28 MB):
Without padding: 0.00428 sec
With padding: 0.00428 sec
NOTE: CPUs like x86-64 handle misaligned 1–2 byte fields very well. Performance on other architectures may suffer.
Conclusion
While creating the underlying framework for Praeceptum’s game engine, I found myself putting serious thought into topics like efficiency and memory management. I enjoyed solving the unique challenges posed by Praeceptum’s engine (such as multi-material tiles). While the engine certainly isn’t finished, I now find myself at least a few steps in what is hopefully the right direction.
Thank you for reading, and enjoy the rest of your day!