I Started Development!

I did a few things, such as:

  • Drew some more concept art.

  • Created a window class for handling the window. Who would have guessed?

    window.pollEvents();
    window.clear();
    window.update();
    window.isOpen();
    
  • Implemented a chunk system to reduce memory usage and allow multithreading.

In order to save memory, chunks containing tiles with the same data are simplified into a UniformChunk.


Tiles, Padding, and Byte Alignment

Tile data is stored like so:

struct MaterialFraction {
    uint8_t material_id = 0;
    uint8_t mass = 0;
};

struct Tile {
    MaterialFraction materials[MAX_MATERIALS_COUNT];
    uint8_t materialsLength = 0;
    uint16_t temperature = 0; // measured in K (Kelvin)
    uint8_t totalMass = 0; // measured in KG (kilograms)
    uint8_t electricalConductivity = 0; // arbitrary measurement (0 - 100)
    uint8_t thermalConductivity = 0; // arbitrary measurement (0 - 100)
};

I use #pragma pack(push, 1) before the structure definitions and #pragma pack(pop) after to eliminate any padding between objects. This causes a minor performance decrease due to byte misalignment; however, the difference is negligible.

Memory usage per tile with and without padding:

Without padding: 26 bytes

With padding: 28 bytes

With 1 billion tiles, which is an extreme scenario, I would save 2 gigabytes of RAM!

Tile Performance Test

I was interested to see whether the difference in performance was noticeable on a large set of tiles, so I created a simple test script. It’s designed to test the speed at which my CPU can access the data contained within each tile.


const size_t NUM = 1'000'000;

Tile* tiles = new Tile[NUM];

std::cout << "sizeof(Tile): " << sizeof(Tile) << " bytes\n";

auto start = std::chrono::high_resolution_clock::now();

for (size_t i = 0; i < NUM; ++i) {
    tiles[i].totalMass += 1;
}

auto end = std::chrono::high_resolution_clock::now();
std::cout << "Elapsed time: "
          << std::chrono::duration<double>(end - start).count()
          << " sec\n";

delete[] tiles;

The results of running theS test script on my processor, an AMD64, and using a sample size of 1 million (a total sample size of ~26–28 MB):

Without padding: 0.00428 sec

With padding: 0.00428 sec

NOTE: CPUs like x86-64 handle misaligned 1–2 byte fields very well. Performance on other architectures may suffer.

Conclusion

While creating the underlying framework for Praeceptum’s game engine, I found myself putting serious thought into topics like efficiency and memory management. I enjoyed solving the unique challenges posed by Praeceptum’s engine (such as multi-material tiles). While the engine certainly isn’t finished, I now find myself at least a few steps in what is hopefully the right direction.

Thank you for reading, and enjoy the rest of your day!