Increasing game performance

I was given a save to test multicellular performance. In it at the lowest I got around 8-15 FPS (with normal entity limit and lowering the thread count and increasing cloud simulation interval to get cleaner profiling data). One thing that most jumped out at me was that the membrane point generation took a bunch of time, and because we have the membrane cache feature, letting the game run for 30+ seconds already got my FPS up to like the 30s and I saw even 60+ FPS at few points.

Here’s some profiling screenshots (notice how Godot engine takes 80% of the time so our even most intensive parts of the code take up just a few percent of the processing time):

Left side shows how our microbe processing code takes the most time.

Here’s the membrane radius being the most expensive part of compound absorbing:

But here another part of the profiling results show that detecting if a compound is useful (and a pow call), take up a bunch of time:

So we might get a tiny bit more performance if we didn’t use the pow calculation on this line:

var fractionToTake = 1.0f - (float)Math.Pow(0.5f, delta / Constants.CLOUD_ABSORPTION_HALF_LIFE);

Another thing to try might be to limit cells to absorb and emit compounds only 30 times per second.


Here’s the reproduction expanded, so growing organelles is taking surprisingly long time (well I guess it is pretty sensible as the game needs to loop through a ton of organelles to check if they are growing):

Reproduction updates is already limited to 20 times per second, but perhaps an approach where the previously growing organelle could be stored would improve performance. I opened an issue to track work on this:


Regards to the new engulfment mechanic it probably also should have a max rate it progresses at (especially it looks like it’s pretty expensive to upgrade the shader parameters for all of the organelles):

Opened an issue:


And here’s the last screenshot:

What surprised me a bit is that playing a sound effect takes so long, so we probably should have some kind of distance based sound effect cooldown for non-player cells.


Again, I tested the disabling organelle rendering and it seems to maybe give double performance initially when the game is very laggy, but then after that it is much less.

Disabled graphics:

And enabled:


So there doesn’t seem that many easy performance gains, though one also pretty radical idea (on top of the organelle rendering: Investigate if cell (organelle) graphics can be rendered using MultiMesh · Issue #3709 · Revolutionary-Games/Thrive · GitHub) I got was that what if we limited cells to process only 20 to 30 times per second? That way most of these expensive things would happen less often, but we could still keep the physics process happening the way it currently is to hopefully keep the gameplay feel the same.

1 Like

It just occurred to me, have we ever thought of reducing the physics FPS to lessen the CPU load? We’re making a game that arguably does not require highly-accurate and fast physics interaction taking place majority of the time so I’m quite certain this could boost the performance a bit while not massively affecting gameplay.

As of right now, the value is set at 60 times per second which is the default, I’m thinking we can lower this to 50 TPS (or maybe even 30 if we’re feeling adventurous). The cons here is that it seems lower physics FPS may result in some stuttering which fortunately can be counteracted with physics interpolation that Godot comes prebuilt.

More reading: Physics Interpolation — Godot Engine (stable) documentation in English.

One big drawback is that as our player movement is tied to physics, reducing the physics simulation rate will directly lower the responsiveness of the game to player input.

Way back with Leviathan I actually had the game set to simulate only 20 updates per second and smoothly interpolating between the simulated updates (this was actually using the same code as interpolating network received data for locally generated updates). I couldn’t really tell that anything was wrong and the game was perfectly fine for me, but Oliver and many other people complained about the really laggy feeling that made the game even unplayable for them. After that I fixed the problem by making the game simulate as many updates per second as it could (though there was still a fixed physics maximum rate which I set to 60 or maybe 75, can’t remember exactly).

Something we could do if we implement our custom logic is to make it so that physics simulation starts just when we have sent Godot data to be rendered, so we could probably entirely run the physics “for free” by running them while rendering is happening as the physics would probably be ready by the time the game has rendered a frame and is ready to simulate the next update.

Assuming this is due to the too low update rate, input delays might not even be noticeable at 30+ times per second update rate.

I think I’ll open a test PR sometime in the future.

I guess that might be the case, after all 30 Hz is 50% more than 20 Hz update rate…
You’ll definitely want someone to test who found 20 updates per second unacceptably laggy.

I’m currently working on a native code module for Thrive that includes an integration to the Jolt physics engine. As preliminary work I did a specific benchmark scene to validate that it is a good idea.

Here’s how that looks:

And here are the test results (note that the scene setup / rendering performance has been a bit problematic so these initial results are with just 64 microbe placeholder physics bodies at once. I plan on trying bigger tests next week):

Jolt single convex shape per microbe (UPDATE: may actually be the spheres case):

Physics time: 0.006744191 Physics FPS limit: 148.2758, FPS: 1
Physics time: 0.004203023 Physics FPS limit: 237.924, FPS: 30
Physics time: 0.002538978 Physics FPS limit: 393.8593, FPS: 30
Physics time: 0.001526676 Physics FPS limit: 655.018, FPS: 308
Physics time: 0.0006721547 Physics FPS limit: 1487.753, FPS: 361
Physics time: 0.0003296497 Physics FPS limit: 3033.523, FPS: 361
Physics time: 0.0001993665 Physics FPS limit: 5015.888, FPS: 360
Physics time: 0.0001796981 Physics FPS limit: 5564.891, FPS: 360
Physics time: 0.0001652893 Physics FPS limit: 6049.998, FPS: 360
Physics time: 0.0001341461 Physics FPS limit: 7454.556, FPS: 360
Physics time: 0.0001477947 Physics FPS limit: 6766.141, FPS: 360

Jolt combined shape from spheres:

Physics time: 0.0008753056 Physics FPS limit: 1142.458, FPS: 1
Physics time: 0.001244984 Physics FPS limit: 803.2229, FPS: 75
Physics time: 0.001221391 Physics FPS limit: 818.7385, FPS: 75
Physics time: 0.001119852 Physics FPS limit: 892.9752, FPS: 329
Physics time: 0.001057841 Physics FPS limit: 945.3221, FPS: 329
Physics time: 0.00107415 Physics FPS limit: 930.9684, FPS: 316
Physics time: 0.000895411 Physics FPS limit: 1116.806, FPS: 313
Physics time: 0.0009229564 Physics FPS limit: 1083.475, FPS: 313
Physics time: 0.0008093275 Physics FPS limit: 1235.594, FPS: 326
Physics time: 0.0006260973 Physics FPS limit: 1597.196, FPS: 326
Physics time: 0.0005499413 Physics FPS limit: 1818.376, FPS: 355
Physics time: 0.0005800998 Physics FPS limit: 1723.841, FPS: 355
Physics time: 0.0004281088 Physics FPS limit: 2335.855, FPS: 360
Physics time: 0.0005050105 Physics FPS limit: 1980.157, FPS: 360

Godot physics (Bullet) convex shape:

Physics time: 0.003198 Physics FPS limit: 312.6954, FPS: 1
Physics time: 0.004069 Physics FPS limit: 245.7606, FPS: 121
Physics time: 0.004069 Physics FPS limit: 245.7606, FPS: 121
Physics time: 0.003704 Physics FPS limit: 269.9784, FPS: 328
Physics time: 0.003704 Physics FPS limit: 269.9784, FPS: 328
Physics time: 0.002906 Physics FPS limit: 344.1156, FPS: 343
Physics time: 0.002906 Physics FPS limit: 344.1156, FPS: 343
Physics time: 0.001898 Physics FPS limit: 526.8704, FPS: 361
Physics time: 0.001898 Physics FPS limit: 526.8704, FPS: 361
Physics time: 0.001539 Physics FPS limit: 649.7726, FPS: 360
Physics time: 0.001539 Physics FPS limit: 649.7726, FPS: 360
Physics time: 0.001116 Physics FPS limit: 896.0574, FPS: 360
Physics time: 0.001116 Physics FPS limit: 896.0574, FPS: 360
Physics time: 0.001397 Physics FPS limit: 715.8196, FPS: 323
Physics time: 0.001397 Physics FPS limit: 715.8196, FPS: 323
Physics time: 0.000698 Physics FPS limit: 1432.665, FPS: 360
Physics time: 0.000698 Physics FPS limit: 1432.665, FPS: 360
Physics time: 0.001458 Physics FPS limit: 685.871, FPS: 360
Physics time: 0.000517 Physics FPS limit: 1934.236, FPS: 360
Physics time: 0.000517 Physics FPS limit: 1934.236, FPS: 360
Physics time: 0.000629 Physics FPS limit: 1589.825, FPS: 360
Physics time: 0.000629 Physics FPS limit: 1589.825, FPS: 360
Physics time: 0.001151 Physics FPS limit: 868.8097, FPS: 360
Physics time: 0.001151 Physics FPS limit: 868.8097, FPS: 360
Physics time: 0.000678 Physics FPS limit: 1474.926, FPS: 360
Physics time: 0.000678 Physics FPS limit: 1474.926, FPS: 360
Physics time: 0.000968 Physics FPS limit: 1033.058, FPS: 361
Physics time: 0.000968 Physics FPS limit: 1033.058, FPS: 361
Physics time: 0.001221 Physics FPS limit: 819.0009, FPS: 360
Physics time: 0.001221 Physics FPS limit: 819.0009, FPS: 360
Physics time: 0.000909 Physics FPS limit: 1100.11, FPS: 360
Physics time: 0.000909 Physics FPS limit: 1100.11, FPS: 360
Physics time: 0.000792 Physics FPS limit: 1262.626, FPS: 360
Physics time: 0.000792 Physics FPS limit: 1262.626, FPS: 360
Physics time: 0.001548 Physics FPS limit: 645.9948, FPS: 360
Physics time: 0.001548 Physics FPS limit: 645.9948, FPS: 360
Physics time: 0.000693 Physics FPS limit: 1443.001, FPS: 360
Physics time: 0.000693 Physics FPS limit: 1443.001, FPS: 360
Physics time: 0.000644 Physics FPS limit: 1552.795, FPS: 360
Physics time: 0.000644 Physics FPS limit: 1552.795, FPS: 360
Physics time: 0.000998 Physics FPS limit: 1002.004, FPS: 360

Godot physics (Bullet) combined spheres (currently the approach used in the game):

Physics time: 0.011677 Physics FPS limit: 85.63844, FPS: 1
Physics time: 0.006466 Physics FPS limit: 154.6551, FPS: 29
Physics time: 0.006466 Physics FPS limit: 154.6551, FPS: 29
Physics time: 0.00437 Physics FPS limit: 228.8329, FPS: 353
Physics time: 0.000726 Physics FPS limit: 1377.411, FPS: 360
Physics time: 0.000726 Physics FPS limit: 1377.411, FPS: 360
Physics time: 0.000945 Physics FPS limit: 1058.201, FPS: 360
Physics time: 0.000945 Physics FPS limit: 1058.201, FPS: 360
Physics time: 0.000448 Physics FPS limit: 2232.143, FPS: 360
Physics time: 0.000448 Physics FPS limit: 2232.143, FPS: 360
Physics time: 0.000415 Physics FPS limit: 2409.639, FPS: 360
Physics time: 0.000415 Physics FPS limit: 2409.639, FPS: 360
Physics time: 0.000362 Physics FPS limit: 2762.431, FPS: 360
Physics time: 0.000362 Physics FPS limit: 2762.431, FPS: 360
Physics time: 0.000373 Physics FPS limit: 2680.965, FPS: 360
Physics time: 0.000373 Physics FPS limit: 2680.965, FPS: 360
Physics time: 0.000396 Physics FPS limit: 2525.253, FPS: 361
Physics time: 0.000396 Physics FPS limit: 2525.253, FPS: 361
Physics time: 0.001432 Physics FPS limit: 698.324, FPS: 360
Physics time: 0.001432 Physics FPS limit: 698.324, FPS: 360
Physics time: 0.000391 Physics FPS limit: 2557.545, FPS: 360
Physics time: 0.000391 Physics FPS limit: 2557.545, FPS: 360
Physics time: 0.000474 Physics FPS limit: 2109.705, FPS: 360
Physics time: 0.000474 Physics FPS limit: 2109.705, FPS: 360
Physics time: 0.001289 Physics FPS limit: 775.7952, FPS: 360
Physics time: 0.001289 Physics FPS limit: 775.7952, FPS: 360
Physics time: 0.000606 Physics FPS limit: 1650.165, FPS: 360
Physics time: 0.000606 Physics FPS limit: 1650.165, FPS: 360
Physics time: 0.000629 Physics FPS limit: 1589.825, FPS: 360
Physics time: 0.000629 Physics FPS limit: 1589.825, FPS: 360
Physics time: 0.000572 Physics FPS limit: 1748.252, FPS: 360
Physics time: 0.000572 Physics FPS limit: 1748.252, FPS: 360
Physics time: 0.000557 Physics FPS limit: 1795.332, FPS: 360
Physics time: 0.000557 Physics FPS limit: 1795.332, FPS: 360
Physics time: 0.000478 Physics FPS limit: 2092.05, FPS: 360
Physics time: 0.000478 Physics FPS limit: 2092.05, FPS: 360
Physics time: 0.000465 Physics FPS limit: 2150.538, FPS: 361
Physics time: 0.000465 Physics FPS limit: 2150.538, FPS: 361
Physics time: 0.000934 Physics FPS limit: 1070.664, FPS: 360
Physics time: 0.000934 Physics FPS limit: 1070.664, FPS: 360
Physics time: 0.00044 Physics FPS limit: 2272.727, FPS: 360
Physics time: 0.00044 Physics FPS limit: 2272.727, FPS: 360
Physics time: 0.000498 Physics FPS limit: 2008.032, FPS: 360
Physics time: 0.000498 Physics FPS limit: 2008.032, FPS: 360
Physics time: 0.005362 Physics FPS limit: 186.4976, FPS: 353
Physics time: 0.005362 Physics FPS limit: 186.4976, FPS: 353
Physics time: 0.000357 Physics FPS limit: 2801.12, FPS: 360

Jolt single thread (instead of 2) convex:

Physics time: 0.00504744 Physics FPS limit: 198.1202, FPS: 1
Physics time: 0.004344175 Physics FPS limit: 230.1933, FPS: 66
Physics time: 0.002714122 Physics FPS limit: 368.4433, FPS: 66
Physics time: 0.001554479 Physics FPS limit: 643.3024, FPS: 317
Physics time: 0.0006783324 Physics FPS limit: 1474.203, FPS: 317
Physics time: 0.0003822733 Physics FPS limit: 2615.929, FPS: 360
Physics time: 0.0002402095 Physics FPS limit: 4163.032, FPS: 360
Physics time: 0.0002077179 Physics FPS limit: 4814.221, FPS: 360
Physics time: 0.0001857197 Physics FPS limit: 5384.457, FPS: 360
Physics time: 0.0001746437 Physics FPS limit: 5725.942, FPS: 360
Physics time: 0.0001679698 Physics FPS limit: 5953.45, FPS: 360
Physics time: 0.0001844394 Physics FPS limit: 5421.835, FPS: 360
Physics time: 0.000174993 Physics FPS limit: 5714.515, FPS: 360

Here’s my quick thoughts:

Turns out that our approach of using combined spheres to make microbe collisions is faster than convex shapes in Godot when nothing is colliding (0.000362 vs 0.000698). Whereas then when shapes are colliding a bunch the convex shape is much faster (0.006466 vs 0.003704).

In Jolt the combined sphere shape is much faster when there are a ton of collisions (0.0008753056 vs 0.006744191) but then it doesn’t reach the maximum performance of the convex shape (0.0004281088 vs 0.0001341461). So it seems a bigger test / testing with microbe colonies will be necessary to pick which is overall the better approach for Thrive: combined sphere collision shape or a convex shape (this disallows holes and concave parts of microbes).

And luckily for me, Jolt is faster (even when running with a single thread, and Jolt scales up to speed up at least to 8 threads). Funnily enough it seems the single thread mode of Jolt is faster when literally all of the bodies are colliding in a big clump, as this likely prevents parallel processing.

As a summary here’s the first frame (a ton of collisions) and then basically the top performance of each test to give a quick overlook of which approaches are good when there’s an absolute ton of collisions and which work well when collisions are rare:

Jolt single convex shape per microbe (UPDATE: may actually be the spheres case):
Physics time: 0.006744191 Physics FPS limit: 148.2758, FPS: 1
Physics time: 0.0001341461 Physics FPS limit: 7454.556, FPS: 360

Jolt combined shape from spheres:
Physics time: 0.0008753056 Physics FPS limit: 1142.458, FPS: 1
Physics time: 0.0004281088 Physics FPS limit: 2335.855, FPS: 360

Godot physics (Bullet) convex shape:
Physics time: 0.003198 Physics FPS limit: 312.6954, FPS: 1
Physics time: 0.000678 Physics FPS limit: 1474.926, FPS: 360

Godot physics (Bullet) combined spheres (currently the approach used in the game):
Physics time: 0.011677 Physics FPS limit: 85.63844, FPS: 1
Physics time: 0.000362 Physics FPS limit: 2762.431, FPS: 360

Jolt single thread (instead of 2) convex:
Physics time: 0.00504744 Physics FPS limit: 198.1202, FPS: 1
Physics time: 0.0001679698 Physics FPS limit: 5953.45, FPS: 360


Some non-microbe findings: Godot is pretty slow at rendering a ton of multimesh parts that all need to update constantly. Individual Godot nodes have pretty good frustrum culling and gives pretty nice FPS bonus. Need to investigate which is the optimal way to setup microbe graphics to move around based on Jolt computed data.


Edit: I just realized I probably mixed up the sphere and convex body creation for Jolt, so mentally flip the numbers.

1 Like

Even though I am not a programmer or understand much about how to improve game performance, I will say that my computer struggles in the Micro-multicellular stage likely due to all of the information it has to process. It also doesn’t help that my computer is older, not optimized for heavy game, and has no battery in it at the moment (lol)

Okay so I made a bit of an embarrassing mistake (I had a missing !) and I actually tested the opposite of what I wanted with Jolt. So the numbers for convex are actually for the sphere case and vice versa.

So turns out that sphere collision shapes are actually more efficient by about maybe 25-40% in terms of speed (when not literally all bodies are in a huge clump colliding).

Which is pretty nice as it will be a bit simpler to convert the current code to use Jolt when I don’t need to overhaul the order of operations (for convex generation the membrane shape is needed to be computed one frame earlier for collisions to be created on microbe spawn).

Also the physics speed seems to pretty much scale less than linearly with the number of spheres each collision consists of (performance is still fine with 60 random mutation steps as compared to 25). Even with 100 mutation steps the simulation still gets around 400 physics frames per second. Though, at this point it is starting to be the case that convex collisions are more efficient. So maybe microbe colonies is the point where the performance for sphere collisions explodes and convex shapes are needed?

I’ll give that some thought as I’ll now start trying to get the microbe benchmark (which requires most of the normal microbe stage logic to be working) working with the new physics.


Update: with Godot physics using spheres the 100 step massive test results in physics FPS limit of around 80-95, so a fourth of the performance. Using Godot convex bodies results in: 80-105 FPS (with it staying abit on average closer to the higher end). I think this close performance shows that it was not a mistake to originally design Thrive to create microbe collisions out of spheres rather than using a convex body.

So as things get bigger the gap between Jolt and Godot performance stays, and even gets bigger. I think this might give some clues as to why the multicellular performance is especially complained about as it might get almost linearly worse when you have 10 big cells glued together into one physics body.

2 Likes