Other than reducing the number of entities I can only think of using Godot’s Servers: Optimization using Servers — Godot Engine (stable) documentation in English
While I can’t make a fancy graph like this I could provide you with a rough estimate of the perfomance of Thrive on my system:
RAM: 16 GB DDR3
CPU: AMD FX-8320
GPU: AMD Radeon R7 250 2GB VRAM (Display Resolution: 1920x1080)
FPS Limit: 120
Clouds simulation: 250 ms
Auto-Evo during gameplay: OFF
Used Threads: 8
Main Menu and Patch Map Perfomance: 120 (usually it is around 211 FPS)
Auto-Evo Report: around 100 FPS probably due to the graphs
Editor: around 90 FPS
Membrane Tab: between 60-70 FPS
During gameplay for the current save the FPS is around 2-12. The save is created in Freebuild Mode and it’s on generation 16. I managed to increase the perfomance (probably) due to 5 or 6 extinctions which I managed to create in my patch. Currently it have 10 species with various populations (including mine of course) and the FPS is back to around 30-40.
And here are a few of the things which I wrote in Discord earlier today, I will just leave them as a quote:
I’m starting to get suspicious what causes the game lagging so much because the moment when I’m either saving or waiting for the auto-evo to finish, my FPS goes from 3-7 at generation 16 with a freebuild cell to 50-80:
- The rendering of the clouds probably have an good impact but does it really? At the beginning of the game you can easily see a lot of clouds and my FPS won’t drop so dramatically (49-to FPS at the lowest).
- The number of objects around me is probably way more FPS intensive and causes spikes and FPS drops but still around my cell the objects shouldn’t be over 20-30 at best. But then it came to my mind that every cell have 3D objects in it and the tip of he iceberg is that they (the organelles) are animated as well.
2.1. Does Thrive uses occlusion culling? This should solve most of the problems in my opinion?
2.2 Could we add an option to the Perfomance tab which disables the organelle animations?
2.3 When zooming out there is no point in animating the internal cell organelles anyway. If the animations are automatically disabled at a certain zoom level this would help for the perfomance of the game.
For the occlusion culling, Godot does not support that feature like other engines. The organelles shader could indeed be made much more performative.
I don’t think we use ambient occlusion, if you are talking about the render effect.
From my profiling results it doesn’t seem like that sending rendering commands to the GPU is the bottleneck, nor is waiting for GPU to process the current frame.
I meant occlusion culling, my bad. It’s an important method for really weak or older gpu but as you say, rendering isn’t the bottleneck.
I wrote about ways we could alternatively get around the performance problems, but these ideas are very hard and time consuming to implement:
Just spitballing here, would it be that bad to place an upper limit on the number of species that can spawn per patch?
I wonder how much extra benefit the gameplay receives if instead of 8 species in a patch you have 10, or 12, or more. Plus, we could introduce mechanics to cull underperforming species to free up “slots” for new species to diverge.
We now have an entity cap for how much stuff can spawn in general, and it can be cranked way down. Spawning depends on population so we should still get a somewhat good representation even with the tiniest limit of all species in the patch. I see limiting the number of species as a cruder alternative as their populations could get very high and spawn a ton of stuff if there is no global limit. And we already basically have species limiting per patch as in auto-evo adds huge penalties to species to keep the count from going above a configured limit (I think it’s maybe 10 or 12 currently).
Another spitball idea I had related to this:
It’s clear that bigger cells cause more performance issues, purely because they have more going on (more nodes, more compound processing, etc.). Currently though big cells count for exactly the same as small cells as far as the entity limit is concerned. Could we have some kind of weighting so that the bigger the cell, the more “entities” it represents? Obviously we’d have to change the nomenclature, and this will have severe gameplay implications since fewer big cells will spawn. I can also forsee a few other issues, such as the game refusing to spawn any cells at all if the entity cap is set at tiny but all possible species are too big. But I think it’s worth an investigation at least to see how it affects performance.
That’s certainly possible. Calling the option still “entity limit” makes sense as different kind of entities naturally have different performance impacts, so not having a direct 1-to-1 impact on some specific entity type is not a problem.
There’s already an issue open about making cells in colonies count differently (though I’m not a bit unsure if it should be a discount or actually a penalty):
If I remember right the spawn system, spawns stuff if the limit is not reached, so a single spawn is allowed to go over the limit, but then it should prevent entities from spawning (with the caveat that there currently seems to be a bug with this).
The entity limit seems to be the most critical to performance, due to this I’m confident we should also add this option to the new game settings up front. An accompanying tooltip or text should then let the player know about its benefit.
I don’t think it should be part of the new game settings, since it’s nothing to do with the world of a specific game and is instead relevant across multiple games, and can even be changed partway through a game. But I agree it needs to be more prominent at the start of a game. Maybe a pop up when opening the main menu for the first time prompting the player to visit the options menu to set it?
It’s just a matter of discoverability and I feel it’s is the most intuitive. This will be like those options in games where you can set them in the new game setup but also tells you that they can be changed later whenever. Having a pop up in the main menu can probably be jarring in my opinion.
I disabled organelle graphics. It only gave me an extra 10 FPS in this save (by lowering the number of draw calls by about 800 per frame):
So while rendering a bunch of individual (transparent) organelles does impact performance, removing that entirely does not fix the performance.
So Godot needing to sort and issue draw calls for the individual organelles doesn’t seem to be our biggest performance problem.
One other potential thing people said is that we should try to do something about the fact that each individual organelle is a sphere collision that is added to the overall shape of a cell. Basically what we could do is generate a mesh collision from the organelles and use just 2 shapes one for general and one for pilus collisions (or maybe keeping pilus collisions as is would be easier and not impact performance that much).
But I think it does add up so if we were to apply different combination of optimizations: multimesh + frustum culling + entity limit + reducing collision shapes + etc, etc. That would potentially give 20 or 30+ FPS increase which honestly doesn’t seem bad.
Previously in my engulfment revamp PR I made it possible to form a convex collision shape from the shape of the membrane, I think the code is still in the membrane code so it probably can be useful for this.
I think it would be definitely worthwhile to combine a set of several optimizations if together they produce a significant improvement in framerate. We only need to keep performance good until the end of the Microbe Stage, after which metaballs will be able to handle performance much better.
Was also thinking… I know it would be a painful decision, but do we need the reproduction system as it currently stands? Where every organelle is one by one split until you reproduce? If it takes a lot of computational power to constantly recalculate the size of cells, couldn’t we switch over to a system where cells simply acquire the required nutrients, and then once they have enough they duplicate all their organelles at once? It would mean only one calculation at the end of a cell’s life cycle, instead of constant recalculations every time he absorbs more nutrients.
It may not seem like much, but just 10 fps can make a world of a difference if you have particularly low framerates. Imagine running the game steadily at 20 fps; This change would bump you up to 30 or maybe more, making the game go from a slideshow to atleast bearable! 30 fps used to be industry standard you know.
As everyone else has said, every little bit can help.
Does that system even have an effect on gameplay? Everything should even out, except maybe cell speed.
It doesn’t take very long to simulate a new membrane shape currently. Only reason organelle splitting causes lag spikes is due to this issue:
Though, this’ll be an entirely different matter when we eventually get 3D membranes with much more expensive computations required.
Also there’s a membrane data caching feature now so each time a membrane is generated, the membrane is stored and can be loaded later. If we don’t care about extra memory use the membrane cache time can be increased to make all intermediate reproduction steps be found in the cache.
I suppose this is a fair point, but the reason why I’m chasing really large improvements is because these kind of small things will at most allow the game to have like 10 more cells in the game at once, so the performance gain that allows 80 cells instead of 70 seems actually pretty insignificant to me…
Yes. Currently the player has to survive on a partially duplicated cell. Switching to all duplication happening at once, basically needs to be made into a reproduction animation that is just played out instead of doing the current operations all at once (as that would cause big lag spikes if it all happens very quickly).
I was given a save to test multicellular performance. In it at the lowest I got around 8-15 FPS (with normal entity limit and lowering the thread count and increasing cloud simulation interval to get cleaner profiling data). One thing that most jumped out at me was that the membrane point generation took a bunch of time, and because we have the membrane cache feature, letting the game run for 30+ seconds already got my FPS up to like the 30s and I saw even 60+ FPS at few points.
Here’s some profiling screenshots (notice how Godot engine takes 80% of the time so our even most intensive parts of the code take up just a few percent of the processing time):
Left side shows how our microbe processing code takes the most time.
Here’s the membrane radius being the most expensive part of compound absorbing:
But here another part of the profiling results show that detecting if a compound is useful (and a pow call), take up a bunch of time:
So we might get a tiny bit more performance if we didn’t use the pow calculation on this line:
var fractionToTake = 1.0f - (float)Math.Pow(0.5f, delta / Constants.CLOUD_ABSORPTION_HALF_LIFE);
Another thing to try might be to limit cells to absorb and emit compounds only 30 times per second.
Here’s the reproduction expanded, so growing organelles is taking surprisingly long time (well I guess it is pretty sensible as the game needs to loop through a ton of organelles to check if they are growing):
Reproduction updates is already limited to 20 times per second, but perhaps an approach where the previously growing organelle could be stored would improve performance. I opened an issue to track work on this:
Regards to the new engulfment mechanic it probably also should have a max rate it progresses at (especially it looks like it’s pretty expensive to upgrade the shader parameters for all of the organelles):
Opened an issue:
And here’s the last screenshot:
What surprised me a bit is that playing a sound effect takes so long, so we probably should have some kind of distance based sound effect cooldown for non-player cells.
Again, I tested the disabling organelle rendering and it seems to maybe give double performance initially when the game is very laggy, but then after that it is much less.
So there doesn’t seem that many easy performance gains, though one also pretty radical idea (on top of the organelle rendering: Investigate if cell (organelle) graphics can be rendered using MultiMesh · Issue #3709 · Revolutionary-Games/Thrive · GitHub) I got was that what if we limited cells to process only 20 to 30 times per second? That way most of these expensive things would happen less often, but we could still keep the physics process happening the way it currently is to hopefully keep the gameplay feel the same.