Thinking about AI

This is a copy-paste of a slack discussion which we can now continue here.


I was watching this

And thought it might make a more interesting starting point for AI than a Finite State Machine. If we made a simple neural network for each species which had inputs (compound concentration in the environment and in the microbe, certain points on the cell wall that could sense by touching) and outputs (whether to move or release agents) then we could have an AI that evolves with the species it is in.

So I guess how it would work is every time a member of species A spawns while the player is playing it gets a variant of the current generation of neural network for that species. It’s fitness then gets rated (did it die, did it gain resources, etc) and once that generation of variants has been run through the best get selected and mutated for the next generation.

If we could get it to work well then it would make for an interesting play experience for the player as the other microbes would learn to act more cleverly around you can fight you better over time.

It would also be a good foundation for similar systems when brains are available.

There are issues. Like in the Mario video the level is the same every time but in the microbe stage the spawn conditions are different every time so that might throw off the learning. Also is it al all reasonable to give a microbe a neural network? Presumably when a real microbe “decides” it’s a chemical network of some kind.

We could build a petri-dish to evolve the ai in where the conditions are the same every time to give it a head start if we needed that.

What do you guys think? Interesting or a waste of time?

I guess in terms of workload making a finite state machine means we have to think a lot about the situations the microbes get in and if we change the game (add new biomes or new compounds) then the ai will need to be changed. If we make something dynamic like this it will work in all conditions.


Cool idea! Given that intelligence is a major evolutionary trait anyway, it’d give the player more of a challenge as well as representing another level of fitness beyond just body functions. My only worries are that such a system would have to tie in with population dynamics which would involve evaluating the comparative successes of behavioural algorithms between species, and that AI would have to be somewhat advanced to start off with and balancing with generation time might prove difficult. The latter could partially be solved by either your petri dish idea, or just giving the AI a simple base to start with (including basic commands like swim away from predators and swim towards compounds) and letting it develop in-game from there.

Player interaction could also play a part, since they’ll be creating their species’ own behavioural system themselves, using similar inputs and outputs with generation-based iterations. There was actually a discussion about this recently on the fan forums of all places which brought up some interesting points:


We’ve actually discussed this before, in this thread, i think:

That’s definitely not to say we shouldn’t discuss it more now - that was a long time ago, long before we had a game which needed AI, and the exact form of AI proposed was a little different. I’m also not going to suggest everyone reads it, as we’re likely to come up with fresher ideas if we start from scratch, but i thought I’d link it for reference. On a side note, that thread was the reason I was asking after Daniferrito a while back, as I kind of wanted to reopen the AI discussion, prompted by the fan forum thread oliver linked.

That aside - for some of the reasons mentioned by tjwhale, and many besides, I’ve always wanted to avoid using a finite state machine, in any way possible. They’re too rigid (especially when we don’t know what most of the creatures in our game are even going to look like, let alone do), rely too much on the imagination of the programmer (i.e.: they’re effectively hard-coded behaviour, something every programmer knows is a bad idea), and are very hard to make interesting.

The question then, is what’s the alternative? The obvious answer is that it needs to be adaptable, and ideally somewhat evolvable, which gives plenty of options. Neural networks are a fairly simple option, and I think what Dani was describing in his thread was a more direct alternative, where rather than evolving a neural structure which responds to simple stimuli, you evolve a function which reacts to multiple stimuli. The end result is similair (one or the other is probably better suited to our application, though I’m not sure which, and I’ll leave that question for another time) - a system which attempts to survive/do well given a situation/stimulus, is selectively bred based on it’s performance, and eventually develops an optimal response to that stimulus.

The problem (if there is one) is the amount of stimulus needed to evolve a reasonable AI. As mentioned in the video, it took 24 hours to create the solution seen about, and in my own experiments it can take 5-6 hours of solid simulation to get solutions to even much simpler challenges. We don’t have time for that in game, and even if we did, the population of each creature active at any one time will be tiny compared to the numbers required for effective evolution (my 5-6 hours experiment involved a population of around 500, with generations lasting no more than a minute). On top of that comes the sheer complexity of the game, with many potential stimuli both outside (sight/smell/harm/compound gradients/temperature/light…) and inside the cell (compound levels, existing damage, toxin effects), and potential responses. Finally we have the fact that the whole situation isn’t stable, with multiple completely different scenarios possible over the course of one lifetime (scavenging for food when young and weak, avoiding predation, attacking weaker cells, scavenging for rarer nutrients, avoiding hostile environments, etc, etc.), let alone the fact all of this could change drastically both when the species itself evolves, and every other species around it.

In short, no, we can’t just use a neural network, it just wouldn’t produce good, or interesting, results. But rather than this being the end of the discussion, I think it’s the start. We would like to use an adaptable system, but what? Does one exist we could just use, or can we adapt an existing one to work in our situation (which is relatively unique in terms of it’s constraints - i.e.: we want relatively real-time results). This is a discussion I’ve been wanting to have for years, but it was never the right time to start it.

I’d suggest we start by propping up what the AI can do, as oliver suggests above, we give it basic behaviours to start with (“swim away from hostiles”, “swim towards food”), but we need to go beyond this. I’m not sure how exactly, but now might be a good time to start thinking about it. Can we modularise behaviour? i.e. rather than a neural net where a neuron fires only if it recieves above a set stimulus, can we have more complicated behaviour for each neuron? I have some further ideas, but I’d like to hear which direction you guys think we should go first?

A lot of the above concerns also apply to the behaviour editor, if its too complicated to evolve a solution to AI behaviour, it’ll be too difficult/tedious for most players to do the same for their species.

I was talking to a friend of mine about this and we came to the idea of an abstract brain that works for all creatures in all stages (up until Society or so). If we can make something like that then it would be pretty powerful. However it would need to be quite abstract.

One way of doing it might be to give a weight to EVERYTHING on the screen all the time. So when you are a microbe you might see another microbe and that microbe might have a pilus. The pilus would give you a -5 but the microbe might give you a +3 as food so overall you would know to avoid it. Maybe when you are hungrier you get a +10 as food so you attack.

That way each species could evolve what it’s weights were (feeding at night or feeding in the day, for example) but the general process of how behaviour would be calculated from the weights would be the same for all creatures in all stages. That way we build it once and then it’ll always work.

I know this is quite abstract, I haven’t got a more specific example as yet of how this would work, just wanted to float it.

I spent some more time thinking about this. I was thinking that kind of what we want is for each creature to have a relationship with the other creatures on the screen and with it’s environment and then choose what to do based on that.

So if you are a microbe alone on the screen (which will never happen as the ai will only be around when the player is around but bear with me) you will decide if you like the concentration of compounds in the area. If you do you will hang around and if you don’t you will move off. That’s your relationship with the environment.

Then say another microbe comes on the screen. You want to be able to judge what sort of relationship to have with the other microbe. Do you want to flee from it, do you want to fight it, do you want to mate with it etc. How you decide that could be based solely on what it looks like (as you have no memory). You could see it has spikes -5 and agents -5 but looks tasty +3 and so you decide it’s an enemy and you should flee.

Maybe you see it and it’s got no spikes or agents and it can photosynthesize so you decide it’s something you might like the hang around so you decide to float around near it.

If multiple microbes are on the screen you make a relationship with each one and then decide which relationships are the most important to you. So maybe when a predator comes you all flee. Or maybe you like sticking together with your friends so much that even when predators come you still sit tightly together.

If you are a more complicated creature (like a deer for example) you could have more information. So memory gives you some information about past interaction with that creature. (That way when you improve your brain you could get genuine ai improvements). So if you get better eyes you get more info about something from far away. If you rely on smell you can know to leave an area if predators have been there recently etc.

The weights for different factors. (spikes is -3, remembering a kill is -20 etc) could evolve over time with the species.

And, more importantly, the relationships could be procedural too. So rather than us defining “flee” as a relationship it could be evolved by the organism itself. A bit similar to the blocks based behaviour idea on the fan forums.

So you are floating around and you see a microbe, you add up it’s component weights and decide to execute relationship X which causes you to move away from it as fast as you can while spitting out agent. You don’t know what happened. You just know that those inputs caused that behaviour in you.

The player could have access to their own species relationships and could edit them manually which would be quite cool. Other species would have theirs tested and evolved over time.

I don’t know if this allows too much room for stupid behaviour but we could test it and maybe petri-dish it for a while.

I think this system might be flexible enough to work all the way up until the society stage. A cheetah is basically making these relationship decisions with it’s environment and neighbours just like a microbe is.

I’m interested in feedback, what do you guys think?

Giving a specific weighting to positive and negative influences has been the plan for a while I think, but adaptively changing their values is a new idea, and I love the sound of it. It gives a clear metric for an evolutionary AI system to modify, which should be more effective than just letting it run free given an environment. If the initial state is based on basic actions adapted to fit the new system (if a microbe has a level of -1 or less, swim away), then we can ensure the behavioural evolution at least has a sane starting point. Random initial states would lead to species repeatedly charging into danger and becoming extinct straight away, giving the player an unfair advantage.

You mention about how the player might be bound by this system, but that idea doesn’t really make sense or sound fun to me. Like bodily evolution, behavioural change can be thought of as three sections: the player’s individual, the player’s species and other species.

Each one is influenced by the player in different ways. The player’s individual should be directly controlled at all times (in my opinion at least), just like how the player edits an organism’s structure manually. At this level it’s the player trying to survive with an attachment to one particular organism.

I think of the player’s species as a product of indirect control - the player designs it to survive as well as possible without their intervention, just as the player would with the blocks-and-wires behaviour editor. It might not be a good idea to allow the player’s species to develop mentally as the AI would. Success or failure should be tied to the player’s skill at adapting to changes in the environment and NPCs.

Which brings me to the third section: other species. The evolutionary processes of these organisms, whether mental or physical, are the player’s adversaries. They definitely should have neural networks of some kind, with Auto-Evo determining their progression. This would give the player the impression that they’re competing with the environment, and have to keep up with evolutionary change to survive.

Yeah I agree that it’s a good idea to give the ai some freedom to adapt but not let it create insane behaviours.

I also agree the player should control their own individual completely. And that they can have some influence over the ai of their own species. Though that should probably be optional (as in you could automate it if you needed to).

So what your talking about is heading in a similair direction to what I was thinking. Rather than have a pure neural network where an output could be as simple as ‘wave left flagella’, you get more complex verbs like ‘flee microbe x’, as well as more complex senses like ‘target microbe has spikes’ rather than simply ‘target mass > x in direction y’.

I’m hoping that we can go a few levels of abstraction further, though I’m not sure how. Perhaps it’s a simplification if we replace ‘flee microbe x if score is < -5’ (with the weights you describe spikes = -3, toxin = -6), with ‘flee all spikes with weighting 3’ and ‘flee all toxin producers with weighting 5’. Perhaps not?
What I am fairly sure about is that we can use boids as basic building blocks. Behaviours like ‘approach’ can be weighted with positive scores for food/friends and negative scores for danger, and we can build custom behaviours for interactions with compound gradients, and combat itself.

That’s a more basic level than what you’ve been thinking about though (?) - given a set of senses to use, and a set of responses you can trigger, how do you link them up? In principal I agree with what your saying: everything that can be sensed is given some sort of weighting (which may be evolvable), and apply that to potential responses; probably with different sense-response combinations having different weightings.
Given a sensing range, there’s certainly enough information available for a microbe to choose which nearby cells are threats or food, pick the most important, and move appropriately. Whether you do something beyond simply moving towards/away (such as using offensive/defensive agents) is potentially a separate choice, or may be heavily linked to the first decision. Neural networks give more chance for interactions like that, individual weighted functions less so, though they also don’t exclude that possibility.

I definitely like the idea of these weights being modified by the cells status - being more aggressive when hungry for example - as well as environmental stimulus (time of day, temperature).

[quote=“tjwhale, post:3, topic:60”]
So if you are a microbe alone on the screen (which will never happen as the ai will only be around when the player is around but bear with me) you will decide if you like the concentration of compounds in the area. If you do you will hang around and if you don’t you will move off. That’s your relationship with the environment.[/quote]
This is the sort of thing we need to make possible, and again there are several approaches, either we can have a finite state machine detect no other cells present and switch into ‘resource gathering’ mode (no good, for all the reasons in the OP); or we could define some sort of hierarchy of behaviours, with interrupts: ‘gather resources’ is the least urgent default action, which is interrupted by the much more urgent ‘escape predator’. The third option fits my preference for continuous behaviour: microbes are always attempting to gather environmental resources (with a low weighting), and they are always running from predators (with a high weighting), so that when there is no predator present, resource gathering is the dominant driver of behaviour, but that becomes negligible when one appears.

Feedback from other cells of your species, or your internal state, can be added into any of the above systems. I’ve experimented with this before, but it’s incredibly difficult to balance manually (i.e.: having ‘run away’ be strong enough when needed, without being overpowering when not, is pretty tough; if you try to add a tendency not to run if in a herd, you’ll probably destroy the original balance, and have trouble both restoring that, and getting good responses in herds which are/aren’t big enough to provide defense). My experiments used almost exclusive linear weightings and (multiplicative) modifiers, so it’s probably that a more intricate control system (such as a neural net, or possibly logarithmic weightings/modifiers) would work better, and that a learning AI would be better at tuning the values - that’s no guarantee though.

The other massive questions is just what senses are/should be available? Can a microbe detect whether another cell has spikes, or an internal toxin organelle? Can it even detect its presence without touching it? Compound gradients will give some things away (it’s presence, leaked toxins from an organelle maybe, some idea of its size), but not everything (note that I’m not suggesting we use this information explicitly, we don’t want to be sampling compound levels for AI behaviour, but we can decide what is reasonably knowable from such information and implicitly sense that). This is something we need to discuss in more detail, but it potentially simplifies the available information, and the resulting behaviours, significantly (for the cell stage). It’s too late at night to go into this further now, so I’ll come back to it next time.

I suggested (on slack) that the players species behaviour should follow the same system as the AI’s, so that the building blocks available to the AI would be those available in the behaviour editor. I didn’t mean that the player’s cell should ever behave autonomously, though that might be interesting (in the sense that the player could build something similair to a macro used in an mmo game).

I’m still concerned that we simply won’t have enough replicates available to really train an AI, regardless of whether we use a neural net or something else - that’s still the main problem we need to solve.

I guess one thing that simplifies this a lot, for the microbe stage at least, is that we only have 2 outputs, move and spray agent (also spray which agent? but I think that is a simple choice, and that depends what we want to do about types of agent).

We could go with a sum of simple, 1d, dynamical systems. So basically you have a chart that looks like this

So this is for a friend. When you are at point A you are too close and want to move further away (x’ > 0). When you are at B you are the right distance and are still. When you are at C you are too far and want to move closer, when you are at D you are too far to sense them well and don’t really care.

This would be for a foe

When you are at A you want to run away fast and when you are at B you are releatively chilling.

You could have one of these for each microbe on the screen and the sum of them is the direction you move in.

Choosing whether to spray agent is a relatively simple binary choice. If you are within the agent effective range then you should spray if your relationship with the other microbe is hostile.

Moreover we don’t have to specify what these charts are for, they could be the relationships, and we don’t have to specify what they look like, they could evolve over time.

So for example, you have 3 possible relationships with other microbes, A, B and C. Your relationship to each microbe on the screen is determined by their scores in certain catagories. So your calculations look like this

A: spikes +5, agent +4, total score = 9
B: is not of same species, total score = -5
C: tastiness is low, -3, spikes -5, agent -10, total score -18

Therefore you choose to have relationship A with that microbe. Now we know that you have determined they are a predator and not a buddy or prey. But that doesn’t have to be explicit.

The A has the foe graph above combined with spraying agent if you are in range.

Then another microbe comes on the screen and the calculations are

A: spikes +5, agent +4, total score = 9
B: is of same species, total score = +20
C: tastiness is low, -3, spikes -5, agent -10, total score -18

and so you have relationship B with it. Which means you have the friend graph about and hang around with it.

The relationships you have, what factors they are based on and what your movement graph looks like could all evolve over time.

So maybe this species evolves and adds catagory D which identitfies cool other species to treat as friends, that maybe are photosynthesizers with good defences and it likes to hang around but firmly out of agent range. Stuff like that.

I guess really this is a sort of procedural FSM because you are picking from a finite set of relationships but with dynamically generated parameters.

To discuss some of your points.

It would be good if we could be realistic about what the microbe is able to sense. However I think we may end up writing things like

if the enemy has an agent gland then make it leak agents and have those agents detected by microbes that are close enough and then they know the enemy has agents

or we could just shortcut and let you know if they have agents if you are close enough.

I guess we are missing the level where you might think “I am being poked, I don’t like being poked, I should move away” and that might lead to some interesting non-linear behaviour. Like a group of microbes gets together and one sprays some agent and suddenly they all start spraying agent and moving wildly, which might be quite interesting.

It may be the case that there isn’t enough time to train the microbes. I guess we could pre-evolve the system so when you start playing they all have reasonable behaviours and then as auto evo makes slow changes the behaviours adapt to keep up.

There will be quite a lot of interactions in the game (you will interact with a few hundred or a few thousand microbes every life) and though that may not be enough to train from scratch it is a lot of good info.

This is what I meant when I said implicitly, rather than explicitly, sensing toxin organelles and similair details. We, as developers, decide that something can be sensed due to leaks and compound gradients, but we allow the AI to sense it simply if it’s close enough.

I’m intrigued by the idea of using response functions, as a layer on top of simple decisions - it adds a lot of parameters to optimize (your effectively using polynomial functions with 3-4 parameters each), but it allows some relatively smooth adjustments to behaviour, which should be a good thing. Adding the response functions to different targets together has a similair result to what I was thinking, but with a little more detail. The only part I’m not convinced by is how you choose which function to apply (or whether we should even be choosing). A more continuous method would be to give each possible input (spikes, toxins, same species) it’s own function and sum those up per target - depending on the ratio of inputs to responses that could mean many more parameters, or about the same.

I also think we have a few more possible outputs than just movement and toxins - or rather I hope we will before we call the microbe stage done. A few examples might be some sort of charge/sprint, an attempted engulfment, the choice of toxin, possibly releasing some beneficial toxin to help yourself, or nearby friendlies. At some level we might also have control over what the compound system is doing, by switching off some processes, though that could work automatically as process rates scale with the availability of compounds.

The problem with having a response to features is that you need to have several different modes of operation.

So if you are hungry and you think something is food it doesn’t matter if it has spikes or agents, you want to attack it, even if it hurts you to do so.

Conversely if it is a predator you want to flee.

If (later in the game) it is a potential mate then all of it’s features other than that you might mate with it are basically irrelevant.

It’s not true that as you get hungrier you want to move closer to your predators. Or if your mate develop spikes you want to move further away from them.

So that’s why I think we need some sort of mode switching.

Hey guys,

Here is what I was thinking for how the AI could be implemented, which I think is similar to what @Seregon was proposing. I think the weights that we give to different attributes of both the microbe (Hunger, Want to reproduce, etc.) and what it sees in others around it (Spikes, Toxins, Attraction to reproduce) should change as a function of some input. i.e. Hunger changes as a function of time, but spikes are a function of distance. Spikes can’t hurt me if I’m far, but their danger increases as I get closer. These attribute functions should all have a minimum of 0 and then some maximum that depends on the organism itself, which can be varied as the organisms evolve. So a skiddish microbe will run away even if the spikes are far. A ‘horny’ microbe might forgo eating and even risk getting close to an enemy if it means reproducing with a nearby mate.

As the time and distance change the output of each attribute function change but instead of adding them all up to get a final score the organism just listens to the one that is highest. I made a short simulation of this that contains one microbe that moves (blue), one enemy microbe (yellow x) that actively follows our AI, one mating microbe (red) that is stationary and a food source (purple). For simplicity if the enemy catches the AI I just reset the enemy location.

Short AI simulation

if you watch the video, the AI’s desire to hunt and reproduce increase as a function of time, with hunger increasing faster than randyness. Its desire to flee, however, changes as a function of distance from the enemy. At first the enemy is close and its hunger and randyness are low, so it flees. Eventually though the AI gets so hungry that it no longer cares about fleeing and simply needs to hunt. When it reaches its food source it eats, so its desire to hunt goes to 0. But now it wants to mate way more than it wants to run away so it goes straight for its mate. After it does its deed there are no predators and its hungry so it hunts again.

This whole thing might get more complicated as we add in other attributes it has/ can sense. But I think that by not adding up all the attributes to get a final score we avoid the problem that @tjwhale describes, where you might not reproduce with a mate with spikes because the total score says it’s bad. In this case you just stay away until you really need to mate. Also, mating and hunger don’t necessarily have to be a function of time only. It can be a function of distance and time. If I’m not too hungry but an easy meal comes along maybe my desire to hunt shoots up. Maybe I’m not out looking for a mate, but if one passes by…

One more thing. I think making individual relationships with microbes on the screen can cause problems if we aren’t careful. If a weak enemy comes on screen, you say it’s an enemy but not enough of a threat to run away so you stay. If a second one comes on screen you reach the same conclusion. Now if 10 are on the screen, individually they are all weak so no one enemy results in you fleeing so you still stay. However, its probably in your best interest to run since there are so many of them. The mechanism I’m proposing you have a ‘desire to flee’ function, which is a result of all the enemies on screen. So 1 strong enemy is equivalent to the sum of many weak enemies. So actions are not based on individual relationships, rather they are based on the environments influence on the microbe’s state of mind (it’s current desire).

Cool post. I really like that it came with a prototype, that’s great!

So ultimately is what you’re suggesting some sort of procedural finite state machine?

So you have a “desire to flee” function. And that looks around the screen and adds up all the things it can see which contribute to that. So you see 5 microbes all with spikes and that gives you a score of +10.

Then you have a “desire to feed” function and you look around and see what is edible around you and you look at your own internal variables and that gives you a +8 so you choose to flee. Have I understood you right?

If so do you think these behaviours should be explicitly defined by us (when you flee move away from the enemy in a straight line etc) or do you think they should be procedural and trained?

What I’m thinking right now is that maybe it’s best to have a set of behaviours with abstract reward functions defined by us. Then when a microbe gets into that state it performs a string of behaviours which are procedurally generated from a general pool. Then those behaviours get mutated and the best ones are chosen.

Ok that’s pretty complicated so an example.

So I am a species of microbe and this is a list of all the possible things I can do

  1. fire agent 1-10
  2. move in direction X
  3. maintain distance from microbe Y

So I have 6 behaviours (which all microbes have, which we gave them in advance), flee, fight, float, flock, graze and hunt.

I look around the environment and decide what behaviour to do, I see a lot of things with spikes and so I choose to flee.

Under flee my current instructions are fire agent 3 and maintain distance 10 from all other microbes. I execute these and see how successful I am.

Over time we evolve BOTH what features I use to decide what behaviour to perform AND what actions I perform when I am doing that behaviour. However the list of behaviours I have is fixed and defined in advance by us.

Does that make sense? What do you all think?

I think that would be enough evolution to be interesting but it’s structured enough that it could be trained quite quickly. Moreover we could offer seeds for each behaviour which would be starting points which would be reasonable. Also the player could program their own species quite easily, with a list of behaviours and a list of actions that they drag and drop together.

I like the idea of behaviors and actions the way you laid them out. Could we define these behaviors as “instincts” that we all can agree are necessary for survival and thus we don’t need some neural network to run thousands of simulations before it gets to the same conclusion. Then, like you said, the actions taken by the organism based on the instincts can be more open ended and trained through natural selection. By mutating both instincts and actions I think we could get a quick and inexpensive approximation to evolution.

In the current game plan when would a player not have full control of their organsim, and would thus have to depend on instincts and actions? I like the idea though. (Idk if y’all have discussed this) This would be a really cool feature even for the microbe stage. One issue with evolution games is that they model evolution as being linear when in reality its branched and one microbe can be an ancestor to several different species. It would be cool if I mutate my microbe and swim around as it and the game keeps track of how I play and what my instincts and actions are. However, the microbe i create also spawns as NPCs. At some point if the player dies, or is curious, they can zoom out and see how spawns of their microbe have faired in the tide pool. If I died or I just think one spawn has evolved into something cool I can decide to play as it. Obviously this is going to be computationally expensive so it will have to be limited somehow, but I think it would be a neat gameplay feature.

There will be ai controlled versions of your species swimming around. There’s going to be an organelle which summons other members of your species to you (to help with fighting etc) and that is the precursor to the mulitcellular stage where you spend so much time together you fuse.

The plan for the pop-dynamics is to have the ocean split up in to “patches” in which there is a CPA system running (Compounds, Population Dynamics and Auto-Evo) which is modelling the entire microbial eco-system. Your species can spread to other patches and can take different evolutionary routes thereby, eventually, creating branches to your species. However all the members of your species in the same patch are considered genetically identical so when you mutuate one you mutuate them all. (As every time you do a mutation 100 million years is going to pass in a leap)

When you die you choose to play as another microbe from your species in any patch where your species has presence. The patches will be different biomes (coastal, deep ocean, surface ocean, river, volcanic vent etc) and so the conditions will be different in a lot of them.

I like the idea that your actions are being recorded for your ai, that’s a cool idea. It would be really cool to have your species start acting how you act without having to explicitly program it.

The CPA system sounds really cool. Will the player be getting any real time info, or maybe some info in the mutation menu about how well the species is doing in each biome (did we die of starvation/poisoning/attack etc.)?

Maybe you all have already discussed this. Is there a thread where y’all have talked about the CPA system and its features/implementation more in depth? Sorry if this is easy to search for but I haven’t gotten a handle on all the forums yet.

The player should get info from a statistics button on the GUI, with data such as species numbers, death rate, etc. as well as information about the simulated parts of their own microbe.

There’s been loads of discussion on the CPA system in the past - these threads should be good starting points:

As with everything we’ve talked about, ideas from the early discussions are mostly obsolete now unfortunately. @Seregon and @tjwhale can give you a much better idea of the current concept.

Each patch runs it’s own population model. Each species interacts with the environment (takes up compounds and spits them out) and processes compounds internally. Then there is a food web where different species take compounds off others by predation.

We haven’t talked that much about how this information will be displayed. IMO there should be a separate screen where you see a map of all the patches your species is in, how well they are doing there, who their predators and prey are (but not the full food web - though maybe it’s cool to let the player see it it’ll probably be super confusing, especially if there are 300 species there). I guess you could see what weapons the species that kills you most has however the only two weapons are pilus and agents, though I guess you could evolve specific resistance to different agents.

Moreover we have talked about allowing different patches to evolve the same species in different ways. Until, finally, they are so different they split into two different species and are considered as such. This is all a bit of a pipe dream at the moment.

If you want to work on it the thing we haven’t really started on is How do you create a formula to tell you the outcome of hunting attempts between two species? So we want a Lotka-Volterra style model so we know

number of members of species A and B in this patch
organelles of species A and B
compounds stored in A and B as a whole (each species is considered to be made of N identical members who have identical compound stores)

How much of A’s compounds do we transfer to B (and how many to the environment as spillage) as a function of the above inputs. (As you can see a pretty hard problem).

We have formulas (CPA master list) for the chemical reactions inside the species.

So getting back on track I am happy to call the behaviours instincts. Is this a system we are all happy with? @Seregon @whodatXLIV Can you comment on anything you are happy or unhappy with? Have I got the lists right and is the evolution mechanism reasonable? We’re not having sexual reproduction in this stage, right (is it realistic for microbes to have sexual reproduction)? should fight and hunt be different? The list of actions seems very short, should there be more there? I’m just freeballing in terms of the training metric, how can that be improved?

Part 1 : The system

  1. There are a set of “instincts” which are shared by all species, these are

flee, fight, float, flock, graze and hunt.

  1. Every time an ai controlled member of a species is on the screen it is operating on one of these instincts.

  2. Each instinct has a set of weighted inputs which give the instinct a score. Whichever instinct has the highest score is the instinct the microbe currently employs. The inputs can be any of the following

compounds in the environment the microbe is in
compounds in other microbes present
compounds in the microbe itself
organelles in other microbes present
organelles in the microbe itself
distance from other microbes
number of other microbes present


Flee : for each microbe present with spikes +4
Float: for each microbe present with spikes -4, if the environment is rich in resources + 3
Graze: for each microbe present with spikes -4, if the environment is poor in resources + 3

  1. When operating on an instinct the species performs (and repeats) a sequence of actions, these can be any of the following

move relative to the closest microbe
move relative to other microbes on the screen (for nice flocking behaviour)
move up or down the gradient of a compound
fire agent
change loaded agent
change internal processes (to provide more energy to run away for example, very important in later stages)


Flee : move away from the closest microbe
Float : do nothing
Graze : move up the gradient of the most desired compound

Part 2 : Evolving over time

Either the weighted inputs to the instincts are being evolved OR the actions performed when in an instinct are being evolved but NOT both at the same time.

To evolve input:

Make 20 copies of the ai for species A and mutate the inputs for each instinct slightly in each one.


1 : Flee : for each microbe present with spikes +4
2 : Flee : for each microbe present with spikes +3
3 : Flee : for each microbe present with spikes +4, for each microbe present with agents +2

Then each time a member of that species is spawned it gets a the next copy of the ai to be tested. While it is on screen it gets scored based on the following metric (needs improvement),

if in flee if you take damage -1 per unit of damage.
if in fight
if in float -1 per unit of damage, -1 if you run low on compounds
if in flock -1 if you drift too far from a member of your species after being close, -1 if you run low on compounds
if in graze -1 if you take damage
if in hunt

After all the copies have been tested a couple of times then the 5 with the best score are kept, the others are discarded and then the 5 are mutated to form the next generation.

When evolving the actions instead of the inputs the testing metric is the same but it is the actions which are slightly mutated rather than the inputs. The actions and inputs are evolved alternately.

Notes: I guess it is the scoring metric which holds the whole system together so that needs to be well designed. You can imagine a situation where, if you swapped the inputs and outputs of fight and float then the whole situation would work the same except for the scoring. Therefore the scoring needs to reward the behaviour we are trying to train the microbe to have.

This system should work well for later stages, I think. It only needs additions rather than rebuilding for a tiger.

There is a problem with the speed of adaptation. So if you make 20 copies and each on is tested 3 times before a decision is made the player needs to see 60 members of a species per behavioural adaptation. Is this a problem? We do want to be able to spawn new species whenever we like with the expectation they will behave partially reasonably from the get go. Maybe we should write some explicit input and output combinations which can be used when a new species is spawned and then they can be adapted over time. That way when a species is spawned it gets something reasonable from the get go but can still change over time.

@tjwhale, I like your idea. It looks a lot like a simplified neural network—you have the inputs (spikes, fertility, agents…), the weights (the +3 and -4’s you mentioned), and the outputs (flee, fight, float, flock, graze and hunt)—which is very good since we are striving for realism and this is a close approximation to how actual bacteria think. But as you mentioned, speed might be a bit of a problem. In my experience, procedural systems that are based on random mutation are often extremely slow since there is a lot of room for foolish behaviors and you need large population sizes that are heavy on memory and CPU.

We can greatly reduce stupid behaviors by beginning the game with an already working combination of weights i.e., have your system evolve on a computer for a couple of hours for the starting cells and then when you are satisfied with the result save it in a separate file. You can then load this “brain” into every single one of the starting cells at startup. Hopefully, this will prevent cells that swim toward deadly predators and away from easy food.

Now, while on a run today I was thinking how we could make this system even better. The first thing I thought about was how actual bacteria think and move. Basically, every second they check to see if they are better off than last second or not. If, for example, they have less energy and more poisons inside their cells that before they will turn around. That’s right, bacteria have a memory.

This brought me to the idea of having a collective “memory” for AI species. I suggest we model every cell on the screen with a bitstring. Every single bit will describe an enemy cell. For example, 0-1-1-0-0-0-0 could mean that the cell has a pilus and a toxin vacuole, but it lacks mitochondria, storage vacuoles, slime, etc. And 1-0-0-1-1-0-0 could mean that it has chloroplasts, storage vacuoles, and lacks a flagellum and a pilus.

We will then use these bitstrings to determine the action of cell A in the following manner. Each species will have a memory in the form of a table with 2 columns. Whenever a member of the species A gets killed the predators bit string is written is written in the first column. However, if the member of species A eats something tasty, its prey’s bitstring is written into column two.

Now, whenever this species A meets a foreign cell, it decides what to do by looking at its bitstring. If the string is similar or exactly equal to a string in the “Killed by” column, the cell will decide to flee. If it’s similar to something in the “food” column, the cell will attack. And if it’s similar to cell A’s own string it will mate.

We can go even further and add a “risk” or “desparateness” factor. This would basically include listing the bitstrings in the column by the amount of times they get put in there. For example if you get killed by cell B 4 times, by cell C 15 times, and by cell D 8 times, your table would look like this:


The cell will then compare a target and see where in the column it is located. If it is in the upper 25% it will flee even if it is super hungry. If it is in the lower 50% it might decide to attack (especially if this bitstring is also located in the “food” column)

I believe this idea satisfies all the requirements for the AI that I found in this thread and the one on github, but if I missed something I will think about how I could add it. This “brain” also doesn’t seem like it would take a lot of space and would be fast. I really hope you guys understood everything that I said—I never was too good with explaining my thoughts.

Anyway, I would like to hear your critiques before I start working on a prototype: maybe there is something huge that I completely missed that would render this useless.

@TheCreator I really like your idea I think it’s very cool. I am 100% on board with the idea of pre-made “brains in a box” so when you create a new species it’s given an appropriate (predator, prey, herd herbivore etc we could make 10 or 12) brain which we know works well and then it can evolve from there. That way the game can handle species creation in a sensible way any time and know they will behave sensible + have evolved surprises for the player.

I like the bitstrings, I think that’s cool. It’s a good way of of encoding the data. I guess we are trying to encode the same ideas, that basically if you get hurt by a lot of things with spikes then you should build up an aversion to spikes. Whether that’s purely a numerical weight or whether that’s “it looks like a thing that hurt me therefore I am wary of it.”

How does your system handle multiple microbes on the screen at the same time? So say there are 6 microbes and 2 look like food and 4 look like danger, what should I do then? I guess you add up your responses to each? Or maybe fear trumps hunger?

Also how do things leave your lists? Do they fall off after a certain amount of time?

Also what happens if the hunt list is empty? Do you add some seeds that never fall off or do some things get randomly added?

Thinking more generally about instincts I think really we have two instinct categories. One is “stay safe” and that is more important than “gather resources”. However inside these there are different sub states. So for some microbes fleeing is the best way of staying safe but for others it will be sticking with the herd. Under gather resources there is grazing and hunting.

I guess fleeing and fighting something off are kind of the same thing. Because if someone is chasing you and they get close then you fire agent and stab them with your pilli and that is the same as fighting.

I also think that maybe everything should graze when it’s not hunting and not everything should hunt (as float is a little bit pointless).

So is it true we could boil everything down to


If in danger flee or herd (whichever your species has chosen)


If not in danger graze{
if a microbe you could eat comes close enough then hunt it


Because that seems quite simple. Is that really all the behaviour we want from this stage? (later there will be mating which will be under “if not in danger”) If so the bitstrings approach might work really well for determining danger or not and whether to hunt or not.

In fact it would work really well against the player. So at first you see a grazing microbe and when you get close and attack it it will stay there and get killed, because it hasn’t got any fear of you. However next time you meet that species it will start to recognise you and will eventually flee as soon as you come on the screen. I think that’s the exact type of behaviour we want from the ai, right? To give the sense that the other species are learning to compete with you.

I had something different in mind for calculating the weighted inputs. It isn’t: microbes with spikes give a +4, and you keep adding them up as more microbes show up. Rather the weights are a function of distance or time or number of species, or all three. These weight functions have a minimum and a maximum and some (probably non linear) rate at which you get from a minimum to a maximum.

So for example microbes with spikes: Spikes cannot hurt me if I’m far away so its weighted input is a function of its distance to me as well as number of spiked microbes present. So if there is one far away from me, or two, or ten I won’t run. They’re too far away and I like where I am. Now if one starts getting closer the weight starts increasing, slowly at first. Who cares if it’s 20 meters away or 19 meters away? But what if its 3 meters away then 2 meters away, clearly much more of a threat, hence the non-linear function of distance. This makes it so that 10 far away is not as frightening as one or two really close. This is the main reason why I don’t like the simple +4 scoring system, it doesn’t capture this dynamic.

So now we have a function describing spiked microbes, and let’s say these guys are getting pretty close. How does the organism decide what to do? Well our guy has two options fight or flee. Both fight and flee have their own minimum and maximums functions for spiked microbes and can be handled differently. I think this is important as it adds depth to the species. I’ll illustrate how this depth comes about with a complex example:

So we have a “microbe with spikes” flee input function that depends on distance(d) and number(n) of microbes with spikes. This function has a minimum of 0 and a maximum of 10. The function goes from 0 to 10 slowly as a function of distance, but really really quickly as a function of number.

We also have a “microbe with spikes” fight input function that depends on d and n. This function has a minimum of 0 and but a maximum of 9. The function goes from 0 to 9 really quickly as a function of distance, but really slowly as a function of number.

Scenario: There is 1 microbe with spikes and it is getting closer. Well the flee function is more dependent on n and not so much on d, while the fight function depends heavily on d, so the fight function will be getting close to its maximum while the flee will stay close to its minimum. Final decision: Fight!

But wait, now instead of 1 microbe getting closer, some of its friends join in. I may be tough enough to fight one of them, but maybe not tough enough to fight a lot of them. Now as n is increasing my flee function really starts to take off. When I eventually see that there are too many microbes and they’re too close I’ll max out both fight and flee functions, but because my flee function has a higher max than my fight function I’ll choose to run.

The above example shows more depth to an organism’s thought process than simply adding up some set scores. Anyway, evolving would simply mean changing max/mins, as well as rates of increase. Downside is we would really have to sit and think what each function depends on. So for example Hunt depends on hunger (which is is a function of time), as well as proximity to food (function of distance), danger food possesses ??, etc.

@TheCreator the brain boxes is a really cool idea. And memory could be really interesting to play with. @tjwhale and I had been wondering how to get other members of your species to act like you and this could be key. I’m wary though of using this brain box to completely determine behaviour. Like @tjwhale said, with multiple on screen foes, food and friends its behaviour may become unrealistic if its simply adding up scores. Instead this memory can be incorporated into the functions method I was talking about earlier. So if organism A has killed me 1 time but organism B has killed me 100 times then if I encounter A my fight max goes up and my flee max goes down. But if B is in the area my flee max skyrockets and my fight max gets really really low. So A I fight and B I run from. If both A and B can’t kill me but I’ve found A is tasty and B isn’t (this could be like A makes more nutrients I need) then my hunt max goes up when A is around but not when B is around. Therefore I’m more likely to drop everything and eat A than I am to eat B. This memory stuff can really add even more depth to an organism’s behaviour.

As for how things leave the list I believe we can use markov models for simple dumb organisms and as an organism gets smarter it stores more and more information. This would be a really interesting way to evolve intelligence.