Thinking about AI

tjwhale · June 22, 2015, 10:41am

Cool post. I really like that it came with a prototype, that’s great!

So ultimately is what you’re suggesting some sort of procedural finite state machine?

So you have a “desire to flee” function. And that looks around the screen and adds up all the things it can see which contribute to that. So you see 5 microbes all with spikes and that gives you a score of +10.

Then you have a “desire to feed” function and you look around and see what is edible around you and you look at your own internal variables and that gives you a +8 so you choose to flee. Have I understood you right?

If so do you think these behaviours should be explicitly defined by us (when you flee move away from the enemy in a straight line etc) or do you think they should be procedural and trained?

What I’m thinking right now is that maybe it’s best to have a set of behaviours with abstract reward functions defined by us. Then when a microbe gets into that state it performs a string of behaviours which are procedurally generated from a general pool. Then those behaviours get mutated and the best ones are chosen.

Ok that’s pretty complicated so an example.

So I am a species of microbe and this is a list of all the possible things I can do

fire agent 1-10
move in direction X
maintain distance from microbe Y

So I have 6 behaviours (which all microbes have, which we gave them in advance), flee, fight, float, flock, graze and hunt.

I look around the environment and decide what behaviour to do, I see a lot of things with spikes and so I choose to flee.

Under flee my current instructions are fire agent 3 and maintain distance 10 from all other microbes. I execute these and see how successful I am.

Over time we evolve BOTH what features I use to decide what behaviour to perform AND what actions I perform when I am doing that behaviour. However the list of behaviours I have is fixed and defined in advance by us.

Does that make sense? What do you all think?

I think that would be enough evolution to be interesting but it’s structured enough that it could be trained quite quickly. Moreover we could offer seeds for each behaviour which would be starting points which would be reasonable. Also the player could program their own species quite easily, with a list of behaviours and a list of actions that they drag and drop together.

whodatXLIV · June 23, 2015, 2:31am

I like the idea of behaviors and actions the way you laid them out. Could we define these behaviors as “instincts” that we all can agree are necessary for survival and thus we don’t need some neural network to run thousands of simulations before it gets to the same conclusion. Then, like you said, the actions taken by the organism based on the instincts can be more open ended and trained through natural selection. By mutating both instincts and actions I think we could get a quick and inexpensive approximation to evolution.

In the current game plan when would a player not have full control of their organsim, and would thus have to depend on instincts and actions? I like the idea though. (Idk if y’all have discussed this) This would be a really cool feature even for the microbe stage. One issue with evolution games is that they model evolution as being linear when in reality its branched and one microbe can be an ancestor to several different species. It would be cool if I mutate my microbe and swim around as it and the game keeps track of how I play and what my instincts and actions are. However, the microbe i create also spawns as NPCs. At some point if the player dies, or is curious, they can zoom out and see how spawns of their microbe have faired in the tide pool. If I died or I just think one spawn has evolved into something cool I can decide to play as it. Obviously this is going to be computationally expensive so it will have to be limited somehow, but I think it would be a neat gameplay feature.

tjwhale · June 23, 2015, 9:13am

There will be ai controlled versions of your species swimming around. There’s going to be an organelle which summons other members of your species to you (to help with fighting etc) and that is the precursor to the mulitcellular stage where you spend so much time together you fuse.

The plan for the pop-dynamics is to have the ocean split up in to “patches” in which there is a CPA system running (Compounds, Population Dynamics and Auto-Evo) which is modelling the entire microbial eco-system. Your species can spread to other patches and can take different evolutionary routes thereby, eventually, creating branches to your species. However all the members of your species in the same patch are considered genetically identical so when you mutuate one you mutuate them all. (As every time you do a mutation 100 million years is going to pass in a leap)

When you die you choose to play as another microbe from your species in any patch where your species has presence. The patches will be different biomes (coastal, deep ocean, surface ocean, river, volcanic vent etc) and so the conditions will be different in a lot of them.

I like the idea that your actions are being recorded for your ai, that’s a cool idea. It would be really cool to have your species start acting how you act without having to explicitly program it.

whodatXLIV · June 23, 2015, 6:05pm

The CPA system sounds really cool. Will the player be getting any real time info, or maybe some info in the mutation menu about how well the species is doing in each biome (did we die of starvation/poisoning/attack etc.)?

Maybe you all have already discussed this. Is there a thread where y’all have talked about the CPA system and its features/implementation more in depth? Sorry if this is easy to search for but I haven’t gotten a handle on all the forums yet.

Oliveriver · June 23, 2015, 6:13pm

The player should get info from a statistics button on the GUI, with data such as species numbers, death rate, etc. as well as information about the simulated parts of their own microbe.

There’s been loads of discussion on the CPA system in the past - these threads should be good starting points:

As with everything we’ve talked about, ideas from the early discussions are mostly obsolete now unfortunately. @Seregon and @tjwhale can give you a much better idea of the current concept.

tjwhale · June 23, 2015, 6:37pm

Each patch runs it’s own population model. Each species interacts with the environment (takes up compounds and spits them out) and processes compounds internally. Then there is a food web where different species take compounds off others by predation.

We haven’t talked that much about how this information will be displayed. IMO there should be a separate screen where you see a map of all the patches your species is in, how well they are doing there, who their predators and prey are (but not the full food web - though maybe it’s cool to let the player see it it’ll probably be super confusing, especially if there are 300 species there). I guess you could see what weapons the species that kills you most has however the only two weapons are pilus and agents, though I guess you could evolve specific resistance to different agents.

Moreover we have talked about allowing different patches to evolve the same species in different ways. Until, finally, they are so different they split into two different species and are considered as such. This is all a bit of a pipe dream at the moment.

If you want to work on it the thing we haven’t really started on is How do you create a formula to tell you the outcome of hunting attempts between two species? So we want a Lotka-Volterra style model so we know

number of members of species A and B in this patch
organelles of species A and B
compounds stored in A and B as a whole (each species is considered to be made of N identical members who have identical compound stores)

How much of A’s compounds do we transfer to B (and how many to the environment as spillage) as a function of the above inputs. (As you can see a pretty hard problem).

We have formulas (CPA master list) for the chemical reactions inside the species.

tjwhale · June 26, 2015, 11:31am

So getting back on track I am happy to call the behaviours instincts. Is this a system we are all happy with? @Seregon @whodatXLIV Can you comment on anything you are happy or unhappy with? Have I got the lists right and is the evolution mechanism reasonable? We’re not having sexual reproduction in this stage, right (is it realistic for microbes to have sexual reproduction)? should fight and hunt be different? The list of actions seems very short, should there be more there? I’m just freeballing in terms of the training metric, how can that be improved?

Part 1 : The system

There are a set of “instincts” which are shared by all species, these are

flee, fight, float, flock, graze and hunt.

Every time an ai controlled member of a species is on the screen it is operating on one of these instincts.
Each instinct has a set of weighted inputs which give the instinct a score. Whichever instinct has the highest score is the instinct the microbe currently employs. The inputs can be any of the following

compounds in the environment the microbe is in
compounds in other microbes present
compounds in the microbe itself
organelles in other microbes present
organelles in the microbe itself
distance from other microbes
number of other microbes present

Example:

Flee : for each microbe present with spikes +4
Float: for each microbe present with spikes -4, if the environment is rich in resources + 3
Graze: for each microbe present with spikes -4, if the environment is poor in resources + 3

When operating on an instinct the species performs (and repeats) a sequence of actions, these can be any of the following

move relative to the closest microbe
move relative to other microbes on the screen (for nice flocking behaviour)
move up or down the gradient of a compound
fire agent
charge?
engulf
change loaded agent
change internal processes (to provide more energy to run away for example, very important in later stages)

Example:

Flee : move away from the closest microbe
Float : do nothing
Graze : move up the gradient of the most desired compound

Part 2 : Evolving over time

Either the weighted inputs to the instincts are being evolved OR the actions performed when in an instinct are being evolved but NOT both at the same time.

To evolve input:

Make 20 copies of the ai for species A and mutate the inputs for each instinct slightly in each one.

Example:

1 : Flee : for each microbe present with spikes +4
2 : Flee : for each microbe present with spikes +3
3 : Flee : for each microbe present with spikes +4, for each microbe present with agents +2
etc

Then each time a member of that species is spawned it gets a the next copy of the ai to be tested. While it is on screen it gets scored based on the following metric (needs improvement),

if in flee if you take damage -1 per unit of damage.
if in fight
if in float -1 per unit of damage, -1 if you run low on compounds
if in flock -1 if you drift too far from a member of your species after being close, -1 if you run low on compounds
if in graze -1 if you take damage
if in hunt

After all the copies have been tested a couple of times then the 5 with the best score are kept, the others are discarded and then the 5 are mutated to form the next generation.

When evolving the actions instead of the inputs the testing metric is the same but it is the actions which are slightly mutated rather than the inputs. The actions and inputs are evolved alternately.

Notes: I guess it is the scoring metric which holds the whole system together so that needs to be well designed. You can imagine a situation where, if you swapped the inputs and outputs of fight and float then the whole situation would work the same except for the scoring. Therefore the scoring needs to reward the behaviour we are trying to train the microbe to have.

This system should work well for later stages, I think. It only needs additions rather than rebuilding for a tiger.

There is a problem with the speed of adaptation. So if you make 20 copies and each on is tested 3 times before a decision is made the player needs to see 60 members of a species per behavioural adaptation. Is this a problem? We do want to be able to spawn new species whenever we like with the expectation they will behave partially reasonably from the get go. Maybe we should write some explicit input and output combinations which can be used when a new species is spawned and then they can be adapted over time. That way when a species is spawned it gets something reasonable from the get go but can still change over time.

TheCreator · June 26, 2015, 12:47pm

@tjwhale, I like your idea. It looks a lot like a simplified neural network—you have the inputs (spikes, fertility, agents…), the weights (the +3 and -4’s you mentioned), and the outputs (flee, fight, float, flock, graze and hunt)—which is very good since we are striving for realism and this is a close approximation to how actual bacteria think. But as you mentioned, speed might be a bit of a problem. In my experience, procedural systems that are based on random mutation are often extremely slow since there is a lot of room for foolish behaviors and you need large population sizes that are heavy on memory and CPU.

We can greatly reduce stupid behaviors by beginning the game with an already working combination of weights i.e., have your system evolve on a computer for a couple of hours for the starting cells and then when you are satisfied with the result save it in a separate file. You can then load this “brain” into every single one of the starting cells at startup. Hopefully, this will prevent cells that swim toward deadly predators and away from easy food.

Now, while on a run today I was thinking how we could make this system even better. The first thing I thought about was how actual bacteria think and move. Basically, every second they check to see if they are better off than last second or not. If, for example, they have less energy and more poisons inside their cells that before they will turn around. That’s right, bacteria have a memory.

This brought me to the idea of having a collective “memory” for AI species. I suggest we model every cell on the screen with a bitstring. Every single bit will describe an enemy cell. For example, 0-1-1-0-0-0-0 could mean that the cell has a pilus and a toxin vacuole, but it lacks mitochondria, storage vacuoles, slime, etc. And 1-0-0-1-1-0-0 could mean that it has chloroplasts, storage vacuoles, and lacks a flagellum and a pilus.

We will then use these bitstrings to determine the action of cell A in the following manner. Each species will have a memory in the form of a table with 2 columns. Whenever a member of the species A gets killed the predators bit string is written is written in the first column. However, if the member of species A eats something tasty, its prey’s bitstring is written into column two.

Now, whenever this species A meets a foreign cell, it decides what to do by looking at its bitstring. If the string is similar or exactly equal to a string in the “Killed by” column, the cell will decide to flee. If it’s similar to something in the “food” column, the cell will attack. And if it’s similar to cell A’s own string it will mate.

We can go even further and add a “risk” or “desparateness” factor. This would basically include listing the bitstrings in the column by the amount of times they get put in there. For example if you get killed by cell B 4 times, by cell C 15 times, and by cell D 8 times, your table would look like this:

C
D
B

The cell will then compare a target and see where in the column it is located. If it is in the upper 25% it will flee even if it is super hungry. If it is in the lower 50% it might decide to attack (especially if this bitstring is also located in the “food” column)

I believe this idea satisfies all the requirements for the AI that I found in this thread and the one on github, but if I missed something I will think about how I could add it. This “brain” also doesn’t seem like it would take a lot of space and would be fast. I really hope you guys understood everything that I said—I never was too good with explaining my thoughts.

Anyway, I would like to hear your critiques before I start working on a prototype: maybe there is something huge that I completely missed that would render this useless.

tjwhale · June 26, 2015, 4:44pm

@TheCreator I really like your idea I think it’s very cool. I am 100% on board with the idea of pre-made “brains in a box” so when you create a new species it’s given an appropriate (predator, prey, herd herbivore etc we could make 10 or 12) brain which we know works well and then it can evolve from there. That way the game can handle species creation in a sensible way any time and know they will behave sensible + have evolved surprises for the player.

I like the bitstrings, I think that’s cool. It’s a good way of of encoding the data. I guess we are trying to encode the same ideas, that basically if you get hurt by a lot of things with spikes then you should build up an aversion to spikes. Whether that’s purely a numerical weight or whether that’s “it looks like a thing that hurt me therefore I am wary of it.”

How does your system handle multiple microbes on the screen at the same time? So say there are 6 microbes and 2 look like food and 4 look like danger, what should I do then? I guess you add up your responses to each? Or maybe fear trumps hunger?

Also how do things leave your lists? Do they fall off after a certain amount of time?

Also what happens if the hunt list is empty? Do you add some seeds that never fall off or do some things get randomly added?

–

Thinking more generally about instincts I think really we have two instinct categories. One is “stay safe” and that is more important than “gather resources”. However inside these there are different sub states. So for some microbes fleeing is the best way of staying safe but for others it will be sticking with the herd. Under gather resources there is grazing and hunting.

I guess fleeing and fighting something off are kind of the same thing. Because if someone is chasing you and they get close then you fire agent and stab them with your pilli and that is the same as fighting.

I also think that maybe everything should graze when it’s not hunting and not everything should hunt (as float is a little bit pointless).

So is it true we could boil everything down to

{

If in danger flee or herd (whichever your species has chosen)

and

If not in danger graze{
if a microbe you could eat comes close enough then hunt it
}

}

Because that seems quite simple. Is that really all the behaviour we want from this stage? (later there will be mating which will be under “if not in danger”) If so the bitstrings approach might work really well for determining danger or not and whether to hunt or not.

In fact it would work really well against the player. So at first you see a grazing microbe and when you get close and attack it it will stay there and get killed, because it hasn’t got any fear of you. However next time you meet that species it will start to recognise you and will eventually flee as soon as you come on the screen. I think that’s the exact type of behaviour we want from the ai, right? To give the sense that the other species are learning to compete with you.

whodatXLIV · June 26, 2015, 6:53pm

I had something different in mind for calculating the weighted inputs. It isn’t: microbes with spikes give a +4, and you keep adding them up as more microbes show up. Rather the weights are a function of distance or time or number of species, or all three. These weight functions have a minimum and a maximum and some (probably non linear) rate at which you get from a minimum to a maximum.

So for example microbes with spikes: Spikes cannot hurt me if I’m far away so its weighted input is a function of its distance to me as well as number of spiked microbes present. So if there is one far away from me, or two, or ten I won’t run. They’re too far away and I like where I am. Now if one starts getting closer the weight starts increasing, slowly at first. Who cares if it’s 20 meters away or 19 meters away? But what if its 3 meters away then 2 meters away, clearly much more of a threat, hence the non-linear function of distance. This makes it so that 10 far away is not as frightening as one or two really close. This is the main reason why I don’t like the simple +4 scoring system, it doesn’t capture this dynamic.

So now we have a function describing spiked microbes, and let’s say these guys are getting pretty close. How does the organism decide what to do? Well our guy has two options fight or flee. Both fight and flee have their own minimum and maximums functions for spiked microbes and can be handled differently. I think this is important as it adds depth to the species. I’ll illustrate how this depth comes about with a complex example:

So we have a “microbe with spikes” flee input function that depends on distance(d) and number(n) of microbes with spikes. This function has a minimum of 0 and a maximum of 10. The function goes from 0 to 10 slowly as a function of distance, but really really quickly as a function of number.

We also have a “microbe with spikes” fight input function that depends on d and n. This function has a minimum of 0 and but a maximum of 9. The function goes from 0 to 9 really quickly as a function of distance, but really slowly as a function of number.

Scenario: There is 1 microbe with spikes and it is getting closer. Well the flee function is more dependent on n and not so much on d, while the fight function depends heavily on d, so the fight function will be getting close to its maximum while the flee will stay close to its minimum. Final decision: Fight!

But wait, now instead of 1 microbe getting closer, some of its friends join in. I may be tough enough to fight one of them, but maybe not tough enough to fight a lot of them. Now as n is increasing my flee function really starts to take off. When I eventually see that there are too many microbes and they’re too close I’ll max out both fight and flee functions, but because my flee function has a higher max than my fight function I’ll choose to run.

The above example shows more depth to an organism’s thought process than simply adding up some set scores. Anyway, evolving would simply mean changing max/mins, as well as rates of increase. Downside is we would really have to sit and think what each function depends on. So for example Hunt depends on hunger (which is is a function of time), as well as proximity to food (function of distance), danger food possesses ??, etc.

@TheCreator the brain boxes is a really cool idea. And memory could be really interesting to play with. @tjwhale and I had been wondering how to get other members of your species to act like you and this could be key. I’m wary though of using this brain box to completely determine behaviour. Like @tjwhale said, with multiple on screen foes, food and friends its behaviour may become unrealistic if its simply adding up scores. Instead this memory can be incorporated into the functions method I was talking about earlier. So if organism A has killed me 1 time but organism B has killed me 100 times then if I encounter A my fight max goes up and my flee max goes down. But if B is in the area my flee max skyrockets and my fight max gets really really low. So A I fight and B I run from. If both A and B can’t kill me but I’ve found A is tasty and B isn’t (this could be like A makes more nutrients I need) then my hunt max goes up when A is around but not when B is around. Therefore I’m more likely to drop everything and eat A than I am to eat B. This memory stuff can really add even more depth to an organism’s behaviour.

As for how things leave the list I believe we can use markov models for simple dumb organisms and as an organism gets smarter it stores more and more information. This would be a really interesting way to evolve intelligence.

tjwhale · June 26, 2015, 9:28pm

I really like the idea of including distance into the calculations, that’s a good thought.
I think we are all agreed that what we want is some sort of Finite State Machine, so, for example, the instincts are the states, there are finitely many of them and the machines switches between them based on inputs and when in them does outputs.
The more I think about it the more I like the bit-strings idea. One of the few pieces of info we can get is “what killed me” or even “what damaged me”. Using that info to say “the thing which killed me had spikes therefore avoid things with spikes, the thing which killed me had flagella, therefore avoid things with flagella” etc I think makes less sense than simply saying “avoid the thing that killed you”. When encountering a new microbe the systems work much the same, “look at the things which damaged you in the past and try to work out from that if this thing will damage you”.

I guess the reason I think it’s better is that you are going to be able to make a distinction between things which are actively hunting you as opposed to things that occasionally jab you. So “yes it has spikes but it only mildly damaged you one time so maybe it’s ok to be close to it, the other thing has spikes and it killed you 25 times, watch out”. In the “avoid spikes” system the problem is that you will start to avoid everything that has spikes and that may be a mistake.

Later in the game it’s going to be important to be able to tell the difference between a heavily armed herbivore and a carnivore as the heavily armed herbivores actually offer you protection rather than harm.

Does that make sense? I’m wondering if the two systems are functionally identical but I think this might be a difference between them.

The bitsrings combined with distances to each microbe could make a really robust system I think.
As I mentioned above I don’t think there is much of a difference between standing and fighting and fleeing. Later there will be (so maybe we should plan for that) but spikes work if you get close and choosing to fire agent works if you are moving away better than if you are still (it makes them swim through a cloud of it) so basically, in this stage, there is no reason to stand and fight that I can think of.

Would love to know what you guys think, we’re really making good progress here, thanks for all the input.

TheCreator · June 27, 2015, 8:17am

Okay, so starting from bottom up, I was thinking we could have a sort-of “shyness” variable that could evolve. This would influence the behaviors of organisms when they don’t know what the species they met is. This could lead to predators that attack anything that they see unless they know that it will hurt them, and herbivores that flee anything unless they know for sure it isn’t a threat.

I was originally thinking that things would always remain on the list, but I like @whodatXLIV’s idea about allowing a set amount of space to mimic intelligence.

As for multiple microbes on the screen, real cells move by looking at food (or poison) gradients. So I thought that for each microbe we could do a simple calculation:

Vector2D netMovement = 0;
For all enemy microbes on screen
{
Vector2D movementDirection = CalculateAction( e^(-1*distanceToMicrobe), typeOfMicrobe );
netMovement += movementDirection;
}

Basically, it calculates a direction of movement for each piece of food and each cell based on the type of the object and based on the gradient, which exponentially decreases with distance, and adds them all together. However, I am not sure how well this will work.

The whole CalculateAction function will decide whether the cell needs to flee, herd, graze, etc… and based on this it will return a direction and speed of movement. So even if you see a deadly predator you wouldn’t spend all your ATP running away from it if it is far away.

Yeah, that makes sense. I didn’t want to use the memory thing to complete determine behavior; I mostly made it to add an extra level of realism to what was described before I joined in.

Yeah, this is exactly what I meant by my previous post. Whenever the AI then meets a species it doesn’t know yet, it will just do a bitstring comparison to decide how best to act.

Edit: For the time being please ignore everything I said about me writing a prototype. I have run into a bit of a problem where all 20 different members in a species bunch up into 1 member because they all have a similar brain and if they are in the same vicinity they go into the same direction: swim after the same food source and run away from the same predators. A simple collision detection function should lessen this, but I have just realized that I have no idea how to do that I tried something that I though should work, but apparently it doesn’t…

TheCreator · June 28, 2015, 4:57pm

I originally planned to edit my previous post, but I guess what I am about to say warrants another one.

I’ve worked a bit on a proof of concept for the bitstring memory idea. In the video you see below I started with 115 “cells”. I have split them up into 6 species, each of which can eat only a certain kind of cells. For example, CellC, which is the brown one, can only eat the green bacteria, and the black and the red cells; it simply goes through the orange ones without killing them (it is unable to digest the compounds in them) and it is eaten by the large purple ones.

In the beginning, these cells have absolutely no idea what life is—they start moving in a randomly chosen direction until they hit something. Whenever a cell hits something, it looks at what happens next. If it gets eaten, the code of the predator gets put in a memoryKill vector; on the other hand, if it eats something, the code gets put in a memoryFood vector. These vectors remain with the cell after it respawns in another randomly chosen place on the map.

Next time when this cell moves around, it calls a GetClosestObject function every frame. This function iterates through every single cell inside a population vector until it finds the closest one. Originally I planned on using a net movement that was affected by every single cell in the vicinity, but this completely failed on me and cell behavior became really erratic (although it is possible I just messed something up, so if someone else wants to give this a go, go ahead). So at the moment, every cells action is completely dependent only on the closest living thing to it, be it food or prey. I don’t like this, but coding anything else is outside my programming reach at the moment, I only recently learned c++.

After finding the closest one, the cell checks to see if its genetic id number is a match for anything in its Kill or Food column. Since I only have 6 different species and no organelles that could be used to compare bits, I didn’t use the bit-string method described above, I used a simple integer 1-6 for an id; nevertheless, I think that if necessary this should be a fairly easy change. If the id is in the memoryKill, the cell moves the opposite direction. If it’s in the food column, it moves in that direction. If the cell has never seen this id before, it doesn’t change its previous direction (although I think this could easily be changed with a “shyness” bool variable that tells it to either go in the same direction or opposite).

One other thing I noticed bad in my prototype is that there is not fighting mechanism. Most cells spend their whole lives running away in a straight line from a predator who is chasing them. I suppose this could be fixed by either giving different species different speeds or by having the fleeing predator turn around and fight if it feels its resources running low and it has been running away for a really long time. So basically have thresholds that I think @tjwhale mentioned—a hungry animal might attack a big predator that killed it multiple times if it is desperate enough.

Anyway, here is the aforementioned prototype after a couple minutes of offscreen simulation:

Edit: Ehhh, my video recording software is acting buggy… apparently my free trial ran out. Anyone know a good one?

tjwhale · June 28, 2015, 5:07pm

Re video capture on Windows I use Microsoft Expression Encoder, it’s free and works fine.

Prototype sounds cool. I’d be interested in seeing the video.

tjwhale · June 28, 2015, 7:31pm

Edit; I’m going to rewrite this prototype because I have a much better idea so feel free to skip this post, it’ll be outdated quite quickly.

I made a prototype today, just based on some very simple principles and I think it’s quite good. It’s based on a lot of the ideas in this thread, memory from the creator and distances and adding up dangers from whodat.

The ai controlled microbe is the large one in the middle of the screen, when it’s scared it goes red/pink, when it’s hunting it goes green and when it’s doing nothing it goes grey. If it get’s trapped in a corner it is teleported back to the middle of the arena (as this happens quite a lot and is a boring way to die).

On the right hand side of the screen you can see two columns of dots. The left column is what the other microbes actually are. Red is predators (who hunt the microbe and kill it if they get too close), Green is prey (who flee the microbe and die if they get hunted) and Blue is neither which the microbe has no interest in.

Basically each time it has an interaction (either it gets eaten, or it eats something, or it tries to eat something and can’t) it records the results in it’s memory. it’s memory degrades over time (and so you can see it forget after a while). Sometimes it will forget about a predator and charge in to hunt it, but it quickly learns!

The first minute is one setup (with a lot of reds where it spends most of it’s time running away) and after 1:00 is another setup where there are more prey. As you can see it quickly learns what to hunt. The first minute isn’t great (it’s too overwhelmed by fear) the second set up is better though, it gets some chances to hunt.

This needs some more testing (like giving every microbe the same intelligence and strengths and seeing what happens). However I would like to argue that this would be really good AI for the player to play against. What it means is at first this microbe won’t run away (in fact it may even hunt you) when it is weaker than you and you get some easy kills. However it learns and now when you turn up it will flee. (Or flock but that’s another story). If you are faster than it then you can still keep hunting it when it is at maximum fear however if it is faster than you you won’t be able to kill it again until it forgets about you and then you get another free shot.

I think this is a really good balance between “all microbes fleeing as soon as you enter the screen everytime” (which is boring) and “all microbes just sitting there and getting eaten” (whic is boring). Somehow memory and forgetting gives the ai a nice rhythum to play against, a perpetual cat and mouse with a slightly dumb mouse.

It’s important to remember the main goal is to make ai that makes the game fun, not necessarily ai that is super clever.

Anyway what do you all think?

Ps. Code is in the prototypes Git repository.

TheCreator · June 29, 2015, 1:09pm

Thanks, but I ended up just using CamStudio, which is free and works fine.

Anyway, here are the awaited videos. The first one is minutes 2-3 of my prototype running, while the second one is minutes 10-12.

They are basically the same thing, but the second video is better in my opinion, since the memories of the cells are more advanced (they know more things). Also, I remind you that there are 115 cells running in this simulation, so as you can see the algorithm is very fast and can easily be applied to over a thousand cells. Huh, let me try and see how far I can stretch the population number until the program becomes slow and jagged.

Edit: lol, I just realized you can completely see my taskbar in the first video Oops.

As for @tjwhale’s video, it is very close to mine, which is very good since it means we are going along the right track. I wanted to add a bit of commentary, but I guess I’ll wait until your next prototype.

I look forward to your comments.

tjwhale · June 29, 2015, 1:55pm

That’s really cool. I like it. I think we are going down the same road as you say.

How about trying it with different species being different speeds and seeing what that looks like?

Also when you say “So at the moment, every cells action is completely dependent only on the closest living thing to it, be it food or prey. I don’t like this, but coding anything else is outside my programming reach at the moment, I only recently learned c++.”

Firstly what you’re doing with it is really good, keep it up, I’m not much of a programmer but I know sticking with it is the way to get better. If you wanted to get more objects would it be possible to call “GetClosestObject” on everything excluding the result of the last call of GetClosestObject? That would get you the second closest object.

TheCreator · June 29, 2015, 3:31pm

tjwhale:

That’s really cool. I like it. I think we are going down the same road as you say.

How about trying it with different species being different speeds and seeing what that looks like?

Also when you say “So at the moment, every cells action is completely dependent only on the closest living thing to it, be it food or prey. I don’t like this, but coding anything else is outside my programming reach at the moment, I only recently learned c++.”

Firstly what you’re doing with it is really good, keep it up, I’m not much of a programmer but I know sticking with it is the way to get better. If you wanted to get more objects would it be possible to call “GetClosestObject” on everything excluding the result of the last call of GetClosestObject? That would get you the second closest object.

Thanks I’ll try it with different speeds tomorrow and post my findings. Also, my algorithm starts to break down around 500 cells, which I think is a fairly good result since. We probably won’t have more than 5-10 microbes on the screen, but this means we will also be able to simulate microbes outside the players reach to create a seamless transition. The lag you can see in my video is actually from my recording software, not the actual simulation.

Calling GetClosestObject multiple times would work, but what I meant was I don’t really know what to do with all those different direction vectors. Should I add them based on distance? Check to see if there is a predator nearby and if that is the case run away? I would probably confuse myself in all those if and switch statements if I tried to code that.

Anyway, look forward to your prototype. Should I try to draw a code flow chart for what we expect for the final AI simulation in Thrive based on the aforementioned ideas?

tjwhale · June 29, 2015, 4:53pm

Sure go for it. I’ll happily take a look.

This is my newest prototype, I think it works really well. It’s based on the following ideas.

You have a memory which has, for each species, 20 slots. So you remember the last 20 interactions you had. You get a 1 if you kill the other microbe and a 0 if you die. From this you can determine your observed probability of winning the next fight (sum of wins / number of observations).

You also have a strength memory (just a number per species) which is how strong you think each species is (including yourself). From this you can derive your expected probability of winning the next fight.

Each time you fight (and get a new observation) you compare your expected value with the observed value. If the expected value was too high you lower what you think your own strength is and increase what you think their strength is. Visa versa if your expected value was too low you think you are stronger and they are weaker than you predicted.

In this simulation each microbe has an actual strength value (displayed above their head) which is what determines the outcome of the fights. (You win if a random number between 0 and the sum of your actual strength and their actual strength is less than your actual strength). But this is not important. The outcomes of the fights could be from the actual game being played and the system would still work fine, the species would still try to sort out who is strong compared to them and who is weak.

As you can see in the video it takes a bit of training but quite quickly they learn to hunt things weaker than them and flee things stronger than them. The microbes are green when they are hunting, grey when they are indifferent and red when they are scared / fleeing. On the right hand side you can see into the brain of the first microbe. The column of numbers on the left is the actual strength of the microbes (with the microbe you are considering at the top left) and on the right is the microbes guesses as to how strong the others are.

As you can see there are some crazy times when the 31 hunts the 74 because it hasn’t worked out yet it is much stronger (you learn a lot more by winning a fight than by losing it). However they sort themselves out.

I reset the system 1:30 in and you can see their period of insanity (all green just charging at each other) is quite long but then they learn. 76 hunts 23 and 14, 23 hunts 14 and 14 only flees.

Hope you like it, all thoughts and comments welcome.

Edit:

Having thought a little more about this I think this system has some good advantages,

You could assign partial results to encounters. So I was using 1 for A kills B and 0 for B kills A but you could have 0.7 for A does more damage to B than B does to A etc. If you really wanted to take it further you could have several parameters, like nutrients before and after as well as damage and that would help the memory a lot.
This system works well for species you haven’t met yet. So say I fight ten species and lose every fight and you fight ten different species and win every fight. If we meet then I will assume that you have strength 50 (as I have no information so assume you are average) whereas I have a low opinion of my own strength (it’s down to 10 or something like that) so I flee. However you think I am average (as you have no information) but you know you are strong (your opinion of your own strength is 100) so you will attack me because you think I am likely to be weaker than you.

Therefore two species who have never met will still react in a sensible way to each other because they have learnt about how strong or weak they are in general, as well as against other, specific, species.

One extension would be to remember the mode of engagement as well as the outcome. So if you see another microbe you remember, I have fought it 20 times, 10 of those times I stood and fought and 10 of those times I ran away and I did better at the running away so therefore I will run this time. This means you could try different behaviours and assign yourself a different strength for each one.
Later in the game we could assign quite a lot of memory upgrades for when you upgrade your brain. So you could have 40 rather than 20 slots for remembering encounters or you could have more pieces of information remembered or you could learn now behaviours to try as well as the ones you have. You could also evolve slower forgetting which would be quite helpful. Or even location dependence so you could figure out that as an alligator if you fight in a swamp you win a lot more than if you fight on the land so on the land you will be tempted to run but in the swamp you be more tempted to fight.

These are just ideas.

TheCreator · June 30, 2015, 2:18pm

Yes! It looks great and we are definitely getting somewhere. Now for some comments:

Rather than having the organisms reset in the middle if they are stuck in a corner, could you do something like this?

if (m_Pos.x > WindowWidth) m_Pos.x = 0;
if (m_Pos.x < 0) m_Pos.x = WindowWidth;
if (m_Pos.y > WindowHeight) m_Pos.y = 0;
if (m_Pos.y < 0) m_Pos.y = WindowHeight;

This isn’t something we would want for the actual game, but it creates an illusion of more space and allows you to see how your organisms would act if given unlimited room to move around.

How does each cell view its environment? Does it only focus on the closest organism and then decide to chase it or run away, or is it able to see every creature in its environment?
You said that you put a 1 if you win the fight and a 0 if you lose it; do these numbers depend at all on the species that killed you, or is it simply a 1 and a 0 with no other data attached? The reason I’m asking is that I don’t completely get where the observed probability comes into play. Could you please delve deeper into expected and observed value and how you use them?

Edit: after re-reading your post a couple of times, I think I am getting a better idea at what they are, but I would still prefer if you could expand on them.

For you first point, I think rather than using a scale 0-1 it would be better to use a scale -1 to 1, where negatives are B kills you and positives are you kill B. However, this is really a minor thing that doesn’t really matter.
Maybe the strength 50 threshold could depend on the species. For example, a really cowardly species could always assume an unknown prey is always strength 100. The system you describe works really well, but by changing this threshold we could also allow for really powerful and strong, but also really peaceful creatures i.e., a huge plant that has really potent toxins so it always wins, but it prefers to save energy and stay out of fights. But overall you are on the right track, using the memory idea to give species an idea of how strong they are is really, really good.
I think we could have the memory be an array of a Memory class object. This class could contain different information such as place of encounter, overall outcome, mode of engagement, enemy health loss, your health loss, etc… I think the different strength for each behavior could also go here.