Thinking about AI

tjwhale · June 26, 2015, 9:28pm

I really like the idea of including distance into the calculations, that’s a good thought.
I think we are all agreed that what we want is some sort of Finite State Machine, so, for example, the instincts are the states, there are finitely many of them and the machines switches between them based on inputs and when in them does outputs.
The more I think about it the more I like the bit-strings idea. One of the few pieces of info we can get is “what killed me” or even “what damaged me”. Using that info to say “the thing which killed me had spikes therefore avoid things with spikes, the thing which killed me had flagella, therefore avoid things with flagella” etc I think makes less sense than simply saying “avoid the thing that killed you”. When encountering a new microbe the systems work much the same, “look at the things which damaged you in the past and try to work out from that if this thing will damage you”.

I guess the reason I think it’s better is that you are going to be able to make a distinction between things which are actively hunting you as opposed to things that occasionally jab you. So “yes it has spikes but it only mildly damaged you one time so maybe it’s ok to be close to it, the other thing has spikes and it killed you 25 times, watch out”. In the “avoid spikes” system the problem is that you will start to avoid everything that has spikes and that may be a mistake.

Later in the game it’s going to be important to be able to tell the difference between a heavily armed herbivore and a carnivore as the heavily armed herbivores actually offer you protection rather than harm.

Does that make sense? I’m wondering if the two systems are functionally identical but I think this might be a difference between them.

The bitsrings combined with distances to each microbe could make a really robust system I think.
As I mentioned above I don’t think there is much of a difference between standing and fighting and fleeing. Later there will be (so maybe we should plan for that) but spikes work if you get close and choosing to fire agent works if you are moving away better than if you are still (it makes them swim through a cloud of it) so basically, in this stage, there is no reason to stand and fight that I can think of.

Would love to know what you guys think, we’re really making good progress here, thanks for all the input.

TheCreator · June 27, 2015, 8:17am

Okay, so starting from bottom up, I was thinking we could have a sort-of “shyness” variable that could evolve. This would influence the behaviors of organisms when they don’t know what the species they met is. This could lead to predators that attack anything that they see unless they know that it will hurt them, and herbivores that flee anything unless they know for sure it isn’t a threat.

I was originally thinking that things would always remain on the list, but I like @whodatXLIV’s idea about allowing a set amount of space to mimic intelligence.

As for multiple microbes on the screen, real cells move by looking at food (or poison) gradients. So I thought that for each microbe we could do a simple calculation:

Vector2D netMovement = 0;
For all enemy microbes on screen
{
Vector2D movementDirection = CalculateAction( e^(-1*distanceToMicrobe), typeOfMicrobe );
netMovement += movementDirection;
}

Basically, it calculates a direction of movement for each piece of food and each cell based on the type of the object and based on the gradient, which exponentially decreases with distance, and adds them all together. However, I am not sure how well this will work.

The whole CalculateAction function will decide whether the cell needs to flee, herd, graze, etc… and based on this it will return a direction and speed of movement. So even if you see a deadly predator you wouldn’t spend all your ATP running away from it if it is far away.

Yeah, that makes sense. I didn’t want to use the memory thing to complete determine behavior; I mostly made it to add an extra level of realism to what was described before I joined in.

Yeah, this is exactly what I meant by my previous post. Whenever the AI then meets a species it doesn’t know yet, it will just do a bitstring comparison to decide how best to act.

Edit: For the time being please ignore everything I said about me writing a prototype. I have run into a bit of a problem where all 20 different members in a species bunch up into 1 member because they all have a similar brain and if they are in the same vicinity they go into the same direction: swim after the same food source and run away from the same predators. A simple collision detection function should lessen this, but I have just realized that I have no idea how to do that I tried something that I though should work, but apparently it doesn’t…

TheCreator · June 28, 2015, 4:57pm

I originally planned to edit my previous post, but I guess what I am about to say warrants another one.

I’ve worked a bit on a proof of concept for the bitstring memory idea. In the video you see below I started with 115 “cells”. I have split them up into 6 species, each of which can eat only a certain kind of cells. For example, CellC, which is the brown one, can only eat the green bacteria, and the black and the red cells; it simply goes through the orange ones without killing them (it is unable to digest the compounds in them) and it is eaten by the large purple ones.

In the beginning, these cells have absolutely no idea what life is—they start moving in a randomly chosen direction until they hit something. Whenever a cell hits something, it looks at what happens next. If it gets eaten, the code of the predator gets put in a memoryKill vector; on the other hand, if it eats something, the code gets put in a memoryFood vector. These vectors remain with the cell after it respawns in another randomly chosen place on the map.

Next time when this cell moves around, it calls a GetClosestObject function every frame. This function iterates through every single cell inside a population vector until it finds the closest one. Originally I planned on using a net movement that was affected by every single cell in the vicinity, but this completely failed on me and cell behavior became really erratic (although it is possible I just messed something up, so if someone else wants to give this a go, go ahead). So at the moment, every cells action is completely dependent only on the closest living thing to it, be it food or prey. I don’t like this, but coding anything else is outside my programming reach at the moment, I only recently learned c++.

After finding the closest one, the cell checks to see if its genetic id number is a match for anything in its Kill or Food column. Since I only have 6 different species and no organelles that could be used to compare bits, I didn’t use the bit-string method described above, I used a simple integer 1-6 for an id; nevertheless, I think that if necessary this should be a fairly easy change. If the id is in the memoryKill, the cell moves the opposite direction. If it’s in the food column, it moves in that direction. If the cell has never seen this id before, it doesn’t change its previous direction (although I think this could easily be changed with a “shyness” bool variable that tells it to either go in the same direction or opposite).

One other thing I noticed bad in my prototype is that there is not fighting mechanism. Most cells spend their whole lives running away in a straight line from a predator who is chasing them. I suppose this could be fixed by either giving different species different speeds or by having the fleeing predator turn around and fight if it feels its resources running low and it has been running away for a really long time. So basically have thresholds that I think @tjwhale mentioned—a hungry animal might attack a big predator that killed it multiple times if it is desperate enough.

Anyway, here is the aforementioned prototype after a couple minutes of offscreen simulation:

Edit: Ehhh, my video recording software is acting buggy… apparently my free trial ran out. Anyone know a good one?

tjwhale · June 28, 2015, 5:07pm

Re video capture on Windows I use Microsoft Expression Encoder, it’s free and works fine.

Prototype sounds cool. I’d be interested in seeing the video.

tjwhale · June 28, 2015, 7:31pm

Edit; I’m going to rewrite this prototype because I have a much better idea so feel free to skip this post, it’ll be outdated quite quickly.

I made a prototype today, just based on some very simple principles and I think it’s quite good. It’s based on a lot of the ideas in this thread, memory from the creator and distances and adding up dangers from whodat.

The ai controlled microbe is the large one in the middle of the screen, when it’s scared it goes red/pink, when it’s hunting it goes green and when it’s doing nothing it goes grey. If it get’s trapped in a corner it is teleported back to the middle of the arena (as this happens quite a lot and is a boring way to die).

On the right hand side of the screen you can see two columns of dots. The left column is what the other microbes actually are. Red is predators (who hunt the microbe and kill it if they get too close), Green is prey (who flee the microbe and die if they get hunted) and Blue is neither which the microbe has no interest in.

Basically each time it has an interaction (either it gets eaten, or it eats something, or it tries to eat something and can’t) it records the results in it’s memory. it’s memory degrades over time (and so you can see it forget after a while). Sometimes it will forget about a predator and charge in to hunt it, but it quickly learns!

The first minute is one setup (with a lot of reds where it spends most of it’s time running away) and after 1:00 is another setup where there are more prey. As you can see it quickly learns what to hunt. The first minute isn’t great (it’s too overwhelmed by fear) the second set up is better though, it gets some chances to hunt.

This needs some more testing (like giving every microbe the same intelligence and strengths and seeing what happens). However I would like to argue that this would be really good AI for the player to play against. What it means is at first this microbe won’t run away (in fact it may even hunt you) when it is weaker than you and you get some easy kills. However it learns and now when you turn up it will flee. (Or flock but that’s another story). If you are faster than it then you can still keep hunting it when it is at maximum fear however if it is faster than you you won’t be able to kill it again until it forgets about you and then you get another free shot.

I think this is a really good balance between “all microbes fleeing as soon as you enter the screen everytime” (which is boring) and “all microbes just sitting there and getting eaten” (whic is boring). Somehow memory and forgetting gives the ai a nice rhythum to play against, a perpetual cat and mouse with a slightly dumb mouse.

It’s important to remember the main goal is to make ai that makes the game fun, not necessarily ai that is super clever.

Anyway what do you all think?

Ps. Code is in the prototypes Git repository.

TheCreator · June 29, 2015, 1:09pm

Thanks, but I ended up just using CamStudio, which is free and works fine.

Anyway, here are the awaited videos. The first one is minutes 2-3 of my prototype running, while the second one is minutes 10-12.

They are basically the same thing, but the second video is better in my opinion, since the memories of the cells are more advanced (they know more things). Also, I remind you that there are 115 cells running in this simulation, so as you can see the algorithm is very fast and can easily be applied to over a thousand cells. Huh, let me try and see how far I can stretch the population number until the program becomes slow and jagged.

Edit: lol, I just realized you can completely see my taskbar in the first video Oops.

As for @tjwhale’s video, it is very close to mine, which is very good since it means we are going along the right track. I wanted to add a bit of commentary, but I guess I’ll wait until your next prototype.

I look forward to your comments.

tjwhale · June 29, 2015, 1:55pm

That’s really cool. I like it. I think we are going down the same road as you say.

How about trying it with different species being different speeds and seeing what that looks like?

Also when you say “So at the moment, every cells action is completely dependent only on the closest living thing to it, be it food or prey. I don’t like this, but coding anything else is outside my programming reach at the moment, I only recently learned c++.”

Firstly what you’re doing with it is really good, keep it up, I’m not much of a programmer but I know sticking with it is the way to get better. If you wanted to get more objects would it be possible to call “GetClosestObject” on everything excluding the result of the last call of GetClosestObject? That would get you the second closest object.

TheCreator · June 29, 2015, 3:31pm

tjwhale:

That’s really cool. I like it. I think we are going down the same road as you say.

How about trying it with different species being different speeds and seeing what that looks like?

Also when you say “So at the moment, every cells action is completely dependent only on the closest living thing to it, be it food or prey. I don’t like this, but coding anything else is outside my programming reach at the moment, I only recently learned c++.”

Firstly what you’re doing with it is really good, keep it up, I’m not much of a programmer but I know sticking with it is the way to get better. If you wanted to get more objects would it be possible to call “GetClosestObject” on everything excluding the result of the last call of GetClosestObject? That would get you the second closest object.

Thanks I’ll try it with different speeds tomorrow and post my findings. Also, my algorithm starts to break down around 500 cells, which I think is a fairly good result since. We probably won’t have more than 5-10 microbes on the screen, but this means we will also be able to simulate microbes outside the players reach to create a seamless transition. The lag you can see in my video is actually from my recording software, not the actual simulation.

Calling GetClosestObject multiple times would work, but what I meant was I don’t really know what to do with all those different direction vectors. Should I add them based on distance? Check to see if there is a predator nearby and if that is the case run away? I would probably confuse myself in all those if and switch statements if I tried to code that.

Anyway, look forward to your prototype. Should I try to draw a code flow chart for what we expect for the final AI simulation in Thrive based on the aforementioned ideas?

tjwhale · June 29, 2015, 4:53pm

Sure go for it. I’ll happily take a look.

This is my newest prototype, I think it works really well. It’s based on the following ideas.

You have a memory which has, for each species, 20 slots. So you remember the last 20 interactions you had. You get a 1 if you kill the other microbe and a 0 if you die. From this you can determine your observed probability of winning the next fight (sum of wins / number of observations).

You also have a strength memory (just a number per species) which is how strong you think each species is (including yourself). From this you can derive your expected probability of winning the next fight.

Each time you fight (and get a new observation) you compare your expected value with the observed value. If the expected value was too high you lower what you think your own strength is and increase what you think their strength is. Visa versa if your expected value was too low you think you are stronger and they are weaker than you predicted.

In this simulation each microbe has an actual strength value (displayed above their head) which is what determines the outcome of the fights. (You win if a random number between 0 and the sum of your actual strength and their actual strength is less than your actual strength). But this is not important. The outcomes of the fights could be from the actual game being played and the system would still work fine, the species would still try to sort out who is strong compared to them and who is weak.

As you can see in the video it takes a bit of training but quite quickly they learn to hunt things weaker than them and flee things stronger than them. The microbes are green when they are hunting, grey when they are indifferent and red when they are scared / fleeing. On the right hand side you can see into the brain of the first microbe. The column of numbers on the left is the actual strength of the microbes (with the microbe you are considering at the top left) and on the right is the microbes guesses as to how strong the others are.

As you can see there are some crazy times when the 31 hunts the 74 because it hasn’t worked out yet it is much stronger (you learn a lot more by winning a fight than by losing it). However they sort themselves out.

I reset the system 1:30 in and you can see their period of insanity (all green just charging at each other) is quite long but then they learn. 76 hunts 23 and 14, 23 hunts 14 and 14 only flees.

Hope you like it, all thoughts and comments welcome.

Edit:

Having thought a little more about this I think this system has some good advantages,

You could assign partial results to encounters. So I was using 1 for A kills B and 0 for B kills A but you could have 0.7 for A does more damage to B than B does to A etc. If you really wanted to take it further you could have several parameters, like nutrients before and after as well as damage and that would help the memory a lot.
This system works well for species you haven’t met yet. So say I fight ten species and lose every fight and you fight ten different species and win every fight. If we meet then I will assume that you have strength 50 (as I have no information so assume you are average) whereas I have a low opinion of my own strength (it’s down to 10 or something like that) so I flee. However you think I am average (as you have no information) but you know you are strong (your opinion of your own strength is 100) so you will attack me because you think I am likely to be weaker than you.

Therefore two species who have never met will still react in a sensible way to each other because they have learnt about how strong or weak they are in general, as well as against other, specific, species.

One extension would be to remember the mode of engagement as well as the outcome. So if you see another microbe you remember, I have fought it 20 times, 10 of those times I stood and fought and 10 of those times I ran away and I did better at the running away so therefore I will run this time. This means you could try different behaviours and assign yourself a different strength for each one.
Later in the game we could assign quite a lot of memory upgrades for when you upgrade your brain. So you could have 40 rather than 20 slots for remembering encounters or you could have more pieces of information remembered or you could learn now behaviours to try as well as the ones you have. You could also evolve slower forgetting which would be quite helpful. Or even location dependence so you could figure out that as an alligator if you fight in a swamp you win a lot more than if you fight on the land so on the land you will be tempted to run but in the swamp you be more tempted to fight.

These are just ideas.

TheCreator · June 30, 2015, 2:18pm

Yes! It looks great and we are definitely getting somewhere. Now for some comments:

Rather than having the organisms reset in the middle if they are stuck in a corner, could you do something like this?

if (m_Pos.x > WindowWidth) m_Pos.x = 0;
if (m_Pos.x < 0) m_Pos.x = WindowWidth;
if (m_Pos.y > WindowHeight) m_Pos.y = 0;
if (m_Pos.y < 0) m_Pos.y = WindowHeight;

This isn’t something we would want for the actual game, but it creates an illusion of more space and allows you to see how your organisms would act if given unlimited room to move around.

How does each cell view its environment? Does it only focus on the closest organism and then decide to chase it or run away, or is it able to see every creature in its environment?
You said that you put a 1 if you win the fight and a 0 if you lose it; do these numbers depend at all on the species that killed you, or is it simply a 1 and a 0 with no other data attached? The reason I’m asking is that I don’t completely get where the observed probability comes into play. Could you please delve deeper into expected and observed value and how you use them?

Edit: after re-reading your post a couple of times, I think I am getting a better idea at what they are, but I would still prefer if you could expand on them.

For you first point, I think rather than using a scale 0-1 it would be better to use a scale -1 to 1, where negatives are B kills you and positives are you kill B. However, this is really a minor thing that doesn’t really matter.
Maybe the strength 50 threshold could depend on the species. For example, a really cowardly species could always assume an unknown prey is always strength 100. The system you describe works really well, but by changing this threshold we could also allow for really powerful and strong, but also really peaceful creatures i.e., a huge plant that has really potent toxins so it always wins, but it prefers to save energy and stay out of fights. But overall you are on the right track, using the memory idea to give species an idea of how strong they are is really, really good.
I think we could have the memory be an array of a Memory class object. This class could contain different information such as place of encounter, overall outcome, mode of engagement, enemy health loss, your health loss, etc… I think the different strength for each behavior could also go here.

tjwhale · June 30, 2015, 5:37pm

Here’s another video with 10 microbes all running the same ai. I made some changes (you can see I sped it up).

I made it so that the amount you change your expectations after each fight is proportional to how wrong you were. That means the learning happens faster. I also increased the size of the memory banks to 50.
I added a “self esteem function” because microbes who thought themselves very weak would run away a lot and therefore get less info and no realise they could be wrong.
I corrected for “drift” so the average of all the strengths in your list is 50.
I think the changes were pretty successful. Under the list of guessed strengths I added a deviated ("Sd = ") which is the average difference between your guess of the their strength and their actual strength. As you can see this number decays over time and get’s to a pretty good resting point (for this microbe (which is now coloured blue)) it get’s it’s average to 7, which is pretty close. At the end I scroll through the microbes brains (current brain is highlighted in red) and you can see the highest Sd is about 20 and the lowest is 7 so they have all learnt pretty well.

Something to bear in mind is that it generates good dynamics the whole way through. Some of the microbes are scared, some are neutral and some are hunting always. At the start this is relatively random but by the end it’s pretty clever. So I think this is quite good.

@TheCreator to answer your questions,

I tried wrapping the window round as you suggest (I’d probably call that periodic boundary conditions) however the path finding started doing weird things. As soon as a microbe crossed the boundary it would decide it wanted to go back and would sort of flicker. I could rewrite the path finding to find the shortest route including paths that go over the boundary but didn’t want to bother. However I agree, the boundary is very artificial. One thing I might try next is to say if they reach the boundary they are despawned. That’s what happens in the game and so is the most realistic, new ones should probably be spawned on the boundary too.
When it is hunting it is only interested in the closest enemy however when it is afraid it adds up a component for every microbe in the simulation like this

for i in range(number_of_animals):
…self.distances[i] = (distance1(self,animals[i]))
…if i != self.number and i != self.target:
…fear += 30*self.memory_strength[i]/(self.distances[i] + 0.01)
if fear >= self.memory_strength[self.number]:
…act afraid.

So basically it adds 30x how strong it thinks the other microbe is divided by the distance to it to it’s fear counter. If the fear counter > your own strength then it acts afraid and flees (it moves so as to reduce it’s fear as quickly as possible).

Each microbe has a separate events memory bank for each other species. So I have a memory bank, with 50 slots now, for species A and a completely separate one for species B etc. So if there are 10 species I have 10 memory banks. (Well 9 really as you don’t need one for yourself).

So remember that what A thinks of B may be different than what B thinks of A. This explains when they both hunt each other and are both green and rush towards each other. They both think they are stronger than the other.

So when A fights B it gets a new entry to it’s memory bank. Maybe it has
1,1,1,1,1,0,0,0,0,0
so it beats B 50% of the time. That’s my observed probability of winning a fight. Each time they fight a new observed value is computed.

A also has another memory which is remembering what A thinks the strengths of the microbes are. So A looks in that table and sees A has strength 30 and B has strength 60. This means A should win

30/(30 + 60) = 33% of the time.

So this means what A is expecting is too low. Therefore A increases it’s own strength in it’s memory and decreases B’s (ideally to A = 45 and B = 45) so then it’s expectation

45/ (45 + 45) = 50%

is the same as the observation. Does that make sense? Because A fights C and D and E etc it’s own strength is how it measures the environment. So say A goes off and beats everyone else and thinks “wow my strength is more like 100” and it fights B again and wins and gets
1,1,1,1,1,0,0,0,0,0,1
it looks and says it’s observation is now 54% but it’s expectation is

100/(100 + 45) = 65%

so it thinks “wow B is really strong, it beats me a lot even though I beat everyone else so i’ll increase what I think of it.” Does that make sense? Feel free to ask more, I’m not sure how well I’ve explained it.

Yeah good point, I’d like to start using partial values as well. (So the outcome can be 0.3 or something like that).
The thing is if you can move and you are really strong you want to hunt, you’re just wasting resources by not doing so. it just makes sense. If you are really strong and can’t move then you just never get scared but you don’t hunt.

Because 50 is average for the whole patch but you have changed what you think of yourself the system you suggest is already working. So if I beat everyone I see I will chase anyone I know nothing about (because I think I am strong and they are average). However if I lose to everything then anything I have no info on I will flee automatically (as I think they are average and I am weak). Just as you suggest. If I’ve misunderstood you on this point let me know.

Yeah I agree completely. I think this is a good idea. Storing more information about the fight (where it happened, what happened, different measures before and after etc) is the way to build a really, really clever ai for the animals later on. So they know that at night they are strong if they use stealth but in the day they should flee a lot and use speed etc. I think it’ll be almost scary how the predators who hunt you will learn what works and what doesn’t.

Remember that’s the real goal, that the other animals adapt to the player. So if they always get killed at the water hole they start spending less time there and running away more. If you are fast but weak the stop running away but turn and fight etc. I think we can get and ai that does that and that’s pretty exciting.

Great questions, if you have more let me know.

If you want to have a look at the code it’s in the prototypes github folder.

Seregon · July 1, 2015, 2:54pm

So I’m way behind with this thread, sorry about that. I’m going to go through now and try and reply to everything I’ve missed, hopefully without repeating too much of whats already been said.

…So having read all of the above posts, I like a lot of the ideas, but I also see a lot of impracticalities. Please don’t take the comments below too negatively, I tend to respond mostly when I see a problem, rather than pointing out everything I agree with. I’ll add a tl;dr/summary at the bottom of the main problems so far and what we might want/need to do about them.

While I agree that the AI’s response needs to a stimulus needs to change according to it’s internal state, or other information, I don’t like the idea of ‘mode switching’. There are continuous alternatives which could achieve the same result, but with more finesse in it’s response. For example, as you get hungrier you worry less and less about spikes. At first this means you’ll risk going near a predator to eat prey, then you become more willing to eat spiky/well defended prey, and eventually you may even try to attack a predator. Something like this:

For the reasons you explain later in your post, this is a terrible idea. It’s something I’ve struggled with in some boids based AI’s I’ve written before, even with simply collision avoidance an agent can avoid a single obstacle, but finds it almost impossible to squeeze between, or go around, two nearby ones. You need to aggregate inputs somehow, though I suspect simply adding them would be overly simple.

I like the prototype. I’m guessing it’s improved upon later, so I won’t comment much on this one. The main problem here is that the agent tries to go through the predator, and doesn’t avoid it until it has to, a blended approach might have it giving the predator a wide birth instead. Do you matlab code for these? I’d like to try modifying them at some point.

This is the reason I was referring to, though you have the additional problem of the AI fixating on one stimulus at a time, when a more optimal, and more natural, behaviour is to react to each stimulus in proportion to its importance.

This, and the idea of evolving both how responses are chosen and what they do, is interesting. I still don’t like the idea of having absolute responses, and only picking one, but there may be something in this.

This is possibly the biggest advantage of the above.

The rest of your species is AI controlled, though what your hinting at here is interesting too. Could we have a sort of macro, or instinct, system - where you either trigger a preset behaviour with a keypress, or actually have the AI take over in certain circumstances. Just a thought for now.

This is something we’re probably not doing, translating the players actions into the AI/CPA system is something we’re really not sure is possible.

As much as I have no idea how this would be done, it’d be nice not to force people to use the behaviour editor.

Apart from my usual complaint about continuity, what you propose in this post is pretty good, more specific comments below:

Sort of. You have horizontal gene transfer through plasmids. Some microbes do have sexual reproduction, though I’m not sure how any of this would affect CPA?

As before, this is terrible for a number of reasons, we need to be able to respond to multiple stimuli at the same time. Having discrete responses does make this more difficult, so I’m not sure how that would be possible in this system.

This looks ok for the species in the player patch. One thing I don’t think we’ve considered is how any of this will work in other patches? There will be no microbe interactions from which the AI can learn.

This doesn’t work with the possibility of the AI adding new responses. Picking cost functions is (to me) by far the hardest part of evolving a system, especially when you’ve got competing costs. We can’t evolve the costs functions (meta-evolution), we won’t have anywhere near enough data for that. Short of fixing the possible responses and only allowing those, I’m not sure how we get round this one either.

We really want to avoid running simulations outside whats going on during the players gameplay, the sheer number of species and interactions going on would make it very difficult to do these in an acceptable amount of time.

This is a huge problem, especially when scaled up to the number of species we’d like to have. This is the case pretty much no matter what system we use though, so a key criteria for picking that system has to be how easily it can be evolved.

TheCreator:

This brought me to the idea of having a collective “memory” for AI species. I suggest we model every cell on the screen with a bitstring. Every single bit will describe an enemy cell. For example, 0-1-1-0-0-0-0 could mean that the cell has a pilus and a toxin vacuole, but it lacks mitochondria, storage vacuoles, slime, etc. And 1-0-0-1-1-0-0 could mean that it has chloroplasts, storage vacuoles, and lacks a flagellum and a pilus.

We will then use these bitstrings to determine the action of cell A in the following manner. Each species will have a memory in the form of a table with 2 columns. Whenever a member of the species A gets killed the predators bit string is written is written in the first column. However, if the member of species A eats something tasty, its prey’s bitstring is written into column two.

I like the idea of simplifying input this way, but it falls down due to the binary response to organelles. This would conceivably result in every species evolving a tiny spike, which would cause everything to run away from it… which would eventually lead to any number of spikes being ignored by all AI’s because every cell has at least one. While a bitstring is a nice piece of data to evolve, most species will end up with most organelles, and look very similar if abstracted this way.

Which reminds me - I don’t think we’ve ever considered how AI could play into CPA, if it does at all? Basic stuff, like does spp A consider spp B prey, could have an affect, but the detail of a full AI system would make CPA exponentially more complicated than it already is.
If, then, AI does not feed into CPA, and therefore doesn’t really affect it’s survival chances, that significantly simplifies what the AI needs to do - it doesn’t need to be optimal at anything, it just needs to provide the player with an interesting gameplay experience. It may be we still want more abstract features to feed into CPA (will defend with agents, will ‘berserk’ when outnumbered, will only feed on autotrophs), which could add some flavour to the AI’s in game behaviour. Again, this is just a thought for now, a detailed AI would be nice, but can we afford it, and do we need it?

Conserving energy by doing nothing could be very important in some environments… though rather dull gameplay wise.

This is more the sort of thing I had in mind.

This, and the rest of that post, are much more like it, with continuous and interacting responses to stimuli. It still relies on discrete responses, which works fairly well in this system, though still has drawbacks. It would also require as much if not more data to evolve than @tjwhale summary post above.

While I like this interaction, there are multiple ways of achieving the same thing in the systems proposed so far. You can change your maxes as above (effectively making it almost impossible to flee from A), you can change the shape of the response curves to D and N, you can simply remove the fight response when dealing with B, and so on (I’d give more examples, but I’d end up confusing the different systems proposed so far). My point is, if we want a system to be quickly, cheaply evolvable, I would guess that having only one way to achieve a result would simplify things. Maybe all functions have a [0 1] range, and only the response to stimuli changes? This is something to consider regardless of the system.

As with floating, your discounting energy usage. If you can’t outrun something, standing may give you more energy with which to fight effectively. It’s not always going to be better or worse than fleeing.

I like your prototype, but this is a bit of a flaw. Effectively each species needs to learn very little to be optimal - which species are friend, which are foe, and which are neutral. That’s a very simple set of information to learn given only 6 species. The rest of the setup and process is pretty good, but your not actually testing a sufficiently complex scenario to see if this would work for thrive.

While I agree that the AI shouldn’t be optimal, it should be interesting, I don’t think that should necessarily come from hobbling the AI. An AI which forgets about you every 20 generations is exploitable, and probably as annoying as a dumb one. If possible, the variable behaviour, and the interesting interactions which result, should come from the environment, not the AI. Given multiple cells to respond to, the behaviour of the AI should vary from simply running away or fighting one target. A microbe that’s been cornered by other cells, or is exhausted, will act differently to one faced with just the player (this is another reason I don’t like using simulations for training, the training scenario will never be realistic, involved just one or a few cells to respond to). To feel what this is like, go and play http://agar.io/ for a while - the concept is simple, eat smaller cells, avoid bigger ones, but it becomes much more interesting due to the human behaviour behind each cell. Much bigger cells ignore you, but often (unintentionally) herd medium cells around, forcing them to interact when in an empty arena they could simply outrun each other.

Thinking about this some more, remember that these AI aren’t learning, they’re evolving. What an AI knows isn’t limited by what it experiences in its lifetime (like how a toddler learns) but how its instincts have changed over multiple generations. Is memory relevant in that scenario? Even if it isn’t, if it makes the game more interesting it may be worth having.

This is getting better, you have something more complex to learn, but have managed to keep the response simple. I like that your using knowledge about yourself as well as about others. Would be interesting to see how it fared in a larger simulation.

tl;dr summary:

We need to be careful of how much time/cpu is needed to evolve an effective AI. Pe-evolving some as seeds is a good way to reduce this, but we still need to have effective evolution between generations. I don’t think we can afford to run simulations, and I’m not sure how we’ll get any data for CPA-driven patches.
Given the above, we need to design an AI which can be easily/cheaply evolved. That doesn’t just mean increasing the rate at which it learns, or giving it more information, it means building an abstraction that efficiently represents what the AI needs to know and can be efficiently optimised given the data we can afford to give it. It also means sacrificing optimality, which is fine so long as the resulting behaviour/gameplay is interesting.
If we can’t evolve AI for CPA-driven patches, then the AI can’t have any intricate effects on CPA. Even if we can, the CPA is already incredibly complex, and adding behaviour to it will make it much more so. We can do that, or we can simplify/abstract the AI input to the CPA (e.g.: predatory, aggressive, passive, hunts in packs, etc.). If we do that, then can we simplify the in game AI to the same degree? Do we need to evolve responses to specific species, or can we simply tell the AI (rather than have it learn) whether a given cell is weaker than it, and have it act according to its ‘personality’ rather than learned behaviour? The point of the AI becomes to be interesting, not to evolve/become an optimal survivor.
The AI needs to respond to multiple, possibly conflicting, stimuli. Becoming fixated on a single other cell leads to death far too often. The later prototypes are heading more in this direction (I think @whodatXLIV’s post #20 comes closest).
If possible, I’d like responses to be continuous. I know I say this far too often, but there are two good reasons for this - #1 it leads to more organic behaviour, with degrees of response and blends of behaviour; #2 computers really don’t like making logical (if then, switch) choices repetitively, doing the same calculation ‘do x of a and y of b’ thousands of times is far more efficient that looping over ‘if x < y do a, otherwise do c’.

I think that’s all I can manage for now, tomorrow I’ll try and come up with some of my own suggestions rather than just criticising yours. Your all doing great work, there’s a lot of great suggestions and solutions here, but there’s plenty more work to be done.

TheCreator · July 1, 2015, 3:08pm

Personally, I liked the speed of the old one a lot better. It was more relaxed and easy to see what is going on, but I guess this is more of a personal preference.

Now, I am pretty sure that this is what you planned for this prototype originally (else why would you have an events memory bank for each creature), but I will still ask it just to be on the safe side. We don’t actually plan to have strength values determine fights in the real game, right? The strength memory is an excellent idea to rank different species and to know who to attack, but I always thought that in the real game fights would be determined by combat. Unless of course that would be computationally expensive and having a strength value is the only way.

Then again, where would the population dynamics play into this? If we only use your described system for organisms on the players screen (and the 8 tiles around it for seamless transitions) how will we be able to fill up the memories of the cells quickly and efficiently enough? I think we need to find a way to use the population dynamics equations to affect the strength memory of creatures. That is, unless I am completely misunderstanding how the system is planned to work in the future; in that case I apologize for my pointless rambling and hope that someone can enlighten me.

Yeah, that makes sense; we probably shouldn’t waste times on something that isn’t going to make it into the game. It worked for me since that is how I set up my system originally, but your idea about despawning creatures is great. Although one thing I might add is that we should calculate 9 times the area of the viewing rectangle (basically a 3x3 square with only the middle square rendered). I suggest this because if you despawn a creatures as soon as it leaves the boundary, the things chasing it will stop seeing it and will turn around, which would also look artificial.

Aye! That makes so much more sense. Here I was thinking that it had an only one events memory bank for all creatures.

Partial values should work by comparing the state of the creature before the combat and after. I propose that we first assign a 10 value to every organelle and then add this up. For example, if we have a barebones plant cell with a cell wall, membrane, 2 chloroplasts, mitochondria, and a vacuole we would have 10+10+2*10+10+10=60 points. After this we look at the cell after the fight. For each organelle we will have a specialized function that judges that organelles health e.g., a vacuoles function could be:

Amount of compound after combat / amount of compounds after

The other organelles functions’ would depend on how we implement toxins. We will then add up the results of this specialized functions (say we get 5 + 3 + (8+2) + 4 + 8 = 30) and divide it by the original number: 30/60=0.5. This is the number that would get inserted into the events memory. This would remove cases when A beats B 10 times and thus gets 1,1,1,1,1,1,1,1,1,1 with a 100% probability but is virtually dead after all the encounters and in reality should have a 0.1,0.2,0.05,0.1, etc…

Yeah, that makes sense. I didn’t realize that your system works in the case that I described.

Anyway, I will look at your code in the github directory, scower this topic and put all of the ideas into a code flow chart so it will be a breeze (hopefully) for @Moopli or @crovea to inplement this. Also, know that I think about it, our AI is looking a lot more than a behavioral tree than a finite state machine, not that it matter much.

Edit: didn’t realize that @Seregon has posted before me. I’ve read through it, but it’s too late where I am to reply. You bring up some good points (some of which are incidentally what I just wrote) and I’ll try to reply to them after I wake up.

tjwhale · July 1, 2015, 6:12pm

@Seregon

Nice points, interesting to read.

Mode Switching vs Continuous Response:

I disagree with you that continuous response is more biological. Consider two bucks fighting to be alpha male. The correct actions here are either

A) Risk injury by fighting by trying to get the reward of becoming the alpha OR

B) Don’t fight, cower when the alpha comes near.

There is no middle ground, there is no “half-fight” where there is any merit in fighting but not putting 100% of your effort into it. It’s the same with shoals. Either maintain the shoal OR book it. There’s no point in swimming loosely around each other, that is a bad strategy. Or even with hunting. Either commit to lunging on the buffalo as a lion, knowing that if it gets a good swipe of it’s horns on you you will be badly gored OR stay clear. There is no benefit in half committing to these situations, they are fundamentally different modes of behaviour.

I agree in general with continuous > if statements but here I think it makes no sense.

AI <-> CPA relationship:

This is a complex issue in itself, as I have been beginning with in the other thread with WhoDat the problems we face are

I) How can you compute the flow of compounds between species from information gained from their “blueprint” (number and arrangement of organelles). But moreover

II) How can you tell which of two species is stronger from the blueprint? If Auto-Evo generates two species should A hunt B or B hunt A? Should every species mutually hunt each other?

I think one of the advantages of the latest system I have proposed is that every species is making a guess at it’s own strength. This could then be fed into the CPA system when it comes to predation and from it we could derive the relationships (predator-prey or mutual avoidance etc).

However I would be happy with a system that could go from blueprint -> fight strength directly. That would be fine and we need something like that to make the non-player CPA simulations work at all.

“If we do that, then can we simplify the in game AI to the same degree?”

The problem here is again one of relationships. You should behave very differently around something which is hunting you as opposed to something you are hunting. We could go super simple and say “these dudes always hunt and these dudes always run away” but then what happens when Auto-Evo makes changes? Does the “flavour” of the microbe change and if so how? We might end up with predators who are super weak in the swimming around but the CPA system thinks they are really efficient so makes lots of them and we could get drift that way.

We might end up with a super weak species always swimming into the jaws of a strong one.

IMO the main problem here is this,

knowing some info about yourself and another microbe predict the outcome of a fight between you (including what if you try different strategies).

If you can do that then you can dynamically assign who is a predator and who is prey and that can change in different situations. I really think the memory system originally opposed by The Creator which I have been using solves a lot of problems. It lets you derive, from actual fight data, how strong the microbes are in relation to each other. Once you know that you are pretty much home and dry. You can give them 3-5 behaviours and have them switch on and off as needed.

The reason this is hard for us is we want to procedurally generate all the species. Usually in games the designers know what species there’ll be and how strong they are but we won’t have that info.

Thanks for making the effort to read the thread, it’s appreciated.

@TheCreator

Yeah the outcomes of the fights would come naturally from the player playing the game. So while swimming around it would find an A fighting a B and, even if it just swam away, the result of that encounter could be recorded in A and B’s memory banks.
How it relates to the population dynamics is a difficult question.

Re filling up the memory banks one of the advantages of this system is the memory banks don’t need to be full for the system to generate reasonable behaviour. Sometimes when you see stuff it comes and attacks you, sometimes it swims away and sometimes it is indifferent to you. What’s important is that the ai’s repsonse gets better over time. So at first weak creatures might rush you and strong ones might flee. However after you have been playing for a while the weak ones flee and the strong ones hunt you and it has learnt. Moreover if you get better at playing the game the ai will adjust to this and more species will flee from you because the expect they will lose whereas before they expected they would win. Which sounds nice.

Yeah I dig what you are saying about assigning a value per organelle. Basically we need to say “this microbe was X good when it went in and Y good when it came out so it’s outcome is X - Y” if the other guy did worse then it’s a win, if the other guy did better then it’s a loss.

I agree that Pyrrhic victories (where you only just beat something) are important to record. I think later we can try and record more info, like if they killed a member of your swarm, or what terrain you were fighting in etc.

TheCreator · July 2, 2015, 3:12pm

@Seregon Don’t worry, your comments will not offend anyone (at least not me). We all understand that what you wrote is purely constructive criticism. If we would be afraid to point out faults in other people’s ideas we wouldn’t be where we are now. I will now read through your post and answer some questions and comment on your ideas.

I see your point; however I think the idea of mode switching is in some cases better than a continuous function. For example take antelopes—if it is being chased by a lion, I won’t stop to eat a bush no matter how tasty it might seem. It won’t even slightly divert its path to run close by the bush. I think having a behavioral tree or FSM is the way to go here, but I am open to further discussion.

I think the goal was to give all members of your own species the same memory, so if you, the player, hunt microbes A and B, but run away from C and ignore D, the rest of your species will do the same. But this idea is definitely not flushed out, it was just a benefit of the system described in the same post.

This is something I agree with 100%. Although having the memory is good, it should only work for the microbes rendered on screen and should definitely not be calculated off screen while the player is playing the game.

Well, rather than having a single bit for an organelle we could have a byte. This would increase the memory, but would allow us to store more information. Either way, I think that this idea was more or less abandoned, and @tjwhale and I have decided to have a memory that stores outcomes of a couple of fights and then uses this data to deduce a strength value for each cell.

Since you liked this idea (even if you misquoted me), I decided to spend more time working on it. It didn’t work the first time I tried it a while back; however I managed to finally fix it and add it to my prototype (apparently I accidentally divided by zero… again). Keep in mind that the video below is the exact same as my prototype before (it has none of @tjwhale’s stuff, I only focused on adding the exponential function). The change is very minor (maybe I should have used a smaller value than e), but you can notice that now microbes have curved paths, which is IMO great and adds a lot more realism)

Edit: I changed the video link to a new one where I use 1.1^(-distance) instead of e^(-distance), so the curves are more pronounced. I think that it would even work if I were to use a linear dependency, so that is something I might try next.

Seregon:

So for example microbes with spikes: Spikes cannot hurt me if I’m far away so its weighted input is a function of its distance to me as well as number of spiked microbes present. So if there is one far away from me, or two, or ten I won’t run. They’re too far away and I like where I am. Now if one starts getting closer the weight starts increasing, slowly at first. Who cares if it’s 20 meters away or 19 meters away? But what if its 3 meters away then 2 meters away, clearly much more of a threat, hence the non-linear function of distance. This makes it so that 10 far away is not as frightening as one or two really close. This is the main reason why I don’t like the simple +4 scoring system, it doesn’t capture this dynamic.

This, and the rest of that post, are much more like it, with continuous and interacting responses to stimuli. It still relies on discrete responses, which works fairly well in this system, though still has drawbacks. It would also require as much if not more data to evolve than @tjwhale summary post above.

Unless you are really against using mode switching, I see tjwhale’s system working with this perfectly:

Overall, in my opinion the most menacing problem we have in our current algorithms/prototypes is the incompatibility with the CPA system. This is partially because we don’t really have a CPA system at the moment and it is hard to make a good solid fit with something we didn’t write yet. The second reason is… umm, I forgot what the second reason was… I’ll get back to you on that.

Aaannyway, if we could somehow deduce or approximate the strength of a microbe based on its organelles we could solve a lot of problems we have now.

tjwhale · July 2, 2015, 10:15pm

You are quite right. I think it is possible we have been massively overthinking this whole thing.

What about we just give a score based on number of spikes, amount and of agents and speed of movement and combine them to give each microbe a strength.

Then if there is too much enemy strength around the microbe gets afraid and flees. Otherwise it chooses between floating and hunting one of the other microbes on the screen.

Super simple, robust, and will fit with the CPA system for all patches.

TheCreator · July 3, 2015, 4:08pm

I don’t know… the strength memory system in your last video looked extremely promising and there are a lot of pro’s that come with using it. The most important things are that it would easily work with the other stages and as you mentioned we could use it to help organisms realize that they are, for example, weak on land but strong in the water.

What if we use this method for all bacteria and cells in the players patch, while assign a score based on the cell’s blueprint for the CPA controlled patches? Tell me, how far can the system in your last video be pushed? Would it work for 1000 (or even 5000) cells?

Sadly, I don’t know a word of python (my knowledge ends with C++, Java, Lua, and HLSL); otherwise I would experiment with your code myself.

I just don’t know, @Seregon brought up a lot of good points. Let’s wait until he and the other people respond.

tjwhale · July 3, 2015, 5:52pm

I guess the strength memory system does combine quite well with the “direct derivation from blueprint” method. You could seed all the strength memories with the numbers derived from the blueprint and then let the system evolve from there. As you say for non-player patches we could just use direct derivation.

In terms of larger patches each of the n microbes needs to remember, say, 50 numbers about each other species. So the memory required is 50*n^2 bits so for 1000 species that’s 50 million or about 6.25 megabytes, so not totally crazy.

In terms of speed it’s very quick because the memories only get updated each time an encounter happens between two microbes. So you are swimming around and you see and A and a B and they fight and A wins, only then will A and B’s memories be updated so that is extremely quick. If you had a lot of microbes on screen then there would be more updating but not even that much as it would only do it when the encounter was over. So I think pretty cheap overall.

Seregon · July 20, 2015, 11:34pm

When I was thinking of partial responses I didn’t mean a fleeing deer veering towards a patch of grass, so much as veering around a rock or tree, or some lesser predator. Even though a deer would in reality pay no attention to a patch of grass, having partial steering forces makes the result more interesting, as in one of the prototypes above where responding to multiple predators leads to more interesting curved paths instead of everything bee-lining towards/away from the nearest food/thread. Also, a continuous response doesn’t necessitate a mixed one, some response functions could use a logistic function, which is essentially nearly binary with the right parameters.

That said, I do agree with your examples, and with the idea that singular responses would be much simpler, so yours may be the better option.

This is kind of how I expect to solve that problem - every species has a predator-prey relationship with every other, but most of those relationships collapse to 0 due to some (mathematically) obvious inballance between them.
How we get CPA parameters from a body-plan, let alone an AI mind-plan-thing, I really don’t know yet…

This is one of my biggest problems with a lot of the systems above - the AI’s aren’t learning how to behave, so much as what is or isn’t a threat. If we can’t figure out which cells are a threat and tell the AI, how are we supposed to tell CPA that? Given that we can’t afford to run simulations for non-player patches (certainly not long enough for an AI to develop), we shouldn’t be using the AI’s knowledge (what is/isn’t a threat) as an input to CPA. My point here isn’t so much that the AI couldn’t learn this stuff, but that it would be far cheaper if we could calculate it, and have the AI learn how to react to that information, rather than having to learn it in the first place.

Ok… if I’d read ahead I’d have seen this, which is largely what I said above.

I wasn’t suggesting cells always be aggressive or passive, but rather that given a certain stimuli, they have a ‘flavour’ of response. When faced with a weaker prey, a cell could stalk it, chase it, harass it, go on a suicidal rampage, etc. It could ignore danger while hunting, or be very wary of more powerful cells, it could hunt only those prey much weaker than it, or be more reckless, it could favor slow prey, etc. None of the above necessarily affects the AI’s response to predators, it’s not a case of ‘always run’ or ‘always charge’, so much as how you run or charge, given that you already know which it is you should be doing.

I do agree that having the AI not always know whether or not they can win a fight is interesting, but I think we have other ways of being interesting. Also remember that the cells swimming around you aren’t ‘learning’, they are evolving responses to stimuli - that’s a very slow, gradual process, in response to a relatively slowly changing environment. I wouldn’t expect an AI’s behaviour to change much while you were playing, though it doesn’t need to be perfect either.

Whether or not the other guy came out worse shouldn’t be a factor. If you take damage but don’t kill the other (i…e: don’t gain nutrients), then its a loss, regardless of what state they left in. If you win, the degree of damage you took in the process is important.

This is really difficult to do. How do you define ‘running away from C’, other than moving in the opposite direction when its near? ‘Hunting A’, becomes moving towards it when near. In a complex environment your going to be moving towards and away from a lot of things for a lot of reasons. It is just possible that putting all the data together would give the right idea, but I’m not sure.

This is the prototype I was referring to above - even if the cells shouldn’t (realistically) be responding to distant cells, doing so makes the behaviour look more interesting, more intricate. I think the same goes for response to nearby, relatively unimportant stimuli, so long as the response is equally minor.

Agreed. If we can’t solve that, we need to consider using the same approach as we have for player behaviour - don’t feed back into CPA. At that point the AI only matter for whats visible to the player, what that means for the type of AI to use, I’m not sure, we’ll consider that if we can’t make it work with CPA.

@tjwhale’s post #36 says pretty much what I just wrote - don’t feed back into CPA, have a simple formula for figuring out whats a predator and whats prey, then react. I’d still like the reaction to be somewhat more complex than fight/flee though, if we go down this route.

As I mentioned on Slack I’ll be away for the next three weeks, so please don’t wait on me to continue discussing here. I’ll catch up when I can, but you guys were doing a fine job without me.

tjwhale · July 21, 2015, 5:17pm

Yeah I think this is pretty much the best idea. Seeing as we need a (blueprint) → (strength) function anyway for the CPA then we might aswell use that for the ai. Moreover (hopefully) we can slave the players experience really closely to the CPA in their patch and this will help with that (though it could end up with some big fails like a weak species thinking it is stronger than you, and always charging you, and just feeding you free food. However we can cross that bridge when we come to it.)