A Genetic Algorithm to Pilot Pod Racers

Wow, what a great contest Codingame! I had a lot of fun during the week but didn’t sleep much; and when I did, I dreamt of drones and collisions! After participating in the SF2442 challenge two weeks before, I was initially disappointed when I saw that drones were back, but the differences from SF2442 were more than enough to provide a fresh challenge.

In SF2442, we had only one drone to race around the track, so the strategy was much more limited. It was possible to block the opponent, but at the cost of stopping your own drone for a while.

In Coders Strike Back, having a team of two drones led to extremely complex situations, where cooperation between the team, and aggression towards the opponents, were key. Since the physics engine was the same between the two contests, it was that much boilerplate code I didn’t have to rewrite, and could dive straight into the complex part, coding the AI.

Analyzing the debug view of Coders Strikes Back

Debug view of a replay

Generic Code Architecture

Before I explain genetic algorithms and evaluation functions, I want to write a bit about how I organized my code. I’ve done a few CodinGame competitions by now, so I try to reuse the same basis as much as possible. I separate the representation of the game state from the AI algorithms, so that I can easily switch the AI without recoding everything. I also try to keep all the “magic numbers” at the top of my code, so I can quickly test new variations. Normally this would go in a specific header, but we are limited to a single file…

For almost (all?) multiplayer contests, it is necessary to simulate the game state multiple rounds in the future. For Coders Strike Back, the only interaction between the AI algorithm and the game state was through the drones. The AI sets the drones’ requested angle and requested thrusts, then asks the game state to update to the next round. There is no intelligence in the drone objects themselves, they are just plain objects that any AI can interact with. Check out the class diagram at the end of this document for more info!

Game Simulation

Without an accurate game simulation, even the best AI won’t perform well. So the first step in this contest (well, actually in SF2442, since the environment was luckily the same!) was to write an exact simulation. I already had a few geometric classes I wrote for the Mars Lander puzzle (Points, Vectors, and Segments) so getting the correct speed and position given and angle and thrust was an easy modification from my Mars Lander code. Using the debug logs, I verified that my expected game state matched what was given in input by CodinGame.

But collisions… I had no idea how to do that, and my Google Fu failed me. I wrote something with vectors and dot products and cross products, which was complete nonsense. Well, I think this is where I say thanks to pb4608 for his explanations of the collision algorithm! He pointed me to the Poker Chip Race forum post and explained the weird impulse rebound. I eventually found a fun article with the maths and physics behind it; you can read it here for more all the details.

So I had a working simulation of the game, and could now plug in my AI.

The AI

I’m a huge fan of genetic algorithms, and use them everywhere when I don’t have a clear idea on how to solve the problem directly! So obviously, that’s the route I took. I won’t explain in depth how genetic algorithms work, but if you’re not familiar with them you can start with the Wikipedia page, and follow the links. Be careful, the rabbit hole goes very deep!

Genome Description

In the case of Coders Strike Back, I considered each gene to be a set of instructions for a drone. So a typical genome could be read as:

[(no shield, angle = 12°, thrust = 200), (no shield, angle = 18°, thrust = 150), (shield, angle = 18°, thrust ignored)]

In practice, I simulated the game state up to 6 turns in the future, so there would be 6 triplets of instructions in one drone’s genome. The genome itself is encoded as an array of doubles between 0.0 and 1.0. So a raw genome is actually closer to this:

{(0.08, 0.43, 0.07), (0.27, 0.55, 0.15), (0.18, 0.07, 0.02), (0.51, 0.97, 0.43), (0.15, 0.69, 0.99), (0.33, 0.05, 0.01)};

The algorithm reads it like so, in pseudo code:

for each triplet (gene1, gene2, gene3)
 if(gene1 > 0.95)
     requestShield();
 if(gene2 < 0.25) requestAngle(-18); else if(gene2 > 0.75)
     requestAngle(18);
 else
     requestAngle(-18 + 36 * ((gene2 – 0.25) * 2.0));
 if(gene3 < 0.25) requestThrust(0); else if(gene3 > 0.75)
     requestThrust(200);
 else
     requestThrust(200 * ((gene3 – 0.25) * 2.0));

Specifying the 0.25 and 0.75 limits to the genes makes it select way more often the extreme angles and thrust values, which intuitively seemed like values with extra importance.

Shared Genome

Actually, I concatenated both drones’ genomes into a unique shared genome, a series of 36 genes. The first 18 are for drone1, the last 18 for drone2. By considering both drones as a single entity, I could evolve cooperative behavior that would otherwise be impossible if each drone was evolved separately. That’s where all the “wow!” moments from the replays come from, it’s all emergent behavior. I was amazed when I first saw one of my drones collide with the other drone on purpose to give it a boost to its next checkpoint! I sometimes felt like I was watching a pool game instead of a drone race…

Direct Bot

Remember how I said I could switch AIs easily? Well, I coded a controlling bot (no genetic algorithms here, just basic if/then/else) that would target the next checkpoint for my first drone, and would target the opponent’s best drone for my second drone. By the way, when I write “first” and “second” drone, I mean the first and second with regards to the distance to the finish line, not the order in which they are given in input. Specifying this bot as my main bot allowed me to validate its behavior in the IDE. It would effectively turn to face its target (checkpoint or opponent drone) and go full thrust towards it. Not very effective, as it would overshoot the checkpoint and waste a lot of time turning around, but as a first approximation of the opponent’s behavior, it was good enough.

So my first version of the genetic algorithm would evolve a series of controlling genes for both of my drones, trying to beat a virtual opponent controlled by this “direct bot.” This worked pretty well for a good part of the week, and I spent my time improving the performance and number of simulations I could run in the allotted timespan.

Evaluation Function

I tried a lot of different evaluation functions but eventually settled for something very simple. I believe the most important in this game is having the biggest difference in checkpoints between the two teams and then having the biggest distance to finish line difference between the two teams. The rest is just minor adjustments to achieve this goal. So to compare two genomes, and select the best of the two, I compare the checkpoint differences. Only if they are equal do I compare the distance to finish line difference. And if those are equal, I chose the one where my second drone is closest to the opponent’s next checkpoint (or the one after the next, if it can’t reach the next checkpoint before the opponent.)

There was also a few lines of code in case of danger of timeout, to force both drones to target their respective next checkpoints, and in case the opponent was in risk of timeout, to prevent both opponent drones from reaching their respective next checkpoints.

I was quite happy with my discrete evaluation. There was no need for coefficients to apply to each parameter, keeping it simple and efficient.

First Improvement: Use Direct Bot in Genome

I was dissatisfied by how long my genetic algorithm took to evolve a basic control scheme. For the first tens of milliseconds, the genomes were completely random, wasting precious time resources. I modified the “shield gene” to make it more into a “meta gene”. The rules for the angle and thrust stayed the same, but now the “shield gene” had much more meaning than just shield on or shield off, as it could switch the drone control over to the direct bot:

if(gene1 > 0.95)
     requestShield()
 else if(gene1 < 0.3 && drone is first drone)
     use direct bot for control to next checkpoint
 else if(gene1 < 0.3 && drone is second drone)
     use direct bot to target best opponent
 else if(gene1 < 0.2 && drone is second drone)
     use direct bot to target best opponent’s next checkpoint
 else if(gene1 < 0.1 && drone is second drone)
     use direct bot to target best opponent’s next next checkpoint
 else
     use gene2 and gene3 to control angle and thrust

After the input phase, I also fill one of the genomes with the value 0.3 so that it effectively uses only the direct bot at each turn.

There was an immediate gain in performance. Now, even at generation 0, there was at least one genome which would control a coherent behavior. Printing out the genome in the debug logs at the end of turn showed that about a third of the genes used the direct bot.

Second Improvement: Evolve the Opponent

The second major improvement I did was using the genetic algorithm the evolve the opponent for some time. Effectively, for the first 30ms of a turn, I would pretend I was the opponent, and evolve a good strategy to counter the direct bot. I would then switch back to my own drone, and evolve them against the opponent drones controlled by their best genome.

That provided another huge performance gain. My leading drone would now often accurately predict that the opponent would try to block it, and go around the opponent instead of blindly slamming into it over and over again.

Optimization for Speed and Win Ratio

Throughout the week, I was continuously trying to get more simulations out of my AI. I used Callgrind and Kcachegrind extensively to identify bottlenecks in the simulation code. One of the improvements I implemented was to use a cache for the direct bot calculations, so as not to waste time recalculating 10,000 times the same output angle and thrust for the given input position, speed, and angle. At the end, I could evolve around 1,500 total generations per round (shared between the opponent’s and my drones). So with a population of 10, that means 15,000 game simulations with a prediction of 6 turns in the future.

I also did many tests in the IDE against the top 5 players to select my best parameters. The number of simulation rounds, the 0.25 and 0.75 bounds for the genes, the genetic population size, the time allotted to evolve the opponent… were all set so as to maximize the win ratio.

Conclusion

I’ve written it in the intro, but I’ll write it again, it was a really fun challenge! It’s very easy to write the basic drone controller which will simply go straight towards the next checkpoint, and then the real work starts… However, it would be nice if all the details of the physics were given from the start, maybe in the “Expert” section of the rules so as not to scare off everyone!

See you all in April for the next contest!

Annex: Class Diagram

Congratulations again Jeff06 and thank you for sharing your strategy in detail!

Try to code a genetic algorithm in Coders Strike Back game