The Smash the Code AI Contest ended this weekend. You had 8 days to code the smartest AI on a Puyo-Puyo-like game. This edition was also special as we introduced a new league system (no need to say that we had a bit of an adrenaline rush before the launch :)). 


6,970 participants took part in the contest, of which 2,490 finishers ended up in the leaderboard: congrats folks!
On the podium: pb4608 (France, C++), Medici (China, Go), and Psyho (Poland, C++). Congratulations to the 3 of you!
The game consisted in a superhuman battle. You had to defeat your opponent by grouping colored blocks.
The contest was inspired by the Puyo-Puyo game in a 1 vs. 1 version. Each player was in charge of a grid and had to stack up blocks composed of 2 pieces called “puyo”, for which the colors are aleatory. It’s possible to make the “puyos” disappear by grouping them by 4 or more puyos of the same color.

The goal of the game is to survive as long as possible, by taking into account the 8 next blocks, our grind progress and our opponent’s one.


Thanks to the league system, participants could smoothly get into the contest:


-Beginners league: The 2 pieces have the same color and cannot be rotated.


-Expert league: The 2 pieces have random colors and can be rotated.


So how could you proceed?


A majority of players opted for the following approach: Place the new blocks near an identical color block fostering the small combos of 4 or 5 blocks allowing them to gain time.


Top ranked players (over Top 500) rather favored two strategies that both had a common point: simulate the functioning of the game on several tours and select the “place and rotation” combination that helped them maximize their score.

1. The brute force strategy: simulate all the possible moves on several tours.


  1 tour ahead : 22 combinations


2 tours ahead: 22² : 484 combinations


3 tours ahead : 22³ : 10648 combinations


4 tours ahead : 22⁴ : 234256 combinations
And so on.

2. The aleatory strategy:

Due to the important number of combinations, several players set out algorithms based on aleatory strategies (Monte-Carlo or Monte-Carlo-Tree-Search for the majority) and could therefore explore solutions up to the 8 next blocks. The whole subtility of the game was to be able to maximize it’s own combo and making it explode before it’s opponent.

You can have a look at some player’s strategies on the forum! Here are a few of them to give you  some hints:
Magus (France, C++, Rank on Contest: 12)

“Monte Carlo Tree Search. I search the solution for the opponent for 15ms. Then I search for myself during 85ms. The strategy is the following:
  • I calculate the “score to kill”. It is the “one shot” score minus one line (and modified by my current nuisance)
  • I calculate the “score to be safe”. I check out the greater gap between two columns on my opponent grid. The score to be safe is the amount of needed lines to fill that gap.
  • I calculate a floor combo score. This is 


At this moment, i can now search my solution. I ignore too small combos (under the floor combo score). If i do a combo greater than the score to be safe, i assume that the opponent will not send my a new combo for the next turns of my search.
I searched an evaluation function for all the contest, but i’m not in the top 10. So i assume my evaluation function is not that good, I use the following criteria:
  • Every group of 2 balls is 2 points. Every group of 3 balls is 6 points.
  • Every ball touching an empty cell add 1 point : This one is important because you want to access many ball as fast as possible
  • A late one : Every ball touching another ball (except a skull ball) is 0.2 points by touching balls. With the previous one and this one, your AI will try to build towers. This is the best way to be “skull proof”.
And at last, every score point is one bonus point. But with some additional rules:
  • If your combo is above the kill score, set it to the kill score. Don’t waste your time by building too big combos.
  • If the combo is too small, ignore it (like said above)
  • To force my AI to not seek for a better combo every time she see it, i use a “patience” term. My patience term is 0.7
For example, if i can win 100 points at the turn 0 (the current turn), my eval function will returns 


, so 100 points. But if i can win 500 points at the turn 2, my eval function will returns 

500* 0.7^2

. This way, my AI will prefer a combo now than a bigger combo in 5 turns.”

BlitzProg (France, C++, Rank on Contest: 78)
“My score heuristic:
Basic simulator here, find out how much points I’d get for a particular move
My connection heuristic:
each group of connected n ball is given n² points (so 1, 4 or 9).
My super greedy combo heuristic:
After the score heuristic is done, post-simulate with a vertical mono-colored block, for each color and each column (total of 30 to test) – and get the biggest score!
With no configuration (naively deciding a move by summing up all of these 3 calculations) this got my AI to start filling then entire board with a single overkilling combo. Good start! 
Then I added in a basic Monte Carlo search to look for better configurations ahead. I look for a combo that would fill half of the opponent board (counting from the column that has the lowest amount of cells). If found, try to do it.
If the opponent is about to activate a combo, try to activate mine. if not, build a combo.
If the combo heuristic of my opponent is tingling my AI, my own combo heuristic score will decrease so only score&connection heuristic remains, to flavor immediate action rather than build up and waiting for the correct piece to come up.
mc search is of various depth. it does depth 3, and if there’s some time left, it search a bit with depth 4 and 5. “
Kookiz (France, C#, Rank on Contest: 60)
“My main concern has been performance. I’ve tried to squeeze every bit out of C# (forced inlining, usage of structs, …), but there’s only so much you can do with a managed language. In the end I could simulate about 100k steps during the 100 ms (one step being “pick one move, play the combos, and calculate the score”).
The main algorithm goes like this:
  • Exhaustively bruteforce the opponent moves with a depth of 2, to detect danger (combos that will send at least 2 lines of skulls)
  • Run the simulations: pick n moves randomly, and play them to see if I can find a combo. Repeat until the 100 ms are spent. “n” is picked dynamically: I use a 8-steps depth most of the time, but reduce it to 6 in the end game (as it makes no sense to search 8 move ahead when the grid is almost filled, I prefer to reduce the depths and increase the probability to find a short killer move). I pick the best moves based on the score: best move one turn ahead, best move two turns ahead, best move three turns ahead, etc… If for the same depth I find two moves with the same score, I use two other criteria for the choice: 1/ the variance of the height of each column (so the AI will try to build towers), 2/ I try to make clusters of 3 colors).
  • When the simulations end and I have the best move for each depth, it’s a matter of deciding on how many turns I want to play:
    • If the opponent is going to make a combo in the next turn or the turn after that, I pick the corresponding depth
    • If there is a combo with more than 1500 points in the next 5 turns, I pick it
    • Otherwise, I use a “patience” term (as perfectly explained by Magus) of 0.4 (lowered to 0.2 in the end game, when my AI is just trying to survive)
In the last day I tried a whole different approach, inspired from neural networks, where I would still randomly simulate, but build a tree with each simulated move and associated weights. Whenever a desired trait is found on one branch of tree (such as a combo, a cluster of 3, a construction in towers, etc…), I’d backpropagate and increase the weight of the nodes leading to that move. On the contrary, when an undesired trait is found (scattered colors for instance), I’d reduce the weights of the nodes. The random generator is rigged to pick the moves along the branches of the tree with the highest weights. This way, I was hoping to find interesting combos, then refining the best way to get there (for instance, for similar scores, it’s much more interesting to build in towers rather than flat). Unfortunately, even though the tree is lazily constructed, the number of allocations was taking a huge toll during the garbage collections, and I couldn’t optimize it enough to provide competitive results. So I finally dropped that solution.”
MrAnderson (USA, C#, Rank on Contest: 83)
“Number of blocks in each group, cubed
+1 for each group near another of the same color (the big timewaster I couldn’t get rid of)
+1 if it’s also above said group
-15 if the group has no available open spaced adjacent to it
-2 if the group size is 1
Then, add to it the number of skull rows the move generated times a multiplier that changed a lot.
If the board was almost full, I’d subtract the number of blocks on the board times a high multiplier to make him focus on survival. That felt kludgy but it worked.
I did brute force for 2 turns for both myself and the opponent using the same evaluation, then subtracted his score from mine to get my move. I’d continue my own evaluation until the time got close, so my brute force was more like 2.3 turns most of the time.
Since I applied the opponent’s expected move to my second turn, the bot was good at firing off all my combos right before he was going to be buried, as long as he guessed right. If he guessed wrong, he tended to get owned. He also had a knack for waiting for a weak combo to land, then burying the opponent’s now-shorter stacks in skulls.”
FredericBautista (France, C++, Rank on Contest: 8)
“At first I remembered playing this kind of game back in the late 90’s and so I started browsing the web in a quest for some strategy hints, or AI related pointers. I came across this website: which contains a lot of hint and strategy for a game called puyo.
It was a good starter to get a grasp and start to try to devise a strategy.
As I was kinda impressed by jeff06 article on Genetic Algorithms that he made for the previous contest,
I also looked in that direction and found that page with a few good hints on the matter:
Seeking for more i’ve ended up looking into this:
In fact I spent most of the 3 first days writing code to simulate the game and looking for clues on the web, I first just used a simple random distribution then tried to place colors by their index making some sum mixed with random. It wasn’t so great but it still got me to silver league.
Then when my simulation got to work, I quickly realized that pure brute force would not get me far with so much colors known in advance, the combinations had to be pruned in some way.
I made a pretty basic heuristic that looked for balls of same colors in the neighborhood. Giving weights based on whether there was some “friendly” ball close enough or not. What I had in mind was to eliminate as much combination as possible before attempting to brute force some solution.
And it’s all my algorithm is about, i only check for immediate opponent attack if i have a choice to make at that very moment, all the rest is just about a few tweaked weights for pruning followed by brute forcing to find the best possible combination.
Edit: On a note, if my AI is so limited in term of opponent analysis is that my algorithm showed poor abilities in predicting coming moves from a frame to another it could drastically change prediction because of the nature of the heuristic pruning.
Hence I rather chose to give better weighting to tall structure in order to maximize my chance to survive an attack and to tweak an according coefficient to select my attacking target. I further refined this coefficient by the current remaining space in order to unleash chains faster when I had less remaining space.
Then, I also applied the same concept the other way around to try to pressure the opponent when he was low on space left. I guess there is a lot space for improvement on my algorithm, especially on the heuristic end which is really very basic and often make poor decisions. Strong point of my algorithm being probably the amount of brute force combination I evaluate every frame. I actually average 80-100% of 22*4^8 (~1.44m) moves in 95ms.
And that was enough to get me in the top 10 of the legend league ^^
Once again, it was a lot of fun, thanks!”
Based on your feedback, we’ve tried to sum up what went good and what we could improve for the next editions. Here’s a brief idea, but don’t hesitate to let us know if something is missing! 🙂
What went well:


  • The league system: 80% of you were happily surprised by the league system and 10% expressed no opinion on this question.
  • The rules changes: 73% of you are happy with the rules changes, even though some of you would like to know them before, just to avoid producing a piece of code that cannot be used in the upper leagues.
  • The game on itself: 83% of you had good fun playing at Smash the Code 🙂
What we can improve:
  • The ranking system: One of the main concerns was about the ranking system. We’ve reviewed all your feedback and are looking for ways to improve it for the next edition.
  • Chat: We received several requests to improve the chat system, in order to be able to better interact with other players. We also think there is room for improvement on this side, so keep tuned on this topic! 😉
  • Sharing content: A few days before the contest, we decided to share a video of the game on several social media, which probably wasn’t the best idea as it provided some information that could allow some of the players to start thinking about their strategy before the contest launch. 

Thank you again for your participation and we hope you see you in the next one: Codebusters!